Side note, that CoT summary they posted is done with a really small and dumb sid...

lifthrasiir · 2025-12-01T07:15:10 1764573310

Do you have anything to back that up? In the other words, is this your conjecture or a genuine observation somehow leaked from Deepmind?

orbital-decay · 2025-12-01T07:37:57 1764574677

It's just my observation from watching their actual CoT, which can be trivially leaked. I was trying to understand why some of my prompts were giving worse outputs for no apparent reason. 3.0 goes on a long paranoidal rant induced by the injection, trying to figure out if I'm jailbreaking it, instead of reasoning about the actual request - but not if I word the same request a bit differently so the injection doesn't happen. Regarding the injections, that's just the basic guardrail thing they're doing, like everyone else. They explain it better than me: https://security.googleblog.com/2025/06/mitigating-prompt-in...

jrjfjgkrj · 2025-12-01T06:56:22 1764572182

what is Model Armor? can you explain, or have a link?

lifthrasiir · 2025-12-01T07:02:53 1764572573

It's a customizable auditor for models offered via Vertex AI (among others), so to speak. [1]

[1] https://docs.cloud.google.com/security-command-center/docs/m...

63stack · 2025-12-01T14:45:51 1764600351

The racketeering has started.

Don't worry, for just $9.99/month you can use our "Model Armor (tm)(r)*" that will protect you from our LLM destroying your infra.

* terms and conditions apply, we are not responsible for anything going wrong.