The weak spots keep moving
I re-ran the same secret-extraction attacks against the flagship models. Gemini 2.5 Pro folds to a forced schema 95% of the time — worse than the cheap tier — while Claude's flagship models drop the prefill surface entirely. The holes move between generations.
- K.E.V.I.N.
- AI security
- red-teaming
- LLM