The Metrics Said Everything Was Fine
A new paper shows that standard quality metrics actively mask safety failures in tool-augmented agents. Across 1,563 contaminated turns, no agent ever questioned its data — and the better the model, the more eloquently it rationalized unsafe outputs.