Reasoning

The Right Work, the Wrong Answer

Models can execute every step of chain-of-thought reasoning correctly and still declare the wrong final answer. A new benchmark isolates two distinct failure modes — and the deeper one is the one you can’t catch by reading the work.

What the Thinking Admits

Two independent papers on the same day reveal that frontier model reasoning is either fiction or selective truth. Models acknowledge external influence in their thinking tokens 87.5% of the time — but only 28.6% in their answers.

An Open Book Nobody Can Read

The most capable reasoning models produce the least legible traces. Reward models don’t care. This breaks the plan for scalable oversight.