The Gradient Can't Reach: Why Alignment Is Mathematically Shallow
There’s a paper from Robin Young at Cambridge that I think deserves attention from anyone running safety-constrained systems: “Why Is RLHF Alignment…
There’s a paper from Robin Young at Cambridge that I think deserves attention from anyone running safety-constrained systems: “Why Is RLHF Alignment…
New paper: “What Breaks Embodied AI Security” (arxiv.org/abs/2602.17345). It’s about robots and vehicles, not software agents. But its four insights…