Authority Lives in Latent Space
A new paper reveals why prompt injection keeps working despite safety training: models assign authority based on how text sounds, not where it comes from. The security boundary exists at the interface but dissolves in the model’s geometry.