
The Information Geometry of Softmax: Probing and Steering
We leverage information geometry to formalize the semantic structure of embedding spaces induced by the softmax architecture, common in LLMs and VLMs. Accepted to ICML 2026.

We leverage information geometry to formalize the semantic structure of embedding spaces induced by the softmax architecture, common in LLMs and VLMs. Accepted to ICML 2026.

Building upon our ICML 2024 paper, we formalize the representations of categorical concepts in LLMs as convex polytopes and further show that hierarchically related concepts are represented as orthogonal vectors. Published in ICLR 2025 as Oral (top 1.8%).

We formalize different notions of linear representations in LLMs (e.g., linear probes and steering vectors) and unify those notions by identifying an inner product that encodes semantic structure. Published in ICML 2024.