Iterative Sparse Matrix Steering: Closed-Form Subspace Alignment for Multi-Layer LLM Control (No SGD required).
-
Updated
Jan 5, 2026 - Jupyter Notebook
Iterative Sparse Matrix Steering: Closed-Form Subspace Alignment for Multi-Layer LLM Control (No SGD required).
Runtime control of LLM agent behaviors through activation steering vectors. More calibrated than prompting.
Repository for paper "Understanding (Un)Reliability of Steering Vectors in Language Models" by Joschka Braun, Carsten Eickhoff, David Krueger, Seyed Ali Bahrainian, Dmitrii Krasheninnikov.
Official implementation of "Beyond Multiple Choice: Evaluating Steering Vectors for Summarization" (Findings of EACL 2026).
My first eval / interpretability project. Testing steering vectors for truthfulness and honesty on TruthfulQA and MASK.
Add a description, image, and links to the steering-vectors topic page so that developers can more easily learn about it.
To associate your repository with the steering-vectors topic, visit your repo's landing page and select "manage topics."