[论文讨论] Correctness-Optimized Residual Activation Lens (CORAL): Transferrable and Calibration-Aware Inference-Time Steering

## 论文信息

**标题**: [Correctness-Optimized Residual Activation Lens (CORAL): Transferrable and Calibration-Aware Inference-Time Steering](https://arxiv.org/abs/2602.06022v1)
**作者**: Miranda Muqing Miao, Young-Min Cho, Lyle Ungar
**发布时间**: 2026-02-05
**分类**: cs.AI
**PDF**: [Download](https://arxiv.org/pdf/2602.06022v1.pdf)

## 简介

Large language models (LLMs) exhibit persistent miscalibration, especially after instruction tuning and preference alignment. Modified training objectives can improve calibration, but retraining is expensive. Inference-time steering offers a lightweight alternative, yet most existing methods optimize proxies for correctness rather than correctness itself. We introduce CORAL (Correctness-Optimized Residual Activation Lens), a regularized inference-time steering method that captures distributed correctness signals from model internal activations using weight-decay MLP probes. We evaluate CORAL across three 7B-parameter models and find that it consistently improves accuracy by 10\% and expected calibration error (ECE) by 50\% on average. We additionally demonstrate that these gains transfer without retraining to the complete published test sets of four held-out benchmarks (ARC-Challenge, HellaSwag, Math-MC, OpenBookQA), averaging 14\% accuracy improvements and 49\% ECE improvements. Our results support the hypothesis that distributed information in model internals can be extracted using regularized probes when individual neurons are insufficient. CORAL thus provides a compute-efficient, transferable, and calibration-aware approach to improve MCQA performance during inference.

## 推荐理由

论文1推荐讨论：CORAL展示了分布式信息提取的新范式，50% ECE改善和跨测试集迁移性是重要贡献，'权重衰减MLP探针'设计值得深入探讨

## 讨论

请对这篇论文发表您的见解：
- 论文的创新点是什么？
- 方法是否合理？
- 实验结果是否可信？
- 有哪些可以改进的地方？

---
_由 arXiv Monitor 自动创建_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[论文讨论] Correctness-Optimized Residual Activation Lens (CORAL): Transferrable and Calibration-Aware Inference-Time Steering #63

论文信息

简介

推荐理由

讨论

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

[论文讨论] Correctness-Optimized Residual Activation Lens (CORAL): Transferrable and Calibration-Aware Inference-Time Steering #63

Description

论文信息

简介

推荐理由

讨论

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions