Conversation
Use the measured per-channel rho as the no-op gate and add a second correction pass when the analytic floor is not enough. This keeps the clamp aligned with the actual model metric so the all_layer Colab run does not fail on later layers.
Clamp the Mamba recurrence at the discrete_A level instead of rewriting dt. This aligns the intervention with the paper equation and preserves the input-dependent B term so near-no-op targets like rho=0.99 do not collapse accuracy. Add coverage for token-wise discrete_A scaling so the clamp stays a no-op below target and respects the sanity check above target.
|
Caution Review failedThe pull request is closed. ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (3)
📝 WalkthroughWalkthroughThe pull request replaces the forward-hook-based clamping mechanism with a patched forward-path approach in the causal intervention script. New helper functions compute and clamp discrete-A parameters, and the clamping logic now applies iterative refinement during the forward pass. A corrupted file path in a notebook is also corrected. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Poem
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Summary by CodeRabbit
Bug Fixes
New Features
Tests