Conversation
Clamp discrete_A in float64 with a small internal safety buffer so the cast back to float32 cannot reintroduce a tiny rho overshoot during long Colab sweeps. Add a regression test that keeps feeding residual overshoots past three correction passes to cover the exact resume failure mode.
Add a recovery-first notebook that rebuilds checkpoint CSVs from an executed 07 resume notebook before loading the model. Split the remaining execution into smaller cells so interrupted Colab runs can continue from much smaller checkpoints instead of replaying the whole sweep.
|
Caution Review failedThe pull request is closed. ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (4)
📝 WalkthroughWalkthroughA new Jupyter notebook and builder script enable resumable causal intervention experiments with recovery capabilities. The notebook automates repository setup, extracts previously-computed protocol results from notebook outputs, validates and merges them into checkpoints, tracks remaining tasks, then resumes targeted model evaluations with incremental checkpointing. Changes
Sequence DiagramsequenceDiagram
actor User
participant Notebook as 08 Targeted Resume
participant Recovery as Recovery Logic
participant Checkpoint as CSV Checkpoints
participant Model as Model Loader
participant Evaluator as Accuracy Evaluator
User->>Notebook: Run notebook
Notebook->>Notebook: Initialize repo & config
Notebook->>Recovery: Find prior recovery notebook
Recovery->>Recovery: Parse notebook JSON & extract tables
Recovery->>Checkpoint: Merge recovered results (dedup)
Recovery->>Notebook: Return status frame (missing targets)
Notebook->>User: Display recovery preview
User->>Model: Approve to load model/tokenizer
Model->>Notebook: Model ready
Notebook->>Notebook: For each (seed, protocol, layer)
Notebook->>Checkpoint: Check completed rho targets
Notebook->>Checkpoint: Add baseline row if missing
loop For each missing rho
Notebook->>Evaluator: evaluate_accuracy(...)
Evaluator->>Notebook: Accuracy result
Notebook->>Checkpoint: Write row incrementally
end
Notebook->>Notebook: Consolidate all checkpoints
Notebook->>Checkpoint: Verify all expected files exist
Notebook->>User: Display unified results & statistics
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Poem
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Summary by CodeRabbit
New Features
Improvements
Tests