Skip to content

fix/colab causal repro notebook#9

Merged
DaviBonetto merged 2 commits intomainfrom
codex/fix/colab-causal-repro-notebook
Mar 19, 2026
Merged

fix/colab causal repro notebook#9
DaviBonetto merged 2 commits intomainfrom
codex/fix/colab-causal-repro-notebook

Conversation

@DaviBonetto
Copy link
Copy Markdown
Owner

@DaviBonetto DaviBonetto commented Mar 19, 2026

Summary by CodeRabbit

  • New Features

    • Added a resumable Jupyter notebook for causal intervention experiments that recovers from prior runs and skips completed tasks.
  • Improvements

    • Enhanced spectral-radius clamping computation with improved tolerance handling and iterative correction logic.
  • Tests

    • Added test coverage for edge cases requiring multiple correction iterations.

Clamp discrete_A in float64 with a small internal safety buffer so the cast back to float32 cannot reintroduce a tiny rho overshoot during long Colab sweeps.

Add a regression test that keeps feeding residual overshoots past three correction passes to cover the exact resume failure mode.
Add a recovery-first notebook that rebuilds checkpoint CSVs from an executed 07 resume notebook before loading the model.

Split the remaining execution into smaller cells so interrupted Colab runs can continue from much smaller checkpoints instead of replaying the whole sweep.
@DaviBonetto DaviBonetto merged commit dd9af75 into main Mar 19, 2026
1 of 2 checks passed
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 19, 2026

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: c40ca6db-1816-4fe5-a0b3-65beab1ea883

📥 Commits

Reviewing files that changed from the base of the PR and between d1d35ad and 8768550.

📒 Files selected for processing (4)
  • notebooks/08_Causal_Intervention_Targeted_Resume.ipynb
  • scripts/build_targeted_resume_notebook.py
  • scripts/run_causal_intervention.py
  • tests/test_run_causal_intervention.py

📝 Walkthrough

Walkthrough

A new Jupyter notebook and builder script enable resumable causal intervention experiments with recovery capabilities. The notebook automates repository setup, extracts previously-computed protocol results from notebook outputs, validates and merges them into checkpoints, tracks remaining tasks, then resumes targeted model evaluations with incremental checkpointing.

Changes

Cohort / File(s) Summary
Notebook Generation Infrastructure
scripts/build_targeted_resume_notebook.py
New script that programmatically constructs 08_Causal_Intervention_Targeted_Resume.ipynb, defining cell templates for repository initialization, configuration, recovery helpers (notebook detection, table extraction, deduplication), run helpers (prompt caching, evaluation), and execution/consolidation logic.
Causal Intervention Targeted Resume Notebook
notebooks/08_Causal_Intervention_Targeted_Resume.ipynb
New notebook providing resumable execution of causal intervention experiments. Includes recovery logic to detect and merge prior protocol results, task status tracking showing completed vs. missing evaluations, deferred model loading, per-protocol execution cells for remaining work across seeds (42, 123, 456), and final consolidation producing unified results CSVs and statistics.
Spectral-Radius Clamping Refinement
scripts/run_causal_intervention.py
Updated discrete-A clamping logic to use float64 internally with dtype re-casting. Changed overshoot detection from fixed 1e-6 tolerance to margin-based target (target_rho - margin), replaced hardcoded 3-pass residual loop with bounded multi-pass controlled by _MAX_DISCRETE_A_CLAMP_PASSES, and adjusted validation threshold constant.
Clamping Logic Test Coverage
tests/test_run_causal_intervention.py
New test test_clamp_discrete_a_keeps_correcting_when_residuals_need_more_than_three_passes verifies that the updated multi-pass clamping loop performs sufficient iterations (≥6 calls) when float residuals require repeated corrections beyond the previous 3-pass baseline.

Sequence Diagram

sequenceDiagram
    actor User
    participant Notebook as 08 Targeted Resume
    participant Recovery as Recovery Logic
    participant Checkpoint as CSV Checkpoints
    participant Model as Model Loader
    participant Evaluator as Accuracy Evaluator
    
    User->>Notebook: Run notebook
    Notebook->>Notebook: Initialize repo & config
    Notebook->>Recovery: Find prior recovery notebook
    Recovery->>Recovery: Parse notebook JSON & extract tables
    Recovery->>Checkpoint: Merge recovered results (dedup)
    Recovery->>Notebook: Return status frame (missing targets)
    Notebook->>User: Display recovery preview
    User->>Model: Approve to load model/tokenizer
    Model->>Notebook: Model ready
    Notebook->>Notebook: For each (seed, protocol, layer)
    Notebook->>Checkpoint: Check completed rho targets
    Notebook->>Checkpoint: Add baseline row if missing
    loop For each missing rho
        Notebook->>Evaluator: evaluate_accuracy(...)
        Evaluator->>Notebook: Accuracy result
        Notebook->>Checkpoint: Write row incrementally
    end
    Notebook->>Notebook: Consolidate all checkpoints
    Notebook->>Checkpoint: Verify all expected files exist
    Notebook->>User: Display unified results & statistics
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Poem

🐰 A notebook that remembers what came before,
Recovers lost tables from output galore,
No need to rerun—just resume and restore,
While spectral clamping keeps margins in score,
Off we bounce, achieving more! ✨

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch codex/fix/colab-causal-repro-notebook
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant