fix/colab causal repro notebook by DaviBonetto · Pull Request #5 · DaviBonetto/spectralguard

DaviBonetto · 2026-03-17T23:33:50Z

Summary by CodeRabbit

Documentation
- Clarified reproducibility notebook and CSV guidance for experiment runs.
New Features
- Colab-ready workflow supports larger reproducibility runs with multi-threshold grids.
- New debug flag enables more verbose runtime diagnostics.
Chores
- Increased default sample size to 100 and expanded threshold grid (0.99, 0.95, 0.90, 0.70, 0.50).
Tests
- Updated tests to match revised statistic reporting and clamping behavior.
Bug Fixes
- Improved runtime robustness and logging for reproducible experiments.

coderabbitai · 2026-03-17T23:34:06Z

Caution

Review failed

Pull request was closed or merged during review

📝 Walkthrough

Walkthrough

Updates the reproducibility notebook to run 100 validated prompts across multiple thresholds and refactors the causal intervention script to compute three rho metrics (rho_channels, rho_tokens, rho_current), propagate per-layer min_dt constraints, modify clamping logic, and add a --debug flag (and apply_clamp_hooks debug parameter); tests adjusted accordingly.

Changes

Cohort / File(s)	Summary
Repro Notebook `notebooks/06_Causal_Intervention_Reproduction.ipynb`	Expanded experiment config: `N_SAMPLES` increased from 20→100 and `THRESHOLDS` expanded to `[0.99, 0.95, 0.90, 0.70, 0.50]`. Minor formatting; run loop updated to iterate over full threshold grid and log results.
Causal Intervention Script `scripts/run_causal_intervention.py`	Statistics/helpers refactor: `_compute_rho_stats` now returns `(rho_channels, rho_tokens, rho_current)`; `_clamp_dt_to_target` accepts/uses per-layer `min_dt_required` and returns additional rho diagnostics; hook logic precomputes per-layer dt floors, uses dt_floor-based decisioning, and calls clamp with propagated min_dt. `apply_clamp_hooks` signature now accepts `debug: bool = False`. CLI adds `--debug` and local imports/logging adjusted.
Tests `tests/test_run_causal_intervention.py`	Tests updated to expect three-value return from `_compute_rho_stats`, compute expected `rho_tokens` (mean over last dim), and assert `rho_current` derived from `rho_tokens`. Assertions for clamping behavior and rho relationships updated to match new outputs.

Sequence Diagram(s)

sequenceDiagram
  participant CLI
  participant Evaluator
  participant HookSystem
  participant Model
  participant Stats
  CLI->>Evaluator: start run (thresholds, N_SAMPLES, debug)
  Evaluator->>Model: load model & prompts
  Evaluator->>HookSystem: install clamp hooks (layer_indices, target_rho, debug)
  HookSystem->>Stats: compute_rho_stats(dt, A)
  Stats-->>HookSystem: rho_channels, rho_tokens, rho_current
  HookSystem->>HookSystem: compare dt vs per-layer min_dt (dt_floor)
  alt dt < dt_floor or clamp needed
    HookSystem->>Stats: _clamp_dt_to_target(..., min_dt_required)
    Stats-->>HookSystem: dt_scaled, rho_before, rho_after
    HookSystem->>Model: apply scaled dt to layer
  end
  Evaluator->>Model: run forward passes, collect metrics
  Evaluator->>CLI: write results (repro CSVs)

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

fix/colab causal repro notebook #1: Overlaps edits to scripts/run_causal_intervention.py (clamp/hook logic and apply_clamp_hooks signature) and updates to notebooks/06_Causal_Intervention_Reproduction.ipynb.

Poem

🐰 I hop through code with careful paws,

I count three rhos and tweak the laws,
A hundred prompts, five thresholds gleam,
Hooks hum softly, clamps redeem,
Repro ready — nibble, nibble, cheers! 🥕

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check	❓ Inconclusive	The PR title 'fix/colab causal repro notebook' is vague and generic. While it mentions 'causal repro notebook,' it uses non-descriptive terminology ('fix') and doesn't convey the actual changes: expanding sampling from 20 to 100, adding threshold grid exploration, or significant API changes to rho statistics computation.	Consider a more descriptive title such as 'Expand causal intervention validation to 100 samples with multi-threshold grid' or 'Refactor rho statistics to support per-layer min thresholds and three-part outputs' to better reflect the substantive changes.

✅ Passed checks (1 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch codex/fix/colab-causal-repro-notebook

📝 Coding Plan

Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

🧹 Nitpick comments (1)

scripts/run_causal_intervention.py (1)

126-127: Prefix unused unpacked variables with underscore.

rho_channels_before and rho_tokens_before are unpacked but never used in this function. Prefix them with underscores to signal intent and silence the linter.

Suggested fix

-    rho_channels_before, rho_tokens_before, rho_current_before = _compute_rho_stats(dt, A)
+    _rho_channels_before, _rho_tokens_before, rho_current_before = _compute_rho_stats(dt, A)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@scripts/run_causal_intervention.py` around lines 126 - 127, The unpacking of
_compute_rho_stats(dt, A) returns three values but the first two
(rho_channels_before, rho_tokens_before) are never used; rename them to start
with underscores (e.g., _rho_channels_before, _rho_tokens_before) in the
assignment to indicate they are intentionally unused and silence the linter
while leaving rho_current_before unchanged.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@scripts/run_causal_intervention.py`:
- Around line 126-127: The unpacking of _compute_rho_stats(dt, A) returns three
values but the first two (rho_channels_before, rho_tokens_before) are never
used; rename them to start with underscores (e.g., _rho_channels_before,
_rho_tokens_before) in the assignment to indicate they are intentionally unused
and silence the linter while leaving rho_current_before unchanged.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 229a9a27-34a7-445f-8293-f397ed6915ab

📥 Commits

Reviewing files that changed from the base of the PR and between 8c87472 and e746929.

📒 Files selected for processing (3)

notebooks/06_Causal_Intervention_Reproduction.ipynb
scripts/run_causal_intervention.py
tests/test_run_causal_intervention.py

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (1)

notebooks/06_Causal_Intervention_Reproduction.ipynb (1)
66-68: Consider removing machine-specific fallback paths

Line 66-68 embeds personal/local absolute directories. For a public Colab repro notebook, this adds brittle environment coupling; REPO_ROOT_OVERRIDE plus clone fallback is usually enough.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@notebooks/06_Causal_Intervention_Reproduction.ipynb` around lines 66 - 68,
Remove the hard-coded, machine-specific absolute Paths from the repo-root
fallback list (the entries like Path("/content/drive/MyDrive/Research - code -
copy") and similar) and rely on REPO_ROOT_OVERRIDE plus the existing
clone/fallback logic; update the code that builds the candidate Paths (where the
list of Path(...) is defined) to omit personal/local directories so the notebook
uses REPO_ROOT_OVERRIDE or the repo clone fallback only.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@notebooks/06_Causal_Intervention_Reproduction.ipynb`:
- Around line 50-52: The notebook currently sets REPO_REF to a mutable branch
and only clones into CLONE_DIR without enforcing that an existing repository is
checked out to that ref; update the clone/update logic (the block that handles
cloning and existing .git in CLONE_DIR) to: validate that REPO_REF is a specific
immutable ref (preferably a tag or commit hash) or allow an explicit override,
and when CLONE_DIR contains a .git, fetch and checkout REPO_REF (or reset to the
commit) so the working tree matches REPO_REF; reference the symbols REPO_REF and
CLONE_DIR and modify the existing clone/update block to run git fetch && git
checkout <REPO_REF> (or git reset --hard <commit>) to guarantee reproducible
code.

In `@scripts/run_causal_intervention.py`:
- Around line 158-159: The hooks created by make_hook close over the loop-scoped
variables A and min_abs_A, causing all registered hooks (created per layer via
make_hook) to see the final loop values; fix this by capturing those tensors in
the hook's local scope when the hook is created—e.g., pass A and min_abs_A into
make_hook (or into the inner hook) as parameters or bind them as default
arguments so the inner function uses the per-layer copies rather than the outer
loop variables; update references in the hook (the function named hook inside
make_hook and any uses of A/min_abs_A elsewhere in that closure) to use the
captured local names (e.g., A_local, min_abs_A_local) so each registered hook
uses the correct layer-specific tensors.

---

Nitpick comments:
In `@notebooks/06_Causal_Intervention_Reproduction.ipynb`:
- Around line 66-68: Remove the hard-coded, machine-specific absolute Paths from
the repo-root fallback list (the entries like
Path("/content/drive/MyDrive/Research - code - copy") and similar) and rely on
REPO_ROOT_OVERRIDE plus the existing clone/fallback logic; update the code that
builds the candidate Paths (where the list of Path(...) is defined) to omit
personal/local directories so the notebook uses REPO_ROOT_OVERRIDE or the repo
clone fallback only.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: f752ab96-a9e1-4495-8b2e-41ec8845617e

📥 Commits

Reviewing files that changed from the base of the PR and between e746929 and f70ed3d.

📒 Files selected for processing (3)

notebooks/06_Causal_Intervention_Reproduction.ipynb
scripts/run_causal_intervention.py
tests/test_run_causal_intervention.py

🚧 Files skipped from review as they are similar to previous changes (1)

tests/test_run_causal_intervention.py

coderabbitai · 2026-03-17T23:45:44Z

notebooks/06_Causal_Intervention_Reproduction.ipynb

+        "REPO_URL = \"https://github.com/DaviBonetto/spectralguard.git\"\n",
+        "REPO_REF = \"codex/fix/colab-causal-repro-notebook\"\n",
+        "CLONE_DIR = Path(\"/content/spectralguard\")\n",


⚠️ Potential issue | 🟠 Major

Reproducibility risk: mutable branch ref and no ref enforcement on existing clone

Line 51 pins to a branch name, and Line 83-100 does not ensure CLONE_DIR is checked out to that ref when .git already exists. This can silently run different code across sessions.

🔧 Proposed hardening

-REPO_REF = "codex/fix/colab-causal-repro-notebook" +# Prefer an immutable commit SHA for reproducibility. +REPO_REF = "PUT_COMMIT_SHA_HERE" @@ - if not (CLONE_DIR / ".git").exists(): + if not (CLONE_DIR / ".git").exists(): print(f"Cloning {REPO_URL} ({REPO_REF}) into {CLONE_DIR} ...") subprocess.run( @@ check=True, ) + else: + # Ensure we are on the intended ref even when reusing an existing clone. + subprocess.run( + ["git", "-C", str(CLONE_DIR), "fetch", "origin", REPO_REF, "--depth", "1"], + check=True, + ) + subprocess.run( + ["git", "-C", str(CLONE_DIR), "checkout", REPO_REF], + check=True, + )

Also applies to: 83-100

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@notebooks/06_Causal_Intervention_Reproduction.ipynb` around lines 50 - 52, The notebook currently sets REPO_REF to a mutable branch and only clones into CLONE_DIR without enforcing that an existing repository is checked out to that ref; update the clone/update logic (the block that handles cloning and existing .git in CLONE_DIR) to: validate that REPO_REF is a specific immutable ref (preferably a tag or commit hash) or allow an explicit override, and when CLONE_DIR contains a .git, fetch and checkout REPO_REF (or reset to the commit) so the working tree matches REPO_REF; reference the symbols REPO_REF and CLONE_DIR and modify the existing clone/update block to run git fetch && git checkout <REPO_REF> (or git reset --hard <commit>) to guarantee reproducible code.

scripts/run_causal_intervention.py

Switch the intervention check to the repo's aggregated rho metric, avoid float32 saturation near 1.0, and use a monotonic search so the clamp stays finite and respects the post-condition.

Keep the hook aligned with _compute_rho_stats returning channels, tokens, and current rho. This removes the unpacking crash in the Colab reproduction path while preserving the clamp logic.

Raise the causal reproduction notebook to the paper's Table 3 configuration with 100 validated prompts and the full threshold grid. This removes the mini-run mismatch so the rerun can be compared directly against the published table.

Replace the brittle global scale search with the analytical per-channel minimum step size derived from A. This makes the causal intervention follow the diagonal operator bound more closely and avoids false clamp failures on long prompts.

coderabbitai bot reviewed Mar 17, 2026

View reviewed changes

DaviBonetto force-pushed the codex/fix/colab-causal-repro-notebook branch from e746929 to f70ed3d Compare March 17, 2026 23:40

coderabbitai bot reviewed Mar 17, 2026

View reviewed changes

DaviBonetto added 4 commits March 17, 2026 21:04

fix(causal): Stabilize rho clamp search

3174bcd

Switch the intervention check to the repo's aggregated rho metric, avoid float32 saturation near 1.0, and use a monotonic search so the clamp stays finite and respects the post-condition.

fix(causal): Match rho stats return shape

56a88d7

Keep the hook aligned with _compute_rho_stats returning channels, tokens, and current rho. This removes the unpacking crash in the Colab reproduction path while preserving the clamp logic.

fix(notebook): Match Table 3 reproduction settings

e7be4af

Raise the causal reproduction notebook to the paper's Table 3 configuration with 100 validated prompts and the full threshold grid. This removes the mini-run mismatch so the rerun can be compared directly against the published table.

DaviBonetto force-pushed the codex/fix/colab-causal-repro-notebook branch from f70ed3d to 3474071 Compare March 18, 2026 00:07

DaviBonetto merged commit d2e6b23 into main Mar 18, 2026
1 of 2 checks passed

coderabbitai bot mentioned this pull request Mar 18, 2026

fix/colab causal repro notebook #6

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix/colab causal repro notebook#5

fix/colab causal repro notebook#5
DaviBonetto merged 4 commits intomainfrom
codex/fix/colab-causal-repro-notebook

DaviBonetto commented Mar 17, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Mar 17, 2026 •

edited

Loading

Review failed

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning, 1 inconclusive)

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Mar 17, 2026

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

DaviBonetto commented Mar 17, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Mar 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning, 1 inconclusive)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 17, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

DaviBonetto commented Mar 17, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 17, 2026 •

edited

Loading