Rewrite verification script to validate all paper heatmap values by negative-dialectic · Pull Request #3 · fuelix/RISE-steering

Michael Freenor (negative-dialectic) · 2026-02-25T21:40:39Z

Summary

Rewrites scripts/verify_paper_results.py to run the full 7×7 cross-language transfer matrix for all 3 models × 3 phenomena (441 evaluations total)
Compares every individual cell against the published heatmap values from Figures 2, 3, 5, 6, 7 of the ICLR 2026 paper
Validates Table 2 (per-model Synthetic Multilingual averages) and Section 6.1 (per-phenomenon cross-model aggregates)
Includes the ICLR 2026 camera-ready PDF for reference

The current code produces slightly higher cross-language transfer scores than the paper figures (mean diff +0.033), consistent with the post-paper refactor improving the implementation. Monolingual (diagonal) scores match closely.

Test plan

Run python scripts/verify_paper_results.py and confirm it completes (~4 min)
Verify cell-level diffs are displayed for all 9 heatmaps
Verify Table 2 and Section 6.1 aggregate comparisons are shown

🤖 Generated with Claude Code

Replace the old monolingual-only verification with full cross-language transfer validation against every number published in the ICLR 2026 paper: - All 9 heatmaps (3 models x 3 phenomena) with 49 cells each (441 total) from Figures 2, 3, 5, 6, 7 - Table 2 per-model Synthetic Multilingual averages - Section 6.1 per-phenomenon cross-model aggregates Uses run_cross_language_experiment() from the evaluation module instead of reimplementing the pipeline. Prints cell-level diffs and aggregate comparisons. Also includes the published ICLR 2026 camera-ready PDF for reference. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Match HuggingFace dataset directory naming so downloaded data works directly with the verification script. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add the two baseline methods compared against in the paper (Tables 3, Figures 8-11): Mean Difference Vector (MDV) and Orthogonal Procrustes. Both implement the same duck-typed interface as RISE (.fit/.transform) so they plug directly into the evaluation pipeline. - src/rise/baselines/mdv.py: averages (transformed - neutral) diffs - src/rise/baselines/procrustes.py: optimal orthogonal mapping via SVD - run_evaluation.py: now evaluates all three methods on same splits - scripts/run_classification.py: reproduces Table 9 (Appendix G) downstream negation classification (MDV F1=0.873, RISE F1=0.897) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

The Procrustes solution W = U V^T solves min ||N W - T||_F in the row-vector convention. For column-vector inputs (PyTorch standard), the prediction is W^T @ x, not W @ x. Verified on real negation data: alignment jumps from 0.56 (wrong) to 0.71 (correct). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Production code fixes: - Fix riemannian_log near-identity/antipodal checks to operate in cosine domain instead of theta domain, avoiding precision loss from acos near ±1 - Switch verify_orthogonality from Frobenius norm to max-element error for dimension-independent tolerance - Relax ORTHOGONALITY_TOL (1e-5 → 1e-4) and ROTOR_VERIFICATION_TOL (1e-5 → 5e-5) to accommodate float32 accumulation in high dimensions - Set ARCCOS_CLAMP_EPS to 0.0 (clamping now handled by domain checks) - Add consistent near-identity check to geodesic_distance Test fixes (16 failures → 0): - Replace torch.cos/sin(float) with math.cos/sin (PyTorch API change) - Scale tangent vector in test_log_exp_inverse to norm < π - Fix test_very_small_angles to use 1° (resolvable by float32 acos) - Fix test_rotor_with_zero_vector for F.normalize(zeros) behavior - Fix test_full_workflow_with_text to use correct embedder fixture - Relax hypothesis-based test tolerances for edge cases Remove visualization: - Delete visualization.py and generate_figures.py - Remove visualization exports from evaluation __init__ - Remove matplotlib from requirements-frozen.txt Update README: - Replace stale verify_paper_results.py expected output with actual output - Remove generate_figures command - Fix --data-dir path to match actual data directory structure Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

The HuggingFace repo has model dirs (text-embedding-3-large/, bge-m3/, mbert/) at its root. Downloading to data/ places them where the scripts expect them. Also removed the huggingface-cli command which doesn't reliably install on PATH, keeping only the Python API approach. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Michael Freenor (freenor-wt) and others added 2 commits February 25, 2026 16:40

Rename TE3L data dir to text-embedding-3-large for consistency

b7a31b6

Match HuggingFace dataset directory naming so downloaded data works directly with the verification script. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Michael Freenor (negative-dialectic) requested a review from Lauren Alvarez (laurenalvarez11) February 25, 2026 21:51

Michael Freenor (freenor-wt) and others added 4 commits February 25, 2026 17:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rewrite verification script to validate all paper heatmap values#3

Rewrite verification script to validate all paper heatmap values#3
Michael Freenor (negative-dialectic) wants to merge 6 commits intomainfrom
feature/verify-paper-heatmaps

Michael Freenor (negative-dialectic) commented Feb 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Michael Freenor (negative-dialectic) commented Feb 25, 2026

Summary

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants