Skip to content

DWCM Phase 4: fix bad-seed convergence via θ-space coordinate Newton solver#12

Merged
fabiosaracco merged 3 commits intomainfrom
copilot/fix-bad-seeds-and-convergence-testing
Mar 13, 2026
Merged

DWCM Phase 4: fix bad-seed convergence via θ-space coordinate Newton solver#12
fabiosaracco merged 3 commits intomainfrom
copilot/fix-bad-seeds-and-convergence-testing

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Mar 13, 2026

For power-law networks at N≥5k, a handful of super-hub nodes (e.g. s_out=419, s_in=520) cause the β-space fixed-point to request β>1 on the very first update. Clamping to β≈1 creates a period-3 oscillation (residuals cycling 1.5×10¹⁰ → 420 → 520 → …) that no amount of Anderson mixing or damping can escape.

Root cause

The β-space FP update β_i^new = s_i / D_i(β^old) can yield β>1 when D_i < s_i, i.e. when a hub's observed strength exceeds what the current multipliers can produce. Clamping to the boundary poisons downstream updates for all other nodes.

Fix: θ-space coordinate Newton ("theta-newton" variant)

Rewriting the DWCM equation as:

Σ_{j≠i}  1/expm1(θ_out_i + θ_in_j)  =  s_out_i

gives a monotone, always-feasible 1-D equation in θ_i ∈ (0, ∞). The exact per-node Newton step is:

W_ij   = 1 / expm1(θ_out_i + θ_in_j)          # expected weight
F_i    = Σ_j W_ijs_out_i                  # residual
F_i   =Σ_j W_ij * (1 + W_ij)               # diagonal Hessian (≤ 0)
Δθ_i   =F_i / F_i                           # Newton step, clipped to ±max_step
θ_i^new = clamp(θ_i + Δθ_i, ETA_MIN, ETA_MAX)

This step is always finite and moves monotonically toward the solution — no β>1 state is ever produced.

Changes

  • src/solvers/fixed_point_dwcm.py

    • New "theta-newton" variant accepted by solve_fixed_point_dwcm; Gauss-Seidel ordering (θ_out updated first, θ_in uses fresh values); zero-strength nodes pinned to _ETA_MAX and never updated
    • New max_step: float = 1.0 parameter controls per-node Newton step clipping for "theta-newton"; damping is now exclusively the β-space blend factor and is skipped for theta-newton (no double attenuation)
    • _theta_newton_step_dense() — full N×N path for N ≤ _LARGE_N_THRESHOLD
    • _theta_newton_step_chunked() — memory-efficient chunked path for large N (same O(chunk×N) budget as existing chunked β-FP)
    • _ANDERSON_MAX_NORM = 1e6 promoted to module-level constant (was re-assigned on every loop iteration)
    • Anderson mixing extended with a residual-norm filter (‖r_k‖∞ < _ANDERSON_MAX_NORM) that prevents extreme theta-newton iterates from contaminating the mixing history
    • Module-level docstring updated to describe all three variants; variant, damping, and max_step params fully documented
  • tests/test_dwcm.py

    • New TestThetaNewtonDWCM class with 5 tests: convergence on N=4 and N=10, convergence with Anderson acceleration, zero-strength node pinning (θ = _ETA_MAX), chunked==dense step consistency, result fields (elapsed_time, peak_ram_bytes, residuals)
  • src/benchmarks/dwcm_comparison.py

    • Aggregate statistics (Time, Iters, RAM, MaxRelErr) now computed over converged runs only; methods with zero convergences show in the table
    • run_multi_seed_comparison returns (agg, bad_seeds) — seeds where no method converged are collected, printed, and (in --phase4 mode) saved to bad_seeds_phase4.txt
    • FP-GS α=0.3 added to solver roster
    • θ-Newton Anderson(10) multi-init added to solver roster
    • New --phase4 CLI flag: N=5k, 5 seeds, all key methods; bad seeds saved to bad_seeds_phase4.txt

Benchmark Results: N=5,000, seeds 0–4, TOL=1e-6, timeout=60 s/solver

Performance metrics computed over converged runs only (non-converged runs show —).

Per-seed results

Seed FP-GS α=1 FP-GS α=0.5 FP-GS α=0.3 FP-GS Anderson(10) θ-Newton Anderson(10) L-BFGS
0 ✗ 150 iters ✗ 150 iters ✗ 150 iters ✗ TIMEOUT ✓ 25 iters · 11.1 s ✗ TIMEOUT
1 ✗ 150 iters ✗ 150 iters ✗ 150 iters ✗ TIMEOUT ✓ 24 iters · 10.5 s ✗ TIMEOUT
2 ✗ 150 iters ✗ 150 iters ✗ 150 iters ✗ TIMEOUT ✓ 24 iters · 10.6 s ✓ 39 iters · 24.1 s
3 ✗ 150 iters ✗ 150 iters ✗ 150 iters ✗ TIMEOUT ✓ 25 iters · 11.8 s ✗ TIMEOUT
4 ✗ 150 iters ✗ 150 iters ✗ 150 iters ✗ TIMEOUT ✓ 25 iters · 11.0 s ✗ TIMEOUT

Aggregate table (converged runs only)

Method Conv Time (s) mean±2σ Iters mean±2σ MaxRelErr mean±2σ
FP-GS α=1.0 0/5
FP-GS α=0.5 0/5
FP-GS α=0.3 0/5
FP-GS Anderson(10) multi-init 0/5
θ-Newton Anderson(10) multi-init 5/5 11.0 ± 1.0 25 ± 1 6.3e-11 ± 3.2e-11
L-BFGS (multi-start) 1/5 24.1 39 2.96e-10

Key findings

  • θ-Newton is the only method that converges reliably at N=5k (5/5 seeds), in ~11 s and ~25 iterations.
  • L-BFGS converged on 1/5 seeds within 60 s; the other 4 timed out — the O(N²) gradient cost at N=5k (~180 ms/eval) leaves too few iterations within the time budget.
  • All β-space FP variants (including Anderson acceleration) failed on all 5 seeds — the period-3 hub oscillation cannot be escaped by Anderson or damping alone.
  • No bad seeds remain: θ-Newton Anderson(10) achieves 100% convergence across all tested seeds.
Original prompt

This section details on the original issue you should resolve

<issue_title>Phase 4 again: DWCM, resolving bad seeds</issue_title>
<issue_description>We are still following the plan in AGENTS.md.

I ran some tests locally for N=10k (it cannot handle something bigger), and, indeed, the performance is quite poor. As it is unfeasible for you to test all dimensions extensively, let's plan a different process:

  1. Modify the test files such that the statistics that they return focus on the performances when the various methods converge (otherwise, they are difficult to read).
  2. Focus on tests for 5 networks with N=5k, and consider for the moment only FP-GS with different alphas, FP-GS Anderson and L-BFGS (as they are the most promising). Rerun the same tests you did in the last step of the previous issue, returning the statistics as defined in point 1. Collect the seeds of the networks where none of the methods converge and save them somewhere.
  3. Go back to the problematic seeds and check what the issue is. Try to solve it.
  4. Go back to point 2 and iteratively proceed down to the present, until at least one of the methods can always solve the equations.

Reference: AGENTS.md for the full plan, .github/copilot-instructions.md for coding conventions.</issue_description>

Comments on the Issue (you are @copilot in this section)


💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

Co-authored-by: fabiosaracco <43313433+fabiosaracco@users.noreply.github.com>
Copilot AI changed the title [WIP] [PHASE-4] Modify tests for convergence performance on networks DWCM Phase 4: fix bad-seed convergence via θ-space coordinate Newton solver Mar 13, 2026
Copilot AI requested a review from fabiosaracco March 13, 2026 09:49
@fabiosaracco fabiosaracco marked this pull request as ready for review March 13, 2026 09:52
Copilot AI review requested due to automatic review settings March 13, 2026 09:52
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a "theta-newton" variant to the DWCM fixed-point solver to address convergence failures (period-3 oscillations) that occur with high-strength hub nodes in power-law networks at N≥5k. The new solver works in θ-space using coordinate-wise Newton steps, avoiding the β>1 clamping instability of the existing β-space fixed-point iteration.

Changes:

  • New "theta-newton" variant in solve_fixed_point_dwcm with Gauss-Seidel ordering and per-node Newton step clamping
  • Two new helper functions: _theta_newton_step_dense() for small N and _theta_newton_step_chunked() for large N (memory-efficient)
  • Anderson mixing extended with a residual-norm filter to prevent extreme theta-newton iterates from contaminating mixing history

Comment thread src/solvers/fixed_point_dwcm.py Outdated
Comment on lines +192 to +197
if effective_chunk > 0:
fp_raw = _theta_newton_step_chunked(
theta, s_out, s_in, effective_chunk, damping
)
else:
fp_raw = _theta_newton_step_dense(theta, s_out, s_in, damping)
Comment on lines +137 to +140
if variant not in ("jacobi", "gauss-seidel", "theta-newton"):
raise ValueError(
f"Unknown variant {variant!r}. Choose 'jacobi', 'gauss-seidel', or 'theta-newton'."
)
Comment thread src/solvers/fixed_point_dwcm.py Outdated
_and_g.append(theta_fp.clone())
_and_r.append(r_k.clone())
r_k_norm = r_k.abs().max().item()
_ANDERSON_MAX_NORM = 1e6
Comment thread src/solvers/fixed_point_dwcm.py Outdated
Comment on lines +185 to +197
if variant == "theta-newton":
# θ-space coordinate Newton step (avoids β>1 clamping oscillations).
# For each node i: Δθ_i = −F_i / F′_i where
# F_i = Σ_{j≠i} W_ij − s_i (expected − observed strength)
# F′_i = −Σ_{j≠i} W_ij(1+W_ij) (diagonal Hessian entry, ≤ 0)
# W_ij = 1/expm1(θ_out_i + θ_in_j)
# Step is clipped to [−damping, +damping] to prevent overshooting.
if effective_chunk > 0:
fp_raw = _theta_newton_step_chunked(
theta, s_out, s_in, effective_chunk, damping
)
else:
fp_raw = _theta_newton_step_dense(theta, s_out, s_in, damping)
@fabiosaracco
Copy link
Copy Markdown
Owner

@copilot apply changes based on the comments in this thread. Then, go on with the remaining tasks.

…tant, add tests, converged-only stats, bad-seed tracking, FP-GS α=0.3, theta-newton solver, --phase4 CLI

Co-authored-by: fabiosaracco <43313433+fabiosaracco@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

Copilot AI commented Mar 13, 2026

@copilot apply changes based on the comments in this thread. Then, go on with the remaining tasks.

Done in commit 6949321. Applied all four review comments and completed the remaining tasks:

  • damping/max_step separated: theta-newton uses max_step for its per-node step clip; the damping blend is skipped entirely for that variant so there's no double attenuation
  • _ANDERSON_MAX_NORM moved to module-level constant with a clarifying comment
  • Docstrings updated: module header now lists all 3 variants; variant, damping, and max_step params are fully documented
  • 5 new unit tests for "theta-newton": convergence (N=4,10), convergence+Anderson (N=4,10), zero-strength node pinning, chunked==dense helper, result fields
  • Aggregate stats show converged-only performance (time/iters/RAM/error); methods with 0 convergences show
  • Bad-seed tracking: run_multi_seed_comparison now returns (agg, bad_seeds); seeds where no method converges are reported and saved with --phase4
  • FP-GS α=0.3 and θ-Newton Anderson(10) multi-init added to solver roster
  • --phase4 CLI mode: N=5k, 5 seeds, saves bad seeds to bad_seeds_phase4.txt

@fabiosaracco fabiosaracco merged commit 1c43ef6 into main Mar 13, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Phase 4 again: DWCM, resolving bad seeds

3 participants