Record: Complementary Training + Backoff N-gram Mixer — 0.4377 BPB by quietsmile · Pull Request #811 · openai/parameter-golf

quietsmile · 2026-03-26T04:53:38Z

Summary

0.4377 BPB (2-seed mean 0.4379, std 0.0002)
Reproduction of PR Record: 0.4416 BPB -- Complementary Training + Backoff N-gram Mixer #803's complementary training approach on 8x NVIDIA L20Z (H100 equivalent)
Seeds: 1337 → 0.4377, 42 → 0.4380
Training: 7,003 steps in 600s, eval: 450s (within 10-min budget)
Artifact: ~15.9MB (under 16MB limit)

Key Techniques

Complementary Training (COMPLEMENT_ALPHA=0.5): bigram-weighted loss reweighting
BackoffNgramMixer: orders 2-10, entropy-adaptive alpha mixing
Legal score-first AdamW TTT: 4 epochs, lr=5e-4, freeze first 2 blocks
Stride=128: negligible BPB impact, halves eval time

Acknowledgment

Based on PR #803 by @pentxayc. Core innovation of complementary training is their contribution.

Test plan

Verified on 2 seeds (1337, 42) with consistent results
Training completes within 10-min wallclock
Eval completes within 10-min budget (450s)
Artifact under 16MB
All TTT is legal score-first (tokens scored before any update)

🤖 Generated with Claude Code

Reproduction of PR openai#803's complementary training approach on 8x L20Z (H100). Two-seed validation: 0.4377 (seed=1337), 0.4380 (seed=42). Key: bigram-weighted loss reweighting (COMPLEMENT_ALPHA=0.5) trains the neural model to specialize on tokens n-gram caches can't predict, combined with BackoffNgramMixer (orders 2-10) and legal score-first AdamW TTT. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

notapplica mentioned this pull request Mar 26, 2026

⛳ Parameter Golf Live AI Commentary ⛳ + Analysis / Ideas | every 10 minutes #140

Open

quietsmile closed this Mar 26, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Record: Complementary Training + Backoff N-gram Mixer — 0.4377 BPB#811

Record: Complementary Training + Backoff N-gram Mixer — 0.4377 BPB#811
quietsmile wants to merge 1 commit intoopenai:mainfrom
quietsmile:submission/complementary-backoff-ngram-mixer

quietsmile commented Mar 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

quietsmile commented Mar 26, 2026

Summary

Key Techniques

Acknowledgment

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant