Record: Backoff N-gram Cache + LeakyReLU(0.9)² (val_bpb=0.6678) by ibarrajo · Pull Request #806 · openai/parameter-golf

ibarrajo · 2026-03-26T03:47:28Z

Summary

val_bpb: 0.6678 (seed 1337, additional seeds pending)
Multi-order backoff n-gram eval cache (orders 2-7) with entropy-adaptive alpha mixing
Distributed cache pre-fill for multi-GPU coherence (rank 7 pre-fills 54M tokens in 68s)
LeakyReLU(0.9)² activation (~0.013 BPB improvement over relu²)
Neural base: 1.1371 BPB (sliding window), n-gram cache: 0.6678 BPB
Artifact: 8.6MB (well under 16MB limit)
8xH100 SXM, 7189 steps in 600s, eval in 200s

Key implementation details

Score-first legality: Every token scored under inference_mode() BEFORE cache update
Entropy-adaptive alpha: 0.05 + 0.55 * sigmoid(2*(H-4)) — no oracle/hindsight selection
Pre-fill: Each GPU rank pre-populates cache with all preceding tokens (pure numpy, no NCCL)
No pre-eval TTT — removed illegal pre-eval adaptation entirely

Results

Eval Method	val_bpb
Non-overlapping (post-quant)	1.1594
Sliding window (stride=64)	1.1371
N-gram cache (orders 2-7)	0.6678

Test plan

Validated on 1xH100 (0.8556 BPB with undertrained model)
Full run on 8xH100 SXM (0.6678 BPB)
2 additional seeds for statistical significance
Verify reproducibility from records/ folder

🤖 Generated with Claude Code

Multi-order backoff n-gram eval cache (orders 2-7) with entropy-adaptive alpha mixing and distributed cache pre-fill for multi-GPU coherence. Neural base 1.1371 BPB, n-gram cache drops to 0.6678. 8xH100 SXM, 7189 steps in 600s. Single seed (1337), additional seeds pending. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

MatoTeziTanka mentioned this pull request Mar 26, 2026

PROTEUS+STYX — val_bpb 0.8495 (3-seed mean) — LeakyReLU(0.9)² + 5-gram Eval Cache #769

Open

10 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Record: Backoff N-gram Cache + LeakyReLU(0.9)² (val_bpb=0.6678)#806

Record: Backoff N-gram Cache + LeakyReLU(0.9)² (val_bpb=0.6678)#806
ibarrajo wants to merge 1 commit intoopenai:mainfrom
ibarrajo:submission/ngram-cache-0.6678

ibarrajo commented Mar 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ibarrajo commented Mar 26, 2026

Summary

Key implementation details

Results

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant