Record: 0.6364 BPB - Depth Recurrence + Multi-Order N-gram Backoff by Naazimsnh02 · Pull Request #808 · openai/parameter-golf

Naazimsnh02 · 2026-03-26T04:18:04Z

Summary

val_bpb: 0.6364 (seed 1337) | ~15.95 MB | 8×H100 SXM | 2 seeds

Adds multi-order n-gram backoff (orders 2-7) with entropy-adaptive alpha to the depth recurrence stack, achieving a new record.

Key contributions

Multi-order n-gram backoff (orders 2-7): Hash-table n-gram counting at eval time. Highest-order match first, cascade down on miss. Zero training cost — purely eval-time.
Entropy-adaptive alpha: alpha = 0.05 + 0.55 * sigmoid(2 * (H − 4.0)) — trusts n-gram more when the neural model is uncertain, model when confident.
Multi-GPU n-gram prefill: Each rank pre-populates its hash tables with all tokens scored by earlier ranks, fixing the table fragmentation problem on multi-GPU setups (without this, 8-GPU gets 0.87 BPB instead of 0.64).
Depth Recurrence: Repeating layers 4,5 for 13 virtual layers from 11 physical at zero parameter cost (carried over from previous submission).

Results

Seed	BPB
1337	0.6364
42	0.6382
Mean	0.6373

Built on PR #549 stack (LeakyReLU(0.5)², BigramHash(2048), XSA4, Partial RoPE, LN Scale, VE128, EMA+SWA, Parameter Banking + Parallel Muon, int6 GPTQ-lite + lzma).

Credits

N-gram backoff technique inspired by PR Record: 11L + Multi-Order N-gram Backoff + Entropy-Adaptive Alpha (val_bpb=0.6672) #770 (@minh-stakc) and PR Record: BackoffNgramMixer + Drift-Free TTT (3-seed mean val_bpb=0.6683) #779 (@deanbrr)
Base model stack: PR Record: LeakyReLU² + Legal Score-First TTT + Parallel Muon — val_bpb 1.1194 (3-seed mean) #549 (@abaybektursun), PR Record: 11L EMA + GPTQ-lite + warmdown3500 + QAT@0.15 (val_bpb=1.1233) #414 (@signalrush)

… Backoff (2-seed)

Naazimsnh02 added 2 commits March 25, 2026 19:14

Add depth recurrence + LoRA TTT submission

e63c36d

Record Submission: 0.6364 BPB - Depth Recurrence + Multi-Order N-gram…

b00c823

… Backoff (2-seed)

notapplica mentioned this pull request Mar 26, 2026

⛳ Parameter Golf Live AI Commentary ⛳ + Analysis / Ideas | every 10 minutes #140

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Record: 0.6364 BPB - Depth Recurrence + Multi-Order N-gram Backoff#808

Record: 0.6364 BPB - Depth Recurrence + Multi-Order N-gram Backoff#808
Naazimsnh02 wants to merge 2 commits intoopenai:mainfrom
Naazimsnh02:ngram-depth-recurrence-0.6364

Naazimsnh02 commented Mar 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Naazimsnh02 commented Mar 26, 2026

Summary

Key contributions

Results

Credits

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant