Skip to content

Record: Order-20 Dirichlet Posterior + Phrase Cache — 0.11545 BPB (3-seed)#968

Open
dentity007 wants to merge 3 commits intoopenai:mainfrom
NathanMaine:submission/nathanmaine-order20-dirichlet
Open

Record: Order-20 Dirichlet Posterior + Phrase Cache — 0.11545 BPB (3-seed)#968
dentity007 wants to merge 3 commits intoopenai:mainfrom
NathanMaine:submission/nathanmaine-order20-dirichlet

Conversation

@dentity007
Copy link
Copy Markdown

Order-20 Dirichlet Posterior + Per-Order OBCL + Phrase Cache

val_bpb: 0.11545 (3-seed mean, std 0.0000010) | ~15.1 MB | 8xH100 SXM

Extends n-gram backoff from order 15 (PR #948) to order 20. Improvement validated via 6-test ablation on 1xH100.

3-seed validation (8xH100, Montréal CA, 747 TFLOPS)

Seed Val BPB Eval Time
1337 0.11544435 459s
42 0.11546433 435s
2025 0.11544736 438s
Mean 0.11545 (std 0.0000010)

What changed from PR #948

  • NGRAM_ORDER=20 (was 15)
  • 5 additional per-order concentrations (all 1.86)
  • Everything else identical

Compliance

  • Training: 560s on 8xH100 (within 600s)
  • Eval: 459s worst case (within 600s)
  • Artifact: ~15.1 MB (within 16,000,000)
  • All caches strictly backward-looking (causal)
  • Score-first evaluation

Credits

Same lineage as PR #948. Built on @Robby955 (PR #900), @signalrush (PR #414), @himanshudongre (PR #846), @deanbrr (PR #659), @newjordan (PR #674), @pentxayc (PR #803).

@dentity007 dentity007 changed the title Order-20 Dirichlet Posterior + Phrase Cache — 0.11545 BPB (3-seed) Record: Order-20 Dirichlet Posterior + Phrase Cache — 0.11545 BPB (3-seed) Mar 27, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant