Adopt the new BranchStore API from liblloyal for high throughput multi-branch operations by lloyal-research · Pull Request #16 · lloyal-ai/lloyal.node

lloyal-research · 2026-02-14T04:09:42Z

Adopt the new BranchStore API from liblloyal for high throughput multi-branch operations
Expose per branch logit snapshot

BranchStore packs N branches into a single llama_decode() via decode_each (commit) and decode_scatter (prefill), replacing per-branch GPU dispatches. Tuple API [branch, token(s)][] makes the binding structural rather than positional. Branch.getLogits() returns a Float32Array copy of the captured logits snapshot, enabling distributional analysis (entropy, KL divergence) from JS without crossing the sampling chain. Tests cover best-of-N, rehydrate+generate, produce→commit interop, mixed single/batched paths, getLogits→modelEntropy integration, and independent per-branch EOG with steer-forced termination.

SessionContext uses span-based DecodeEachItem/DecodeScatterItem and BranchStore-internal scratch buffers - removes _decodeScratch member.

Copilot

Pull request overview

Adopts the updated BranchStore API from liblloyal to enable high-throughput batched multi-branch decoding, and exposes per-branch logits snapshots to JS consumers.

Changes:

Added BranchStore JS wrapper + exports, plus new internal N-API bindings for batched commit/prefill.
Exposed Branch.getLogits() via a new native _branchGetLogits() method.
Added an integration test suite covering common BranchStore workflows.

Reviewed changes

Copilot reviewed 7 out of 8 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
test/integration.js	Adds end-to-end integration coverage for BranchStore workflows and per-branch logits snapshots.
src/SessionContext.hpp	Declares new internal native methods for branch logits access and store operations.
src/SessionContext.cpp	Implements `_branchGetLogits`, `_storeCommit`, `_storePrefill`, and updates decode API usage.
lib/index.js	Exports `BranchStore` from the public JS entrypoint.
lib/index.d.ts	Adds typings/docs for `_branchGetLogits`, `_storeCommit/_storePrefill`, `Branch.getLogits()`, and `BranchStore`.
lib/BranchStore.js	Introduces the JS `BranchStore` wrapper over the internal native store API.
lib/Branch.js	Adds `Branch.getLogits()` wrapper method.
liblloyal	Updates the submodule to a commit that provides the new BranchStore-native functionality.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-14T04:12:02Z

src/SessionContext.cpp

+  // Accept tokens into sampler penalty windows (CPU, per-branch)
+  for (uint32_t i = 0; i < n; i++) {
+    lloyal::branch::accept_token(items[i].handle, items[i].token, &_branchStore);
+  }
+
+  // Batched decode: one token per branch, single llama_decode dispatch
+  _branchStore.decode_each(items);
+


In _storeCommit, accept_token mutates per-branch sampler/penalty state before the decode happens. If _branchStore.decode_each(items) errors/throws or fails internally, the commit becomes partial (penalty windows advanced, KV/logits not), leaving branches in an inconsistent state. Consider performing decode_each first and only calling accept_token after successful decode, or wrapping decode in error handling and ensuring state is not mutated unless the commit fully succeeds.

Suggested change

// Accept tokens into sampler penalty windows (CPU, per-branch)

for (uint32_t i = 0; i < n; i++) {

lloyal::branch::accept_token(items[i].handle, items[i].token, &_branchStore);

}

// Batched decode: one token per branch, single llama_decode dispatch

_branchStore.decode_each(items);

// Batched decode: one token per branch, single llama_decode dispatch.

// Perform decode first so that if it throws/fails, sampler/penalty state

// is not mutated and branches remain consistent.

_branchStore.decode_each(items);

// Accept tokens into sampler penalty windows (CPU, per-branch)

for (uint32_t i = 0; i < n; i++) {

lloyal::branch::accept_token(items[i].handle, items[i].token, &_branchStore);

}

Copilot · 2026-02-14T04:12:02Z

test/integration.js

+      for (let step = 0; step < 10; step++) {
+        const live = branches.map(b => [b, b.produce()])
+          .filter(([, p]) => !p.isStop);
+        if (!live.length) break;
+        store.commit(live.map(([b, p]) => [b, p.token]));
+      }
+
+      // All branches should have valid perplexity (metrics tracked through _storeCommit)
+      const ppls = branches.map(b => b.perplexity);
+      console.log(`  best-of-N perplexities: [${ppls.map(p => p.toFixed(2)).join(', ')}]`);
+      assert(ppls.every(p => isFinite(p) && p >= 1.0),
+        `best-of-N: all perplexities valid [${ppls.map(p => p.toFixed(2))}]`);
+
+      // Greedy forks from same root → same perplexity (sanity check)
+      assert(Math.abs(ppls[0] - ppls[1]) < 0.01,
+        `best-of-N: greedy forks have equal perplexity`);
+


These BranchStore integration tests assume the model won’t immediately produce an EOG/stop token after the prompt (e.g., if live.length becomes 0 at step 0). In that case, later assertions about perplexity validity/equality can become flaky across models/variants. To make the tests robust, consider explicitly steering away from EOG for the first token(s), or detect early-stop and skip/adjust the assertions when no commits occurred.

Suggested change

for (let step = 0; step < 10; step++) {

const live = branches.map(b => [b, b.produce()])

.filter(([, p]) => !p.isStop);

if (!live.length) break;

store.commit(live.map(([b, p]) => [b, p.token]));

}

// All branches should have valid perplexity (metrics tracked through _storeCommit)

const ppls = branches.map(b => b.perplexity);

console.log(` best-of-N perplexities: [${ppls.map(p => p.toFixed(2)).join(', ')}]`);

assert(ppls.every(p => isFinite(p) && p >= 1.0),

`best-of-N: all perplexities valid [${ppls.map(p => p.toFixed(2))}]`);

// Greedy forks from same root → same perplexity (sanity check)

assert(Math.abs(ppls[0] - ppls[1]) < 0.01,

`best-of-N: greedy forks have equal perplexity`);

let commitCount = 0;

for (let step = 0; step < 10; step++) {

const live = branches.map(b => [b, b.produce()])

.filter(([, p]) => !p.isStop);

if (!live.length) break;

store.commit(live.map(([b, p]) => [b, p.token]));

commitCount += 1;

}

// All branches should have valid perplexity (metrics tracked through _storeCommit)

const ppls = branches.map(b => b.perplexity);

console.log(` best-of-N perplexities: [${ppls.map(p => p.toFixed(2)).join(', ')}]`);

if (commitCount === 0) {

console.log(' best-of-N skipped: model stopped immediately (no tokens committed)');

} else {

assert(ppls.every(p => isFinite(p) && p >= 1.0),

`best-of-N: all perplexities valid [${ppls.map(p => p.toFixed(2))}]`);

// Greedy forks from same root → same perplexity (sanity check)

assert(Math.abs(ppls[0] - ppls[1]) < 0.01,

`best-of-N: greedy forks have equal perplexity`);

}

Copilot · 2026-02-14T04:12:02Z

lib/BranchStore.js

+  commit(entries) {
+    const handles = [], tokens = [];
+    for (const [branch, token] of entries) {
+      if (branch.disposed) throw new Error('BranchStore.commit: branch is disposed');
+      handles.push(branch.handle);
+      tokens.push(token);
+    }
+    this._ctx._storeCommit(handles, tokens);
+  }


BranchStore.commit()/prefill() are public JS entrypoints but currently rely on destructuring and downstream native validation for most input-shape errors. If a caller passes a non-iterable, malformed tuple, or wrong token types, the thrown error may be confusing. Consider adding lightweight upfront validation (e.g., Array.isArray(entries), tuple length checks, numeric token checks, and Array.isArray(tokens) for prefill) and throwing a TypeError with a clear message.

Copilot · 2026-02-14T04:12:02Z

src/SessionContext.cpp

+  const float* logits = lloyal::branch::get_logits(handle, &_branchStore);
+
+  if (!logits) {
+    throw Napi::Error::New(env, "_branchGetLogits: no logits captured");


The error message _branchGetLogits: no logits captured doesn’t tell the caller how to resolve it. Since the intended fix is actionable, consider extending the message to indicate which operation captures logits (e.g., captureLogits() or decodeAndCapture*() / a store commit that captures), so consumers can correct usage quickly.

Suggested change

throw Napi::Error::New(env, "_branchGetLogits: no logits captured");

throw Napi::Error::New(env,

"_branchGetLogits: no logits captured; call a logits-capturing operation "

"(e.g. captureLogits, decodeAndCapture*, or a store commit that captures logits) "

"before calling _branchGetLogits");

lloyal-research added 2 commits February 14, 2026 00:35

refactor(api): adopt new BranchStore decode API

daf51a5

SessionContext uses span-based DecodeEachItem/DecodeScatterItem and BranchStore-internal scratch buffers - removes _decodeScratch member.

Copilot AI review requested due to automatic review settings February 14, 2026 04:09

Copilot AI reviewed Feb 14, 2026

View reviewed changes

lloyal-research added 3 commits February 14, 2026 15:19

feat(ci): bump package-lock

d785541

feat(throughput): fix tests

5f340bb

feat(throughput): harden _storeCommit

0fc0d35

lloyal-research merged commit 5b04fd5 into main Feb 14, 2026
2 of 7 checks passed

lloyal-research deleted the feat/throughput branch February 20, 2026 03:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adopt the new BranchStore API from liblloyal for high throughput multi-branch operations#16

Adopt the new BranchStore API from liblloyal for high throughput multi-branch operations#16
lloyal-research merged 5 commits intomainfrom
feat/throughput

lloyal-research commented Feb 14, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Feb 14, 2026

Uh oh!

Copilot AI Feb 14, 2026

Uh oh!

Copilot AI Feb 14, 2026

Uh oh!

Copilot AI Feb 14, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

-    throw Napi::Error::New(env, "_branchGetLogits: no logits captured");
+    throw Napi::Error::New(env,
+      "_branchGetLogits: no logits captured; call a logits-capturing operation "
+      "(e.g. captureLogits, decodeAndCapture*, or a store commit that captures logits) "
+      "before calling _branchGetLogits");

Conversation

lloyal-research commented Feb 14, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Feb 14, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 14, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 14, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 14, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants