feat(branch): async API surface + AsyncWorker promotion#18
Merged
lloyal-research merged 2 commits intomainfrom Feb 18, 2026
Merged
feat(branch): async API surface + AsyncWorker promotion#18lloyal-research merged 2 commits intomainfrom
lloyal-research merged 2 commits intomainfrom
Conversation
Breaking change: Branch and BranchStore methods that touch GPU decode
or KV mutation are now async.
Three-tier strategy:
- Sync: pure CPU ops (produce, sample, accept, accessors, steer)
- Async wrapper: JS async over sync native (fork, prune, pruneSubtree,
retainOnly) — KV metadata ops where worker overhead > operation cost
- True AsyncWorker: GPU decode moved to libuv thread pool
(branchDecodeAndCaptureOne, branchDecodeAndCaptureBatch, storeCommit,
storePrefill, decodeAndCapture, jsonSchemaToGrammar)
Branch is now an async iterable — for await (const { token, text } of branch)
generates until EOG with commit-before-yield semantics. produce/commit
remains available for multi-branch coordination.
No liblloyal changes. Callers serialize with await (same contract as
existing DecodeWorker). Removed _decodeMutex.
There was a problem hiding this comment.
Pull request overview
This PR converts the Branch and BranchStore APIs from synchronous to asynchronous, introducing a three-tier architecture:
- Sync tier: Pure CPU operations (produce, sample, accept, accessors, steer)
- Async wrapper tier: JavaScript async methods over synchronous native calls (fork, prune, pruneSubtree, retainOnly)
- True AsyncWorker tier: GPU decode operations moved to libuv thread pool (commit, prefill, decodeAndCapture, jsonSchemaToGrammar)
The PR also introduces Branch as an async iterable, enabling simple for await loops for token generation with commit-before-yield semantics. The mutex was removed as operations are now serialized through JavaScript's await contract rather than native locking.
Changes:
- Converted GPU decode and KV mutation methods to return Promises, offloading work to libuv thread pool via AsyncWorker classes
- Made Branch an async iterable with commit-before-yield semantics
- Removed
_decodeMutexfrom SessionContext, relying on JavaScript-level serialization - Added comprehensive test coverage for async rejection, empty inputs, JSON schema conversion, disposal during async operations, and async iteration
Reviewed changes
Copilot reviewed 10 out of 11 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| src/SessionContext.cpp | Added 6 AsyncWorker classes for GPU decode operations; converted decodeAndCapture, branch operations, and store operations to return Promises; removed mutex |
| src/SessionContext.hpp | Removed _decodeMutex member and include; updated comment for decodeAndCapture |
| lib/Branch.js | Converted fork, prune, pruneSubtree, commit, prefill, decodeAndCaptureOne to async; implemented Symbol.asyncIterator; updated documentation |
| lib/BranchStore.js | Converted commit, prefill, retainOnly to async methods |
| lib/index.d.ts | Updated type signatures for all async methods; added Symbol.asyncIterator type; updated JSDoc examples |
| test/integration.js | Added await keywords throughout; added 5 new test functions for async features (rejection, empty inputs, JSON schema, disposal, iteration) |
| examples/*.mjs | Updated all examples to use await with Branch/BranchStore methods; simplified chat example using async iterator |
| README.md | Updated examples to show async API usage; added async iterator example |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Breaking change: Branch and BranchStore methods that touch GPU decode
or KV mutation are now async.
Three-tier strategy:
Branch is now an async iterable — for await (const { token, text } of branch)
generates until EOG with commit-before-yield semantics. produce/commit
remains available for multi-branch coordination.
No liblloyal changes. Callers serialize with await (same contract as
existing DecodeWorker). Removed _decodeMutex.