Skip to content

feat(chat): new chat api#14

Merged
lloyal-research merged 13 commits intomainfrom
feat/new-chat-api
Feb 12, 2026
Merged

feat(chat): new chat api#14
lloyal-research merged 13 commits intomainfrom
feat/new-chat-api

Conversation

@lloyal-research
Copy link
Copy Markdown
Contributor

Integrated the new liblloyal::chat_in and liblloyal::chat_out apis for parity with llama.cpp chat edges

Copilot AI review requested due to automatic review settings February 10, 2026 15:30
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Integrates liblloyal’s new chat_in/chat_out APIs to provide format-aware chat prompt generation and structured output parsing (tools/reasoning), aligning behavior with llama.cpp chat edge cases.

Changes:

  • Replaced chat_template usage with chat_in formatting and added chat_out parsing via new parseChatOutput() binding.
  • Expanded TypeScript typings to expose format-awareness metadata, parsing options/results, and related enums/types.
  • Added integration tests for chat in/out round-tripping and new warm-vs-cold parity + semantic recall checks; updated CI matrix model.

Reviewed changes

Copilot reviewed 7 out of 9 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
test/matrix.json Switches CI chat model entry to Ministral for updated chat template coverage.
test/integration.js Adds new integration tests for chat_in/chat_out and warm continuation correctness.
src/SessionContext.hpp Declares new parseChatOutput N-API method.
src/SessionContext.cpp Implements chat_in::format plumbing, exposes extended format metadata, adds chat_out::parse binding.
liblloyal Updates submodule to a revision containing chat_in/chat_out.
lib/index.d.ts Adds format-aware chat result fields, parsing APIs/types, and richer docs.
examples/chat/chat.mjs Formatting-only: normalizes quoting/style (no behavior change shown).
README.md Updates listed CI model/template entry.
.gitignore Adds ignore rule for docs/_internal.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +492 to +503
assert(warmStr === coldStr,
`Warm==Cold parity: ${warmGen2.length} tokens match`);

if (warmStr !== coldStr) {
// Diagnostic: show first divergence point
for (let i = 0; i < Math.max(warmGen2.length, coldGen2.length); i++) {
if (warmGen2[i] !== coldGen2[i]) {
console.log(` First divergence at position ${i}: warm=${warmGen2[i]} cold=${coldGen2[i]}`);
break;
}
}
}
Copy link

Copilot AI Feb 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The diagnostic block is unreachable on mismatch because assert() throws before the if (warmStr !== coldStr) runs, and the assert message is misleading (it claims tokens match even when they don’t). Move the divergence logging before throwing, and make the failure message include at least the first differing index (or the two token sequences) so failures are actionable.

Suggested change
assert(warmStr === coldStr,
`Warm==Cold parity: ${warmGen2.length} tokens match`);
if (warmStr !== coldStr) {
// Diagnostic: show first divergence point
for (let i = 0; i < Math.max(warmGen2.length, coldGen2.length); i++) {
if (warmGen2[i] !== coldGen2[i]) {
console.log(` First divergence at position ${i}: warm=${warmGen2[i]} cold=${coldGen2[i]}`);
break;
}
}
}
let firstDiffIdx = -1;
if (warmStr !== coldStr) {
// Diagnostic: show first divergence point
for (let i = 0; i < Math.max(warmGen2.length, coldGen2.length); i++) {
if (warmGen2[i] !== coldGen2[i]) {
firstDiffIdx = i;
console.log(` First divergence at position ${i}: warm=${warmGen2[i]} cold=${coldGen2[i]}`);
break;
}
}
}
assert(
warmStr === coldStr,
firstDiffIdx >= 0
? `Warm==Cold parity failed at token ${firstDiffIdx}: warm=${warmGen2[firstDiffIdx]} cold=${coldGen2[firstDiffIdx]}`
: `Warm==Cold parity failed: warm sequence=${warmStr} cold sequence=${coldStr}`
);

Copilot uses AI. Check for mistakes.
Comment on lines +543 to +560
async function warmTurn(messages, lastText, userContent) {
messages.push({ role: 'user', content: userContent });
const { prompt: fullPrompt } = await ctx.formatChat(JSON.stringify(messages));
const delta = fullPrompt.slice(lastText.length);
const deltaToks = await ctx.tokenize(delta);
branch.prefill(deltaToks);

const gen = [];
for (let i = 0; i < GEN_TOKENS; i++) {
const { token, isStop } = branch.produce();
if (isStop) break;
branch.commit(token);
gen.push(token);
}
const assistantText = await ctx.detokenize(gen);
messages.push({ role: 'assistant', content: assistantText });
return { text: assistantText, lastText: fullPrompt + assistantText };
}
Copy link

Copilot AI Feb 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warmTurn() closes over branch, but branch is declared later with var, relying on hoisting and call-order for correctness. This is fragile and harder to read/debug; declare let branch; before warmTurn and assign it before any call sites, or pass branch explicitly as a parameter to warmTurn() (and prefer const/let over var).

Copilot uses AI. Check for mistakes.
const promptToks = await ctx.tokenize(prompt);
await ctx.decode(promptToks, 0, 0);

var branch = Branch.create(ctx, 0, promptToks.length, { temperature: 0 });
Copy link

Copilot AI Feb 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warmTurn() closes over branch, but branch is declared later with var, relying on hoisting and call-order for correctness. This is fragile and harder to read/debug; declare let branch; before warmTurn and assign it before any call sites, or pass branch explicitly as a parameter to warmTurn() (and prefer const/let over var).

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 11 out of 13 changed files in this pull request and generated 6 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 143 to 145
* Returns: Promise<{ prompt: string, stopTokens: string[] }>
*/
Napi::Value formatChat(const Napi::CallbackInfo& info);
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The formatChat doc comment is now outdated: the implementation returns many additional fields (format, grammar, grammarLazy, reasoningFormat, etc.), not just { prompt, stopTokens }. Please update this comment to reflect the extended return shape, and add a brief doc comment describing parseChatOutput’s arguments/return fields since it’s a new public binding.

Suggested change
* Returns: Promise<{ prompt: string, stopTokens: string[] }>
*/
Napi::Value formatChat(const Napi::CallbackInfo& info);
* Returns: Promise<object> resolving to an object containing:
* - prompt: string
* - stopTokens: number[]
* - format?: string
* - grammar?: string
* - grammarLazy?: () => string
* - reasoningFormat?: string
* - and other model-specific formatting/grammar metadata
*/
Napi::Value formatChat(const Napi::CallbackInfo& info);
/**
* Parse raw model chat output using the metadata returned by formatChat.
* Args: info (NAPI call info; typically includes the raw completion text and
* associated formatting/grammar metadata).
* Returns: JavaScript object representing the parsed chat response, including
* assistant content and related metadata.
*/

Copilot uses AI. Check for mistakes.

if (info.Length() < 1 || !info[0].IsString()) {
throw Napi::TypeError::New(env, "Expected (messagesJson: string[, templateOverride: string])");
throw Napi::TypeError::New(env, "Expected (messagesJson: string[, options: object])");
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The updated error message no longer reflects the actual supported signature: the implementation still supports a string second argument for backward compatibility (formatChat(messagesJson, templateOverride: string)). Consider updating the message to mention both accepted forms so callers can recover quickly from argument errors.

Suggested change
throw Napi::TypeError::New(env, "Expected (messagesJson: string[, options: object])");
throw Napi::TypeError::New(
env,
"Expected (messagesJson: string[, options: object]) or (messagesJson: string, templateOverride: string)");

Copilot uses AI. Check for mistakes.
Comment on lines +1170 to +1176
// Second argument: options object (or string for backward compat)
if (info.Length() >= 2) {
if (info[1].IsString()) {
// Backward compat: formatChat(messagesJson, templateOverride)
inputs.template_override = info[1].As<Napi::String>().Utf8Value();
} else if (info[1].IsObject()) {
Napi::Object opts = info[1].As<Napi::Object>();
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a second argument is provided but is neither a string nor an object (e.g. null, number, boolean), it is silently ignored. That can hide calling bugs and lead to surprising behavior. It’d be better to throw a TypeError when info[1] is present but not one of the supported types.

Copilot uses AI. Check for mistakes.
}
if (opts.Has("addGenerationPrompt") && opts.Get("addGenerationPrompt").IsBoolean()) {
inputs.add_generation_prompt = opts.Get("addGenerationPrompt").As<Napi::Boolean>().Value();
}
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a second argument is provided but is neither a string nor an object (e.g. null, number, boolean), it is silently ignored. That can hide calling bugs and lead to surprising behavior. It’d be better to throw a TypeError when info[1] is present but not one of the supported types.

Suggested change
}
}
} else {
throw Napi::TypeError::New(env, "Expected options to be a string or object");

Copilot uses AI. Check for mistakes.
Comment on lines +1190 to +1192
if (opts.Has("reasoningFormat") && opts.Get("reasoningFormat").IsString()) {
inputs.reasoning_format = opts.Get("reasoningFormat").As<Napi::String>().Utf8Value();
}
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FormatChatOptions.reasoningFormat is modeled as a string in the binding, but you also expose a numeric ReasoningFormat enum in the TS types. A common user expectation will be to pass ReasoningFormat.AUTO (number) into formatChat, but this implementation will ignore it because it only accepts strings. Consider accepting both number and string inputs here (or adjusting the TS surface to avoid offering an enum that isn’t accepted on input).

Copilot uses AI. Check for mistakes.
Comment on lines +450 to +459
const turn3 = await warmTurn(branch, 'Do you remember my name?');
console.log(` Turn 3 (name recall): "${turn3.trim().slice(0, 80)}"`);
const nameRecalled = turn3.toLowerCase().includes('lloyal');
assert(nameRecalled, `Name recall: ${nameRecalled ? 'found "Lloyal"' : 'MISSING "Lloyal" in: ' + turn3.trim().slice(0, 120)}`);

// Turn 4 (WARM): recall food
const turn4 = await warmTurn(branch, 'Do you remember my favourite food?');
console.log(` Turn 4 (food recall): "${turn4.trim().slice(0, 80)}"`);
const foodRecalled = turn4.toLowerCase().includes('pizza');
assert(foodRecalled, `Food recall: ${foodRecalled ? 'found "pizza"' : 'MISSING "pizza" in: ' + turn4.trim().slice(0, 120)}`);
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These assertions make the integration test brittle: even at temperature: 0, a model can “remember” without repeating the exact string (e.g., “Yes, you told me earlier” or paraphrasing), causing false failures across models/templates. To reduce flakiness, consider tightening the prompts to require an exact minimal answer (e.g., “Reply with only the name.” / “Reply with only the food.”) or relaxing the checks to allow common variants while still proving recall.

Suggested change
const turn3 = await warmTurn(branch, 'Do you remember my name?');
console.log(` Turn 3 (name recall): "${turn3.trim().slice(0, 80)}"`);
const nameRecalled = turn3.toLowerCase().includes('lloyal');
assert(nameRecalled, `Name recall: ${nameRecalled ? 'found "Lloyal"' : 'MISSING "Lloyal" in: ' + turn3.trim().slice(0, 120)}`);
// Turn 4 (WARM): recall food
const turn4 = await warmTurn(branch, 'Do you remember my favourite food?');
console.log(` Turn 4 (food recall): "${turn4.trim().slice(0, 80)}"`);
const foodRecalled = turn4.toLowerCase().includes('pizza');
assert(foodRecalled, `Food recall: ${foodRecalled ? 'found "pizza"' : 'MISSING "pizza" in: ' + turn4.trim().slice(0, 120)}`);
const turn3 = await warmTurn(branch, 'Do you remember my name? Reply with only my name.');
console.log(` Turn 3 (name recall): "${turn3.trim().slice(0, 80)}"`);
const nameAnswer = turn3.trim().toLowerCase();
const nameRecalled = nameAnswer === 'lloyal';
assert(nameRecalled, `Name recall: expected "Lloyal", got "${nameAnswer}"`);
// Turn 4 (WARM): recall food
const turn4 = await warmTurn(branch, 'Do you remember my favourite food? Reply with only my favourite food.');
console.log(` Turn 4 (food recall): "${turn4.trim().slice(0, 80)}"`);
const foodAnswer = turn4.trim().toLowerCase();
const foodRecalled = foodAnswer === 'pizza';
assert(foodRecalled, `Food recall: expected "pizza", got "${foodAnswer}"`);

Copilot uses AI. Check for mistakes.
@lloyal-research lloyal-research merged commit 125f106 into main Feb 12, 2026
6 of 7 checks passed
@lloyal-research lloyal-research deleted the feat/new-chat-api branch February 20, 2026 03:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants