Add Function Calling support and OpenAI Style Conversion by saivishwak · Pull Request #908 · utilityai/llama-cpp-rs

saivishwak · 2026-01-22T16:46:06Z

Added full llama.cpp function‑calling support with OpenAI‑style chat compatibility, matching server behavior end‑to‑end.
Exposed granular raw bindings in llama-cpp-sys-2 (chat templates, parsers, grammar triggers, preserved tokens, streaming diffs).
Implemented safe Rust wrappers in llama-cpp-2, including OpenAI chat message types and streaming delta parsing.
Fixed grammar sampler crashes by aligning sampling flow with llama.cpp (avoid double‑accept).
Added new examples/tools.rs showing Tool Calling
Added new examples/openai_stream.rs showing OpenAI conversion with streaming.

saivishwak · 2026-01-22T16:51:56Z

Streaming Example with Tools

cargo run --features cuda --release --example openai_stream -- hf-model TheBloke/Llama-2-7B-GGUF llama-2-7b.Q4_K_M.gguf

Output

Streaming deltas:
{"tool_calls":[{"function":{"arguments":"","name":"get_weather"},"id":"cyrkrUxsxp2KRhcU5y0UP5yunfiFwK1f","index":0,"type":"function"}]}
{"tool_calls":[{"function":{"arguments":"{"},"index":0}]}
{"tool_calls":[{"function":{"arguments":"\""},"index":0}]}
{"tool_calls":[{"function":{"arguments":"city"},"index":0}]}
{"tool_calls":[{"function":{"arguments":"\":"},"index":0}]}
{"tool_calls":[{"function":{"arguments":"\""},"index":0}]}
{"tool_calls":[{"function":{"arguments":"Par"},"index":0}]}
{"tool_calls":[{"function":{"arguments":"is"},"index":0}]}
{"tool_calls":[{"function":{"arguments":"\","},"index":0}]}
{"tool_calls":[{"function":{"arguments":"\""},"index":0}]}
{"tool_calls":[{"function":{"arguments":"unit"},"index":0}]}
{"tool_calls":[{"function":{"arguments":"\":"},"index":0}]}
{"tool_calls":[{"function":{"arguments":"\""},"index":0}]}
{"tool_calls":[{"function":{"arguments":"c"},"index":0}]}
{"tool_calls":[{"function":{"arguments":"\""},"index":0}]}
{"tool_calls":[{"function":{"arguments":"}"},"index":0}]}

Final message:
{
  "content": null,
  "role": "assistant",
  "tool_calls": [
    {
      "function": {
        "arguments": "{\"city\":\"Paris\",\"unit\":\"c\"}",
        "name": "get_weather"
      },
      "id": "6jYW4BsvCRuduDwghsKLzgiyTrWsv49Q",
      "type": "function"
    }
  ]
}

MarcusDunn · 2026-01-23T18:57:33Z

I'll try to get around to reviewing this this weekend. Thanks for the PR.

saivishwak · 2026-01-23T20:59:00Z

I'll look into the failures and conflicts, Also planning on some small refactoring.

Thanks

saivishwak · 2026-01-24T15:45:55Z

Refactored with minial API policy and also added OpenAI style Chat Completions server example

curl --location 'localhost:8080/v1/chat/completions' \
--header 'Content-Type: application/json' \
--data '{
    "model": "dummy-model",
    "messages": [
        {
            "role": "system",
            "content": "You are an helpful assistant Tess"
        },
        {
            "role": "user",
            "content": "What is your name?"
        }
    ]
}'

---- OUTPUT ---

{"choices":[{"finish_reason":"stop","index":0,"message":{"content":"Hi there! My name is Tess, and I'm here to assist you with any questions or tasks you may have. I'm a helpful assistant, and I'm always happy to lend a hand. What can I help you with today?","role":"assistant"}}],"created":1769269333,"id":"chatcmpl-1769269333","model":"dummy-model","object":"chat.completion","usage":{"completion_tokens":48,"prompt_tokens":26,"total_tokens":74}}

saivishwak · 2026-02-03T11:24:45Z

Hey @MarcusDunn ,

Hope you are doing great.

Wanted to know, when can we review this pr and get it into the next version.

Thnks

MarcusDunn · 2026-02-03T15:03:16Z

llama-cpp-2/src/lib.rs

+        llama_cpp_sys_2::llama_rs_json_schema_to_grammar(schema_cstr.as_ptr(), false, &mut out)
+    };
+
+    let result = (|| {


is this a lambda for a reason? can it just be a block?

No Strong reason to use IIFE here, Block would work. The only intent was to scope the Result while ensuring the FFI Free happens. I can swith to block to avoid IIFE confusion.

MarcusDunn · 2026-02-03T15:04:27Z

llama-cpp-2/src/openai.rs

+            )
+        };
+
+        let result = (|| {


again, unsure about iife.

Same as above, Can change to block.

MarcusDunn · 2026-02-03T15:06:08Z

llama-cpp-2/src/sampling.rs


+    /// Try accepting a token from the sampler. Returns an error if the sampler throws.
+    pub fn try_accept(&mut self, token: LlamaToken) -> Result<(), SamplerAcceptError> {
+        let rc = unsafe { llama_cpp_sys_2::llama_rs_sampler_accept(self.sampler, token.0) };


is this reference counted? if no, please change name.

Naming rc was ResultCheck, Yes it seems to confusing the Rust nomenclature. WIll change it to sampler_result instead.

llama-cpp-sys-2/wrapper_common.cpp

MarcusDunn

I'm not qualified to review the cpp unfortunately. See comments. Otherwise looks good.

MarcusDunn · 2026-02-03T15:17:13Z

Hey @MarcusDunn ,

Hope you are doing great.

Wanted to know, when can we review this pr and get it into the next version.

Thnks

reviewed. If tests. pass I'll merge.

saivishwak · 2026-02-03T16:39:52Z

Hi @MarcusDunn ,

Checking the review comments on C++, and will revert back.

Thanks

MarcusDunn · 2026-02-03T16:52:33Z

long as linux and Mac pass, I'm happy, ideally windows stays functional, but It's very much a best-effort-basis.

saivishwak · 2026-02-03T16:54:38Z

long as linux and Mac pass, I'm happy, ideally windows stays functional, but It's very much a best-effort-basis.

Okay, Sorry for the to and fro, Updated the fix for review comments.

Add Function Calling support and OpenAI Style Conversion

f587e71

saivishwak mentioned this pull request Jan 22, 2026

Incomplete Rust Bindings for llama.cpp Tool Calling Support #864

Open

saivishwak mentioned this pull request Jan 22, 2026

[FEATURE]: Add OpenAI Style Message format and Tool Calling functionality to LlamaCpp backend liquidos-ai/AutoAgents#108

Closed

Add json_schema in template

e69abf5

Refactor and add server example

4d79a28

saivishwak and others added 2 commits January 24, 2026 21:17

Merge branch 'main' into feature/add_tool_support

ef47ac7

Refactor to rename openai methods with suffix

8c8d25d

MarcusDunn reviewed Feb 3, 2026

View reviewed changes

llama-cpp-sys-2/wrapper_common.cpp Show resolved Hide resolved

MarcusDunn approved these changes Feb 3, 2026

View reviewed changes

saivishwak added 2 commits February 3, 2026 21:59

Fix cargo fmt

a01ac85

Try windows build fix

54814d9

Fix review comments

883d8d6

MarcusDunn merged commit 4c197b7 into utilityai:main Feb 3, 2026
4 of 5 checks passed

Conversation

saivishwak commented Jan 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

saivishwak commented Jan 22, 2026

Streaming Example with Tools

Output

Uh oh!

MarcusDunn commented Jan 23, 2026

Uh oh!

saivishwak commented Jan 23, 2026

Uh oh!

saivishwak commented Jan 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

saivishwak commented Feb 3, 2026

Uh oh!

MarcusDunn Feb 3, 2026

Choose a reason for hiding this comment

Uh oh!

saivishwak Feb 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

MarcusDunn Feb 3, 2026

Choose a reason for hiding this comment

Uh oh!

saivishwak Feb 3, 2026

Choose a reason for hiding this comment

Uh oh!

MarcusDunn Feb 3, 2026

Choose a reason for hiding this comment

Uh oh!

saivishwak Feb 3, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

MarcusDunn left a comment

Choose a reason for hiding this comment

Uh oh!

MarcusDunn commented Feb 3, 2026

Uh oh!

saivishwak commented Feb 3, 2026

Uh oh!

MarcusDunn commented Feb 3, 2026

Uh oh!

saivishwak commented Feb 3, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

saivishwak commented Jan 22, 2026 •

edited

Loading

saivishwak commented Jan 24, 2026 •

edited

Loading

saivishwak Feb 3, 2026 •

edited

Loading