Skip to content

Add Function Calling support and OpenAI Style Conversion#908

Merged
MarcusDunn merged 8 commits intoutilityai:mainfrom
liquidos-ai:feature/add_tool_support
Feb 3, 2026
Merged

Add Function Calling support and OpenAI Style Conversion#908
MarcusDunn merged 8 commits intoutilityai:mainfrom
liquidos-ai:feature/add_tool_support

Conversation

@saivishwak
Copy link
Contributor

@saivishwak saivishwak commented Jan 22, 2026

  • Added full llama.cpp function‑calling support with OpenAI‑style chat compatibility, matching server behavior end‑to‑end.
  • Exposed granular raw bindings in llama-cpp-sys-2 (chat templates, parsers, grammar triggers, preserved tokens, streaming diffs).
  • Implemented safe Rust wrappers in llama-cpp-2, including OpenAI chat message types and streaming delta parsing.
  • Fixed grammar sampler crashes by aligning sampling flow with llama.cpp (avoid double‑accept).
  • Added new examples/tools.rs showing Tool Calling
  • Added new examples/openai_stream.rs showing OpenAI conversion with streaming.

@saivishwak
Copy link
Contributor Author

Streaming Example with Tools

cargo run --features cuda --release --example openai_stream -- hf-model TheBloke/Llama-2-7B-GGUF llama-2-7b.Q4_K_M.gguf

Output

Streaming deltas:
{"tool_calls":[{"function":{"arguments":"","name":"get_weather"},"id":"cyrkrUxsxp2KRhcU5y0UP5yunfiFwK1f","index":0,"type":"function"}]}
{"tool_calls":[{"function":{"arguments":"{"},"index":0}]}
{"tool_calls":[{"function":{"arguments":"\""},"index":0}]}
{"tool_calls":[{"function":{"arguments":"city"},"index":0}]}
{"tool_calls":[{"function":{"arguments":"\":"},"index":0}]}
{"tool_calls":[{"function":{"arguments":"\""},"index":0}]}
{"tool_calls":[{"function":{"arguments":"Par"},"index":0}]}
{"tool_calls":[{"function":{"arguments":"is"},"index":0}]}
{"tool_calls":[{"function":{"arguments":"\","},"index":0}]}
{"tool_calls":[{"function":{"arguments":"\""},"index":0}]}
{"tool_calls":[{"function":{"arguments":"unit"},"index":0}]}
{"tool_calls":[{"function":{"arguments":"\":"},"index":0}]}
{"tool_calls":[{"function":{"arguments":"\""},"index":0}]}
{"tool_calls":[{"function":{"arguments":"c"},"index":0}]}
{"tool_calls":[{"function":{"arguments":"\""},"index":0}]}
{"tool_calls":[{"function":{"arguments":"}"},"index":0}]}

Final message:
{
  "content": null,
  "role": "assistant",
  "tool_calls": [
    {
      "function": {
        "arguments": "{\"city\":\"Paris\",\"unit\":\"c\"}",
        "name": "get_weather"
      },
      "id": "6jYW4BsvCRuduDwghsKLzgiyTrWsv49Q",
      "type": "function"
    }
  ]
}

@MarcusDunn
Copy link
Contributor

I'll try to get around to reviewing this this weekend. Thanks for the PR.

@saivishwak
Copy link
Contributor Author

I'll look into the failures and conflicts, Also planning on some small refactoring.

Thanks

@saivishwak
Copy link
Contributor Author

saivishwak commented Jan 24, 2026

Refactored with minial API policy and also added OpenAI style Chat Completions server example

curl --location 'localhost:8080/v1/chat/completions' \
--header 'Content-Type: application/json' \
--data '{
    "model": "dummy-model",
    "messages": [
        {
            "role": "system",
            "content": "You are an helpful assistant Tess"
        },
        {
            "role": "user",
            "content": "What is your name?"
        }
    ]
}'

---- OUTPUT ---

{"choices":[{"finish_reason":"stop","index":0,"message":{"content":"Hi there! My name is Tess, and I'm here to assist you with any questions or tasks you may have. I'm a helpful assistant, and I'm always happy to lend a hand. What can I help you with today?","role":"assistant"}}],"created":1769269333,"id":"chatcmpl-1769269333","model":"dummy-model","object":"chat.completion","usage":{"completion_tokens":48,"prompt_tokens":26,"total_tokens":74}}

@saivishwak
Copy link
Contributor Author

Hey @MarcusDunn ,

Hope you are doing great.

Wanted to know, when can we review this pr and get it into the next version.

Thnks

llama_cpp_sys_2::llama_rs_json_schema_to_grammar(schema_cstr.as_ptr(), false, &mut out)
};

let result = (|| {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this a lambda for a reason? can it just be a block?

Copy link
Contributor Author

@saivishwak saivishwak Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No Strong reason to use IIFE here, Block would work. The only intent was to scope the Result while ensuring the FFI Free happens. I can swith to block to avoid IIFE confusion.

)
};

let result = (|| {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

again, unsure about iife.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above, Can change to block.


/// Try accepting a token from the sampler. Returns an error if the sampler throws.
pub fn try_accept(&mut self, token: LlamaToken) -> Result<(), SamplerAcceptError> {
let rc = unsafe { llama_cpp_sys_2::llama_rs_sampler_accept(self.sampler, token.0) };
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this reference counted? if no, please change name.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Naming rc was ResultCheck, Yes it seems to confusing the Rust nomenclature. WIll change it to sampler_result instead.

Copy link
Contributor

@MarcusDunn MarcusDunn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not qualified to review the cpp unfortunately. See comments. Otherwise looks good.

@MarcusDunn
Copy link
Contributor

Hey @MarcusDunn ,

Hope you are doing great.

Wanted to know, when can we review this pr and get it into the next version.

Thnks

reviewed. If tests. pass I'll merge.

@saivishwak
Copy link
Contributor Author

Hi @MarcusDunn ,

Checking the review comments on C++, and will revert back.

Thanks

@MarcusDunn
Copy link
Contributor

long as linux and Mac pass, I'm happy, ideally windows stays functional, but It's very much a best-effort-basis.

@saivishwak
Copy link
Contributor Author

long as linux and Mac pass, I'm happy, ideally windows stays functional, but It's very much a best-effort-basis.

Okay, Sorry for the to and fro, Updated the fix for review comments.

@MarcusDunn MarcusDunn merged commit 4c197b7 into utilityai:main Feb 3, 2026
4 of 5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants