-
Notifications
You must be signed in to change notification settings - Fork 80
feat: Add vLLM tool parsing support in completions #646
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Hi @gitlost-murali! Thanks for the PR! I am wondering do you have a real world example you could test this on? (and add to Test Plan) |
daniellepintz
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like unit tests are failing as well
ca2799b to
883494e
Compare
|
Hi @daniellepintz , Thanks! I fixed the tests. Specifically, I added an end-end integration test which reflects multi-turn chat usage. In unit tests, I use mocked responses to test the actual parsing functionality. W.r.t test plan, I added integration tests with real world use case and mentioned how to run it along with the expected flow. Is that what is expected? Happy to change the test plan section |
…te test accordingly.
…when tool parser is enabled
…zer stub and mocked responses
883494e to
2a6face
Compare
|
Hi @gitlost-murali, thanks for the PR. Unfortunately, similar to the other PRs, we prefer to not make changes to generator.py until #669 is resolved :/ So let's check back in at that point |
Description
This pull request adds support for parsing and extracting structured tool calls from model outputs in the generator, making it possible to handle tool-augmented chat completions. It introduces a configurable tool parser, updates the
Completiondata model to include tool call information, and adds comprehensive unit tests for both tool parsing and non-tool parsing scenarios.What this does
When users send requests with tool parsing enabled like this:
The completion object will now have:
completion.tool_calls- List of parsed tool calls from the model outputcompletion.has_tool_calls- Boolean property to check if any tool calls existcompletion.content- The text content excluding tool call tagsTo enable this, set tool_call_parser="hermes" when creating the policy.
Configuration
To enable tool parsing, add
tool_call_parserto your policy configuration in the YAML file:Test Plan
Unit Tests
tests/unit_tests/test_generator.py)Integration Tests (Real-World Example)
Qwen/Qwen3-0.6B):pytest tests/integration_tests/test_tool_parsing.py -v -sTests:
test_tool_parsing_multi_turn- End-to-end tool calling workflow:calculator(equation="123 + 456")withhermesparsertest_content_without_tool_calls- Verifies non-tool requests:tool_calls == []andcontent == text