Skip to content

Conversation

@gitlost-murali
Copy link
Contributor

@gitlost-murali gitlost-murali commented Dec 14, 2025

Description

This pull request adds support for parsing and extracting structured tool calls from model outputs in the generator, making it possible to handle tool-augmented chat completions. It introduces a configurable tool parser, updates the Completion data model to include tool call information, and adds comprehensive unit tests for both tool parsing and non-tool parsing scenarios.

What this does

When users send requests with tool parsing enabled like this:

formatted_request = tokenizer.apply_chat_template(
    as_chat,
    tools=tools,
    tokenize=False,
    add_generation_prompt=True,
)

response = await policy.generate.route(formatted_request)
completion = response[0]

The completion object will now have:

  • completion.tool_calls - List of parsed tool calls from the model output

  • completion.has_tool_calls - Boolean property to check if any tool calls exist

  • completion.content - The text content excluding tool call tags

    To enable this, set tool_call_parser="hermes" when creating the policy.

Configuration

To enable tool parsing, add tool_call_parser to your policy configuration in the YAML file:

# Policy configuration
policy:
  engine_args:
    model: ${model}
    tensor_parallel_size: 2
  sampling_params:
    n: ${group_size}
    max_tokens: ${max_res_tokens}
  tool_call_parser: "hermes"  # Enable tool call parsing (optional)

Test Plan

Unit Tests

  • Added unit tests covering both scenarios: with and without tool parsing, verifying correct extraction and population of tool call information. (tests/unit_tests/test_generator.py)

Integration Tests (Real-World Example)

  • Added integration tests that validate the full tool-calling workflow with an actual model (Qwen/Qwen3-0.6B):

pytest tests/integration_tests/test_tool_parsing.py -v -s

Tests:

  1. test_tool_parsing_multi_turn - End-to-end tool calling workflow:

    • User asks "Calculate 123 + 456"
    • Model generates tool call
    • Tool call is extracted: calculator(equation="123 + 456") with hermes parser
    • Calculator executes, result fed back to model
    • Model returns final answer containing "579"
  2. test_content_without_tool_calls - Verifies non-tool requests:

    • User asks "What is the capital of France?"
    • Confirms tool_calls == [] and content == text

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Dec 14, 2025
@daniellepintz
Copy link
Contributor

Hi @gitlost-murali! Thanks for the PR! I am wondering do you have a real world example you could test this on? (and add to Test Plan)

Copy link
Contributor

@daniellepintz daniellepintz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like unit tests are failing as well

@gitlost-murali
Copy link
Contributor Author

Hi @daniellepintz , Thanks! I fixed the tests.

Specifically, I added an end-end integration test which reflects multi-turn chat usage. In unit tests, I use mocked responses to test the actual parsing functionality.

W.r.t test plan, I added integration tests with real world use case and mentioned how to run it along with the expected flow. Is that what is expected? Happy to change the test plan section

@daniellepintz
Copy link
Contributor

Hi @gitlost-murali, thanks for the PR. Unfortunately, similar to the other PRs, we prefer to not make changes to generator.py until #669 is resolved :/ So let's check back in at that point

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants