next_turn_messages.pop() in _separate_tool_calls corrupts conversation history when multiple MCP tool calls require approval

*This bug was identified by Claude.*

## Summary

In `_separate_tool_calls()`, `next_turn_messages.pop()` is called inside the loop over tool calls (lines 721 and 725 of `streaming.py`). When the model returns multiple MCP tool calls that require approval in a single turn, `.pop()` fires once per tool call — but only the first pop removes the intended assistant message. Subsequent pops silently destroy unrelated messages from the conversation history.

With 2 tool calls needing approval, the user message is eaten. With 3+, the system message is also removed (or `IndexError` if the list is exhausted).

## Affected code

https://github.com/llamastack/llama-stack/blob/main/src/llama_stack/providers/inline/responses/builtin/responses/streaming.py#L711-L725

```python
for tool_call in choice.message.tool_calls:
    # ... other branches ...
    else:
        if self._approval_required(tool_call.function.name):
            approval_response = self.ctx.approval_response(...)
            if approval_response:
                if approval_response.approve:
                    non_function_tool_calls.append(tool_call)
                else:
                    next_turn_messages.pop()   # <-- called per tool call
            else:
                approvals.append(tool_call)
                next_turn_messages.pop()       # <-- called per tool call
```

## Reproduction

This script mirrors the exact control flow from `_separate_tool_calls()` (lines 674-729), using simplified stand-ins for the Pydantic models. It simulates a model response containing 3 MCP tool calls that all require approval with no pre-approval response provided:

```python
from dataclasses import dataclass


@dataclass
class ToolCallFunction:
    name: str
    arguments: str = "{}"


@dataclass
class ToolCall:
    id: str
    function: ToolCallFunction


@dataclass
class AssistantMessage:
    content: str | None
    tool_calls: list[ToolCall] | None

    def __repr__(self):
        names = [tc.function.name for tc in (self.tool_calls or [])]
        return f"AssistantMessage(tool_calls={names})"


@dataclass
class ChoiceMessage:
    content: str | None = None
    tool_calls: list[ToolCall] | None = None


@dataclass
class Choice:
    message: ChoiceMessage


def _approval_required(tool_name, mcp_tool_to_server):
    return tool_name in mcp_tool_to_server


def _separate_tool_calls_BUGGY(choices, messages, mcp_tool_to_server):
    """Mirrors the current (buggy) code in streaming.py."""
    approvals = []
    next_turn_messages = messages.copy()

    for choice in choices:
        next_turn_messages.append(
            AssistantMessage(content=choice.message.content, tool_calls=choice.message.tool_calls)
        )

        if choice.message.tool_calls:
            for tool_call in choice.message.tool_calls:
                if _approval_required(tool_call.function.name, mcp_tool_to_server):
                    approval_response = None  # no pre-approval
                    if approval_response:
                        pass
                    else:
                        approvals.append(tool_call)
                        next_turn_messages.pop()  # BUG: called per tool call

    return approvals, next_turn_messages


messages = [
    "SystemMessage('You are a helpful assistant')",
    "UserMessage('What is the weather in NYC, London, and Tokyo?')",
]

tool_calls = [
    ToolCall(id="call_1", function=ToolCallFunction(name="get_weather", arguments='{"city":"NYC"}')),
    ToolCall(id="call_2", function=ToolCallFunction(name="get_weather", arguments='{"city":"London"}')),
    ToolCall(id="call_3", function=ToolCallFunction(name="get_weather", arguments='{"city":"Tokyo"}')),
]

choices = [Choice(message=ChoiceMessage(content=None, tool_calls=tool_calls))]
mcp_tools = {"get_weather": "mcp-server-config"}

approvals, result = _separate_tool_calls_BUGGY(choices, messages, mcp_tools)

print(f"Input messages:  {messages}")
print(f"Output messages: {result}")
print(f"Approvals:       {len(approvals)}")
```

Output:

```
Input messages:  ["SystemMessage('You are a helpful assistant')", "UserMessage('What is the weather in NYC, London, and Tokyo?')"]
Output messages: []
Approvals:       3

PROBLEM: started with 2 messages, ended with 0.
The user message and system message were destroyed.
```

- pop # 1 removes the `AssistantMessage` (correct — this is what the code intended)
- pop # 2 removes the `UserMessage` (history corruption)
- pop # 3 removes the `SystemMessage` (entire conversation history wiped)

## Impact

- **Single tool call needing approval**: works by accident (one pop, correct target)
- **Two tool calls**: silently destroys the user's most recent message — subsequent turns lose context
- **Three+ tool calls**: can crash with `IndexError` or wipe the entire conversation history

This compounds with #5287 (now fixed by #5288), which caused `ApprovalFilter.never` lists to be ignored, making *every* MCP tool require approval and dramatically increasing the likelihood of hitting this multi-pop path.

## Expected behavior

The assistant message should be removed from `next_turn_messages` **at most once** per choice, regardless of how many tool calls within that choice require approval. A flag should track whether removal is needed and execute a single `.pop()` after the inner loop completes.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

next_turn_messages.pop() in _separate_tool_calls corrupts conversation history when multiple MCP tool calls require approval #5301

Summary

Affected code

Reproduction

Impact

Expected behavior

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

next_turn_messages.pop() in _separate_tool_calls corrupts conversation history when multiple MCP tool calls require approval #5301

Description

Summary

Affected code

Reproduction

Impact

Expected behavior

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions