feat(3/n): agent resume_turn#1194
Conversation
| request.session_id, request.turn_id | ||
| ) | ||
| tool_execution_step = ToolExecutionStep( | ||
| step_id=(in_progress_tool_call_step.step_id if in_progress_tool_call_step else str(uuid.uuid4())), |
There was a problem hiding this comment.
shouldn't you error out if there is no step found?
There was a problem hiding this comment.
Agree its a bit confusing (had to play around with the react_agent app). Let me add a comment here.
We do not error out here b/c in the case of ReActAgent (with a custom tool parser), server do not output a tool_execution step_start, and don't have the step. However, we should still allow the turn to be resumed with the ToolCallResponse in this case because server outputs message (no ToolCall) --> parser parse into ToolCall --> client execute ToolCall --> resume turn.
There was a problem hiding this comment.
@yanxi0830 hm yeah this is confusing and broke my mental model.
There was a problem hiding this comment.
in that case, the server wouldn't have sent a step_start event either
There was a problem hiding this comment.
yeah, in this case (of custom tool parsers), the server wouldn't have sent a step_start event, but we would still like to log the ToolExecutionStep in resume as a client tool call did indeed happen.
llama_stack/providers/inline/agents/meta_reference/agent_instance.py
Outdated
Show resolved
Hide resolved
|
|
||
| # get the steps from the turn id | ||
| steps = [] | ||
| if len(turns) > 0: |
There was a problem hiding this comment.
shouldn't this be true always?
There was a problem hiding this comment.
yes, but just in case kvstore gets destroyed.
| tool_responses=[], | ||
| started_at=datetime.now(), | ||
| ), | ||
| ) |
There was a problem hiding this comment.
I think we also need to save n_iter of inference so that we respect self.agent_config.max_infer_iters, which btw we don't currently respect when custom tool is used.
There was a problem hiding this comment.
llamastack/llama-stack-client-python#158
The max_infer_iters is still used here to track number of times we call resume.
There was a problem hiding this comment.
We should make it so that the total number of inference doesn't exceed max_infer_iters? Currently we could have max_infer_iters^2 of inference in the worst case (each resume could have max_infer_iters inference)
There was a problem hiding this comment.
Ah yeah, this will be max_infer_iters^2, I think that's what the current behaviour now if creating a second turn instead of resume too.
Will need to think a bit on how we can keep track of the same max_infer_iters across the client & server boundary in a follow up. E.g. we need to introduce extra params to communicate b/w the number of iters, whether we should use this max_infer_iters on client side.
There was a problem hiding this comment.
SG for following up on this. Thanks!
hardikjshah
left a comment
There was a problem hiding this comment.
looks good but left a couple of comments ( could be follow ups )
llama_stack/providers/inline/agents/meta_reference/agent_instance.py
Outdated
Show resolved
Hide resolved
| ) | ||
|
|
||
| output_message = None | ||
| async for chunk in self.run( |
There was a problem hiding this comment.
This logic seems to be significantly overlapping with impl of create_and_execute_turn right ?
May be a follow up : but would be good to consolidate the code so that logic of running a turn is in one place.
There was a problem hiding this comment.
Yes, doing that for 0.1.5 (so that existing create_and_execute_turn are fully verified and we can depreacate the allow_resume_turn flag: #1212
# What does this PR do? - #157 - Server change: llamastack/llama-stack#1194 ## Test Plan - See llamastack/llama-stack#1194 <img width="1080" alt="image" src="https://github.com/user-attachments/assets/fb4cf70d-1c3d-423d-ac75-432c2a3505d7" /> [//]: # (## Documentation) [//]: # (- [ ] Added a Changelog entry if the change is significant)
What does this PR do?
client changes
Test Plan
llama-stack-apps
Output Before: we have 2
turn_idwith 2 turnsOutput After: we have 1
turn_id, the final turn have all 3 stepsTelemetry
