Discussion #975: [RFC] Agent Categorization + ReACT Agent OutputParser

Originally posted by @yanxi0830 on 2025-02-05 22:16 UTC
Category: Ideas
Status: Closed on 2025-02-13 18:29 UTC

---

-- moving over discussions from https://github.com/meta-llama/llama-stack/pull/955

# Problem
We want to standardize the steps necessary for users to build a ReACT agent with the ability to interleave between generating thoughts and taking task specific actions dynamically. 

The current agent orchestration loop requires ad hoc logic for intercepting agent outputs and parsing outputs from output messages to fit a ReACT framework ([example](https://colab.research.google.com/drive/1hb7JjIW6N8TpHfidRrC_rsoC4oT4httA)). This proposes changes to LlamaStack client SDKs and server APIs for better ergonomics to build an ReACT agent. 

# Proposed Solution

We want to have the flexibility to configure custom prompts and custom output parsers in agent loop execution. 

1. Introduce the notion of OutputParser for parsing outputs from ReACT prompting into ToolCall agent output.
Our current agent loop with custom tool calls will loop and call tools until there’s no more tool response. In ReACT framework, action output typically maps to a tool call. We can re-utilize the agent loop, but add a parsing logic right after agent outputs to populate “action” into ToolCall to enable ReACT. 
- Client Agent SDK https://github.com/meta-llama/llama-stack-client-python/pull/121
- We need to incorporate similar output parsing on server for ReACT with builtin tools. 

2. For further generalization: RFC for high-level concept categorization of what defines an Agent Type. 


## Current Agent Types Summary
- An `Agent` instance is defined by an `AgentConfig`
- An `Agent` instance can be categorized into several classes
 - **Vanilla**
 - keep track of conversation loop history
 - **RAG**
 - access to "builtin::rag" tool
 - we current first force retrieve context by explicitly calling memory tool
 - **ToolCalling**
 - could be configured to use "builtin::websearch" / "builtin::code_interpreter" / "builtin::wolfram_alpha" / "builtin::filesystem" / etc
 - **ReACT**
 - require custom output parser to execute "action" as tool calls

<table>
 <tr>
 <td>
Agent Type
 </td>
 <td>Agent Config Template
 </td>
 <td>System Prompt
 </td>
 <td>Output Parser
 </td>
 <td>Orchestration
 </td>
 <td>Note
 </td>
 </tr>
 <tr>
 <td>Vanilla
 </td>
 <td>raw Agent
 </td>
 <td>instruction
 </td>
 <td>
 </td>
 <td>Conversation Loop
 </td>
 <td>
 </td>
 </tr>
 <tr>
 <td>Tool Calling
 </td>
 <td>toolgroups=[“builtin::websearch”, “builtin::code_interpreter”]
 </td>
 <td><a href="https://github.com/meta-llama/llama-models/blob/ecf2f12acb17a5c2bd3913504a70fa79f2e518de/models/llama3_2/prompts_text.py#L55-L66">default tool prompt</a> + instruction
 </td>
 <td><a href="https://github.com/meta-llama/llama-models/blob/ecf2f12acb17a5c2bd3913504a70fa79f2e518de/models/llama3/api/chat_format.py#L169">decode_assistant_message_from_content</a>
 </td>
 <td>Loop until there’s no more tool calls

Pass tool response as next turn (built-in tool & custom tool differ)
 </td>
 <td>
 </td>
 </tr>
 <tr>
 <td>RAG
 </td>
 <td>toolgroups=(builtin::rag, args: {vector_db_ids})

force_retrieval=?
 </td>
 <td><a href="https://github.com/meta-llama/llama-models/blob/ecf2f12acb17a5c2bd3913504a70fa79f2e518de/models/llama3_2/prompts_text.py#L55-L66">default tool prompt</a> + instruction
 </td>
 <td>
 </td>
 <td>Retrieve context from RAG tool before calling generation. 
 </td>
 <td>We should add an ability to force retrieval & ability for auto retrieval via model tool calling
 </td>
 </tr>
 <tr>
 <td>ReACT
 </td>
 <td>instructions=react_prompt

output_parser=react_output_parser
 </td>
 <td>ReACT prompting (thought-action-answer)
 </td>
 <td>Parse from action / action_input into ToolCall as part of Agent Response. 
 </td>
 <td>Loop until there’s no more tool calls

Pass tool response as next turn
 </td>
 <td>
 </td>
 </tr>
</table>

# Proof of Concept Implementation

- llama-stack-client-python: https://github.com/meta-llama/llama-stack-client-python/pull/121
- llama-stack-apps: https://github.com/meta-llama/llama-stack-apps/pull/166
- llama-stack: https://github.com/meta-llama/llama-stack/pull/955
- llama-models: https://github.com/meta-llama/llama-models/pull/272

---
_Migrated from discussion #975: https://github.com/llamastack/llama-stack/discussions/975_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Discussion #975: [RFC] Agent Categorization + ReACT Agent OutputParser #4582

Problem

Proposed Solution

Current Agent Types Summary

Proof of Concept Implementation

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Agent Type	Agent Config Template	System Prompt	Output Parser	Orchestration	Note
Vanilla	raw Agent	instruction		Conversation Loop
Tool Calling	toolgroups=[“builtin::websearch”, “builtin::code_interpreter”]	default tool prompt + instruction	decode_assistant_message_from_content	Loop until there’s no more tool calls Pass tool response as next turn (built-in tool & custom tool differ)
RAG	toolgroups=(builtin::rag, args: {vector_db_ids}) force_retrieval=?	default tool prompt + instruction		Retrieve context from RAG tool before calling generation.	We should add an ability to force retrieval & ability for auto retrieval via model tool calling
ReACT	instructions=react_prompt output_parser=react_output_parser	ReACT prompting (thought-action-answer)	Parse from action / action_input into ToolCall as part of Agent Response.	Loop until there’s no more tool calls Pass tool response as next turn

Discussion #975: [RFC] Agent Categorization + ReACT Agent OutputParser #4582

Description

Problem

Proposed Solution

Current Agent Types Summary

Proof of Concept Implementation

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions