A multi-agent Retrieval-Augmented Generation (RAG) system with Agent2Agent communication, built with LangGraph. The pipeline uses three specialized agents (Researcher, Summarizer, and Answer) that work in sequence to answer user questions from a document corpus.
- Researcher Agent: Retrieves relevant document chunks from a vector store (Chroma) using semantic search.
- Summarizer Agent: Summarizes the retrieved documents into a concise context.
- Answer Agent: Generates a final answer to the user's question using the summarized context.
Communication between agents follows a standardized Agent2Agent (A2A) protocol for request/response messages.
User Query → [Retrieve] → [Summarize] → [Answer] → Final Answer
↓ ↓ ↓
Chroma DB Summary LLM Answer
- State: Shared workflow state is defined in
state.py(MultiAgentState). - Workflow: Orchestrated in
main.pyvia a LangGraphStateGraphwith nodes:retrieve→summarize→answer. - Agents: Implemented in
agent.pywith a commonBaseAgentand A2A message handling froma2a_protocol.py.
multi_agent_rag/
├── README.md # This file
├── main.py # Entry point, workflow definition, RAG setup
├── agent.py # ResearcherAgent, SummarizerAgent, AnswerAgent
├── state.py # MultiAgentState (TypedDict)
├── a2a_protocol.py # AgentMessage format and create_message()
├── documents/ # Source .txt documents for RAG
│ ├── artificial_intelligence.txt
│ ├── natural_language_processing.txt
│ ├── robotics.txt
│ ├── data_science.txt
│ ├── neural_networks.txt
│ ├── computer_vision.txt
│ └── machine_learning.txt
└── chroma_db/ # Persisted vector store (created at runtime)
- Python 3.x
- Ollama running locally with the
llama3model (and optionallynomic-embed-textfor embeddings)
From the project root, install dependencies:
pip install -r requirements.txtKey dependencies: langgraph, langchain, langchain-ollama, chromadb, langchain-community, langchain-text-splitters, python-dotenv.
- Install dependencies (see above).
- Ensure Ollama is running and you have pulled:
llama3(chat model)nomic-embed-text(embeddings)
- Optional: Add a
.envin the project root if you need to configure environment variables.
From the project root (RAG_Learning):
python -m multi_agent_rag.mainOr from inside multi_agent_rag/:
python main.pyOn startup, the system:
- Loads
.txtfiles frommulti_agent_rag/documents/. - Splits them into chunks (500 chars, 50 overlap) and builds a Chroma vector store in
multi_agent_rag/chroma_db/. - Starts an interactive loop.
Example:
Your question: What is machine learning?
Answer:
[... model's answer based on retrieved and summarized context ...]
Workflow Status: complete
Documents Retrieved: 5
Type quit, exit, or q to exit.
Messages between agents use the format in a2a_protocol.py:
- Fields:
from_agent,to_agent,message_type(request/response/notification/error),task,data,timestamp,message_id,correlation_id. - Helpers:
create_message()builds a properly formattedAgentMessage.
Agents extend BaseAgent and implement process_message() to handle incoming A2A messages and use send_message() to reply.
- Documents directory: Default is
./multi_agent_rag/documents. If missing, a sample document is created. - Chroma persist path:
./multi_agent_rag/chroma_db. - LLM:
ChatOllama(model="llama3", temperature=0)inmain.py. - Embeddings:
OllamaEmbeddings(model="nomic-embed-text"). - Retrieval: Top
k=5chunks per query; chunk size 500, overlap 50.
You can change these in main.py and agent.py as needed.
Part of the RAG Learning project. Use and modify as needed for learning and experimentation.