A research platform where multiple specialized AI agents work together to conduct thorough research, verify facts, synthesize information, and generate well-structured reports with proper citations. Think of it as having a team of research assistants, each with their own expertise, collaborating on your research questions.
I built this platform to demonstrate how different agentic design patterns can work together. Here's what makes it tick:
- Multi-Agent Architecture: Specialized agents (Researcher, Fact-Checker, Synthesizer, Evaluator) work together
- RAG System: Document retrieval and knowledge base integration
- Tool Use: Web search, PDF parsing, citation extraction
- Guardrails: Fact-checking and verification
- Evaluation: Quality scoring and assessment
- Routing: Intelligent research strategy determination
- Memory: Session management and context retention
- Multi-Source Research: Combines web search and document retrieval
- Fact-Checking: Automatic verification and cross-referencing
- Report Generation: Structured, well-formatted research reports
- Quality Evaluation: Multi-dimensional quality scoring
- Citation Management: Automatic citation tracking and formatting
- Session Management: Save and continue research sessions
- Document Management: Upload and manage PDF documents
- Streamlit Web UI: Easy-to-use interface
- Document Upload: Drag-and-drop PDF uploads
- Research History: View and manage past research sessions
- Export Options: Export reports and citations
- Python 3.8 or higher
- Gemini API key
- (Optional) Tavily API key or Serper API key for web search
Getting started is pretty straightforward:
- Navigate to the project directory:
cd research_platform- Install the dependencies:
pip install -r requirements.txt- Set up your environment variables:
cp .env.example .envThen edit .env and add your API keys. You'll need at least the Gemini key:
GEMINI_API_KEY=your_GEMINI_API_KEY_here
TAVILY_API_KEY=your_tavily_api_key_here # Optional - for web search
SERPER_API_KEY=your_serper_api_key_here # Optional - alternative to Tavily
- Start the application:
streamlit run frontend/streamlit_app.pyThe app will automatically open in your browser at http://localhost:8501. If it doesn't, just navigate there manually.
- Enter a research query in the Research page
- Choose options: Enable/disable web search and document search
- Click "Start Research" and wait for the agents to complete their work
- Review the report with quality scores and citations
- Export the report or citations as needed
- Go to Documents page
- Upload PDF files using the file uploader
- Click "Load Documents into RAG System"
- Documents are now available for research queries
- Go to History page to view past research sessions
- Load a previous session to review or continue research
- Delete sessions you no longer need
User Query
↓
Router → Determines research strategy
↓
Researcher Agent → Gathers information (Web + RAG)
↓
Fact-Checker Agent → Verifies facts and cross-references
↓
Synthesizer Agent → Creates structured report
↓
Evaluator Agent → Assesses quality
↓
Final Report + Citations + Quality Scores
- Backend: Agent implementations, core systems, tools
- Frontend: Streamlit UI and components
- Data: Document storage, sessions, vector store
research_platform/
├── backend/
│ ├── agents/ # Agent implementations
│ ├── core/ # Core systems (RAG, Router, Memory, Citations)
│ ├── tools/ # Tools (Web Search, PDF Parser, Citation Extractor)
│ ├── orchestrator.py # Main orchestration logic
│ └── models.py # Data models
├── frontend/
│ ├── components/ # UI components
│ ├── streamlit_app.py # Main Streamlit app
│ └── utils.py # UI utilities
├── data/ # Data storage
│ ├── documents/ # Uploaded documents
│ ├── sessions/ # Research sessions
│ └── vectorstore/ # FAISS vector store
└── requirements.txt # Python dependencies
GEMINI_API_KEY: Required. Your Gemini API key for LLM accessTAVILY_API_KEY: Optional. For web search via Tavily APISERPER_API_KEY: Optional. Alternative to Tavily for web searchGEMINI_MODEL: Optional. Default: "gpt-4"Gemini_TEMPERATURE: Optional. Default: 0.7
You can customize the models used by editing the orchestrator initialization in streamlit_app.py:
orchestrator = ResearchOrchestrator(
model_name="gpt-4", # or "gpt-3.5-turbo"
temperature=0.7
)You can also use the orchestrator programmatically:
from backend.orchestrator import ResearchOrchestrator
# Initialize
orchestrator = ResearchOrchestrator()
# Load documents (optional)
orchestrator.load_documents(["Document text 1", "Document text 2"])
# Conduct research
result = orchestrator.research(
query="Your research question",
use_web_search=True,
use_rag=True
)
# Access results
print(result["report"])
print(result["quality_scores"])
print(result["citations"])If you run into issues, here are some common problems and how to fix them:
"Gemini API key is required"
- Make sure your
.envfile exists in the project root - Double-check that
GEMINI_API_KEYis set correctly (no extra spaces or quotes) - Restart the Streamlit app after changing the
.envfile
"Web search not working"
- Web search is completely optional - the platform works fine with just documents
- If you want web search, you'll need either a Tavily or Serper API key
- You can get a Tavily key from https://tavily.com or Serper from https://serper.dev
"PDF parsing failed"
- Make sure the PDF isn't password-protected
- Try a different PDF file to see if it's file-specific
- Very large PDFs might take longer or timeout
"Research takes too long"
- This is normal - the platform uses multiple agents that need to process sequentially
- Complex queries naturally take longer than simple ones
- Check your Gemini API rate limits if it's consistently slow
This is a portfolio project I built to showcase agentic AI patterns in action. If you find it useful or want to improve it, feel free to:
- Report any issues you encounter
- Suggest improvements or new features
- Fork it and make it your own
- LangChain: Agent framework and orchestration
- Gemini API: LLM access
- Streamlit: Web UI
- FAISS: Vector storage for RAG
- Tavily/Serper: Web search APIs
- Python 3.8+: Core language
- Academic Research: Conduct thorough research with fact-checking
- Business Intelligence: Research market trends and competitors
- Content Creation: Research-backed content generation
- Fact-Checking: Verify information with multiple sources
- Report Generation: Create well-structured research reports
This is a portfolio project built to showcase agentic AI patterns in action. If you find it useful or want to improve it, feel free to:
- Report any issues you encounter
- Suggest improvements or new features
- Fork it and make it your own
This project is built for educational and portfolio purposes. Use it as a learning resource or starting point for your own projects.
Built using:
- LangChain for agent framework
- Gemini for LLM access
- Streamlit for UI
- FAISS for vector storage
- Tavily/Serper for web search