An AI-powered research agent that autonomously researches any topic and produces comprehensive reports with citations.
- Plans - Generates targeted research questions
- Searches - Queries the web for information (Tavily API)
- Analyzes - Identifies gaps and searches again
- Synthesizes - Produces a structured report with citations
Input: "quantum computing applications in finance"
Output:
- 44 findings extracted
- 33 unique sources cited
- 3 research iterations
- Comprehensive report with executive summary,
detailed sections, and key takeaways
User Query
│
▼
┌─────────────────┐
│ Plan Research │ → Generate 4-6 targeted questions
└────────┬────────┘
│
▼
┌─────────────────┐
│ Research Loop │ → Search → Extract → Analyze Gaps
└────────┬────────┘ (repeats up to 3 iterations)
│
▼
┌─────────────────┐
│ Synthesize │ → Combine findings into report
└────────┬────────┘
│
▼
Markdown Report
(with citations)
| Component | Choice | Why |
|---|---|---|
| LLM | Claude (Anthropic) | Planning, extraction, synthesis |
| Search | Tavily API | Web search optimized for AI agents |
| Compute | AWS Lambda | Serverless, pay-per-use |
| UI | Gradio | Local web interface |
| Framework | Pure Python | No LangChain/LlamaIndex dependencies |
# Clone
git clone https://github.com/woodstocksoftware/research-agent.git
cd research-agent
# Environment
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt
# API Keys
export ANTHROPIC_API_KEY="your-key"
export TAVILY_API_KEY="your-key"
# Run with Gradio UI
python app.py
# Open http://localhost:7860# Build
sam build --template infrastructure/template.yaml
# Deploy
sam deploy --resolve-s3 \
--stack-name research-agent \
--capabilities CAPABILITY_IAM CAPABILITY_NAMED_IAM \
--parameter-overrides \
AnthropicApiKey=$ANTHROPIC_API_KEY \
TavilyApiKey=$TAVILY_API_KEYLocal (Gradio UI):
python app.py
# Open http://localhost:7860AWS (CLI helper):
./research.sh "your research topic"AWS (Direct Lambda):
aws lambda invoke \
--function-name research-agent-dev \
--cli-read-timeout 300 \
--cli-binary-format raw-in-base64-out \
--payload file://payload.json \
response.jsonNote: Research takes 2-3 minutes. API Gateway has a 29-second timeout, so use direct Lambda invocation or the CLI helper for best results.
research-agent/
├── app.py # Gradio UI
├── research.sh # CLI helper for AWS
├── infrastructure/
│ └── template.yaml # SAM/CloudFormation template
├── src/
│ ├── agent/
│ │ └── researcher.py # Core agent logic
│ ├── tools/
│ │ └── search.py # Tavily search wrapper
│ └── lambda/
│ └── handler.py # AWS Lambda handler
└── requirements.txt
The agent generates 4-6 specific research questions covering different aspects of the topic (definition, current state, key players, challenges, trends).
For each question:
- Search the web using Tavily
- Extract key findings with source attribution
- Analyze gaps in coverage
- Generate new queries if gaps exist
Combines all findings into a structured report:
- Executive summary
- Organized sections with headers
- Inline citations with links
- Key takeaways
- Source list
This agent is built with pure Python to demonstrate understanding of:
- Agent loops and control flow
- Tool integration patterns
- Prompt engineering for structured outputs
- Multi-step reasoning
No magic. Just code you can read, understand, and modify.
| Service | Cost |
|---|---|
| Claude API | ~$0.10-0.20 per research |
| Tavily API | Free tier: 1000 searches/month |
| AWS Lambda | ~$0.01 per research |
MIT
Built by Jim Williams | GitHub