"Like a research assistant who can look up papers, take notes, and compare findings — rather than just answering one question."
A regular LLM call is a single prompt → single response. You hand the model some text and it writes back. That's it.
An AI agent is different: it has access to tools — functions it can call to look things up, compute things, or take actions — and it decides dynamically which tools to use based on each new sub-goal.
Think of the difference between:
- 🤖 Simple LLM: You ask "what does Paper A say about transformers?" and the model guesses from its training data.
- 🕵️ Research Agent: You ask the same question, and the agent looks it up, reads the relevant sections, possibly compares them to Paper B, and synthesises an answer with citations.
| Simple RAG | Research Agent |
|---|---|
| Embed documents → vector DB | Same |
| User query → nearest chunks → LLM answer | Agent plans which tools to call |
| Single retrieval step | Multi-step: search → summarise → compare |
| No memory between steps | Observations from each step feed the next |
| Good for Q&A | Good for synthesis, comparison, gap analysis |
In simple RAG, the pipeline is fixed: retrieve then answer. In an agent, the LLM itself decides the pipeline at runtime.
ReAct = Reason + Act. The agent alternates between thinking and doing:
Thought : I need to find papers about attention mechanisms.
Action : search_papers
Input : attention mechanism self-attention
Observation: [Result 1] Paper: "Attention Is All You Need" …
Thought : I found the relevant paper. Now I'll get its full summary.
Action : summarize_paper
Input : Attention Is All You Need
Observation: Title: Attention Is All You Need, Authors: Vaswani et al. …
Thought : I have enough to answer the question.
Final Answer: The paper "Attention Is All You Need" introduced …
Each Observation is the tool's output, appended to the agent's context. The agent re-reads the growing context at each step to decide what to do next.
┌─────────────────────────────────────────────────────┐
│ main.py │
│ (CLI: --query / --report / --interactive) │
└──────────────────────┬──────────────────────────────┘
│
┌────────────▼────────────┐
│ Research Agent │ ← agent.py
│ (ReAct loop + LLM) │
└──┬──────────┬───────────┘
│ │
┌─────────▼──┐ ┌───▼────────────┐ ┌─────────────────┐
│search_tool │ │ summary_tool │ │ compare_tool │
│(FAISS │ │ (PaperMetadata │ │ (LLM comparison│
│ semantic │ │ lookup) │ │ of two papers)│
│ search) │ └───────┬────────┘ └────────┬────────┘
└─────┬──────┘ │ │
│ ┌───────▼────────────────────▼──────┐
│ │ PaperMetadata objects │
│ │ (from paper_parser.py) │
│ └───────────────────────────────────┘
┌─────▼──────┐
│ FAISS index│ ← paper_indexer.py
│ (chunked │
│ PDFs + │
│ metadata) │
└────────────┘
Gap Analysis (--report):
paper_metadata → gap_analyzer.py → LLM synthesis → report_generator.py → .md file
# 1. Clone / navigate to the project
cd research-agent
# 2. Create and activate a virtual environment
python -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
# 3. Install dependencies
pip install -r requirements.txt
# 4. Configure environment
cp .env.example .env
# Edit .env and add your OPENAI_API_KEY
# 5. Add research papers
# Copy your .pdf files into data/papers/Place any number of .pdf files into data/papers/. The pipeline will:
- Extract text and LLM-parse metadata (title, authors, abstract, methodology, findings, limitations).
- Chunk the full text and embed it into a FAISS vector index.
- Make both the metadata and the full text available to the agent's tools.
Tips:
- Use papers that are topically related for better gap analysis.
- 3–10 papers is the sweet spot. More than 20 may hit the LLM's context limit during gap analysis.
- Scanned PDFs without OCR will produce empty or garbled text — use PDFs with selectable text.
# Ask a single question and exit
python main.py --query "What methodologies are used across these papers?"
# Start an interactive Q&A session
python main.py --interactive
# Generate a gap analysis report
python main.py --topic "transformer models" --report
# All options
python main.py --papers-dir data/papers \
--topic "BERT fine-tuning" \
--model gpt-4 \
--report \
--output reports/bert_gaps.mdThese questions showcase the agent's multi-step reasoning:
"What methodologies are used across these papers?"
"Which papers agree on X, and which contradict each other?"
"What are the main gaps in this research area?"
"Summarise the paper on [topic] and compare it to [other paper]."
"Which paper has the strongest experimental design?"
"What datasets are most commonly used?"
"Are there any contradictions between the papers' findings?"
The gap analysis report has six sections:
| Section | What it means |
|---|---|
| Common Themes | Topics / findings that appear in multiple papers — the consensus view |
| Contradictions | Where papers disagree — potential areas of ongoing debate |
| Missing Experiments | Experiments that logically follow from the existing work but haven't been done |
| Missing Populations | Groups, languages, contexts, or demographics not yet studied |
| Methodological Gaps | Approaches not used in any paper (e.g., "no longitudinal study exists") |
| Suggested Next Steps | Concrete research directions derived from all of the above |
⚠️ Always verify the output. LLMs can hallucinate contradictions or invent plausible-sounding but non-existent gaps. Treat the gap analysis as a first draft to refine with domain expertise.
-
LLMs can hallucinate citations — the agent might confidently say "Paper X found Y" when it did not. Always check claims against the original PDF.
-
Gap analysis may miss domain-specific context — a gap that is obvious to a domain expert ("nobody used technique Z") requires domain knowledge the LLM may not have.
-
Works best with 3–10 papers on the same topic — fewer papers means less to synthesise; more papers risks exceeding the context window during gap analysis.
-
PDF extraction quality varies — scanned PDFs, multi-column layouts, and heavy use of figures degrade text extraction. The LLM falls back gracefully but metadata may be incomplete.
-
The agent may loop or over-call tools — the
max_iterations=8safety cap prevents infinite loops but may cut off complex multi-paper comparisons.
- Create
src/tools/my_tool.pywith acreate_my_tool(…) -> Toolfunction. - Import and instantiate it in
src/agent.pyinsidecreate_research_agent. - Add it to the
toolslist passed toinitialize_agent.
The agent will automatically start using the new tool based on its description — no other changes needed.
cite_tool— generate a BibTeX entry for a paper from its metadata.timeline_tool— order papers chronologically and show how the field evolved.keyword_tool— extract and rank keywords across all papers.arxiv_tool— search arXiv for papers related to the indexed collection.