Automate the literature review, all locally!
A local-first, agentic AI research system that automates literature review, identifies research gaps, and scaffolds novel research papers, all powered by local LLMs.
AutoResearcher is a complete research automation pipeline that transforms a research domain into publication-ready paper scaffolds through a multi-agent workflow. Unlike traditional literature review tools, AutoResearcher doesn't just find papers, it understands them, extracts insights, identifies novelty gaps, and generates coherent research directions grounded in actual literature.
- Research Setup - Accepts the required research domain, optional focus areas and the related research parameters from the user
- Domain Mapping - Analyzes research domains and identifies subfields, methodologies, and trends
- Literature Scouting - Intelligently queries arXiv and discovers relevant papers
- Paper Analysis - Extracts claims, limitations, and key insights from the PDFs
- Novelty Detection and Synthesis - Identifies research gaps and generates possible novel research directions with detailed justifications
- Paper Scaffolding - Creates a complete paper outline with title, abstract, and section structures
- 100% Local - All of the AI inference runs on your machine via Ollama (no API keys, no data sharing)
- Multi-Agent Architecture - 9 specialized AI agents orchestrated via Streamlit or n8n workflow
- Literature-Grounded - Every claim and citation is traceable to actual papers
- Interactive UI - An all-black dark-mode themed Streamlit interface
- Production-Ready - FastAPI backend with error handling and logging
- Purpose: Analyzes research domain and maps the landscape
- Input: Domain name + constraints
- Output: Subfields, methodologies, trends, problem classes
- LLM:
deepseek-r1:7b
- Purpose: Generates optimal search queries for arXiv
- Input: Domain map
- Output: List of search queries
- LLM:
qwen3:4b
- Purpose: Discovers relevant papers via arXiv API
- Input: Search queries + max papers
- Output: Paper metadata (title, authors, abstract, arXiv ID)
- External API: arXiv
- Purpose: Downloads and processes PDFs
- Input: arXiv IDs
- Output: Extracted text from PDFs
- Libraries: PyMuPDF, pdfplumber
- Purpose: Extracts key claims from papers
- Input: Paper text
- Output: List of claims per paper
- LLM:
qwen3:4b
- Purpose: Identifies limitations and future work
- Input: Paper text
- Output: List of limitations per paper
- LLM:
qwen3:4b
- Purpose: Identifies research gaps and underexplored areas
- Input: All claims + limitations
- Output: Novelty gaps with explanations
- LLM:
deepseek-r1:7b
- Purpose: Generates possible novel research directions
- Input: Domain map + novelty gaps
- Output: Research directions with titles, descriptions, methodologies
- LLM:
deepseek-r1:7b
- Purpose: Creates paper outline from selected direction
- Input: Research direction + papers
- Output: Title, abstract, contributions, section outline
- LLM:
deepseek-r1:7b
┌─────────────────┐
│ Streamlit UI │ ← User Input (Domain + Constraints)
└────────┬────────┘
│
▼
┌─────────────────┐
│ n8n Workflow │ ← Orchestration Layer (Supportable)
└────────┬────────┘
│
▼
┌───────────────────────────────────────────────────┐
│ FastAPI Backend │
│ ┌──────────────────────────────────────────────┐ │
│ │ 9 Research Agents │ │
│ ├──────────────────────────────────────────────┤ │
│ │ 1. Domain Mapper │ │
│ │ 2. Query Generator │ │
│ │ 3. Literature Scout │ │
│ │ 4. Paper Ingestion │ │
│ │ 5. Claim Extractor │ │
│ │ 6. Limitation Extractor │ │
│ │ 7. Novelty Detector │ │
│ │ 8. Direction Synthesizer │ │
│ │ 9. Scaffold Generator │ │
│ └──────────────────────────────────────────────┘ │
│ ┌──────────────────────────────────────────────┐ │
│ │ Core Services │ │
│ ├──────────────────────────────────────────────┤ │
│ │ • LLM Service (Ollama) │ │
│ │ • Vector Store (Embeddings) │ │
│ │ • arXiv Service │ │
│ │ • PDF Processor │ │
│ │ • Semantic Scholar (Supportable) │ │
│ └──────────────────────────────────────────────┘ │
└───────────────────────────────────────────────────┘
│
▼
┌─────────────────┐
│ Ollama (LLMs) │ ← deepseek-r1:7b + qwen3:4b
└─────────────────┘
- FastAPI - High-performance Python API framework
- Python 3.9+ - Core language
- Pydantic - Data validation and schemas
- Ollama - Local LLM inference engine
deepseek-r1:7b- Reasoning model for complex analysisqwen3:4b- Fast model for simple tasks
- sentence-transformers - Semantic embeddings
- NLTK - Text processing
- PyMuPDF - PDF text extraction
- pdfplumber - Alternative PDF parser
- BeautifulSoup4 - HTML parsing (Semantic Scholar)
- n8n - Workflow automation (runs via local npx, or similar setup)
- Streamlit - Interactive UI with custom CSS
- DiskCache - Result caching
- FAISS/Chroma - Vector embeddings
- arXiv API - Paper search and metadata
- Semantic Scholar - Enhanced metadata
- 100% Local: All LLM inference runs on your machine
- No Cloud: No data sent to external APIs (except arXiv for papers)
- No Tracking: Zero telemetry or analytics
- Ethical Citations: Every claim grounded in actual papers
- No Fabrication: System never invents citations or experiments
- Make sure Python 3.9+ is installed, and n8n is set up on your system (if to be used for orchestration).
- Install Ollama and set up deepseek-r1:7b and qwen3:4b.
- Clone this repository on your local machine.
- Set up virtual Python environment and install the required dependencies:
python -m venv venv
source venv/bin/activate # macOS/Linux
# OR
venv\Scripts\activate # Windows
pip install -r requirements.txt- Configure the environment:
# Copy environment template
cp .env.example .env
# Edit .env with your settings (if needed)
nano .env- Run the system by setting up and running 3 separate simultaneous terminal windows:
# Terminal 1 : Start n8n, import workflow
npx n8n
# Opens on: http://localhost:5678
# Terminal 2 : Start FastAPI Backend
cd python_agents
source venv/bin/activate # or venv\Scripts\activate on Windows
uvicorn main:app --reload --host 0.0.0.0 --port 8000
# API: http://localhost:8000
# Terminal 3 : Start Streamlit UI
cd streamlit_app
source venv/bin/activate # or venv\Scripts\activate on Windows
streamlit run app.py
# UI: http://localhost:8501 - Open this and start using the systemAutoResearcher is a research assistance agent. It:
- Does NOT guarantee novelty,
- Does NOT replace human judgment,
- Does NOT write complete papers.
Always verify outputs and conduct proper literature review.
Contributions are welcome!
Distributed under the MIT License.








