🧠 DistillAPI: Academic Research Summarizer

DistillAPI is a modern, full-stack AI application designed to transform dense academic research papers into highly structured, actionable insights. By leveraging LangGraph for orchestrating complex LLM workflows and Next.js for a sleek, dashboard-style interface, DistillAPI extracts core methodologies, key findings, and generates interactive study questions in seconds.

✨ Key Features

📄 Smart PDF Ingestion: Asynchronously parses and chunks large PDF documents using LangChain.
🤖 State-Based AI Agent: Uses LangGraph to orchestrate a multi-step summarization and Q&A pipeline.
🎯 Guaranteed Structured Outputs: Enforces strict JSON schemas using Pydantic, ensuring the LLM always returns the exact format required by the UI.
🛡️ Defensive UI Rendering: Built with Next.js and Shadcn, featuring a resilient frontend that handles missing data gracefully and renders markdown natively.
⚡ Async Backend: FastAPI implementation with asyncio thread delegation prevents blocking, allowing high-concurrency document processing.

🏗️ Architecture Stack

Backend (Python)

Framework: FastAPI
AI Orchestration: LangGraph & LangChain
LLM Provider: OpenRouter (hunter-alpha)
Data Validation: Pydantic

Frontend (TypeScript)

Framework: Next.js 15+ (App Router)
Styling: Tailwind CSS
UI Components: Shadcn UI (Radix Primitives)
Markdown: react-markdown
Icons: Lucide React

🚀 Getting Started

Follow these instructions to get a copy of the project up and running on your local machine.

Prerequisites

Node.js (v18+)
Python (3.9+)
An OpenRouter API Key.

1. Backend Setup (FastAPI & LangGraph)

Open a terminal and navigate to your backend directory:

# Create a virtual environment (recommended)
python -m venv venv
source venv/bin/activate  # On Windows use `venv\Scripts\activate`

# Install required Python packages
pip install fastapi uvicorn langchain langchain-openai langchain-community pypdf python-multipart pydantic python-dotenv

Create a .env file in the same directory as main.py and add your API key:

Code snippet

OPENAI_API_KEY="your-api-key-here"

Start the backend server:

Bash

python main.py
# Server will start on http://localhost:8000

2. Frontend Setup (Next.js)

Open a new terminal window and navigate to your frontend directory (docuparse-web):

Bash

# Install dependencies
npm install

# Install markdown renderer (if you haven't already)
npm install react-markdown

Start the development server:

Bash

npm run dev
# Frontend will start on http://localhost:3000

🧠 How the LangGraph Agent Works

The backend utilizes a directed graph (StateGraph) to process documents predictably:

load_pdf node: Reads the uploaded temporary PDF file and extracts raw text.
chunk_text node: Splits the text into manageable 4000-character chunks with overlap to preserve context.
summarize node: Ingests the chunks and uses .with_structured_output(StructuredSummary) to force the LLM to extract 7 specific academic data points (Core Problem, Methodology, etc.).
generate_qa node: Reads the structured summary state and generates 3 conceptual study questions and answers using strict Pydantic schemas.

🗺️ Roadmap / Future Enhancements

Map-Reduce Summarization: Upgrade the summarize_node to process entire 50+ page PDFs by summarizing individual chunks and combining them, rather than just reading the first few pages.
Streaming Responses: Implement Server-Sent Events (SSE) to stream the summary to the UI letter-by-letter as the LLM generates it.
Export to PDF/Markdown: Add a button to download the generated insights.
Citation Tracking: Map extracted methodologies back to specific page numbers in the original PDF.

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.


### Pro-Tips for your README:
* **The Hero Image:** Notice the `![DistillAPI Demo]` tag at the top. Once your app is running, take a screenshot of your beautiful dashboard, save it in a `public/` folder, and update that URL to point to your actual image! 
* **Customizing:** If you end up switching from OpenRouter back to standard OpenAI, just update the LLM Provider section.

This is a portfolio-grade README. If a recruiter or another developer looks at this, they will

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
docuparse-web		docuparse-web
pdfs		pdfs
.gitignore		.gitignore
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🧠 DistillAPI: Academic Research Summarizer

✨ Key Features

🏗️ Architecture Stack

Backend (Python)

Frontend (TypeScript)

🚀 Getting Started

Prerequisites

1. Backend Setup (FastAPI & LangGraph)

2. Frontend Setup (Next.js)

🧠 How the LangGraph Agent Works

🗺️ Roadmap / Future Enhancements

📄 License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🧠 DistillAPI: Academic Research Summarizer

✨ Key Features

🏗️ Architecture Stack

Backend (Python)

Frontend (TypeScript)

🚀 Getting Started

Prerequisites

1. Backend Setup (FastAPI & LangGraph)

2. Frontend Setup (Next.js)

🧠 How the LangGraph Agent Works

🗺️ Roadmap / Future Enhancements

📄 License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages