Generate a day-by-day study plan from a textbook PDF using a strict, 5-stage pipeline. Output is JSON, Markdown, and CSV.
PDF Input
↓
[Stage 1] TOC Extraction → Extract only TOC pages from PDF
↓
[Stage 2] TOC Parsing → Convert to chapters/topics (LLM-powered)
↓
[Stage 3] Study Time Estimation → Compute normalized topic weights
↓
[Stage 4] Schedule Builder → Allocate time (5-min increments), fix rounding
↓
[Stage 5] Schedule Optimizer → Distribute across days, balance load
↓
Schedule Output (JSON / CSV / Markdown)
The ADK layer uses three collaborating agents plus a coordinator:
- Ingestion Agent: registers files and resolves file references
- Processing Agent: runs the 5-stage pipeline for a file_ref
- Output Agent: presents or exports the schedule
The coordinator routes each request to these agents in order.
- TOC extraction
- TOC parsing (LLM)
- Study time estimation
- Schedule building (5-minute blocks)
- Schedule optimization
See ARCHITECTURE.md for details.
- Node.js 18+
- Python 3.10+
- Gemini API key
npm installpip install -r requirements.txtSet your Gemini API key:
# Windows PowerShell
$env:GEMINI_API_KEY = "your-api-key-here"
# Or add to .env
GEMINI_API_KEY=your-api-key-herenode index.js <pdf-path> <start-date> <end-date> <total-minutes> [daily-minutes]
# Example
node index.js physics-textbook.pdf 2025-01-15 2025-02-28 1200 120cat input.json | node json-cli.jsExample input.json:
{
"pdfFilePath": "/path/to/textbook.pdf",
"startDate": "2025-01-15",
"endDate": "2025-02-28",
"totalStudyMinutes": 1200,
"dailyMinutesTarget": 120,
"courseName": "Physics",
"pdfDisplayName": "Physics Textbook.pdf"
}- Markdown: human-readable schedule with daily summaries
- CSV: spreadsheet-ready schedule
- JSON: full structured data
Default output folder (auto-save):
outputs/<pdf-stem>.mdoutputs/<pdf-stem>.csv
node test-pipeline.js