Docent — Human-AI Symbiotic Loop from Research to Understanding

Docent is an open-source, browser-based platform that connects document understanding, research curation, and educational delivery into a single conversational pipeline. Its AI persona Sage engages users in a symbiotic loop — an iterative cycle of bidirectional questioning between human and AI — spanning the full arc from research to learning.

Given a research paper, technical report, or free-form topic, Sage curates knowledge through vision-based document analysis and web-grounded research, synthesizes structured slide presentations with custom SVG diagrams and extracted figures, delivers narrated lectures, and supports post-presentation Q&A for comprehension reinforcement.

The Symbiotic Loop

Docent treats the research-to-learning pipeline not as a batch process but as an interactive dialogue across five stages:

Research & Comprehension — Vision-based PDF analysis (every page rendered as high-res image for LLM context) or web search curation with optional deep thinking / extended reasoning
Structured Synthesis — Slide presentations with custom SVG diagrams, precisely cropped PDF figures, configurable layouts, and speaker notes — streamed incrementally as the LLM generates them
Narrated Delivery — Dual TTS pipeline (browser-native Web Speech API + optional Google Gemini neural TTS) with cross-slide prefetch buffering and auto-advance
Conversational Refinement — Users shape presentations through iterative dialogue, editing plans, adjusting content, and requesting changes
Interactive Assessment — Bidirectional Q&A where Sage probes understanding and users ask questions, propose corrections, or debate interpretations

Two governance concerns are asymmetrically assigned: hallucination prevention is the AI's structural responsibility (ensuring content fidelity before delivery), while assessment of understanding is the AI's pedagogical role (verifying that learning has occurred).

Key Features

Interaction & Research

Conversational refinement — Edit and reshape presentations through natural dialogue
Web search — Sage curates knowledge from current web sources
Deep thinking / extended reasoning — Allocate up to 40,000 reasoning tokens for complex analysis
Post-generation Q&A — Ask Sage anything about the presented content, grounded in full slide context

Content Generation

Custom SVG diagrams — 10 diagram types: flowcharts, timelines, comparison tables, network diagrams, layered architecture, Venn/overlap, bar charts, annotated schematics, matrix/heatmap, and cycle diagrams
Vision-guided PDF figure extraction — LLM-predicted normalized bounding-box coordinates with automatic whitespace trimming
AI image generation — Generate images via Gemini/Nano Banana through OpenRouter
Real photo search — Find and embed relevant photographs

Architecture & Flexibility

Multi-model BYOK — Bring your own key: Claude Sonnet 4, GPT-4o, Gemini 2.5 Flash, Llama 3.3, DeepSeek V3, Qwen 2.5, and more via OpenRouter
Streaming generation — First slide appears within seconds; slides render incrementally as the LLM produces them
Browser-only / zero install — Runs entirely in the browser; deploy on Vercel or run locally
Multi-language — Generate presentations and narration in any language
Open source — MIT License

Presentation & Delivery

Speaker notes + dual TTS — Browser-native (free) and Google Gemini neural quality narration
PPTX export — Download as PowerPoint
HTML export — Print-to-PDF or standalone HTML
Session management — Save and restore conversations with full state (PDF, slides, chat history, narrative arc)
Memory system — Sage remembers key insights across sessions via [NOTE:] tags

Quick Start

Prerequisites

Node.js 18+
An OpenRouter API key

Installation

git clone https://github.com/symbiont-ai/docent.git
cd docent
npm install

Development

npm run dev

Open http://localhost:3000 and enter your OpenRouter API key in Settings.

Production Build

npm run build
npm start

Deploy on Vercel

Optionally set OPENROUTER_API_KEY as an environment variable for a shared hosted instance.

Environment Variables

Variable	Required	Description
`OPENROUTER_API_KEY`	No	Default API key for the hosted instance. Users can also enter their own key in Settings.
`GOOGLE_API_KEY`	No	Google API key for Gemini neural TTS (optional; users can enter their own in Settings).

Technology Stack

Component	Technology	Purpose
Framework	Next.js / React 19	Application shell, routing, API routes
Language	TypeScript	Type-safe development
PDF Processing	pdfjs-dist	PDF parsing, rendering, figure cropping
PPTX Export	PptxGenJS	PowerPoint file generation
Storage	idb-keyval / IndexedDB	Browser-local session persistence
Charts	Recharts	Optional data visualization
Speech (free)	Web Speech API	Browser-native TTS narration
Speech (neural)	Google Gemini API	High-quality neural TTS
LLM Gateway	OpenRouter API	Multi-model LLM access

Architecture

src/
  app/            # Next.js App Router (layout, page, API routes)
  components/     # React components (AppShell, ChatPanel, SlideViewer, etc.)
  hooks/          # Custom React hooks (useChat, usePresentation, usePDF, useTTS, useSessions, useMemory)
  lib/            # Utilities (API client, storage, presentation parsing, figure extraction, export)
  types/          # TypeScript type definitions

Six React hooks compose the client-side state:

Hook	Responsibility
`useChat`	Message flow, file uploads, API calls, generation coordinator
`usePresentation`	Slide state, navigation, figure resolution
`usePDF`	PDF loading, page rendering, thumbnails, figure cropping
`useTTS`	Dual speech synthesis, voice selection, chunked playback, prefetch buffering
`useMemory`	Cross-session notes via `[NOTE:]` tags
`useSessions`	Complete session serialization to/from IndexedDB

License

MIT License. See LICENSE for details.

Citation

If you use Docent in your research, please cite:

@article{docent2026,
  title={Docent: Human--AI Symbiotic Loop from Research to Understanding},
  author={Symbiont-AI Cognitive Labs},
  year={2026},
  url={https://github.com/symbiont-ai/docent}
}

Built by Symbiont-AI Cognitive Labs

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
src		src
.env.example		.env.example
.gitignore		.gitignore
.vercelignore		.vercelignore
LICENSE		LICENSE
README.md		README.md
next.config.js		next.config.js
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Docent — Human-AI Symbiotic Loop from Research to Understanding

The Symbiotic Loop

Key Features

Interaction & Research

Content Generation

Architecture & Flexibility

Presentation & Delivery

Quick Start

Prerequisites

Installation

Development

Production Build

Deploy on Vercel

Environment Variables

Technology Stack

Architecture

License

Citation

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors 2

Languages

Folders and files

Latest commit

History

Repository files navigation

Docent — Human-AI Symbiotic Loop from Research to Understanding

The Symbiotic Loop

Key Features

Interaction & Research

Content Generation

Architecture & Flexibility

Presentation & Delivery

Quick Start

Prerequisites

Installation

Development

Production Build

Deploy on Vercel

Environment Variables

Technology Stack

Architecture

License

Citation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors 2

Languages

Packages