Transform documents into knowledge graphs with AI
Get started with Graphora in minutes, not hours. Choose your preferred option:
Visit demo.graphora.io - no signup required. Upload a document and see the knowledge graph extraction in action.
Open our quickstart notebook and run it in your browser:
Extract knowledge graphs from the command line:
# Install
pip install graphora[cli]
# Extract
graphora extract document.pdf --output graph.jsonThat's it! No database setup, no LLM keys required to get started.
| Feature | Graphora | LangChain GraphTransformer | Microsoft GraphRAG |
|---|---|---|---|
| Zero-config start | ✅ Yes | ❌ No | |
| Auto schema inference | ✅ Yes | ❌ No | ❌ No |
| Quality validation | ✅ Yes | ❌ No | ❌ No |
| Human review workflow | ✅ Yes | ❌ No | ❌ No |
| Visual schema builder | ✅ Yes | ❌ No | ❌ No |
| Schema chat copilot | ✅ Yes | ❌ No | ❌ No |
| Entity deduplication | ✅ Yes (Splink) | ✅ Yes |
- AI-powered extraction: Advanced LLM-driven entity and relationship extraction from unstructured documents
- Multi-format support: Process PDFs, Word docs, text files, and more
- Visual schema builder: Design your ontology with an intuitive drag-and-drop interface
- Schema chat copilot: Natural language conversations with streaming responses to refine your schema
- Auto schema inference: Let AI suggest schemas from your documents
- Entity deduplication: Powered by Splink for accurate entity resolution
- Human-in-the-loop: Review and refine extractions before final graph integration
- Quality validation: Built-in validation to ensure extraction completeness and accuracy
- Flexible storage: In-memory mode for quick starts, Neo4j for production
- Real-time tracking: Monitor preprocessing and extraction progress
- Scalable architecture: Microservice design optimized for document intelligence
Want to run Graphora on your infrastructure? Here's how to get started.
- Python 3.11 or higher
- uv package manager
# Install uv
curl -LsSf https://astral.sh/uv/install.sh | sh
# Clone and setup
git clone https://github.com/graphora/graphora-api.git
cd graphora-api
uv sync
# Start development server
make devThe API will be available at:
- API: http://localhost:8000
- Documentation: http://localhost:8000/api/v1/docs
- OpenAPI Spec: http://localhost:8000/api/v1/openapi.json
Perfect for local development and testing:
# Set environment variables
export AUTH_BYPASS_ENABLED=true
export STORAGE_TYPE=memory
export LLM_PROVIDER=openai
export OPENAI_API_KEY=your-key-here
# Start server
make devNo database setup required. The system will use in-memory storage and auto-infer schemas.
For production deployments:
# Start all services (Postgres, Neo4j, Prefect, Redis)
make compose-up
# Run migrations
make migrate
# Start server
make devSee Local Development Guide for detailed setup instructions.
Setup:
make install- sync Python dependencies via uvmake install-dev- install with dev dependencies
Development:
make dev- start development server with auto-reloadmake start- start production server
Local Services:
make compose-up- start Postgres, Neo4j, Prefect, and Redis containersmake compose-down- stop local servicesmake compose-logs- tail logs from local servicesmake compose-status- show status of local services
Testing:
make test- run all tests (emits coverage stats and writescoverage.xml)make test-unit- run unit tests onlymake test-integration- run integration tests onlymake test-cov- run tests with HTML coverage report
Code Quality:
make lint- run Ruff + Black checksmake lint-fix- run Ruff + Black with auto-fixmake format- apply Black formattingmake typecheck- run mypy type checkingmake deadcode- run Vulture to surface unused definitionsmake pre-commit- run all pre-commit checks (lint, test, deadcode)
Database:
make migrate- run database migrationsmake dev-reset-postgres- delete local Postgres datamake dev-reset-neo4j- delete local Neo4j datamake dev-reset-redis- delete local Redis data
Other:
make openapi-snapshot- regeneratetests/snapshots/openapi.jsonmake clean- remove build artifacts and cache filesmake help- show all available commands
All API requests must include a Clerk-issued bearer token:
Authorization: Bearer <token>Configure the backend with Clerk credentials via .env:
CLERK_JWKS_URLCLERK_ISSUERCLERK_AUDIENCECLERK_API_KEY(if server-to-server calls are required)
Clients no longer send the legacy user-id header; the backend derives the user from the JWT subject claim.
Call the API from CI jobs or data pipelines without extra backend code. Mint short-lived Clerk JWTs on demand:
-
Create (or reuse) a Clerk user that represents the pipeline and add a JWT template (e.g.
graphora_pipeline) whoseaudvalue matchesCLERK_AUDIENCE. -
When the pipeline starts, create a token via Clerk's backend API:
curl -X POST "https://api.clerk.com/v1/users/<USER_ID>/tokens/graphora_pipeline" \ -H "Authorization: Bearer $CLERK_API_KEY" \ -H "Content-Type: application/json" \ -d '{"expires_in_seconds": 3600}'
-
Export the returned token before invoking the client:
export GRAPHORA_AUTH_TOKEN="<clerk-jwt-from-step-2>" python pipeline.py
-
Repeat the minting step whenever the token expires (keep TTLs short and rotate the Clerk API key like any other secret).
The graphora Python package automatically reads GRAPHORA_AUTH_TOKEN, so no application changes are required.
app/
├── agents/ # AI agents for workflow and feedback
├── api/ # API endpoints (REST and GraphQL)
├── services/ # Core business logic services
├── schemas/ # Pydantic models and schemas
├── utils/ # Utility functions and helpers
└── main.py # Application entry point
-
Preprocessing Service
- Handles multi-step document preprocessing
- Provides real-time status updates
- Implements robust error handling
-
Extraction Service
- Manages entity and relationship extraction
- Integrates with LLM providers for intelligent processing
- Handles temporary graph creation
-
Graph Service
- Manages Neo4j database operations
- Handles subgraph creation and updates
- Processes user feedback
- Document Upload
POST /api/v1/documents/upload
Content-Type: multipart/form-data- Submit Feedback
POST /api/v1/feedback/{document_id}
Content-Type: application/jsonSee the API Documentation for the complete endpoint reference.
- Contributing Guide - How to contribute
- Repository Guidelines - Quick contributor reference
- Local Development Guide - Spin up dependencies and run the API locally
- Security Policy - How to report security issues
- Support - How to get help
- Trademark Policy - Trademark usage guidelines
- Frontend: graphora/graphora-fe
- Python Client: graphora/graphora-client
We welcome contributions! Please see our Contributing Guide for details.
- Read the Code of Conduct
- Sign the Contributor License Agreement
- Check out good first issues
This project is licensed under the MIT License.
- ✅ Use freely in personal and commercial projects
- ✅ Modify and distribute with or without source code
- ✅ Use in closed-source and SaaS products
See LICENSE for full terms.
- Enterprise Support: SLA-backed support for production deployments
- Consulting: Custom integrations, training, architecture design
- Commercial Licensing: Closed-source and SaaS deployments
- Database Vendor Partnerships: OEM licensing for database companies
Contact: support@graphora.io
- GitHub Discussions: Ask questions, share ideas
- Discord: Coming soon
- Twitter: Coming soon
Please report security vulnerabilities to support@graphora.io
See SECURITY.md for details.
Made with ❤️ by Arivan Labs