Skip to content

xr843/fojin

FoJin 佛津

The World's Encyclopedic Buddhist Digital Text Platform

450+ sources. 30 languages. 30 countries. One search.

Aggregating the world's Buddhist digital heritage — from the Chinese Tripitaka to Sanskrit manuscripts, Pali suttas to Tibetan texts — with full-text reading, AI-powered Q&A, knowledge graph, and multi-language parallel reading.

Live Demo  ·  中文文档  ·  Discord  ·  Report Bug

CI Security Scan License GitHub stars

FoJin — Global Buddhist Digital Text Platform


Why FoJin?

Buddhist texts are scattered across hundreds of databases worldwide — CBETA, SuttaCentral, BDRC, SAT, 84000, GRETIL, and many more. Each has different interfaces, languages, and data formats. Researchers spend more time finding texts than reading them.

FoJin solves this. It aggregates 450+ sources into a single, searchable platform with features no other tool provides:

What you need How FoJin helps
Find a sutra across databases Multi-dimensional search across local index with 450+ sources
Read the full text online 4,488 fascicles available for online reading
Compare translations Parallel reading in 30 languages side by side
Look up Buddhist terms 6 dictionaries, 237K entries (Chinese/Sanskrit/Pali/English)
Explore relationships Knowledge graph with 9,600+ entities and 3,800+ relations
View original manuscripts IIIF manuscript viewer connected to BDRC and more
Ask questions about texts AI Q&A ("XiaoJin") grounded in 11M characters of canonical text

Quick Start

git clone https://github.com/xr843/fojin.git
cd fojin
cp .env.example .env
docker compose up -d

Then visit: http://localhost:3000

API docs at http://localhost:8000/docs

Features

Multi-Dimensional Search

Search across Buddhist canons by title, translator, catalog number, or full-text keyword. Powered by Elasticsearch with ICU tokenizer for multi-language support.

Search results for Avatamsaka Sutra

Full-Text Reading

Read 4,488 fascicles of Buddhist texts online. Navigate by volume, scroll through content, and jump between related texts.

Parallel Reading (29 Languages)

Compare translations side by side — Classical Chinese, Sanskrit, Pali, Tibetan, English, Japanese, Korean, Gandhari, and 21 more languages.

Dictionary Lookup

6 authoritative dictionaries with 237,593 entries:

  • DDB (Digital Dictionary of Buddhism)
  • SuttaCentral Glossary (Pali)
  • NCPED (New Concise Pali-English Dictionary)
  • NTI (Nan Tien Institute Buddhist Dictionary)
  • Edgerton BHS (Buddhist Hybrid Sanskrit Dictionary)
  • Monier-Williams (Sanskrit-English Dictionary)

Knowledge Graph

9,600+ entities (persons, monasteries, texts, schools) and 3,800+ relationships, visualized as an interactive force-directed graph. Click any node to explore connections.

AI Q&A — "XiaoJin"

Ask questions in natural language. XiaoJin answers based on canonical Buddhist texts (38 core sutras, ~11M characters) using RAG (Retrieval-Augmented Generation). Every answer includes citations to the source text.

AI Q&A answering about Xuanzang's disciples

Manuscript Viewer

Browse digitized manuscripts and rare editions from BDRC and other institutions via IIIF protocol.

Data Sources

450+ data sources from 30 countries

FoJin aggregates data from major Buddhist digital projects worldwide:

Source Content Languages
CBETA Chinese Buddhist Canon Classical Chinese
SuttaCentral Early Buddhist Texts Pali, Chinese, English
84000 Tibetan Buddhist Canon Tibetan, English, Sanskrit
BDRC Tibetan manuscripts (IIIF) Tibetan
SAT Taisho Tripitaka Chinese, Japanese
GRETIL Sanskrit e-texts Sanskrit
DSBC Digital Sanskrit Buddhist Canon Sanskrit
Gandhari.org Gandhari manuscripts Gandhari
VRI Tipitaka Pali Canon (Chattha Sangayana) Pali
Korean Tripitaka Goryeo Tripitaka Chinese, Korean
+ 398 more...

Tech Stack

Layer Technology
Frontend React 18, TypeScript, Vite, Ant Design 5, Zustand, TanStack Query
Backend FastAPI, SQLAlchemy (async), Pydantic v2
Database PostgreSQL 15 + pgvector + pg_trgm
Search Elasticsearch 8 (ICU tokenizer)
Cache Redis 7
AI Dify + RAG (vector + keyword dual retrieval)
Deploy Docker Compose, Nginx (gzip_static, security headers)
CI GitHub Actions

Architecture

                    +-----------+
                    |  Nginx    |  (gzip, security headers, static cache)
                    +-----+-----+
                          |
              +-----------+-----------+
              |                       |
        +-----+-----+          +-----+-----+
        |  React 18  |          |  FastAPI   |
        |  (Vite)    |          |  (async)   |
        +------------+          +-----+------+
                                      |
                    +---------+-------+---------+
                    |         |       |         |
              +-----+   +----+--+ +--+---+ +---+----+
              | PG 15 |  | ES 8  | |Redis | | Dify   |
              |pgvector|  | ICU  | |cache | | RAG/AI |
              +--------+  +------+ +------+ +--------+

Development

# Backend
cd backend
python -m venv .venv && source .venv/bin/activate
pip install -r requirements-dev.txt
alembic upgrade head
uvicorn app.main:app --reload

# Frontend
cd frontend
npm install
npm run dev

# Tests
cd backend && pytest tests/ -q

Security

  • Non-root containers (backend: app, frontend: nginx)
  • Multi-stage Docker builds (no build tools in production)
  • Internal services bound to 127.0.0.1 only
  • Memory/CPU limits per container
  • CSP, X-Frame-Options, X-Content-Type-Options headers
  • Query length limits on all search parameters
  • JWT with 8h expiry, production requires strong secret

Contributing

Contributions are welcome! Whether it's adding a new data source, improving search, fixing bugs, or translating the UI — we'd love your help.

  1. Fork the repository
  2. Create your feature branch (git checkout -b feat/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feat/amazing-feature)
  5. Open a Pull Request

See CONTRIBUTING.md for detailed guidelines.

Roadmap

  • Citation export (BibTeX, RIS, APA)
  • Mobile-responsive reader
  • Public REST API with rate limiting
  • User annotations
  • Community-contributed data sources
  • Internationalization (i18n) — Japanese, Korean, Thai, Vietnamese UI
  • OCR pipeline for scanned texts
  • Embedding-based semantic search across all texts
  • Collaborative annotation sharing
  • API documentation and developer portal
  • Integration with Zotero and reference managers

License

Apache License 2.0 — applies to FoJin source code only. Third-party data sources retain their own licenses (CC BY-NC-SA, CC0, CC BY-NC-ND, etc.). See NOTICE for details.

Acknowledgments

FoJin is built on the generous work of the global Buddhist digital humanities community. Special thanks to:

  • CBETA — Chinese Buddhist Electronic Text Association
  • SuttaCentral — Early Buddhist Texts
  • BDRC — Buddhist Digital Resource Center
  • 84000 — Translating the Words of the Buddha
  • SAT — SAT Daizokyo Text Database
  • All other data source providers listed in the Sources page

If FoJin is useful for your research, please consider giving it a star!

Discussions  ·  Issues  ·  Contributing  ·  contact@fojin.app

Made with care for the Buddhist studies community.