A Kotlin Multiplatform library + JVM tooling to build and query a unified SQLite database of Jewish texts (Sefaria + Otzaria), with Lucene-based full-text search.
SeforimLibrary is a comprehensive solution for working with Jewish religious texts from the Otzaria database. The project converts the original Otzaria database into a modern SQLite database with full-text search capabilities using FTS5, making it efficient to search through large volumes of text.
The library is structured as a set of modules that can be imported via Maven:
- core: Contains data models representing entities like books, authors, categories, and lines of text
- dao: Provides database access objects and repositories for interacting with the SQLite database
- otzariasqlite: Otzaria import/enrichment tooling (append into an existing DB)
- catalog: Precomputed catalog builder (
catalog.pb) - searchindex: Lucene index builders (text + lookup)
- packaging: Release/bundling tooling (release info +
.tar.zstbundle) - sefariasqlite: One-step Sefaria export → SQLite importer
Generation tooling modules are grouped under generator/ on disk (Gradle module names stay :sefariasqlite, :otzariasqlite, etc.).
- Convert Otzaria database to SQLite format
- Efficient full-text search using SQLite's FTS5
- Hierarchical category and book organization
- Table of contents navigation for books
- Support for links between related texts
- Comprehensive data model for Jewish religious texts
SeforimLibrary is typically consumed from the SeforimApp/ project in the parent repo.
- JDK 11 or higher
- Kotlin 1.9.0 or higher
- SQLite 3.35.0 or higher (for FTS5 support)
// Initialize the database
val dbPath = "path/to/your/database.db"
val driver = JdbcSqliteDriver(url = "jdbc:sqlite:$dbPath")
val repository = SeforimRepository(dbPath, driver)// Search in all books
val searchResults = repository.search("your search query", limit = 20, offset = 0)
// Search in a specific book
val bookSearchResults = repository.searchInBook(bookId, "your search query")
// Search by author
val authorSearchResults = repository.searchByAuthor("author name", "your search query")// Get root categories
val rootCategories = repository.getRootCategories()
// Get subcategories
val subcategories = repository.getCategoryChildren(parentId)
// Get books in a category
val books = repository.getBooksByCategory(categoryId)// Get book details
val book = repository.getBook(bookId)
// Get lines of text
val lines = repository.getLines(bookId, startIndex, endIndex)
// Get table of contents
val toc = repository.getBookToc(bookId)./gradlew generateSeforimDbOutputs (by default): build/seforim.db + build/catalog.pb
./gradlew packageSeforimBundle# 1) Base DB from Sefaria
./gradlew :sefariasqlite:generateSefariaSqlite
# 2) Append Otzaria (lines + links)
./gradlew :otzariasqlite:appendOtzaria
# 3) Build precomputed catalog
./gradlew :catalog:buildCatalog
# 4) Build Lucene indexes (creates build/seforim.db.lucene + build/seforim.db.lookup.lucene)
./gradlew :searchindex:buildLuceneIndexDefault
# 5) Download lexical.db next to the DB (used by the app; auto-run by packaging)
./gradlew :packaging:downloadLexicalDb
# 6) Package everything into a single .tar.zst (plus split parts)
./gradlew :packaging:packageArtifacts-
core: Contains data models and extensions
models: Data classes representing entities in the databaseextensions: Utility extensions for working with the models
-
dao: Database access layer
repository: Repository classes for accessing the databaseextensions: Extensions for converting between database and model objectssqldelight: SQL queries and database schema
-
generator/: Grouping folder for JVM generation tooling modules
-
otzariasqlite: Otzaria enrichment tools
DatabaseGenerator: Converts Otzaria sources into SQLite rowsGenerateLines/GenerateLinks: phase tasks (see Gradle tasks)
-
catalog: Precomputed catalog tools
BuildCatalog: buildscatalog.pbfrom a SQLite DB
-
searchindex: Lucene indexing tools
LuceneTextIndexWriter/LuceneLookupIndexWriter: JVM Lucene writersBuildLuceneIndex: CLI entrypoint used by Gradle tasks
-
packaging: Release/bundling tools
WriteReleaseInfo/PackageArtifacts: bundle.tar.zstfor distribution
-
sefariasqlite: Sefaria direct importer
SefariaDirectImporter/GenerateSefariaSqlite: Sefaria export → SQLite