CLI-first personal knowledge capture and retrieval for human+agent workflows. Save URLs, ingest content, annotate, tag, search, and brief -- all from the terminal.
- Runtime: Node.js >= 22 (ESM-only,
"type": "module") - Language: TypeScript 5.8, strict mode,
ES2022target,NodeNextmodule resolution - Database: SQLite via
better-sqlite3with WAL mode, FTS5 for full-text search - CLI framework: Commander
- Tests: Node built-in test runner (
node:test) withnode:assert/strict - Build:
tsctodist/, dev runner viatsx
src/
cli/index.ts -- Commander entry point, all commands registered here
adapters/ -- Source-specific fetch+parse (article, youtube, x, pdf, bluesky, linkedin)
db/database.ts -- SQLite connection, migration runner
repositories/ -- One class per table, raw SQL queries, typed inputs/outputs
services/ -- Business logic orchestrating repositories + adapters
lib/ -- Shared types, error handling, ID generation, URL utils, output formatting
db/migrations/ -- Sequential .sql migration files (001_init.sql, etc.)
test/
unit/ -- Adapter and utility tests (mock fetch, no DB)
integration/ -- Full-stack tests using withTempDb helper (real SQLite)
helpers/temp-db.ts -- Creates ephemeral DB via LINKLEDGER_DB_PATH env var
Layering: CLI -> Services -> Repositories -> Database. Adapters are called by services (ingest worker). Services receive a ServiceContext containing all repositories and the DB handle.
- All imports use explicit
.jsextensions (NodeNext resolution) - IDs are deterministic where possible (
itemIdFromCanonicalUrl) or randomish (createRandomishId) - Dates are ISO 8601 strings, generated via
nowIso() - SQL uses named parameters (
@param) withbetter-sqlite3bindings - Transactions use
db.transaction()for multi-statement atomicity - Errors use
AppError(code, message, retryable)-- retryable flag controls job retry behavior - JSON output uses
{ ok: true, data }/{ ok: false, error }envelope pattern - Tests mock
globalThis.fetchdirectly (no libraries), restore infinally - Test concurrency is disabled (
--test-concurrency=1) because tests share process-level env vars
- All user input MUST go through named parameters (
@param) -- never interpolate into SQL strings - Dynamic IN clauses use indexed params (
@tag0,@tag1, ...) built from arrays - FTS5 queries go through
toFtsQuery()which sanitizes tokens before MATCH - Watch for SQL injection in any new query code
- Every adapter implements
SourceAdapterinterface:supports(),detectType(),fetchAndParse() fetchAndParsemust returnAdapterParseResultwith metadata, chunks, checksum, fetchedAt- Non-retryable parse failures use
AppError(..., false), transient network errors useAppError(..., true) - New adapters need a corresponding unit test with mocked fetch
- The
search_ftsvirtual table indexestitle,chunk_text,annotation_textwithitem_id UNINDEXED - BM25 weights are
(2.5, 1.0, 2.0)for title, chunk, annotation columns - Ranking combines BM25 + pinned boost - low-confidence penalty
- Any changes to FTS schema require a new numbered migration
- Foreign keys are enforced (
PRAGMA foreign_keys = ON) - Items have a unique constraint on
canonical_url - Tags have a unique constraint on
(item_id, tag, actor) - Content chunks have a unique constraint on
(item_id, chunk_index) - Migrations are sequential and idempotent (use
IF NOT EXISTS)
- Unit tests go in
test/unit/, integration tests intest/integration/ - Integration tests must use
withTempDb()for isolation - Always restore global state (
globalThis.fetch, env vars) infinallyblocks - Run
npm testandnpm run typecheckbefore approving