feat(t3code): SQLite crash resilience — WAL checkpoint, backup, integrity check by lukstafi · Pull Request #96 · lukstafi/ludics

lukstafi · 2026-03-28T15:40:53Z

Summary

Addresses the 2026-03-26 state.sqlite corruption incident. Three defensive measures added to src/t3code/server.ts (ludics-side only, no upstream t3code changes):

WAL checkpoint on clean stop: stopServer() opens the DB via bun:sqlite and runs PRAGMA wal_checkpoint(TRUNCATE) before sending SIGTERM, flushing all WAL frames to the main file. Falls back to PASSIVE if an exclusive lock is unavailable. Errors are non-fatal — the stop always proceeds.
Backup before start: ensureServer() copies state.sqlite (+ WAL/SHM side-files) to .bak, rotating the previous .bak → .bak.1 (last 2 backups kept, ~360 MB for a 180 MB DB).
Integrity check + auto-recovery on start: PRAGMA integrity_check runs before each startup. On failure: (1) try sqlite3 .recover piped into a fresh DB and verify with another integrity_check, swapping it in if good; (2) fall back to .bak if .recover fails. Corrupt DB is preserved as .corrupt.

Test plan

Normal start/stop cycle works with no visible changes
ludics t3code stop — verify state.sqlite-wal is truncated to 0 bytes after stop
Start server 3 times — verify state.sqlite.bak and state.sqlite.bak.1 are present, no .bak.2
Corrupt a test DB copy, restart — verify auto-recovery kicks in and .corrupt file is left behind
bun run typecheck passes ✅ | bun run build produces binary ✅

Closes task-393bb51c.

🤖 Generated with Claude Code

Proposal for task-010fa0f1 covering extraction of TASK_ID_RE, resolveTaskFile helper, reuse of updateFrontmatterField, and unification of duplicated priorityValue() across dashboard.ts and flow.ts. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Covers two features: clickable retrospective links for done tasks, and a time-based filter (All/30d/7d/24h) using task file mtime. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…, deprecated fields Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…up, integrity check Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…ntegrity check On unclean shutdown, t3code's WAL-mode SQLite DB can be left in an inconsistent state. Add three defensive measures to src/t3code/server.ts: 1. WAL checkpoint on clean stop: before SIGTERM, open the DB via bun:sqlite and run PRAGMA wal_checkpoint(TRUNCATE), falling back to PASSIVE if an exclusive lock is unavailable. Errors are non-fatal. 2. Backup before start: copy state.sqlite (+ WAL/SHM side-files) to state.sqlite.bak with a rotating .bak → .bak.1 (keeping last 2 copies, ~360 MB total for a 180 MB DB). 3. Integrity check with auto-recovery: run PRAGMA integrity_check on startup; if it fails, attempt sqlite3 .recover piped into a fresh DB, verify the result with another integrity_check, then swap it in. If .recover fails or produces a bad DB, fall back to the .bak file. Corrupt DB is preserved as .corrupt for inspection. All paths are logged to stderr. Closes task-393bb51c. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 198f88d25b

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-03-28T15:44:49Z

src/t3code/server.ts

+    backupDb(dbPath);
+    checkAndRecoverDb(dbPath);


Run integrity check before taking startup backup

Taking a fresh backup before checkAndRecoverDb can overwrite the only known-good snapshot with a corrupt database. On a startup where state.sqlite is already damaged, backupDb rotates the previous .bak to .bak.1 and writes the corrupt file into .bak; if .recover then fails, the restore path reads .bak and can “recover” to the same broken state instead of the last healthy one. This defeats the intended corruption fallback and can preserve data loss.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-03-28T15:44:49Z

src/t3code/server.ts

+  const bakPath = dbPath + ".bak";
+  if (existsSync(bakPath)) {


Fall back to .bak.1 during backup restore

The recovery branch only attempts state.sqlite.bak even though backup rotation keeps state.sqlite.bak.1. In the exact corruption scenario this feature targets, .bak may be the newly-copied bad snapshot while .bak.1 is the last good copy, so skipping .bak.1 can cause avoidable restore failure and startup with corrupted state.

Useful? React with 👍 / 👎.

…ak.1 Addresses review finding: backupDb() was running before checkAndRecoverDb(), so a corrupt DB would be written into .bak. If .recover then failed, the backup-restore fallback would restore the same corrupt state. Three fixes: 1. Reorder in ensureServer(): checkAndRecoverDb() runs first; backupDb() only runs if the DB is healthy (including after a successful auto-recovery). .bak is now always a copy of a known-good database. 2. checkAndRecoverDb() backup fallback: verify each backup's integrity_check before restoring; skip to .bak.1 if .bak is also corrupt. 3. backupDb() rotation: WAL/SHM side-files (.bak-wal, .bak-shm) are now renamed to (.bak.1-wal, .bak.1-shm) during rotation so that .bak.1 remains self-consistent for restore. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

lukstafi and others added 5 commits March 28, 2026 16:20

Add proposal for dashboard tasks.html retrospective links + time filter

81f7f7a

Covers two features: clickable retrospective links for done tasks, and a time-based filter (All/30d/7d/24h) using task file mtime. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Add proposal for PR #69 followups: lifecycle tests, stop-hook env var…

ccec7dd

…, deprecated fields Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Add proposal for t3code SQLite crash resilience: WAL checkpoint, back…

2b19010

…up, integrity check Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

chatgpt-codex-connector bot reviewed Mar 28, 2026

View reviewed changes

lukstafi merged commit 628f96f into main Mar 28, 2026
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(t3code): SQLite crash resilience — WAL checkpoint, backup, integrity check#96

feat(t3code): SQLite crash resilience — WAL checkpoint, backup, integrity check#96
lukstafi merged 6 commits intomainfrom
ludics/task-393bb51c-s2/root

lukstafi commented Mar 28, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

chatgpt-codex-connector bot Mar 28, 2026

Uh oh!

chatgpt-codex-connector bot Mar 28, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

lukstafi commented Mar 28, 2026

Summary

Test plan

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Mar 28, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot Mar 28, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant