Skip to content

Fix export/import: runtime bugs, add vectors, compression, validation#111

Merged
gvonness-apolitical merged 3 commits intomainfrom
fix/export-import-overhaul
Feb 16, 2026
Merged

Fix export/import: runtime bugs, add vectors, compression, validation#111
gvonness-apolitical merged 3 commits intomainfrom
fix/export-import-overhaul

Conversation

@gvonness-apolitical
Copy link
Contributor

Summary

  • Fix 7 runtime bugs in archive.ts — column/table name mismatches vs schema (source_idsource_chunk_id, typeedge_type, weightinitial_weight, cluster_memberschunk_clusters, missing id/created_at/link_count on edges, missing session_id/project_path on chunks)
  • Add vector embedding export/import so semantic search works after restore (with --no-vectors opt-out)
  • Add full cluster data: centroid, exemplar IDs, distances, membership hash (previously empty shells)
  • Add gzip compression — auto-detected on import, always-on for export
  • Add validation and dry-run — version check, count verification, dangling edge detection, --dry-run flag
  • Wire CLI flags that were implemented but never passed through: --projects, --redact-paths, --redact-code, --no-vectors, --dry-run
  • Fix edge filtering — both endpoints must be in the export (no dangling refs)
  • Add ExportResult/ImportResult return types with formatted CLI summary
  • Bump archive version to 1.1 (backward-compatible with 1.0)
  • Replace 19 interface-only tests with 27 integration tests using real in-memory databases
  • Fix docs — remove phantom --format/--replace flags, fix magic bytes description (CST\0 not Causantic\0)

Test plan

  • npm run build — no type errors
  • npm test — 82 files, 1707 tests pass (27 new archive integration tests)
  • Manual: npx causantic export --output /tmp/test.json --no-encrypt — verify JSON contains vectors and full cluster data
  • Manual: npx causantic import /tmp/test.json — verify data restored, recall returns results
  • Manual: npx causantic import /tmp/test.json --dry-run — verify summary printed, DB unchanged

archive.ts had 7 column/table name mismatches vs the actual schema
(source_id→source_chunk_id, type→edge_type, weight→initial_weight,
cluster_members→chunk_clusters, etc.) that would crash at runtime.
All 19 tests were interface-only so never caught this.

Changes:
- Fix all SQL column/table names to match schema.sql
- Add vector embedding export/import (semantic search works after import)
- Add full cluster data: centroid, exemplar IDs, distances, membership hash
- Add gzip compression (auto-detected on import)
- Add validateArchive() with version, count, and referential integrity checks
- Add dry-run import option
- Add ExportResult/ImportResult return types with summary
- Filter edges to require both endpoints in export (no dangling refs)
- Wire CLI flags: --projects, --redact-paths, --redact-code, --no-vectors, --dry-run
- Print formatted summary after export/import operations
- Bump archive version to 1.1 (backward-compatible with 1.0)
- Replace 19 interface-only tests with 27 integration tests using real DBs
- Fix docs: remove phantom --format/--replace flags, fix magic bytes (CST\0)
Remove unused imports (gzipSync, ExportResult, ImportResult) and
unused destructured variable.
@gvonness-apolitical gvonness-apolitical merged commit 892b62e into main Feb 16, 2026
3 checks passed
@gvonness-apolitical gvonness-apolitical deleted the fix/export-import-overhaul branch February 16, 2026 22:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant