Skip to content

Recall Quality & Project Inspection — Full Impl Complete #56

@ashu17706

Description

@ashu17706

Context

Testing smriti ingest file + smriti recall revealed 5 critical gaps in recall quality and project inspection capabilities.

Problems Fixed

  1. Full-doc retrieval broken: markdown files get split on \n\n+ into paragraph-per-message rows. Recall returns individual fragments, not the whole document.
  2. No project inspection: smriti projects only lists id - path. No session counts, tags, decisions, or agent breakdown per project.
  3. No tag discovery: smriti categories shows taxonomy tree with no usage counts. No way to ask "what tags are used in project X?"
  4. No project-scoped status: smriti status only global stats. No --project scoping.
  5. No recall tests: tag-based recall, category-filtered search, full-doc retrieval — all untested.

Solution Implemented

Phase 1: Full Document Storage Fix

  • Added "document" format mode to src/ingest/parsers/generic.ts
  • Stores entire file as single user message (no paragraph splitting)
  • Title derives from first # heading, falls back to filename
  • --whole flag added to smriti ingest file command
  • Warning emitted for .md files ingested without --whole

Phase 2: Project Inspection

  • smriti projects <id> command returns detailed report:
    • Session counts by agent
    • Message statistics
    • Tag usage breakdown with counts
    • Decision session count (tagged decision/*)
    • 5 most recent sessions with categories
  • Human-readable (formatProjectReport) and JSON output
  • Supports --tags, --decisions filters

Phase 3: Tag Discovery

  • smriti tags command shows tag usage globally or per project
  • --project <id> scopes to project
  • --available shows full category tree (like categories)
  • Human-readable (formatTagUsage) and JSON output

Phase 4: Scoped Status

  • smriti status --project <id> filters all stats to project
  • Agent, project, category counts scoped properly
  • Header shows "Status for project: X"

Phase 5: Comprehensive Test Suite

  • test/recall.test.ts with 29 test scenarios:
    • Full-document retrieval (single message per .md file)
    • Tag usage queries (global and project-scoped)
    • Project reports (all dimensions)
    • Multi-filter list operations (category + project + agent)
    • Integration tests validating all retrieval paths
  • All tests pass with in-memory SQLite DB (no external dependencies)

Files Changed

File Change
src/ingest/parsers/generic.ts Add "document" format mode
src/ingest/index.ts Pass --whole option to parser
src/db.ts Add getTagUsage() and getProjectReport() functions
src/index.ts Add --whole flag, tags command, project <id> inspection, status --project
src/format.ts Add formatProjectReport() and formatTagUsage() formatters
test/recall.test.ts New comprehensive test suite (29 tests)

Verification

# Test full-doc storage
smriti ingest file /tmp/test.md --whole --title "Test Doc" --project smriti

# Test project inspection  
smriti projects smriti                         # human-readable report
smriti projects smriti --json                  # machine-readable
smriti projects smriti --decisions             # filter to decisions only

# Test tag discovery
smriti tags                                    # global usage
smriti tags --project smriti                   # project-scoped

# Test scoped status
smriti status --project smriti

# Run test suite
bun test test/recall.test.ts                   # 29/29 passing

Impact

  • ✅ Full markdown documents can now be stored and retrieved whole
  • ✅ Project managers can inspect project health with rich metrics
  • ✅ Researchers can discover tag patterns across projects
  • ✅ All retrieval paths have comprehensive test coverage
  • ✅ No breaking changes to existing APIs

Known Limitations

  • Categories tree is not ranked by usage (could be future enhancement)
  • Project-scoped status doesn't break down by file operation type (future)

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions