Skip to content

fix: use relative paths for index keys to support shared worktree indexes#5

Merged
giancarloerra merged 1 commit intogiancarloerra:mainfrom
cstuncsik:fix/worktree-relative-paths
Mar 16, 2026
Merged

fix: use relative paths for index keys to support shared worktree indexes#5
giancarloerra merged 1 commit intogiancarloerra:mainfrom
cstuncsik:fix/worktree-relative-paths

Conversation

@cstuncsik
Copy link
Contributor

@cstuncsik cstuncsik commented Mar 16, 2026

Problem

The SOCRATICODE_PROJECT_ID env var (added in #2) allows multiple directories to share a single Qdrant collection, which is essential for git worktree workflows. However, sharing the collection alone isn't enough — the indexer internally keys everything by absolute path:

  • File hash map (change detection): keyed by absolute path (e.g. /Users/me/n8n/packages/cli/src/foo.ts)
  • Chunk IDs: generated from absolute path via chunkId(filePath, startLine)
  • deleteFileChunks(): filters Qdrant by the filePath payload field (absolute)

This means when a worktree at /tmp/worktree-xyz/ shares the same collection as /Users/me/n8n/:

  1. Hash lookup misses — every file appears "changed" → full re-index every time
  2. Chunk IDs differ — creates duplicate chunks instead of updating existing ones
  3. Stale chunk deletion fails — can't find old chunks to clean up (wrong path in filter)

The SOCRATICODE_PROJECT_ID feature effectively becomes useless for its primary use case.

Solution

Switch all internal keying from absolute paths to relative paths. The relativePath is already computed everywhere (via glob("**/*", { absolute: false })), just not used as the canonical key.

Changes

src/services/indexer.ts:

  • chunkId() — hash on relativePath instead of absolute filePath, producing stable IDs across worktrees
  • indexProject() — hash map get/set/has all use relativePath; deleted file detection uses relative path set
  • updateProjectIndex() — same pattern of changes
  • All chunking functions (chunkByAstRegions, chunkByLines, chunkByCharacters, chunkFileContent) — pass relativePath to chunkId()
  • Automatic migration: getProjectHashes() detects absolute-path keys in existing indexes and transparently converts them to relative paths on load — no manual re-index needed

src/services/qdrant.ts:

  • deleteFileChunks() — filter on relativePath Qdrant field instead of filePath

Absolute paths are now only used for actual file I/O (fsp.stat(), fsp.readFile()).

Backward compatibility

Existing indexes with absolute-path hash keys are automatically migrated to relative paths when loaded. The migration is transparent and one-time — after the first checkpoint/save, the hash map is persisted with relative keys. No manual re-index required.

Test plan

  • All 639 existing tests pass
  • Added unit test verifying chunkId() produces identical IDs regardless of absolute path prefix
  • Updated integration test for deleteFileChunks() to use relative path
  • Manually verified: indexed from worktree path A, searched successfully from path B using shared SOCRATICODE_PROJECT_ID

@cstuncsik cstuncsik force-pushed the fix/worktree-relative-paths branch from 9ee8f46 to c03d19c Compare March 16, 2026 07:37
…exes

When SOCRATICODE_PROJECT_ID is set to share a Qdrant collection across
git worktrees, the indexer still used absolute paths as keys in the file
hash map, chunk IDs, and for deleting file chunks. This caused every
worktree to see all files as "new" and trigger a full re-index, defeating
the purpose of the shared project ID feature.

Switch all internal keying from absolute paths to relative paths:
- chunkId() now hashes on relativePath for stable IDs across worktrees
- File hash map (change detection) keyed by relativePath
- deleteFileChunks() filters on relativePath Qdrant field
- Deleted file detection uses relative path sets

Absolute paths are now only used for actual file I/O (stat, readFile).
@cstuncsik cstuncsik force-pushed the fix/worktree-relative-paths branch from c03d19c to 505fbd7 Compare March 16, 2026 07:39
@giancarloerra giancarloerra merged commit 51bec99 into giancarloerra:main Mar 16, 2026
4 checks passed
@giancarloerra giancarloerra self-assigned this Mar 16, 2026
@giancarloerra
Copy link
Owner

Thanks, cstuncsik, once again, clean implementation and very good idea, much appreciated!

@cstuncsik
Copy link
Contributor Author

You're welcome

Thank you for your project, I really like it

I also wanted to solve the same problem couple of months ago and made a proof of concept but then I just forgot about it

https://github.com/cstuncsik/local-code-llm-mcp

@giancarloerra
Copy link
Owner

I think somehow it remained a problem even if there are a lot of products around trying to do it. Interesting you were also starting to solve it...well thanks for contributing with your ideas, the worktrees was an important missing one!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants