Files: src/graphs/docs.ts, src/lib/parsers/docs.ts, src/lib/parsers/codeblock.ts
Stores markdown document structure as a graph of heading-based chunks with cross-file links.
| Field | Type | Description |
|---|---|---|
id |
string | Unique node ID (see format below) |
fileId |
string | Source file relative to projectDir |
title |
string | Heading text, or filename for root chunk |
content |
string | Full text of the section (heading stripped) |
level |
number | 1 = file root, 2-6 = heading depth |
links |
string[] | fileIds of linked files |
embedding |
number[] | L2-normalized vector; [] until embedded |
fileEmbedding |
number[] | File-level embedding (root nodes only) |
language |
string | Code block language (code block chunks only) |
symbols |
string[] | Extracted symbols (code block chunks only) |
mtime |
number | File mtimeMs at index time |
- File root:
"docs/auth.md"(the fileId itself) - Section:
"docs/auth.md::JWT Tokens"(fileId + heading text) - Duplicate heading:
"docs/auth.md::Notes::2"(dedup suffix) - Code block:
"docs/auth.md::JWT Tokens::code-1"(parent section + index)
| Type | Description |
|---|---|
| Sibling | chunk → next chunk within the same file (sequential order) |
| Cross-file | chunk → root node of linked file (from markdown links) |
File: src/lib/parsers/docs.ts
parseFile(content, absolutePath, projectDir, chunkDepth):
#headings are treated as the file title (level 1 root chunk)- Headings at depth <=
chunkDepth(default 4) create chunk boundaries - Deeper headings are folded into the parent chunk's content
- Duplicate heading titles within a file get
::2,::3suffixes
Recognizes:
- Markdown links:
[text](./relative/path.md)— resolved relative to the file - Wiki links:
[[page name]]or[[page name|alias]]— searched withinprojectDir - External links (
https://, etc.) are ignored - Only links to files that exist on disk are recorded
File: src/lib/parsers/codeblock.ts
Fenced code blocks (```lang ... ```) in markdown are extracted as child chunks:
- Each code block becomes a child chunk with
languageandsymbolsfields - TS/JS/TSX/JSX blocks: parsed with tree-sitter to extract top-level symbol names
- Other languages or parse failures:
symbols = [] - Untagged blocks:
language = undefined - Code block chunk IDs:
"fileId::Section::code-1"(level = parent level + 1)
| Method | Description |
|---|---|
listFiles() |
List all indexed markdown files with title and chunk count |
getToc(fileId) |
Table of contents (heading hierarchy) for a file |
getNode(nodeId) |
Full content of a specific chunk |
search(query, opts) |
Hybrid search with BFS expansion |
searchFiles(query, opts) |
File-level semantic search (by path + title) |
findExamples(symbol, opts) |
Find code blocks containing a symbol |
searchSnippets(query, opts) |
Search over code blocks |
listSnippets(opts) |
List code blocks with filters |
explainSymbol(symbol, opts) |
Code block + surrounding text for a symbol |
| Method | Description |
|---|---|
updateFile(chunks, fileId, mtime) |
Replace file's nodes and edges |
removeFile(fileId) |
Remove all nodes for a file |
Root nodes (level 1) have a fileEmbedding field — embedded from the file path + h1 title. Used by docs_search_files for file-level semantic search (simple cosine similarity, no BFS).
Stored as docs.json in the graphMemory directory. Includes embedding model fingerprint for automatic re-index on model change.