diff --git a/.git-ai/lancedb.tar.gz b/.git-ai/lancedb.tar.gz index 502d000..849977b 100644 --- a/.git-ai/lancedb.tar.gz +++ b/.git-ai/lancedb.tar.gz @@ -1,3 +1,3 @@ version https://git-lfs.github.com/spec/v1 -oid sha256:f29f4d74339b62dfd1d0fdf759cbbe9cab3963ac212e0ddfa582c11ccac7fb6f -size 73846 +oid sha256:ace72b3025698c44f30a2c2813b8fc80af0d54fa2e70f5ae028bf9ead7bacb6e +size 84376 diff --git a/.gitignore b/.gitignore index 7ced680..c1d1a9a 100644 --- a/.gitignore +++ b/.gitignore @@ -8,6 +8,9 @@ dist/ .git-ai/ast-graph.sqlite .git-ai/ast-graph.export.json .git-ai/._* +.git-ai/dsr/dsr-index.sqlite +.git-ai/dsr/dsr-index.export.json .tmp-hook-test/ .vscode/ .trae/documents/ +.trae/ diff --git a/.trae/rules/git-ai-mcp/RULE.md b/.trae/rules/git-ai-mcp/RULE.md deleted file mode 100644 index 328cf96..0000000 --- a/.trae/rules/git-ai-mcp/RULE.md +++ /dev/null @@ -1,18 +0,0 @@ ---- -name: "git-ai-mcp" -description: "约束 Agent 使用 git-ai MCP 的行为(先索引再检索、证据引用、路径安全)。当 Agent 需要检视仓库代码或输出修改建议时使用。" ---- - -# Git-AI MCP(Rule) - -## 必须遵守 -- 先确保仓库已绑定:优先调用 `set_repo`,再用 `get_repo` 校验。 -- 符号/语义检索无结果或明显过期时,先 `index_repo({ overwrite: true })` 再重试检索。 -- 任何实现结论必须给出证据:至少提供命中 `file` 与行范围,并基于 `read_file` 的实际内容得出结论。 -- 不允许读取仓库外路径:`read_file.file` 必须是仓库根目录下的相对路径,不能使用 `../`。 -- 任何新增设计/实现必须评估存储成本与性能:避免把大体量索引/构建产物集中写入主仓库,优先使用引用/按需加载/缓存,并明确缓存位置与清理策略。 - -## 推荐策略 -- 优先 `search_symbols`(更稳定),再使用 `semantic_search` 做补充。 -- 先小范围读文件(200 行内),定位到关键函数后再二次读取更精确行段。 -- 需要共享索引或走 hooks 流程时,优先 `pack_index` 生成 `.git-ai/lancedb.tar.gz`,并提示用户将归档文件提交。 diff --git a/.trae/skills/git-ai-mcp/SKILL.md b/.trae/skills/git-ai-mcp/SKILL.md deleted file mode 100644 index 01757a7..0000000 --- a/.trae/skills/git-ai-mcp/SKILL.md +++ /dev/null @@ -1,75 +0,0 @@ ---- -name: "git-ai-mcp" -description: "通过 git-ai 的 MCP 工具检视/检索代码仓。用户要“找实现/定位符号/语义搜索/读取文件/重建索引/MCP 相关问题”时调用。" ---- - -# Git-AI MCP(Agent 使用模版) - -## 目标 -用最少 token 完成“从问题 → 命中点 → 读代码 → 给结论/改动建议”的闭环: -1) 先用索引工具命中位置(符号/语义) -2) 再精读关键文件片段(按行读取) -3) 必要时重建/打包索引,保证检索结果可靠 - -## 开始前:绑定仓库 -当对话里给了明确仓库路径,或你不确定 MCP 进程 cwd 是否在仓库目录: - -1. `set_repo({ path: "/ABS/PATH/TO/REPO" })` -2. `get_repo({})` 校验 repoRoot 是否正确 - -后续调用尽量不传 `path`(保持默认仓库一致),除非你需要切换仓库。 - -## 索引保障(很关键) -当出现以下情况之一,先重建索引: -- `search_symbols` / `semantic_search` 没结果或明显过时 -- 用户刚改了大量文件/刚切分支/刚合并 - -调用: -- `index_repo({ overwrite: true, dim: 256 })` -- 如需共享索引:`pack_index({ lfs: false })` - -## 检视套路(推荐顺序) - -### 1) 符号定位(最稳) -当用户提到函数/类/文件名/模块名: -- `search_symbols({ query: "FooBar", limit: 50 })` -- `search_symbols({ query: "get*repo", mode: "wildcard", case_insensitive: true, limit: 20 })` -- `search_symbols({ query: "^get.*repo$", mode: "regex", case_insensitive: true, limit: 20 })` -- 混合仓库建议指定语言避免噪声:`search_symbols({ query: "deleteDictDataByIds", lang: "java", limit: 50 })` - -输出 rows 后,选最可能的 1-3 个命中点继续读代码: -- `read_file({ file: "src/xxx.ts", start_line: 1, end_line: 220 })` -- 若需要更小范围:根据 start_line/end_line 二次读取 - -### 2) 语义检索(问法更自然) -当用户描述行为(“在哪里初始化 DB / 哪里处理 auth / 错误如何返回”): -- `semantic_search({ query: "where do we ...", topk: 5 })` -- 混合仓库建议指定语言避免噪声:`semantic_search({ query: "删除字典数据", lang: "java", topk: 5 })` - -语义检索返回的是摘要行(file/kind/signature),仍需要用 `read_file` 打开文件确认真实实现。 - -### 3) 文件浏览 -当你需要找入口文件、配置文件、或按模式定位: -- `list_files({ pattern: "src/**/*.{ts,tsx,js,jsx}", limit: 500 })` -- `list_files({ pattern: "**/*mcp*", limit: 200 })` - -### 4) AST 图查询(递归/关系类问题) -当你需要回答“包含关系/继承关系/子节点列表/递归查询”等问题: -- `ast_graph_find({ prefix: "Foo", limit: 20 })` -- `ast_graph_children({ id: "src/mcp/server.ts", as_file: true })` -- “引用/调用链”优先用高阶接口(比写 CozoScript 更稳定): - - `ast_graph_refs({ name: "greet", limit: 200 })` - - `ast_graph_callers({ name: "greet", limit: 200 })` - - `ast_graph_callees({ name: "hello", limit: 200 })` - - `ast_graph_chain({ name: "greet", direction: "upstream", max_depth: 3, lang: "java" })` -- `ast_graph_query({ query: "", params: {...} })` - -## 输出要求(给用户的答复) -- 先给结论,再给证据(文件 + 行范围) -- 引用代码位置用 IDE 可点链接(file://...#Lx-Ly) -- 若需要改代码:给出最小改动集、避免引入新依赖 - -## 常见坑 -- MCP 的 `semantic_search` 依赖 `.git-ai/lancedb`:没索引就没结果 -- 修改索引后建议 `pack_index`,并把 `.git-ai/lancedb.tar.gz` 提交(如果团队要共享) -- `read_file` 只能读仓库内相对路径,不允许 `../` 越界 diff --git a/DEVELOPMENT.md b/DEVELOPMENT.md index 08bc02d..e8b6fdb 100644 --- a/DEVELOPMENT.md +++ b/DEVELOPMENT.md @@ -76,7 +76,7 @@ git-ai ai hooks status git-ai ai serve ``` -If the host cannot guarantee the working directory points to the repository directory, you can let the Agent call `set_repo({path: ...})` first, or pass `path` in tool parameters. +MCP tools require `path` in every tool call to select the target repository (atomic, no implicit defaults). ## Publishing Notes (npm) - Ensure `npm run build` has generated `dist/**` diff --git a/DEVELOPMENT.zh-CN.md b/DEVELOPMENT.zh-CN.md index 4e3c288..327b3ab 100644 --- a/DEVELOPMENT.zh-CN.md +++ b/DEVELOPMENT.zh-CN.md @@ -76,7 +76,7 @@ git-ai ai hooks status git-ai ai serve ``` -如果宿主无法保证工作目录指向仓库目录,可以先让 Agent 调用 `set_repo({path: ...})`,或在工具参数里传 `path`。 +MCP tools 的 `path` 为必传:每次工具调用都必须显式传 `path` 参数来选择目标仓库(调用原子化,无隐式默认)。 ## 发布注意事项(npm) - 确保 `npm run build` 已生成 `dist/**` diff --git a/README.md b/README.md index f41fdad..924da8d 100644 --- a/README.md +++ b/README.md @@ -57,11 +57,22 @@ git-ai ai index --overwrite git-ai ai query Indexer --limit 10 git-ai ai semantic "semantic search" --topk 5 git-ai ai graph find GitAIV2MCPServer +git-ai ai dsr context --json +git-ai ai dsr generate HEAD +git-ai ai dsr rebuild-index +git-ai ai dsr query symbol-evolution GitAIV2MCPServer --limit 200 --json git-ai ai pack git-ai ai unpack git-ai ai serve ``` +## DSR (Deterministic Semantic Record) + +DSR is a per-commit, immutable, deterministic semantic artifact: + +- Canonical files: `.git-ai/dsr/.json` +- Databases are rebuildable caches derived from DSR + Git (never the other way around) + ## MCP Server (stdio) `git-ai` provides an MCP-based stdio Server for Agents/Clients to call as tools: @@ -98,18 +109,18 @@ git-ai ai serve ``` Note: -- `git-ai ai serve` defaults to using the current directory as the repository location (similar to git usage). -- If the host cannot guarantee that the MCP process working directory (cwd) points to the repository directory, it is recommended that the Agent execute `set_repo({path: \"/ABS/PATH/TO/REPO\"})` before the first call, or pass the `path` parameter in every tool call. +- `git-ai ai serve` only starts the MCP stdio server. +- MCP tools require `path` in every tool call to select the target repository (atomic, no implicit defaults). -## Agent Skills / Rules (Trae) +## Agent Templates This repository provides reusable Skill/Rule templates for Agents: -- Skill: [./.trae/skills/git-ai-mcp/SKILL.md](./.trae/skills/git-ai-mcp/SKILL.md) -- Rule: [./.trae/rules/git-ai-mcp/RULE.md](./.trae/rules/git-ai-mcp/RULE.md) +- Skill: [templates/agents/common/skills/git-ai-mcp/SKILL.md](./templates/agents/common/skills/git-ai-mcp/SKILL.md) +- Rule: [templates/agents/common/rules/git-ai-mcp/RULE.md](./templates/agents/common/rules/git-ai-mcp/RULE.md) Usage: -- After opening this repository in Trae, the Agent will automatically load Skills under `.trae/skills/**`. -- When you need to add constraints to the Agent, put the Rule content into your Agent configuration/system rules (or directly reference `.trae/rules/**` in this repository as a source). +- Install the templates into your target repository (default: `.agents/`). +- For Trae compatibility, you can install into `.trae/` with `--agent trae`. One-click install into another repository: @@ -117,7 +128,8 @@ One-click install into another repository: cd /path/to/your-repo git-ai ai agent install git-ai ai agent install --overwrite -git-ai ai agent install --to /custom/location/.trae +git-ai ai agent install --to /custom/location/.agents +git-ai ai agent install --agent trae ``` ## Git hooks (Rebuild index before commit, verify pack before push, auto unpack on checkout) @@ -130,7 +142,7 @@ git-ai ai hooks status ``` Explanation: -- `pre-commit`: Automatically `index --overwrite` + `pack`, and add `.git-ai/meta.json` and `.git-ai/lancedb.tar.gz` to the staging area. +- `pre-commit`: Automatically `index --incremental --staged` + `pack`, and add `.git-ai/meta.json` and `.git-ai/lancedb.tar.gz` to the staging area. - `pre-push`: `pack` again, if the archive changes, block the push and prompt to submit the archive file first. - `post-checkout` / `post-merge`: If `.git-ai/lancedb.tar.gz` exists, automatically `unpack`. diff --git a/README.zh-CN.md b/README.zh-CN.md index 6d00c1d..5081310 100644 --- a/README.zh-CN.md +++ b/README.zh-CN.md @@ -57,11 +57,22 @@ git-ai ai index --overwrite git-ai ai query Indexer --limit 10 git-ai ai semantic "semantic search" --topk 5 git-ai ai graph find GitAIV2MCPServer +git-ai ai dsr context --json +git-ai ai dsr generate HEAD +git-ai ai dsr rebuild-index +git-ai ai dsr query symbol-evolution GitAIV2MCPServer --limit 200 --json git-ai ai pack git-ai ai unpack git-ai ai serve ``` +## DSR(Deterministic Semantic Record) + +DSR 是按提交(per-commit)、不可变、确定性的语义工件: + +- 规范文件:`.git-ai/dsr/.json` +- 数据库/索引仅为可删缓存,必须可由 DSR + Git 重建(永远不反向推断 Git DAG) + ## MCP Server(stdio) `git-ai` 提供一个基于 MCP 的 stdio Server,供 Agent/客户端以工具方式调用: @@ -98,18 +109,14 @@ git-ai ai serve ``` 说明: -- `git-ai ai serve` 默认使用当前目录作为仓库定位起点(类似 git 的用法)。 -- 若宿主无法保证 MCP 进程的工作目录(cwd)指向仓库目录,推荐由 Agent 在首次调用前先执行一次 `set_repo({path: \"/ABS/PATH/TO/REPO\"})`,或在每次 tool 调用里传 `path` 参数。 - -## Agent Skills / Rules(Trae) +- `git-ai ai serve` 仅负责启动 MCP stdio server。 +- MCP tools 的 `path` 为必传:每次 tool 调用都必须显式传 `path` 参数来选择目标仓库(调用原子化,无隐式默认)。 -本仓库提供了 Agent 可直接复用的 Skill/Rule 模版: -- Skill: [./.trae/skills/git-ai-mcp/SKILL.md](./.trae/skills/git-ai-mcp/SKILL.md) -- Rule: [./.trae/rules/git-ai-mcp/RULE.md](./.trae/rules/git-ai-mcp/RULE.md) +## Agent 模版(skills/rules) -使用方式: -- 在 Trae 中打开本仓库后,Agent 会自动加载 `.trae/skills/**` 下的 Skill。 -- 需要给 Agent 加约束时,把 Rule 内容放到你的 Agent 配置/系统规则中(也可以直接引用本仓库的 `.trae/rules/**` 作为规范来源)。 +本仓库提供了可复用的 Skill/Rule 模版: +- Skill: [templates/agents/common/skills/git-ai-mcp/SKILL.md](./templates/agents/common/skills/git-ai-mcp/SKILL.md) +- Rule: [templates/agents/common/rules/git-ai-mcp/RULE.md](./templates/agents/common/rules/git-ai-mcp/RULE.md) 一键安装到其它仓库: @@ -117,7 +124,8 @@ git-ai ai serve cd /path/to/your-repo git-ai ai agent install git-ai ai agent install --overwrite -git-ai ai agent install --to /custom/location/.trae +git-ai ai agent install --to /custom/location/.agents +git-ai ai agent install --agent trae ``` ## Git hooks(提交前重建索引,push 前打包校验,checkout 自动解包) @@ -130,7 +138,7 @@ git-ai ai hooks status ``` 说明: -- `pre-commit`:自动 `index --overwrite` + `pack`,并把 `.git-ai/meta.json` 与 `.git-ai/lancedb.tar.gz` 加入暂存区。 +- `pre-commit`:自动 `index --incremental --staged` + `pack`,并把 `.git-ai/meta.json` 与 `.git-ai/lancedb.tar.gz` 加入暂存区(索引内容以 staged 为准)。 - `pre-push`:再次 `pack`,若归档发生变化则阻止 push,提示先提交归档文件。 - `post-checkout` / `post-merge`:若存在 `.git-ai/lancedb.tar.gz` 则自动 `unpack`。 diff --git a/assets/hooks/pre-commit b/assets/hooks/pre-commit index 69e2b9d..70ae097 100644 --- a/assets/hooks/pre-commit +++ b/assets/hooks/pre-commit @@ -11,7 +11,7 @@ if ! command -v git-ai >/dev/null 2>&1; then exit 0 fi -git-ai ai index --overwrite -p "$ROOT" >/dev/null +git-ai ai index --incremental --staged -p "$ROOT" >/dev/null git-ai ai pack -p "$ROOT" >/dev/null git add .git-ai/meta.json .git-ai/lancedb.tar.gz || true diff --git a/docs/README.md b/docs/README.md index 3346e76..daadad6 100644 --- a/docs/README.md +++ b/docs/README.md @@ -11,6 +11,7 @@ This collects all documentation for `git-ai`. ### Core Goals - Store structured code repository indexes under `.git-ai/`, shareable via archive `.git-ai/lancedb.tar.gz`. - Enable Agents to hit symbols/snippets via MCP tools at low cost, then read files as needed. +- Persist per-commit semantic change as DSR (immutable, deterministic), and rebuild caches from it. ### Important Directories - `.git-ai/meta.json`: Index metadata (locally generated, usually not committed). @@ -18,6 +19,8 @@ This collects all documentation for `git-ai`. - `.git-ai/lancedb.tar.gz`: Archived index (can be committed/tracked via git-lfs). - `.git-ai/ast-graph.sqlite`: AST graph database (CozoDB). - `.git-ai/ast-graph.export.json`: AST graph export snapshot (for non-SQLite backend cross-process reuse). +- `.git-ai/dsr/.json`: Per-commit DSR (canonical artifact, immutable). +- `.git-ai/dsr/dsr-index.sqlite`: DSR query accelerator (rebuildable cache from DSR + Git). ## Directory @@ -29,6 +32,7 @@ This collects all documentation for `git-ai`. - [MCP Server Integration](./zh-CN/mcp.md) (Chinese) - [Manifest Workspace Support](./zh-CN/manifests.md) (Chinese) - [Troubleshooting](./zh-CN/troubleshooting.md) (Chinese) +- [DSR (Deterministic Semantic Record)](./zh-CN/dsr.md) (Chinese) ### Advanced & Principles - [Advanced: Index Archiving & LFS](./zh-CN/advanced.md) (Chinese) diff --git a/docs/zh-CN/DESIGN.md b/docs/zh-CN/DESIGN.md index 78bf5af..98286a9 100644 --- a/docs/zh-CN/DESIGN.md +++ b/docs/zh-CN/DESIGN.md @@ -5,6 +5,7 @@ - 引入 SQ8(8-bit 标量量化)降低向量存储体积。 - 引入内容哈希去重:相同内容只存一份向量,多处引用仅存引用关系。 - 索引口径:仅针对当前 checkout 的 HEAD 工作区;历史版本由 Git 负责管理(通过 checkout 获得对应索引快照)。 +- 提供按提交的 DSR(Deterministic Semantic Record)作为语义工件:按提交、不可变、确定性;数据库仅作可重建缓存。 ## 2. 存储布局 索引产物放在仓库根目录: @@ -15,6 +16,10 @@ - `ast-graph.export.json`:AST 图导出快照(仅在非 SQLite 后端时用于跨进程复用) - `meta.json`:索引元信息(维度、编码、构建时间等) +DSR 产物(按提交,规范工件): +- `.git-ai/dsr/.json`:单提交 DSR(不可变、确定性) +- `.git-ai/dsr/dsr-index.sqlite`:DSR 查询加速索引(可删缓存,可由 DSR + Git 重建) + ## 3. 数据模型(两张表) ### 3.1 chunks(去重后的内容向量表) @@ -77,3 +82,21 @@ - `pre-push`:再次打包并校验归档未发生变化;若变化则阻止 push,提示先提交归档文件。 - `post-checkout` / `post-merge`:若存在 `.git-ai/lancedb.tar.gz`,自动解包到 `.git-ai/lancedb/`。 - 安装方式:在仓库中执行 `git-ai ai hooks install`(写入 .githooks/* 并设置 core.hooksPath=.githooks)。 + +## 8. DSR(Deterministic Semantic Record) + +DSR 用于固化“每个提交的语义变化”,并严格遵守: + +- Git DAG 是历史/分支的唯一权威来源(DSR 只 enrich 节点,不定义边) +- 一次提交 → 一份 DSR 文件(`.git-ai/dsr/.json`) +- DSR 一旦生成不可覆盖;若已存在且内容不同应停止并报错 +- 任何数据库/索引必须可由 DSR + Git 重建(缓存可删) + +CLI 入口: + +- `git-ai ai dsr context` +- `git-ai ai dsr generate ` +- `git-ai ai dsr rebuild-index` +- `git-ai ai dsr query symbol-evolution ` + +更详细说明见:[DSR 文档](./dsr.md) diff --git a/docs/zh-CN/README.md b/docs/zh-CN/README.md index 6c3f31a..f8f93bc 100644 --- a/docs/zh-CN/README.md +++ b/docs/zh-CN/README.md @@ -13,6 +13,7 @@ ### 核心目标 - 把代码仓的结构化索引放在 `.git-ai/` 下,并可通过归档文件 `.git-ai/lancedb.tar.gz` 分享 - 让 Agent 通过 MCP tools 低成本命中符号/片段,再按需读取文件 +- 把每个提交的语义变化固化为 DSR(按提交、不可变、确定性),并可据此重建缓存 ### 重要目录 - `.git-ai/meta.json`:索引元数据(本地生成,通常不提交) @@ -20,6 +21,8 @@ - `.git-ai/lancedb.tar.gz`:归档后的索引(可提交/可用 git-lfs 追踪) - `.git-ai/ast-graph.sqlite`:AST 图数据库(CozoDB) - `.git-ai/ast-graph.export.json`:AST 图导出快照(用于非 SQLite 后端跨进程复用) +- `.git-ai/dsr/.json`:单提交 DSR(规范工件,按提交、不可变) +- `.git-ai/dsr/dsr-index.sqlite`:DSR 查询加速索引(可删缓存,可由 DSR + Git 重建) ## 目录 @@ -31,6 +34,7 @@ - [MCP Server 接入](./mcp.md) - [Manifest Workspace 支持](./manifests.md) - [排障](./troubleshooting.md) +- [DSR(Deterministic Semantic Record)](./dsr.md) ### 进阶与原理 - [进阶:索引归档与 LFS](./advanced.md) diff --git a/docs/zh-CN/architecture_explained.md b/docs/zh-CN/architecture_explained.md index cbc0e54..dd51c72 100644 --- a/docs/zh-CN/architecture_explained.md +++ b/docs/zh-CN/architecture_explained.md @@ -104,3 +104,13 @@ `git-ai` 的架构本质上是在**检索效果**与**工程成本**之间寻找的一个极致平衡点。 通过 **Tree-sitter + Hash Embedding + LanceDB + CozoDB** 的组合,我们在不引入任何重型依赖的前提下,实现了对代码库的**语义级(Vector)**和**结构级(Graph)**的双重索引。这种架构特别适合作为 AI Agent 的“代码知识外脑”,为其提供精准、快速的上下文检索能力。 + +## 6. 按提交语义工件(DSR) + +除了面向当前 checkout 的 `.git-ai/` 索引缓存外,`git-ai` 还提供 DSR(Deterministic Semantic Record)作为 **按提交(per-commit)** 的语义工件: + +- 每个 commit 对应一份 DSR 文件:`.git-ai/dsr/.json` +- DSR 必须不可变且确定性;数据库索引仅作可删缓存(可由 DSR + Git 重建) +- 历史遍历必须从 Git DAG 出发,DSR 只 enrich 节点,不定义边 + +详见:[DSR 文档](./dsr.md) diff --git a/docs/zh-CN/cli.md b/docs/zh-CN/cli.md index 5d03148..e0d76b0 100644 --- a/docs/zh-CN/cli.md +++ b/docs/zh-CN/cli.md @@ -15,12 +15,17 @@ git-ai push -u origin main ```bash git-ai ai status git-ai ai index --overwrite +git-ai ai index --incremental --staged git-ai ai query "search text" --limit 20 git-ai ai query "get*repo" --mode wildcard --case-insensitive --limit 20 git-ai ai semantic "semantic query" --topk 10 git-ai ai graph find "Foo" git-ai ai graph children src/mcp/server.ts --as-file git-ai ai graph query "?[name, kind] := *ast_symbol{ref_id, file, name, kind, signature, start_line, end_line}" --params "{}" +git-ai ai dsr context --json +git-ai ai dsr generate HEAD +git-ai ai dsr rebuild-index +git-ai ai dsr query symbol-evolution "GitAIV2MCPServer" --limit 200 --json git-ai ai pack git-ai ai unpack git-ai ai agent install @@ -33,15 +38,27 @@ git-ai ai serve - `ai status --json` 可输出机器可读 JSON。 - `ai index` 的进度条输出到 stderr,stdout 保持为 JSON(避免破坏管道解析)。 -## Trae 一键安装(skills/rules) +## DSR(按提交、不可变、确定性) -将本仓库内置的 Trae Agent 模板(skills/rules)复制到目标仓库的 `.trae/` 目录,便于在 Trae 中直接加载。 +DSR 命令入口为 `git-ai ai dsr ...`,产物位于 `.git-ai/dsr/`。 + +- `dsr context`:发现 repo root / HEAD / branch,并检测 DSR 目录状态 +- `dsr generate `:为单个提交生成 DSR(存在且不同会报错,不会覆盖) +- `dsr rebuild-index`:从 DSR 重建可删的查询加速索引 +- `dsr query symbol-evolution `:只读查询;先遍历 Git DAG,再读取 DSR 附着语义;缺失 DSR 会停止并报错 + +## Agent 一键安装(skills/rules) + +将本仓库内置的 Agent 模板(skills/rules)复制到目标仓库的 `.agents/` 目录,便于主流 code agent 识别与加载。 ```bash cd /path/to/your-repo git-ai ai agent install git-ai ai agent install --overwrite -git-ai ai agent install --to /custom/location/.trae +git-ai ai agent install --to /custom/location/.agents + +# 可选:安装到 Trae 的 .trae 目录 +git-ai ai agent install --agent trae ``` ## RepoMap(全局鸟瞰,可选) diff --git a/docs/zh-CN/dsr.md b/docs/zh-CN/dsr.md new file mode 100644 index 0000000..70b96a5 --- /dev/null +++ b/docs/zh-CN/dsr.md @@ -0,0 +1,91 @@ +# DSR(Deterministic Semantic Record) + +DSR 是一个 **按提交(per-commit)** 的、**不可变(immutable)**、**确定性(deterministic)** 的语义工件:每个 Git commit 对应一份 DSR 文件。DSR 只负责“丰富提交节点的语义”,**永远不定义 Git DAG 的边**。 + +## 设计约束(不可违反) + +- Git commit DAG 是历史与分支的唯一权威来源 +- DSR 按提交生成:一个 commit → 一个 DSR 文件 +- DSR 一旦生成不可修改;若发现冲突,视为系统错误并停止 +- DSR 是规范工件(canonical artifact);数据库/索引仅是可重建缓存(rebuildable cache) +- 绝不从语义数据推断 Git 拓扑(父子/分支/合并结构) + +冲突优先级: + +Git > DSR > Database > Heuristics + +缺失数据处理: + +- 缓存缺失:从 DSR + Git 重建 +- DSR 缺失:报告并停止(不要推断) + +## 存储布局 + +DSR 相关产物位于仓库根目录: + +- `.git-ai/dsr/.json`:单提交 DSR(规范工件) +- `.git-ai/dsr/dsr-index.sqlite`:DSR 查询加速索引(可删缓存) +- `.git-ai/dsr/dsr-index.export.json`:非 SQLite 后端时的导出快照(用于跨进程复用) + +## DSR Schema(v1) + +必填字段: + +- `commit_hash` +- `affected_symbols` +- `ast_operations` +- `semantic_change_type` + +可选字段: + +- `summary`(默认使用 commit subject) +- `risk_level` + +禁止字段(避免编码拓扑/分支信息): + +- parent commits / branch names / merge topology + +## CLI 命令 + +### Phase 0:上下文发现 + +```bash +git-ai ai dsr context --json +``` + +产物(JSON)包含: + +- `repo_root` +- `commit_hash`(HEAD commit) +- `branch` / `detached` +- `dsr_directory_state`(.git-ai 与 dsr 目录存在性/文件数) + +### Phase 2:为单个提交生成 DSR + +```bash +git-ai ai dsr generate +``` + +- `` 支持任何可解析为 commit 的 rev(例如 `HEAD`、sha、tag) +- 生成路径固定为 `.git-ai/dsr/.json` +- 若文件已存在且内容不同会报错并停止(保证不可变性) + +### Phase 3:从 DSR 重建缓存索引 + +```bash +git-ai ai dsr rebuild-index +``` + +该索引用于加速查询,语义事实不应只存在于数据库中。 + +### Phase 6:只读查询(Git DAG 先行) + +```bash +git-ai ai dsr query symbol-evolution --limit 200 --json +``` + +行为要点: + +- 先按 `git rev-list --topo-order` 遍历 DAG +- 每个 commit 再读取对应 DSR 进行语义附着 +- 遇到缺失 DSR 的 commit 会立刻停止并返回错误(不推断) diff --git a/docs/zh-CN/hooks.md b/docs/zh-CN/hooks.md index a4294f7..8c89cc2 100644 --- a/docs/zh-CN/hooks.md +++ b/docs/zh-CN/hooks.md @@ -9,7 +9,6 @@ git-ai ai hooks status ``` ## 行为 -- `pre-commit`:自动 `index --overwrite` + `pack`,并把 `.git-ai/meta.json`、`.git-ai/lancedb.tar.gz` 加入暂存区 +- `pre-commit`:自动 `index --incremental --staged` + `pack`,并把 `.git-ai/meta.json`、`.git-ai/lancedb.tar.gz` 加入暂存区(索引内容以 staged 为准) - `pre-push`:再次 `pack`,若归档发生变化则阻止 push,提示先提交归档文件 - `post-checkout` / `post-merge`:若存在 `.git-ai/lancedb.tar.gz` 则自动 `unpack` - diff --git a/docs/zh-CN/mcp.md b/docs/zh-CN/mcp.md index a925d52..b7731d6 100644 --- a/docs/zh-CN/mcp.md +++ b/docs/zh-CN/mcp.md @@ -1,10 +1,10 @@ # MCP Server 接入 -`git-ai` 提供了一个基于 MCP (Model Context Protocol) 的 stdio Server,供 Agent (如 Claude Desktop, Trae 等) 调用,赋予 Agent “理解代码库”的能力。 +`git-ai` 提供了一个基于 MCP (Model Context Protocol) 的 stdio Server,供 Agent (如 Claude Desktop, Trae 等) 调用,赋予 Agent "理解代码库"的能力。 ## 启动 -在目标仓库目录执行: +在任意目录执行: ```bash git-ai ai serve @@ -15,43 +15,43 @@ git-ai ai serve ## 工具列表 ### 仓库管理 -- `get_repo({ path? })`:返回当前默认仓库根目录(调试用) -- `set_repo({ path })`:设置默认仓库路径,避免依赖进程工作目录 +- `get_repo({ path })`:返回指定 `path` 对应的仓库根目录与扫描根目录(调试用) ### 索引管理 -- `check_index({ path? })`:检查索引结构是否与当前版本一致(不一致需重建索引) -- `pack_index({ path?, lfs? })`:打包索引为 `.git-ai/lancedb.tar.gz`(可选启用 git-lfs track) -- `unpack_index({ path? })`:解包索引归档 +- `check_index({ path })`:检查索引结构是否与当前版本一致(不一致需重建索引) +- `rebuild_index({ path, dim?, overwrite? })`:重建全量索引(写入 `.git-ai/`;Risk: high) +- `pack_index({ path, lfs? })`:打包索引为 `.git-ai/lancedb.tar.gz`(可选启用 git-lfs track) +- `unpack_index({ path })`:解包索引归档 ### 检索 -- `search_symbols({ query, mode?, case_insensitive?, max_candidates?, limit?, lang?, path?, with_repo_map?, repo_map_max_files?, repo_map_max_symbols?, wiki_dir? })`:符号检索(lang: auto/all/java/ts;可选附带 repo_map) -- `semantic_search({ query, topk?, lang?, path?, with_repo_map?, repo_map_max_files?, repo_map_max_symbols?, wiki_dir? })`:基于 LanceDB + SQ8 的语义检索(lang: auto/all/java/ts;可选附带 repo_map) -- `repo_map({ path?, max_files?, max_symbols?, wiki_dir? })`:生成 repo map(重要文件/符号排名、引导 Wiki 阅读) -- `ast_graph_find({ prefix, limit?, lang?, path? })`:按名字前缀查找符号定义(大小写不敏感;lang: auto/all/java/ts) -- `ast_graph_children({ id, as_file?, path? })`:列出包含关系的直接子节点(文件→顶层符号、类→方法等) -- `ast_graph_refs({ name, limit?, lang?, path? })`:按名字查引用位置(call/new/type;lang: auto/all/java/ts) -- `ast_graph_callers({ name, limit?, lang?, path? })`:按名字查调用者(callee name;lang: auto/all/java/ts) -- `ast_graph_callees({ name, limit?, lang?, path? })`:按名字查被调用者(caller name;lang: auto/all/java/ts) -- `ast_graph_chain({ name, direction?, max_depth?, limit?, lang?, path? })`:按名字查调用链路(upstream/downstream,最大深度;lang: auto/all/java/ts) -- `ast_graph_query({ query, params?, path? })`:对 AST 图数据库执行 CozoScript 查询(进阶) +- `search_symbols({ path, query, mode?, case_insensitive?, max_candidates?, limit?, lang?, with_repo_map?, repo_map_max_files?, repo_map_max_symbols?, wiki_dir? })`:符号检索(lang: auto/all/java/ts;可选附带 repo_map) +- `semantic_search({ path, query, topk?, lang?, with_repo_map?, repo_map_max_files?, repo_map_max_symbols?, wiki_dir? })`:基于 LanceDB + SQ8 的语义检索(lang: auto/all/java/ts;可选附带 repo_map) +- `repo_map({ path, max_files?, max_symbols?, wiki_dir? })`:生成 repo map(重要文件/符号排名、引导 Wiki 阅读) +- `ast_graph_find({ path, prefix, limit?, lang? })`:按名字前缀查找符号定义(大小写不敏感;lang: auto/all/java/ts) +- `ast_graph_children({ path, id, as_file? })`:列出包含关系的直接子节点(文件→顶层符号、类→方法等) +- `ast_graph_refs({ path, name, limit?, lang? })`:按名字查引用位置(call/new/type;lang: auto/all/java/ts) +- `ast_graph_callers({ path, name, limit?, lang? })`:按名字查调用者(callee name;lang: auto/all/java/ts) +- `ast_graph_callees({ path, name, limit?, lang? })`:按名字查被调用者(caller name;lang: auto/all/java/ts) +- `ast_graph_chain({ path, name, direction?, max_depth?, limit?, lang? })`:按名字查调用链路(upstream/downstream,最大深度;lang: auto/all/java/ts) +- `ast_graph_query({ path, query, params? })`:对 AST 图数据库执行 CozoScript 查询(进阶) ### 文件读取 -- `list_files({ path?, pattern?, limit? })`:按 glob 列文件(默认忽略 node_modules, .git 等) -- `read_file({ path?, file, start_line?, end_line? })`:按行读取文件片段 +- `list_files({ path, pattern?, limit? })`:按 glob 列文件(默认忽略 node_modules, .git 等) +- `read_file({ path, file, start_line?, end_line? })`:按行读取文件片段 ## AST 图查询示例 列出指定文件里的顶层符号(推荐:无需手动算 file_id): ```js -ast_graph_children({ id: "src/mcp/server.ts", as_file: true }) +ast_graph_children({ path: "/ABS/PATH/TO/REPO", id: "src/mcp/server.ts", as_file: true }) ``` 查询某个方法/函数的调用者(推荐:用 callers/callees/chain,不用手写 CozoScript): ```js -ast_graph_callers({ name: "greet", limit: 50 }) -ast_graph_chain({ name: "greet", direction: "upstream", max_depth: 3 }) +ast_graph_callers({ path: "/ABS/PATH/TO/REPO", name: "greet", limit: 50 }) +ast_graph_chain({ path: "/ABS/PATH/TO/REPO", name: "greet", direction: "upstream", max_depth: 3 }) ``` 列出指定文件里的顶层符号(进阶:直接写 CozoScript,需要 file_id): @@ -64,73 +64,97 @@ ast_graph_chain({ name: "greet", direction: "upstream", max_depth: 3 }) ``` ## 推荐调用方式(让 Agent 自动传对路径) -- 第一次调用先 `set_repo({path: "/ABS/PATH/TO/REPO"})` -- 后续工具调用不传 `path`(走默认仓库) +- MCP tools 的 `path` 为必传:每次工具调用都必须显式传 `path: "/ABS/PATH/TO/REPO"`(保证调用原子性) ## RepoMap 使用建议 -repo map 用于给 Agent 一个“全局鸟瞰 + 导航入口”(重要文件/符号 + Wiki 关联),建议作为分析前置步骤: +repo map 用于给 Agent 一个"全局鸟瞰 + 导航入口"(重要文件/符号 + Wiki 关联),建议作为分析前置步骤: ```js -repo_map({ max_files: 20, max_symbols: 5 }) +repo_map({ path: "/ABS/PATH/TO/REPO", max_files: 20, max_symbols: 5 }) ``` 如果你希望在一次检索结果里顺带附加 repo map(默认关闭,避免输出膨胀): ```js -search_symbols({ query: "Foo", limit: 20, with_repo_map: true, repo_map_max_files: 20, repo_map_max_symbols: 5 }) -semantic_search({ query: "where is auth handled", topk: 5, with_repo_map: true }) +search_symbols({ path: "/ABS/PATH/TO/REPO", query: "Foo", limit: 20, with_repo_map: true, repo_map_max_files: 20, repo_map_max_symbols: 5 }) +semantic_search({ path: "/ABS/PATH/TO/REPO", query: "where is auth handled", topk: 5, with_repo_map: true }) ``` ## Agent Skills / Rules -本仓库提供了 Agent 可直接复用的 Skill/Rule 模版,旨在让 Agent 能够遵循最佳实践来使用上述工具。 +本仓库提供了 Agent 可直接复用的 Skill/Rule 模板,旨在让 Agent 能够遵循最佳实践来使用上述工具。 -- Skill 源码:[../.trae/skills/git-ai-mcp/SKILL.md](../.trae/skills/git-ai-mcp/SKILL.md) -- Rule 源码:[../.trae/rules/git-ai-mcp/RULE.md](../.trae/rules/git-ai-mcp/RULE.md) +### YAML 格式模板 -### Skill 使用模版 +- **Skill**: [`templates/agents/common/skills/git-ai/skill.yaml`](../../templates/agents/common/skills/git-ai/skill.yaml) - 指导 Agent 如何使用 git-ai 的 Git-native 语义体系(包含 DSR 约束)与 MCP 工具 + - 包含:触发条件、工作流步骤、工具定义、输出要求、常见陷阱 + +- **Rule**: [`templates/agents/common/rules/git-ai.yaml`](../../templates/agents/common/rules/git-ai.yaml) - 约束 Agent 使用 git-ai MCP 的行为 + - 包含:必须遵守的规则、推荐策略、禁止事项、Git Hooks 规则、Manifest Workspace 规则 -#### 目标 -用最少 token 完成“从问题 → 命中点 → 读代码 → 给结论/改动建议”的闭环: -1) 先用索引工具命中位置(符号/语义) -2) 再精读关键文件片段(按行读取) -3) 必要时重建/打包索引,保证检索结果可靠 +### Markdown 模版(便于直接阅读/复制) -#### 索引保障(很关键) -当出现以下情况之一,先重建索引: -- `search_symbols` / `semantic_search` 没结果或明显过时 -- 用户刚改了大量文件/刚切分支/刚合并 +- **Skill**: [`templates/agents/common/skills/git-ai-mcp/SKILL.md`](../../templates/agents/common/skills/git-ai-mcp/SKILL.md) +- **Rule**: [`templates/agents/common/rules/git-ai-mcp/RULE.md`](../../templates/agents/common/rules/git-ai-mcp/RULE.md) -建议: -- 用 `check_index({})` 判断索引结构是否兼容 -- 用 CLI 重建索引:`git-ai ai index --overwrite` -- 如需共享索引:`pack_index({ lfs: false })` +### 安装到 Trae -#### 检视套路(推荐顺序) +将本仓库的 Skills 和 Rules 安装到当前项目的 `.agents` 目录(默认): -**1) 符号定位(最稳)** -当用户提到函数/类/文件名/模块名: -- `search_symbols({ query: "FooBar", limit: 50 })` -- `search_symbols({ query: "get*repo", mode: "wildcard", case_insensitive: true, limit: 20 })` +```bash +cd /path/to/your-repo +git-ai ai agent install +git-ai ai agent install --overwrite +git-ai ai agent install --to /custom/location/.agents +``` + +如果你希望安装到 Trae 的 `.trae` 目录: + +```bash +git-ai ai agent install --agent trae +``` + +### Skill 工作流概览 + +根据 `skill.yaml`,推荐的工作流程: + +1. **绑定仓库** (`bind_repo`) - 确保仓库已绑定 +2. **确保索引新鲜** (`ensure_index`) - 必要时重建索引 +3. **定位符号** (`locate_symbols`) - 使用 `search_symbols` 精确查找 +4. **语义搜索** (`semantic_search`) - 自然语言描述搜索 +5. **浏览文件** (`browse_files`) - 使用 `list_files` 查找文件 +6. **AST 查询** (`ast_query`) - 递归/关系类查询 +7. **读取代码** (`read_code`) - 使用 `read_file` 读取关键片段 + +### Rule 约束概览 + +根据 `rule.yaml`,Agent 必须遵守: + +- **bind_repo_first**: 先绑定仓库再操作 +- **index_before_search**: 搜索无结果时先重建索引 +- **evidence_based_conclusion**: 结论必须有证据(文件+行号) +- **path_safety**: 禁止读取仓库外路径 +- **storage_cost_evaluation**: 评估存储成本 + +禁止事项包括: +- 默认使用外部 embedding 服务 +- 直接提交 `.git-ai/lancedb/` 目录 +- 使用 `../` 读取外部文件 +- 假设索引是最新的而不检查 -输出 rows 后,选最可能的 1-3 个命中点继续读代码: -- `read_file({ file: "src/xxx.ts", start_line: 1, end_line: 220 })` +## DSR 与 MCP 的关系 -**2) 语义检索(问法更自然)** -当用户描述行为(“在哪里初始化 DB / 哪里处理 auth / 错误如何返回”): -- `semantic_search({ query: "where do we ...", topk: 5 })` +- MCP tools 主要覆盖“索引(.git-ai)构建与检索”,用于让 Agent 低成本定位证据 +- DSR 是“按提交的语义工件(.git-ai/dsr)”,用于语义历史/演化类查询与可重建缓存 +- 任何历史遍历都必须从 Git DAG 出发(DSR 只 enrich 节点,不定义边) -**3) 文件浏览** -当你需要找入口文件、配置文件、或按模式定位: -- `list_files({ pattern: "src/**/*.{ts,tsx,js,jsx}", limit: 500 })` +DSR 相关命令见:[DSR 文档](./dsr.md) +## 输出要求 -**4) AST 图谱(结构化问题优先)** -当你要回答“这个文件里有哪些顶层定义 / 这个类有哪些方法 / 继承结构怎么查”等问题: -- `ast_graph_find({ prefix: "GitAI", limit: 20 })` -- `ast_graph_children({ id: "src/mcp/server.ts", as_file: true })` -- 复杂关系(extends/implements/递归)再退回:`ast_graph_query({ query: "", params: {...} })` +Agent 使用 git-ai MCP 工具时应遵循: -#### 输出要求 -- 先给结论,再给证据(文件 + 行范围) -- 引用代码位置用 IDE 可点链接(file://...#Lx-Ly) +1. **先给结论,再给证据** - 先总结发现,再提供详细位置 +2. **使用 IDE 可点击链接** - 格式:`file:///path/to/file#L10-L20` +3. **最小改动原则** - 建议修改时避免引入新依赖 +4. **证据必须基于 read_file** - 不要基于假设或猜测 diff --git a/docs/zh-CN/quickstart.md b/docs/zh-CN/quickstart.md index 7a3fd05..75f54ae 100644 --- a/docs/zh-CN/quickstart.md +++ b/docs/zh-CN/quickstart.md @@ -14,6 +14,12 @@ git-ai ai status git-ai ai index --overwrite ``` +后续如果你只想对变更文件做快速更新(例如配合 git hooks 在提交前重建索引),可以使用增量模式: + +```bash +git-ai ai index --incremental --staged +``` + ## 语义检索/符号检索 ```bash diff --git a/docs/zh-CN/troubleshooting.md b/docs/zh-CN/troubleshooting.md index a302161..d4b3433 100644 --- a/docs/zh-CN/troubleshooting.md +++ b/docs/zh-CN/troubleshooting.md @@ -4,8 +4,9 @@ stdio server 正常行为是“等待客户端连接”。如果你在终端直接运行,看起来像卡住是正常的。 ## search_symbols/semantic_search 查不到结果 -- 先在仓库执行:`git-ai ai index --overwrite` -- 如果你是通过 MCP 客户端启动且 cwd 不在仓库目录:先 `set_repo({path: ...})` +- 先在仓库执行:`git-ai ai index --overwrite`(或仅更新变更:`git-ai ai index --incremental --staged`) +- 如果你是通过 MCP 客户端启动且 cwd 不在仓库目录: + - MCP tools 的 `path` 为必传:每次 MCP tool 调用都显式传 `path: "/ABS/PATH/TO/REPO"`(保证调用原子性) ## Windows / Linux 安装失败 - Node 版本需 >= 18,且架构为 x64/arm64(LanceDB N-API 预编译包支持的范围) diff --git a/package.json b/package.json index 4c8f9e2..56a76bf 100644 --- a/package.json +++ b/package.json @@ -18,7 +18,7 @@ "dist/**", "docs/**", "assets/**", - ".trae/**", + "templates/**", "README.md" ], "keywords": [], diff --git a/src/commands/ai.ts b/src/commands/ai.ts index 9982959..bacc963 100644 --- a/src/commands/ai.ts +++ b/src/commands/ai.ts @@ -10,12 +10,14 @@ import { graphCommand } from './graph'; import { checkIndexCommand } from './checkIndex'; import { statusCommand } from './status'; import { agentCommand } from './trae'; +import { dsrCommand } from './dsr'; export const aiCommand = new Command('ai') .description('AI features (indexing, search, hooks, MCP)') .addCommand(indexCommand) .addCommand(checkIndexCommand) .addCommand(statusCommand) + .addCommand(dsrCommand) .addCommand(queryCommand) .addCommand(semanticCommand) .addCommand(graphCommand) diff --git a/src/commands/dsr.ts b/src/commands/dsr.ts new file mode 100644 index 0000000..ac3056a --- /dev/null +++ b/src/commands/dsr.ts @@ -0,0 +1,154 @@ +import { Command } from 'commander'; +import path from 'path'; +import { detectRepoGitContext } from '../core/dsr/gitContext'; +import { generateDsrForCommit } from '../core/dsr/generate'; +import { materializeDsrIndex } from '../core/dsr/indexMaterialize'; +import { symbolEvolution } from '../core/dsr/query'; +import { getDsrDirectoryState } from '../core/dsr/state'; + +export const dsrCommand = new Command('dsr') + .description('Deterministic Semantic Record (per-commit, immutable, Git-addressable)'); + +dsrCommand + .command('context') + .description('Discover repository root, HEAD, branch, and DSR directory state') + .option('-p, --path ', 'Path inside the repository', '.') + .option('--json', 'Output machine-readable JSON', false) + .action(async (options) => { + const start = path.resolve(options.path); + const ctx = await detectRepoGitContext(start); + const state = await getDsrDirectoryState(ctx.repo_root); + const out = { + commit_hash: ctx.head_commit, + repo_root: ctx.repo_root, + branch: ctx.branch, + detached: ctx.detached, + dsr_directory_state: state, + }; + if (options.json) { + console.log(JSON.stringify(out, null, 2)); + process.exit(0); + } + const lines: string[] = []; + lines.push(`repo: ${out.repo_root}`); + lines.push(`head: ${out.commit_hash}`); + lines.push(`branch: ${out.detached ? '(detached)' : out.branch}`); + lines.push(`dsrCacheRoot: ${out.dsr_directory_state.cache_root} (${out.dsr_directory_state.cache_root_exists ? 'exists' : 'missing'})`); + lines.push(`dsrDir: ${out.dsr_directory_state.dsr_dir} (${out.dsr_directory_state.dsr_dir_exists ? 'exists' : 'missing'})`); + lines.push(`dsrFiles: ${String(out.dsr_directory_state.dsr_file_count)}`); + console.log(lines.join('\n')); + process.exit(0); + }); + +dsrCommand + .command('generate') + .description('Generate DSR for exactly one commit') + .argument('', 'Commit hash (any rev that resolves to a commit)') + .option('-p, --path ', 'Path inside the repository', '.') + .option('--json', 'Output machine-readable JSON', false) + .action(async (commit: string, options) => { + const start = path.resolve(options.path); + const ctx = await detectRepoGitContext(start); + const res = await generateDsrForCommit(ctx.repo_root, String(commit)); + const out = { + commit_hash: res.dsr.commit_hash, + file_path: res.file_path, + existed: res.existed, + counts: { + affected_symbols: res.dsr.affected_symbols.length, + ast_operations: res.dsr.ast_operations.length, + }, + semantic_change_type: res.dsr.semantic_change_type, + risk_level: res.dsr.risk_level, + }; + if (options.json) { + console.log(JSON.stringify(out, null, 2)); + process.exit(0); + } + const lines: string[] = []; + lines.push(`commit: ${out.commit_hash}`); + lines.push(`dsr: ${out.file_path}`); + lines.push(`status: ${out.existed ? 'exists' : 'generated'}`); + lines.push(`ops: ${String(out.counts.ast_operations)}`); + lines.push(`affected_symbols: ${String(out.counts.affected_symbols)}`); + lines.push(`semantic_change_type: ${out.semantic_change_type}`); + lines.push(`risk_level: ${out.risk_level ?? 'unknown'}`); + console.log(lines.join('\n')); + process.exit(0); + }); + +dsrCommand + .command('rebuild-index') + .description('Rebuild performance-oriented DSR index from DSR files') + .option('-p, --path ', 'Path inside the repository', '.') + .option('--json', 'Output machine-readable JSON', false) + .action(async (options) => { + const start = path.resolve(options.path); + const ctx = await detectRepoGitContext(start); + const res = await materializeDsrIndex(ctx.repo_root); + if (options.json) { + console.log(JSON.stringify({ repo_root: ctx.repo_root, ...res }, null, 2)); + process.exit(res.enabled ? 0 : 2); + } + if (!res.enabled) { + console.error(res.skippedReason ?? 'rebuild-index skipped'); + process.exit(2); + } + const lines: string[] = []; + lines.push(`repo: ${ctx.repo_root}`); + lines.push(`engine: ${res.engine}`); + if (res.dbPath) lines.push(`db: ${res.dbPath}`); + if (res.exportPath) lines.push(`export: ${res.exportPath}`); + if (res.counts) { + lines.push(`commits: ${String(res.counts.commits)}`); + lines.push(`affected_symbols: ${String(res.counts.affected_symbols)}`); + lines.push(`ast_operations: ${String(res.counts.ast_operations)}`); + } + console.log(lines.join('\n')); + process.exit(0); + }); + +const queryCommand = new Command('query').description('Read-only semantic queries over Git DAG + DSR'); + +queryCommand + .command('symbol-evolution') + .description('List commits where a symbol changed (requires DSR per traversed commit)') + .argument('', 'Symbol name') + .option('-p, --path ', 'Path inside the repository', '.') + .option('--all', 'Traverse all refs (default: from HEAD)', false) + .option('--start ', 'Start commit (default: HEAD)') + .option('--limit ', 'Max commits to traverse', (v) => Number(v), 200) + .option('--contains', 'Match by substring instead of exact match', false) + .option('--json', 'Output machine-readable JSON', false) + .action(async (symbol: string, options) => { + const startDir = path.resolve(options.path); + const ctx = await detectRepoGitContext(startDir); + const res = await symbolEvolution(ctx.repo_root, String(symbol), { + all: Boolean(options.all), + start: options.start ? String(options.start) : undefined, + limit: Number(options.limit), + contains: Boolean(options.contains), + }); + if (options.json) { + console.log(JSON.stringify({ repo_root: ctx.repo_root, symbol, ...res }, null, 2)); + process.exit(res.ok ? 0 : 2); + } + if (!res.ok) { + console.error(`missing DSR for commit: ${res.missing_dsrs?.[0] ?? 'unknown'}`); + process.exit(2); + } + const hits = res.hits ?? []; + const lines: string[] = []; + lines.push(`repo: ${ctx.repo_root}`); + lines.push(`symbol: ${symbol}`); + lines.push(`hits: ${String(hits.length)}`); + for (const h of hits.slice(0, 50)) { + const opKinds = Array.from(new Set(h.operations.map((o) => o.op))).sort().join(','); + lines.push(`${h.commit_hash} ${h.semantic_change_type} ${h.risk_level ?? ''} ops=${String(h.operations.length)} kinds=${opKinds} ${h.summary ?? ''}`.trim()); + } + if (hits.length > 50) lines.push(`... (${hits.length - 50} more)`); + console.log(lines.join('\n')); + process.exit(0); + }); + +dsrCommand.addCommand(queryCommand); diff --git a/src/commands/index.ts b/src/commands/index.ts index 3e9b144..84864a7 100644 --- a/src/commands/index.ts +++ b/src/commands/index.ts @@ -1,22 +1,32 @@ import { Command } from 'commander'; import path from 'path'; +import fs from 'fs-extra'; import { inferScanRoot, resolveGitRoot } from '../core/git'; import { IndexerV2 } from '../core/indexer'; import { createLogger } from '../core/log'; +import { getStagedNameStatus, getWorktreeNameStatus } from '../core/gitDiff'; +import { IncrementalIndexerV2 } from '../core/indexerIncremental'; export const indexCommand = new Command('index') .description('Build LanceDB+SQ8 index for the current repository (HEAD working tree)') .option('-p, --path ', 'Path inside the repository', '.') .option('-d, --dim ', 'Embedding dimension', '256') .option('--overwrite', 'Overwrite existing tables', false) + .option('--incremental', 'Incremental indexing (only changed files)', false) + .option('--staged', 'Read changed file contents from Git index (staged)', false) .action(async (options) => { const log = createLogger({ component: 'cli', cmd: 'ai index' }); const startedAt = Date.now(); try { const repoRoot = await resolveGitRoot(path.resolve(options.path)); const scanRoot = inferScanRoot(repoRoot); - const dim = Number(options.dim); + const requestedDim = Number(options.dim); const overwrite = Boolean(options.overwrite); + const incremental = Boolean((options as any).incremental ?? false); + const staged = Boolean((options as any).staged ?? false); + const metaPath = path.join(repoRoot, '.git-ai', 'meta.json'); + const meta = await fs.readJSON(metaPath).catch(() => null); + const dim = typeof meta?.dim === 'number' ? meta.dim : requestedDim; const isTTY = Boolean(process.stderr.isTTY) && !process.env.CI; let renderedInTTY = false; let finishedInTTY = false; @@ -49,11 +59,24 @@ export const indexCommand = new Command('index') } }; - const indexer = new IndexerV2({ repoRoot, scanRoot, dim, overwrite, onProgress: renderProgress }); - await indexer.run(); + if (incremental) { + const changes = staged ? await getStagedNameStatus(repoRoot) : await getWorktreeNameStatus(repoRoot); + const indexer = new IncrementalIndexerV2({ + repoRoot, + scanRoot, + dim, + source: staged ? 'staged' : 'worktree', + changes, + onProgress: renderProgress, + }); + await indexer.run(); + } else { + const indexer = new IndexerV2({ repoRoot, scanRoot, dim, overwrite, onProgress: renderProgress }); + await indexer.run(); + } if (renderedInTTY && !finishedInTTY) process.stderr.write('\n'); log.info('index_repo', { ok: true, repoRoot, scanRoot, dim, overwrite, duration_ms: Date.now() - startedAt }); - console.log(JSON.stringify({ ok: true, repoRoot, scanRoot, dim, overwrite }, null, 2)); + console.log(JSON.stringify({ ok: true, repoRoot, scanRoot, dim, overwrite, incremental, staged }, null, 2)); } catch (e) { log.error('index_repo', { ok: false, duration_ms: Date.now() - startedAt, err: e instanceof Error ? { name: e.name, message: e.message, stack: e.stack } : { message: String(e) } }); process.exit(1); diff --git a/src/commands/serve.ts b/src/commands/serve.ts index 4003de1..ce436a2 100644 --- a/src/commands/serve.ts +++ b/src/commands/serve.ts @@ -1,13 +1,11 @@ import { Command } from 'commander'; -import path from 'path'; import { GitAIV2MCPServer } from '../mcp/server'; export const serveCommand = new Command('serve') - .description('Start MCP server (stdio) using current directory by default') - .option('-p, --path ', 'Repository path (defaults to current directory)', '.') + .description('Start MCP server (stdio). Repository is specified by path in each tool call.') .option('--disable-mcp-log', 'Disable MCP access logging') .action(async (options) => { - const server = new GitAIV2MCPServer(path.resolve(options.path), { + const server = new GitAIV2MCPServer(process.cwd(), { disableAccessLog: !!options.disableMcpLog, }); await server.start(); diff --git a/src/commands/trae.ts b/src/commands/trae.ts index 9668ebb..39b714b 100644 --- a/src/commands/trae.ts +++ b/src/commands/trae.ts @@ -35,28 +35,31 @@ export const agentCommand = new Command('agent') .alias('trae') .addCommand( new Command('install') - .description('Install skills/rules templates (default: /.trae)') + .description('Install skills/rules templates (default: /.agents)') .option('-p, --path ', 'Path inside the repository', '.') - .option('--to ', 'Destination .trae directory (overrides --path)', '') + .option('--to ', 'Destination directory (overrides default)', '') + .option('--agent ', 'Template layout: agents|trae', 'agents') .option('--overwrite', 'Overwrite existing files', false) .action(async (options) => { const repoRoot = await resolveGitRoot(path.resolve(options.path)); - const destTraeDir = String(options.to ?? '').trim() ? path.resolve(String(options.to)) : path.join(repoRoot, '.trae'); + const agent = String((options as any).agent ?? 'agents').trim().toLowerCase(); + const defaultDirName = agent === 'trae' ? '.trae' : '.agents'; + const destDir = String(options.to ?? '').trim() ? path.resolve(String(options.to)) : path.join(repoRoot, defaultDirName); const overwrite = Boolean(options.overwrite ?? false); const packageRoot = await findPackageRoot(__dirname); - const srcTraeDir = path.join(packageRoot, '.trae'); - const srcSkillsDir = path.join(srcTraeDir, 'skills'); - const srcRulesDir = path.join(srcTraeDir, 'rules'); + const srcTemplateDir = path.join(packageRoot, 'templates', 'agents', 'common'); + const srcSkillsDir = path.join(srcTemplateDir, 'skills'); + const srcRulesDir = path.join(srcTemplateDir, 'rules'); if (!await fs.pathExists(srcSkillsDir) || !await fs.pathExists(srcRulesDir)) { - console.log(JSON.stringify({ ok: false, repoRoot, error: 'template_missing', srcTraeDir }, null, 2)); + console.log(JSON.stringify({ ok: false, repoRoot, error: 'template_missing', srcTemplateDir }, null, 2)); process.exitCode = 2; return; } - const dstSkillsDir = path.join(destTraeDir, 'skills'); - const dstRulesDir = path.join(destTraeDir, 'rules'); - await fs.ensureDir(destTraeDir); + const dstSkillsDir = path.join(destDir, 'skills'); + const dstRulesDir = path.join(destDir, 'rules'); + await fs.ensureDir(destDir); await fs.copy(srcSkillsDir, dstSkillsDir, { overwrite }); await fs.copy(srcRulesDir, dstRulesDir, { overwrite }); @@ -64,6 +67,6 @@ export const agentCommand = new Command('agent') skills: await listDirNames(dstSkillsDir), rules: await listDirNames(dstRulesDir), }; - console.log(JSON.stringify({ ok: true, repoRoot, destTraeDir, overwrite, installed }, null, 2)); + console.log(JSON.stringify({ ok: true, repoRoot, agent, destDir, overwrite, installed }, null, 2)); }) ); diff --git a/src/core/astGraph.ts b/src/core/astGraph.ts index 3149771..36b1b78 100644 --- a/src/core/astGraph.ts +++ b/src/core/astGraph.ts @@ -1,5 +1,7 @@ import fs from 'fs-extra'; import { openRepoCozoDb, repoAstGraphExportPath } from './cozo'; +import { sha256Hex } from './crypto'; +import { toPosixPath } from './paths'; export interface AstGraphData { files: Array<[string, string, string]>; @@ -27,38 +29,43 @@ export interface WriteAstGraphResult { skippedReason?: string; } -export async function writeAstGraphToCozo(repoRoot: string, data: AstGraphData): Promise { +export type WriteAstGraphMode = 'replace' | 'put'; + +export async function writeAstGraphToCozo(repoRoot: string, data: AstGraphData, opts?: { mode?: WriteAstGraphMode }): Promise { const db = await openRepoCozoDb(repoRoot); if (!db) return { enabled: false, skippedReason: 'Cozo backend not available (see .git-ai/cozo.error.json)' }; + const mode: WriteAstGraphMode = opts?.mode ?? 'replace'; + const op = mode === 'put' ? ':put' : ':replace'; + const script = ` { ?[file_id, file, lang] <- $files - :replace ast_file { file_id: String => file: String, lang: String } + ${op} ast_file { file_id: String => file: String, lang: String } } { ?[ref_id, file, lang, name, kind, signature, start_line, end_line] <- $symbols - :replace ast_symbol { ref_id: String => file: String, lang: String, name: String, kind: String, signature: String, start_line: Int, end_line: Int } + ${op} ast_symbol { ref_id: String => file: String, lang: String, name: String, kind: String, signature: String, start_line: Int, end_line: Int } } { ?[parent_id, child_id] <- $contains - :replace ast_contains { parent_id: String, child_id: String } + ${op} ast_contains { parent_id: String, child_id: String } } { ?[sub_id, super_name] <- $extends_name - :replace ast_extends_name { sub_id: String, super_name: String } + ${op} ast_extends_name { sub_id: String, super_name: String } } { ?[sub_id, iface_name] <- $implements_name - :replace ast_implements_name { sub_id: String, iface_name: String } + ${op} ast_implements_name { sub_id: String, iface_name: String } } { ?[from_id, from_lang, name, ref_kind, file, line, col] <- $refs_name - :replace ast_ref_name { from_id: String, from_lang: String, name: String, ref_kind: String, file: String, line: Int, col: Int } + ${op} ast_ref_name { from_id: String, from_lang: String, name: String, ref_kind: String, file: String, line: Int, col: Int } } { ?[caller_id, caller_lang, callee_name, file, line, col] <- $calls_name - :replace ast_call_name { caller_id: String, caller_lang: String, callee_name: String, file: String, line: Int, col: Int } + ${op} ast_call_name { caller_id: String, caller_lang: String, callee_name: String, file: String, line: Int, col: Int } } `; @@ -84,3 +91,64 @@ export async function writeAstGraphToCozo(repoRoot: string, data: AstGraphData): }, }; } + +export async function removeFileFromAstGraph(repoRoot: string, file: string): Promise { + const db = await openRepoCozoDb(repoRoot); + if (!db) return { enabled: false, skippedReason: 'Cozo backend not available (see .git-ai/cozo.error.json)' }; + + const filePosix = toPosixPath(file); + const file_id = sha256Hex(`file:${filePosix}`); + + const script = ` +syms[ref_id] := *ast_symbol{ref_id, file}, file == $file +{ + ?[ref_id] := syms[ref_id] + :rm ast_symbol { ref_id } +} +{ + ?[parent_id, child_id] := *ast_contains{parent_id, child_id}, syms[child_id] + :rm ast_contains { parent_id, child_id } +} +{ + ?[parent_id, child_id] := *ast_contains{parent_id, child_id}, syms[parent_id] + :rm ast_contains { parent_id, child_id } +} +{ + ?[parent_id, child_id] := *ast_contains{parent_id, child_id}, parent_id == $file_id + :rm ast_contains { parent_id, child_id } +} +{ + ?[sub_id, super_name] := *ast_extends_name{sub_id, super_name}, syms[sub_id] + :rm ast_extends_name { sub_id, super_name } +} +{ + ?[sub_id, iface_name] := *ast_implements_name{sub_id, iface_name}, syms[sub_id] + :rm ast_implements_name { sub_id, iface_name } +} +{ + ?[from_id, from_lang, name, ref_kind, file, line, col] := *ast_ref_name{from_id, from_lang, name, ref_kind, file, line, col}, file == $file + :rm ast_ref_name { from_id, from_lang, name, ref_kind, file, line, col } +} +{ + ?[caller_id, caller_lang, callee_name, file, line, col] := *ast_call_name{caller_id, caller_lang, callee_name, file, line, col}, file == $file + :rm ast_call_name { caller_id, caller_lang, callee_name, file, line, col } +} +{ + ?[file_id] <- [[$file_id]] + :rm ast_file { file_id } +} +`; + + await db.run(script, { file: filePosix, file_id } as any); + if (db.engine !== 'sqlite' && db.exportRelations) { + const exported = await db.exportRelations(['ast_file', 'ast_symbol', 'ast_contains', 'ast_extends_name', 'ast_implements_name', 'ast_ref_name', 'ast_call_name']); + await fs.writeJSON(repoAstGraphExportPath(repoRoot), exported, { spaces: 2 }); + } + if (db.close) await db.close(); + + return { + enabled: true, + engine: db.engine, + dbPath: db.dbPath, + }; +} diff --git a/src/core/cozo.ts b/src/core/cozo.ts index b6c9273..dd6a826 100644 --- a/src/core/cozo.ts +++ b/src/core/cozo.ts @@ -31,6 +31,16 @@ async function tryImportFromExport(repoRoot: string, client: CozoClient): Promis await client.importRelations(data); } +async function tryImportFromExportPath(exportPath: string | null | undefined, client: CozoClient): Promise { + if (!exportPath) return; + if (!client.importRelations) return; + if (client.engine === 'sqlite') return; + if (!await fs.pathExists(exportPath)) return; + const data = await fs.readJSON(exportPath).catch(() => null); + if (!data) return; + await client.importRelations(data); +} + async function openCozoNode(repoRoot: string): Promise { let mod: any; try { @@ -136,6 +146,107 @@ async function openCozoWasm(repoRoot: string): Promise { return client; } +export async function openCozoDbAtPath(dbPath: string, exportPath?: string): Promise { + const errors: string[] = []; + try { + const moduleName: string = 'cozo-node'; + const mod = await import(moduleName); + const CozoDb = mod?.CozoDb ?? mod?.default?.CozoDb ?? mod?.default ?? mod; + if (typeof CozoDb !== 'function') throw new Error('cozo-node loaded but CozoDb export is missing'); + await fs.ensureDir(path.dirname(dbPath)); + + let db: any; + let engine: CozoClient['engine'] = 'mem'; + try { + db = new CozoDb('sqlite', dbPath); + engine = 'sqlite'; + } catch (e1) { + try { + db = new CozoDb({ engine: 'sqlite', path: dbPath }); + engine = 'sqlite'; + } catch { + db = new CozoDb(); + engine = 'mem'; + } + } + + const client: CozoClient = { + backend: 'cozo-node', + engine, + dbPath: engine === 'sqlite' ? dbPath : undefined, + run: async (script: string, params?: Record) => db.run(script, params ?? {}), + exportRelations: typeof db.exportRelations === 'function' ? async (rels: string[]) => db.exportRelations(rels) : undefined, + importRelations: typeof db.importRelations === 'function' ? async (data: any) => db.importRelations(data) : undefined, + close: typeof db.close === 'function' ? async () => { await db.close(); } : undefined, + }; + await tryImportFromExportPath(exportPath, client); + return client; + } catch (e: any) { + errors.push(String(e?.message ?? e)); + } + + try { + const moduleName: string = 'cozo-lib-wasm'; + const mod = await import(moduleName); + const init = mod?.default; + const CozoDb = mod?.CozoDb; + if (typeof init !== 'function' || typeof CozoDb?.new !== 'function') { + throw new Error('cozo-lib-wasm loaded but exports are not compatible'); + } + + if (!cozoWasmInit) cozoWasmInit = Promise.resolve(init()).then(() => {}); + await cozoWasmInit; + + const db: any = CozoDb.new(); + const run = async (script: string, params?: Record) => { + const out = db.run(String(script), JSON.stringify(params ?? {})); + try { + return JSON.parse(String(out)); + } catch { + return out; + } + }; + + const exportRelations = async (relations: string[]) => { + if (typeof db.export_relations !== 'function') return null; + const out = db.export_relations(JSON.stringify(relations)); + try { + return JSON.parse(String(out)); + } catch { + return out; + } + }; + + const importRelations = async (data: any) => { + if (typeof db.import_relations !== 'function') return null; + const out = db.import_relations(JSON.stringify(data)); + try { + return JSON.parse(String(out)); + } catch { + return out; + } + }; + + const client: CozoClient = { + backend: 'cozo-wasm', + engine: 'mem', + run, + exportRelations, + importRelations, + close: typeof db.free === 'function' ? async () => { db.free(); } : undefined, + }; + + await tryImportFromExportPath(exportPath, client); + return client; + } catch (e: any) { + errors.push(String(e?.message ?? e)); + } + + await fs.ensureDir(path.dirname(dbPath)); + await fs.writeJSON(path.join(path.dirname(dbPath), 'cozo.error.json'), { errors }, { spaces: 2 }).catch(() => {}); + return null; +} + export async function openRepoCozoDb(repoRoot: string): Promise { const errors: string[] = []; try { diff --git a/src/core/dsr/generate.ts b/src/core/dsr/generate.ts new file mode 100644 index 0000000..796e18a --- /dev/null +++ b/src/core/dsr/generate.ts @@ -0,0 +1,332 @@ +import fs from 'fs-extra'; +import path from 'path'; +import { sha256Hex } from '../crypto'; +import { toPosixPath } from '../paths'; +import { assertCommitExists, getCommitParents, getCommitSubject, getNameStatusBetween, gitShowFile, resolveCommitHash } from './gitContext'; +import { dsrDirectory, dsrFilePath } from './paths'; +import { SnapshotCodeParser } from './snapshotParser'; +import { DeterministicSemanticRecord, DsrAstOperation, DsrOperationKind, DsrRiskLevel, DsrSemanticChangeType, DsrSymbolDescriptor } from './types'; + +export interface GenerateDsrResult { + dsr: DeterministicSemanticRecord; + file_path: string; + existed: boolean; +} + +function normalizeFilePath(p: string): string { + return toPosixPath(p); +} + +function symbolContainerKey(s: { container?: { kind: string; name: string } }): string { + if (!s.container) return ''; + return `${s.container.kind}:${s.container.name}`; +} + +function symbolKeyFull(file: string, s: { kind: string; name: string; signature: string; container?: { kind: string; name: string } }): string { + return `${file}|${symbolContainerKey(s)}|${s.kind}|${s.name}|${s.signature}`; +} + +function symbolKeyNoSig(file: string, s: { kind: string; name: string; container?: { kind: string; name: string } }): string { + return `${file}|${symbolContainerKey(s)}|${s.kind}|${s.name}`; +} + +function clampLine(n: number, min: number, max: number): number { + if (!Number.isFinite(n)) return min; + if (n < min) return min; + if (n > max) return max; + return n; +} + +function computeRangeHash(content: string, startLine: number, endLine: number): string { + const lines = content.split('\n'); + const maxLine = Math.max(1, lines.length); + const s = clampLine(startLine, 1, maxLine); + const e = clampLine(endLine, 1, maxLine); + const from = Math.min(s, e); + const to = Math.max(s, e); + const slice = lines.slice(from - 1, to).join('\n'); + return sha256Hex(slice); +} + +interface SymbolSnap { + desc: DsrSymbolDescriptor; + content_hash: string; +} + +function toDescriptor(file: string, s: any): DsrSymbolDescriptor { + const out: DsrSymbolDescriptor = { + file, + kind: String(s.kind), + name: String(s.name), + signature: String(s.signature ?? ''), + start_line: Number(s.startLine ?? 0), + end_line: Number(s.endLine ?? 0), + }; + if (s.container?.name) { + out.container = { + kind: String(s.container.kind), + name: String(s.container.name), + signature: String(s.container.signature ?? ''), + }; + } + return out; +} + +function riskFromOps(ops: DsrAstOperation[]): DsrRiskLevel { + let max: DsrRiskLevel = 'low'; + for (const op of ops) { + if (op.op === 'delete' || op.op === 'rename') return 'high'; + if (op.op === 'modify') max = 'medium'; + } + return max; +} + +function semanticTypeFromOps(ops: DsrAstOperation[]): DsrSemanticChangeType { + if (ops.length === 0) return 'no-op'; + const kinds = new Set(ops.map((o) => o.op)); + if (kinds.size === 1) { + const only = Array.from(kinds)[0]; + if (only === 'add') return 'additive'; + if (only === 'modify') return 'modification'; + if (only === 'delete') return 'deletion'; + if (only === 'rename') return 'rename'; + } + return 'mixed'; +} + +function stableSortDescriptor(a: DsrSymbolDescriptor, b: DsrSymbolDescriptor): number { + const ak = `${a.file}|${a.kind}|${a.name}|${a.signature}|${a.container?.kind ?? ''}|${a.container?.name ?? ''}`; + const bk = `${b.file}|${b.kind}|${b.name}|${b.signature}|${b.container?.kind ?? ''}|${b.container?.name ?? ''}`; + return ak.localeCompare(bk); +} + +function stableSortOp(a: DsrAstOperation, b: DsrAstOperation): number { + const ak = `${a.op}|${a.symbol.file}|${a.symbol.kind}|${a.symbol.name}|${a.symbol.signature}|${a.previous?.name ?? ''}|${a.previous?.signature ?? ''}|${a.content_hash}`; + const bk = `${b.op}|${b.symbol.file}|${b.symbol.kind}|${b.symbol.name}|${b.symbol.signature}|${b.previous?.name ?? ''}|${b.previous?.signature ?? ''}|${b.content_hash}`; + return ak.localeCompare(bk); +} + +function canonDsr(dsr: DeterministicSemanticRecord): DeterministicSemanticRecord { + const affected = [...dsr.affected_symbols].sort(stableSortDescriptor); + const ops = [...dsr.ast_operations].sort(stableSortOp); + const out: DeterministicSemanticRecord = { + commit_hash: dsr.commit_hash, + affected_symbols: affected, + ast_operations: ops, + semantic_change_type: dsr.semantic_change_type, + }; + if (dsr.summary) out.summary = dsr.summary; + if (dsr.risk_level) out.risk_level = dsr.risk_level; + return out; +} + +function stringifyDsr(dsr: DeterministicSemanticRecord): string { + return JSON.stringify(canonDsr(dsr), null, 2) + '\n'; +} + +export async function generateDsrForCommit(repoRoot: string, commitHash: string): Promise { + const resolvedCommit = await resolveCommitHash(repoRoot, commitHash); + await assertCommitExists(repoRoot, resolvedCommit); + const parents = await getCommitParents(repoRoot, resolvedCommit); + const parent = parents.length > 0 ? parents[0] : null; + + const changes = await getNameStatusBetween(repoRoot, parent, resolvedCommit); + const parser = new SnapshotCodeParser(); + + const beforeSnaps: SymbolSnap[] = []; + const afterSnaps: SymbolSnap[] = []; + + for (const ch of changes) { + const status = String(ch.status); + const file = normalizeFilePath(String(ch.path)); + const includeBefore = status !== 'A'; + const includeAfter = status !== 'D'; + + if (includeBefore && parent) { + const beforeContent = await gitShowFile(repoRoot, parent, file); + if (beforeContent != null) { + const parsed = parser.parseContent(file, beforeContent); + for (const s of parsed.symbols) { + const desc = toDescriptor(file, s); + const content_hash = computeRangeHash(beforeContent, desc.start_line, desc.end_line); + beforeSnaps.push({ desc, content_hash }); + } + } + } + + if (includeAfter) { + const afterContent = await gitShowFile(repoRoot, resolvedCommit, file); + if (afterContent != null) { + const parsed = parser.parseContent(file, afterContent); + for (const s of parsed.symbols) { + const desc = toDescriptor(file, s); + const content_hash = computeRangeHash(afterContent, desc.start_line, desc.end_line); + afterSnaps.push({ desc, content_hash }); + } + } + } + } + + const beforeByFull = new Map(); + const afterByFull = new Map(); + const beforeByNoSig = new Map(); + const afterByNoSig = new Map(); + + for (const s of beforeSnaps) { + const file = s.desc.file; + const kFull = symbolKeyFull(file, s.desc); + const kNoSig = symbolKeyNoSig(file, s.desc); + beforeByFull.set(kFull, [...(beforeByFull.get(kFull) ?? []), s]); + beforeByNoSig.set(kNoSig, [...(beforeByNoSig.get(kNoSig) ?? []), s]); + } + + for (const s of afterSnaps) { + const file = s.desc.file; + const kFull = symbolKeyFull(file, s.desc); + const kNoSig = symbolKeyNoSig(file, s.desc); + afterByFull.set(kFull, [...(afterByFull.get(kFull) ?? []), s]); + afterByNoSig.set(kNoSig, [...(afterByNoSig.get(kNoSig) ?? []), s]); + } + + const usedBefore = new Set(); + const usedAfter = new Set(); + + const ops: DsrAstOperation[] = []; + + for (const [kFull, bList] of beforeByFull.entries()) { + const aList = afterByFull.get(kFull) ?? []; + if (aList.length === 0) continue; + const pairs = Math.min(bList.length, aList.length); + for (let i = 0; i < pairs; i++) { + const b = bList[i]; + const a = aList[i]; + usedBefore.add(b); + usedAfter.add(a); + if (b.content_hash !== a.content_hash) { + ops.push({ + op: 'modify', + symbol: a.desc, + previous: { name: b.desc.name, signature: b.desc.signature }, + content_hash: a.content_hash, + }); + } + } + } + + const remainingBefore = beforeSnaps.filter((s) => !usedBefore.has(s)); + const remainingAfter = afterSnaps.filter((s) => !usedAfter.has(s)); + + const remainingAfterByNoSig = new Map(); + for (const a of remainingAfter) { + const k = symbolKeyNoSig(a.desc.file, a.desc); + remainingAfterByNoSig.set(k, [...(remainingAfterByNoSig.get(k) ?? []), a]); + } + + for (const b of remainingBefore) { + const k = symbolKeyNoSig(b.desc.file, b.desc); + const candidates = remainingAfterByNoSig.get(k) ?? []; + if (candidates.length !== 1) continue; + const a = candidates[0]; + if (usedAfter.has(a)) continue; + usedBefore.add(b); + usedAfter.add(a); + if (b.content_hash !== a.content_hash || b.desc.signature !== a.desc.signature) { + ops.push({ + op: 'modify', + symbol: a.desc, + previous: { name: b.desc.name, signature: b.desc.signature }, + content_hash: a.content_hash, + }); + } + } + + const remBefore2 = beforeSnaps.filter((s) => !usedBefore.has(s)); + const remAfter2 = afterSnaps.filter((s) => !usedAfter.has(s)); + + const afterByHash = new Map(); + for (const a of remAfter2) { + const k = `${a.desc.file}|${symbolContainerKey(a.desc)}|${a.desc.kind}|${a.content_hash}`; + afterByHash.set(k, [...(afterByHash.get(k) ?? []), a]); + } + + for (const b of remBefore2) { + const k = `${b.desc.file}|${symbolContainerKey(b.desc)}|${b.desc.kind}|${b.content_hash}`; + const candidates = afterByHash.get(k) ?? []; + if (candidates.length !== 1) continue; + const a = candidates[0]; + if (usedAfter.has(a)) continue; + usedBefore.add(b); + usedAfter.add(a); + if (b.desc.name !== a.desc.name || b.desc.signature !== a.desc.signature) { + ops.push({ + op: 'rename', + symbol: a.desc, + previous: { name: b.desc.name, signature: b.desc.signature }, + content_hash: a.content_hash, + }); + } else if (b.content_hash !== a.content_hash) { + ops.push({ + op: 'modify', + symbol: a.desc, + previous: { name: b.desc.name, signature: b.desc.signature }, + content_hash: a.content_hash, + }); + } + } + + for (const a of afterSnaps) { + if (usedAfter.has(a)) continue; + ops.push({ + op: 'add', + symbol: a.desc, + content_hash: a.content_hash, + }); + } + + for (const b of beforeSnaps) { + if (usedBefore.has(b)) continue; + ops.push({ + op: 'delete', + symbol: b.desc, + content_hash: b.content_hash, + }); + } + + const affectedMap = new Map(); + for (const op of ops) { + const s = op.symbol; + const k = symbolKeyFull(s.file, s); + affectedMap.set(k, s); + } + + const subject = await getCommitSubject(repoRoot, resolvedCommit).catch(() => ''); + const risk_level = riskFromOps(ops); + const semantic_change_type = semanticTypeFromOps(ops); + + const dsr: DeterministicSemanticRecord = canonDsr({ + commit_hash: resolvedCommit, + affected_symbols: Array.from(affectedMap.values()), + ast_operations: ops, + semantic_change_type, + summary: subject || undefined, + risk_level, + }); + + const dir = dsrDirectory(repoRoot); + const file_path = dsrFilePath(repoRoot, resolvedCommit); + await fs.ensureDir(dir); + + const rendered = stringifyDsr(dsr); + if (await fs.pathExists(file_path)) { + const existing = await fs.readFile(file_path, 'utf-8').catch(() => ''); + if (existing.trimEnd() !== rendered.trimEnd()) { + throw new Error(`DSR already exists but differs: ${file_path}`); + } + return { dsr, file_path, existed: true }; + } + + const tmp = path.join(dir, `${commitHash}.json.tmp-${process.pid}-${Date.now()}`); + await fs.writeFile(tmp, rendered, 'utf-8'); + await fs.move(tmp, file_path, { overwrite: false }); + return { dsr, file_path, existed: false }; +} diff --git a/src/core/dsr/gitContext.ts b/src/core/dsr/gitContext.ts new file mode 100644 index 0000000..2df3858 --- /dev/null +++ b/src/core/dsr/gitContext.ts @@ -0,0 +1,78 @@ +import simpleGit from 'simple-git'; + +export interface RepoGitContext { + repo_root: string; + head_commit: string; + branch: string | null; + detached: boolean; +} + +export async function detectRepoGitContext(startDir: string): Promise { + const git = simpleGit(startDir); + const repo_root = (await git.raw(['rev-parse', '--show-toplevel'])).trim(); + const head_commit = (await simpleGit(repo_root).raw(['rev-parse', 'HEAD'])).trim(); + const branchRaw = (await simpleGit(repo_root).raw(['rev-parse', '--abbrev-ref', 'HEAD'])).trim(); + const detached = branchRaw === 'HEAD'; + return { + repo_root, + head_commit, + branch: detached ? null : branchRaw, + detached, + }; +} + +export async function getCommitParents(repoRoot: string, commitHash: string): Promise { + const git = simpleGit(repoRoot); + const out = (await git.raw(['show', '-s', '--format=%P', commitHash])).trim(); + if (!out) return []; + return out.split(/\s+/).map((s) => s.trim()).filter(Boolean); +} + +export async function getCommitSubject(repoRoot: string, commitHash: string): Promise { + const git = simpleGit(repoRoot); + return (await git.raw(['show', '-s', '--format=%s', commitHash])).trim(); +} + +export async function assertCommitExists(repoRoot: string, commitHash: string): Promise { + const git = simpleGit(repoRoot); + await git.raw(['cat-file', '-e', `${commitHash}^{commit}`]); +} + +export async function resolveCommitHash(repoRoot: string, rev: string): Promise { + const git = simpleGit(repoRoot); + return (await git.raw(['rev-parse', `${rev}^{commit}`])).trim(); +} + +export interface NameStatusRow { + status: string; + path: string; +} + +export async function getNameStatusBetween(repoRoot: string, parent: string | null, commit: string): Promise { + const git = simpleGit(repoRoot); + const lines = parent + ? (await git.raw(['diff', '--name-status', '--no-renames', parent, commit])).trim().split('\n') + : (await git.raw(['diff-tree', '--root', '--no-commit-id', '--name-status', '-r', commit])).trim().split('\n'); + + const rows: NameStatusRow[] = []; + for (const line of lines) { + const trimmed = line.trim(); + if (!trimmed) continue; + const parts = trimmed.split('\t'); + const status = (parts[0] ?? '').trim(); + const p = (parts[1] ?? '').trim(); + if (!status || !p) continue; + rows.push({ status, path: p }); + } + rows.sort((a, b) => (a.status + '\t' + a.path).localeCompare(b.status + '\t' + b.path)); + return rows; +} + +export async function gitShowFile(repoRoot: string, commitHash: string, filePath: string): Promise { + const git = simpleGit(repoRoot); + try { + return await git.raw(['show', `${commitHash}:${filePath}`]); + } catch { + return null; + } +} diff --git a/src/core/dsr/indexMaterialize.ts b/src/core/dsr/indexMaterialize.ts new file mode 100644 index 0000000..977fb90 --- /dev/null +++ b/src/core/dsr/indexMaterialize.ts @@ -0,0 +1,122 @@ +import fs from 'fs-extra'; +import path from 'path'; +import { openCozoDbAtPath } from '../cozo'; +import { dsrDirectory, dsrIndexDbPath, dsrIndexExportPath } from './paths'; +import { DeterministicSemanticRecord } from './types'; + +export interface DsrIndexMaterializeResult { + enabled: boolean; + engine?: 'sqlite' | 'mem'; + dbPath?: string; + exportPath?: string; + counts?: { + commits: number; + affected_symbols: number; + ast_operations: number; + }; + skippedReason?: string; +} + +export async function materializeDsrIndex(repoRoot: string): Promise { + const dsrDir = dsrDirectory(repoRoot); + if (!await fs.pathExists(dsrDir)) { + return { enabled: false, skippedReason: `DSR directory missing: ${dsrDir}` }; + } + + const files = (await fs.readdir(dsrDir).catch(() => [])) + .filter((f) => f.endsWith('.json')) + .filter((f) => !f.endsWith('.export.json')) + .sort((a, b) => a.localeCompare(b)); + + const dsrs: DeterministicSemanticRecord[] = []; + for (const f of files) { + const full = path.join(dsrDir, f); + const data = await fs.readJSON(full).catch(() => null); + if (!data || typeof data !== 'object') continue; + if (typeof (data as any).commit_hash !== 'string') continue; + dsrs.push(data as any); + } + + const dbPath = dsrIndexDbPath(repoRoot); + const exportPath = dsrIndexExportPath(repoRoot); + const db = await openCozoDbAtPath(dbPath, exportPath); + if (!db) return { enabled: false, skippedReason: 'Cozo backend not available (see cozo.error.json next to dsr-index.sqlite)' }; + + const commits: Array<[string, string, string, string]> = []; + const affected: Array<[string, string, string, string, string, string, string, string]> = []; + const ops: Array<[string, string, string, string, string, string, string, string, string]> = []; + + for (const r of dsrs) { + const commit = String(r.commit_hash); + commits.push([ + commit, + String(r.semantic_change_type ?? ''), + String(r.risk_level ?? ''), + String(r.summary ?? ''), + ]); + + for (const s of Array.isArray(r.affected_symbols) ? r.affected_symbols : []) { + affected.push([ + commit, + String((s as any).file ?? ''), + String((s as any).kind ?? ''), + String((s as any).name ?? ''), + String((s as any).signature ?? ''), + String((s as any).container?.kind ?? ''), + String((s as any).container?.name ?? ''), + String((s as any).container?.signature ?? ''), + ]); + } + + for (const o of Array.isArray(r.ast_operations) ? r.ast_operations : []) { + const sym = (o as any).symbol ?? {}; + ops.push([ + commit, + String((o as any).op ?? ''), + String(sym.file ?? ''), + String(sym.kind ?? ''), + String(sym.name ?? ''), + String(sym.signature ?? ''), + String((o as any).previous?.name ?? ''), + String((o as any).previous?.signature ?? ''), + String((o as any).content_hash ?? ''), + ]); + } + } + + const script = ` +{ + ?[commit_hash, semantic_change_type, risk_level, summary] <- $commits + :replace dsr_commit { commit_hash: String => semantic_change_type: String, risk_level: String, summary: String } +} +{ + ?[commit_hash, file, kind, name, signature, container_kind, container_name, container_signature] <- $affected + :replace dsr_affected_symbol { commit_hash: String, file: String, kind: String, name: String, signature: String, container_kind: String, container_name: String, container_signature: String } +} +{ + ?[commit_hash, op, file, kind, name, signature, prev_name, prev_signature, content_hash] <- $ops + :replace dsr_ast_operation { commit_hash: String, op: String, file: String, kind: String, name: String, signature: String, prev_name: String, prev_signature: String, content_hash: String } +} +`; + + await db.run(script, { commits, affected, ops } as any); + + if (db.engine !== 'sqlite' && db.exportRelations) { + const exported = await db.exportRelations(['dsr_commit', 'dsr_affected_symbol', 'dsr_ast_operation']); + await fs.ensureDir(path.dirname(exportPath)); + await fs.writeJSON(exportPath, exported, { spaces: 2 }); + } + + if (db.close) await db.close(); + return { + enabled: true, + engine: db.engine, + dbPath: db.dbPath, + exportPath: db.engine !== 'sqlite' ? exportPath : undefined, + counts: { + commits: commits.length, + affected_symbols: affected.length, + ast_operations: ops.length, + }, + }; +} diff --git a/src/core/dsr/paths.ts b/src/core/dsr/paths.ts new file mode 100644 index 0000000..3474c7d --- /dev/null +++ b/src/core/dsr/paths.ts @@ -0,0 +1,21 @@ +import path from 'path'; + +export function dsrCacheRoot(repoRoot: string): string { + return path.join(repoRoot, '.git-ai'); +} + +export function dsrDirectory(repoRoot: string): string { + return path.join(dsrCacheRoot(repoRoot), 'dsr'); +} + +export function dsrFilePath(repoRoot: string, commitHash: string): string { + return path.join(dsrDirectory(repoRoot), `${commitHash}.json`); +} + +export function dsrIndexDbPath(repoRoot: string): string { + return path.join(dsrDirectory(repoRoot), 'dsr-index.sqlite'); +} + +export function dsrIndexExportPath(repoRoot: string): string { + return path.join(dsrDirectory(repoRoot), 'dsr-index.export.json'); +} diff --git a/src/core/dsr/query.ts b/src/core/dsr/query.ts new file mode 100644 index 0000000..9ca6ccf --- /dev/null +++ b/src/core/dsr/query.ts @@ -0,0 +1,92 @@ +import fs from 'fs-extra'; +import simpleGit from 'simple-git'; +import { dsrFilePath } from './paths'; +import { DeterministicSemanticRecord } from './types'; + +export interface SymbolEvolutionOptions { + start?: string; + all?: boolean; + limit?: number; + contains?: boolean; +} + +export interface SymbolEvolutionHit { + commit_hash: string; + semantic_change_type: string; + risk_level?: string; + summary?: string; + operations: Array<{ + op: string; + file: string; + kind: string; + name: string; + signature: string; + previous_name?: string; + previous_signature?: string; + content_hash: string; + }>; +} + +export async function listCommitsTopological(repoRoot: string, opts: SymbolEvolutionOptions): Promise { + const git = simpleGit(repoRoot); + const args: string[] = ['rev-list', '--topo-order']; + if (opts.limit && opts.limit > 0) args.push('-n', String(opts.limit)); + if (opts.all) args.push('--all'); + else args.push(String(opts.start ?? 'HEAD')); + const out = (await git.raw(args)).trim(); + if (!out) return []; + return out.split('\n').map((l) => l.trim()).filter(Boolean); +} + +export async function symbolEvolution(repoRoot: string, symbol: string, opts: SymbolEvolutionOptions): Promise<{ + ok: boolean; + hits?: SymbolEvolutionHit[]; + missing_dsrs?: string[]; +}> { + const commits = await listCommitsTopological(repoRoot, opts); + const missing_dsrs: string[] = []; + const hits: SymbolEvolutionHit[] = []; + const needle = String(symbol ?? '').trim(); + if (!needle) return { ok: true, hits: [] }; + + const matches = (name: string) => { + if (opts.contains) return name.includes(needle); + return name === needle; + }; + + for (const c of commits) { + const p = dsrFilePath(repoRoot, c); + if (!await fs.pathExists(p)) { + missing_dsrs.push(c); + break; + } + const rec = await fs.readJSON(p).catch(() => null) as DeterministicSemanticRecord | null; + if (!rec) continue; + const ops = Array.isArray(rec.ast_operations) ? rec.ast_operations : []; + const matchedOps = ops + .filter((o: any) => matches(String(o?.symbol?.name ?? '')) || matches(String(o?.previous?.name ?? ''))) + .map((o: any) => ({ + op: String(o?.op ?? ''), + file: String(o?.symbol?.file ?? ''), + kind: String(o?.symbol?.kind ?? ''), + name: String(o?.symbol?.name ?? ''), + signature: String(o?.symbol?.signature ?? ''), + previous_name: o?.previous?.name ? String(o.previous.name) : undefined, + previous_signature: o?.previous?.signature ? String(o.previous.signature) : undefined, + content_hash: String(o?.content_hash ?? ''), + })) + .sort((a, b) => `${a.op}|${a.file}|${a.kind}|${a.name}|${a.signature}|${a.previous_name ?? ''}`.localeCompare(`${b.op}|${b.file}|${b.kind}|${b.name}|${b.signature}|${b.previous_name ?? ''}`)); + + if (matchedOps.length === 0) continue; + hits.push({ + commit_hash: String(rec.commit_hash), + semantic_change_type: String(rec.semantic_change_type ?? ''), + risk_level: rec.risk_level, + summary: rec.summary, + operations: matchedOps, + }); + } + + if (missing_dsrs.length > 0) return { ok: false, missing_dsrs }; + return { ok: true, hits }; +} diff --git a/src/core/dsr/snapshotParser.ts b/src/core/dsr/snapshotParser.ts new file mode 100644 index 0000000..982c281 --- /dev/null +++ b/src/core/dsr/snapshotParser.ts @@ -0,0 +1,61 @@ +import Parser from 'tree-sitter'; +import { ParseResult, SymbolInfo } from '../types'; +import { LanguageAdapter } from '../parser/adapter'; +import { TypeScriptAdapter } from '../parser/typescript'; +import { JavaAdapter } from '../parser/java'; +import { CAdapter } from '../parser/c'; +import { GoAdapter } from '../parser/go'; +import { PythonAdapter } from '../parser/python'; +import { RustAdapter } from '../parser/rust'; + +export interface ParsedSymbolSnapshot { + symbol: SymbolInfo; + content_hash: string; +} + +export class SnapshotCodeParser { + private parser: Parser; + private adapters: LanguageAdapter[]; + + constructor() { + this.parser = new Parser(); + this.adapters = [ + new TypeScriptAdapter(false), + new TypeScriptAdapter(true), + new JavaAdapter(), + new CAdapter(), + new GoAdapter(), + new PythonAdapter(), + new RustAdapter(), + ]; + } + + parseContent(filePath: string, content: string): ParseResult { + const adapter = this.pickAdapter(filePath); + if (!adapter) return { symbols: [], refs: [] }; + try { + this.parser.setLanguage(adapter.getTreeSitterLanguage()); + const tree = this.parser.parse(content); + return adapter.extractSymbolsAndRefs(tree.rootNode); + } catch (e: any) { + const msg = String(e?.message ?? e); + if (msg.includes('Invalid language object')) return { symbols: [], refs: [] }; + if (!msg.includes('Invalid argument')) return { symbols: [], refs: [] }; + try { + const tree = this.parser.parse(content, undefined, { bufferSize: 1024 * 1024 }); + return adapter.extractSymbolsAndRefs(tree.rootNode); + } catch { + return { symbols: [], refs: [] }; + } + } + } + + private pickAdapter(filePath: string): LanguageAdapter | null { + for (const adapter of this.adapters) { + for (const ext of adapter.getSupportedFileExtensions()) { + if (filePath.endsWith(ext)) return adapter; + } + } + return null; + } +} diff --git a/src/core/dsr/state.ts b/src/core/dsr/state.ts new file mode 100644 index 0000000..4869079 --- /dev/null +++ b/src/core/dsr/state.ts @@ -0,0 +1,30 @@ +import fs from 'fs-extra'; +import path from 'path'; +import { dsrCacheRoot, dsrDirectory } from './paths'; + +export interface DsrDirectoryState { + cache_root: string; + cache_root_exists: boolean; + dsr_dir: string; + dsr_dir_exists: boolean; + dsr_file_count: number; +} + +export async function getDsrDirectoryState(repoRoot: string): Promise { + const cache_root = dsrCacheRoot(repoRoot); + const dsr_dir = dsrDirectory(repoRoot); + const cache_root_exists = await fs.pathExists(cache_root); + const dsr_dir_exists = await fs.pathExists(dsr_dir); + let dsr_file_count = 0; + if (dsr_dir_exists) { + const entries = await fs.readdir(dsr_dir).catch(() => []); + dsr_file_count = entries.filter((e) => e.endsWith('.json') && !e.endsWith('.export.json')).length; + } + return { + cache_root: path.resolve(cache_root), + cache_root_exists, + dsr_dir: path.resolve(dsr_dir), + dsr_dir_exists, + dsr_file_count, + }; +} diff --git a/src/core/dsr/types.ts b/src/core/dsr/types.ts new file mode 100644 index 0000000..4f9780a --- /dev/null +++ b/src/core/dsr/types.ts @@ -0,0 +1,38 @@ +export type DsrRiskLevel = 'low' | 'medium' | 'high'; + +export type DsrOperationKind = 'add' | 'modify' | 'delete' | 'rename'; + +export interface DsrSymbolDescriptor { + file: string; + kind: string; + name: string; + signature: string; + start_line: number; + end_line: number; + container?: { + kind: string; + name: string; + signature: string; + }; +} + +export interface DsrAstOperation { + op: DsrOperationKind; + symbol: DsrSymbolDescriptor; + previous?: { + name: string; + signature: string; + }; + content_hash: string; +} + +export type DsrSemanticChangeType = 'no-op' | 'additive' | 'modification' | 'deletion' | 'rename' | 'mixed'; + +export interface DeterministicSemanticRecord { + commit_hash: string; + affected_symbols: DsrSymbolDescriptor[]; + ast_operations: DsrAstOperation[]; + semantic_change_type: DsrSemanticChangeType; + summary?: string; + risk_level?: DsrRiskLevel; +} diff --git a/src/core/gitDiff.ts b/src/core/gitDiff.ts new file mode 100644 index 0000000..f5aba56 --- /dev/null +++ b/src/core/gitDiff.ts @@ -0,0 +1,81 @@ +import simpleGit from 'simple-git'; + +export type GitNameStatus = 'A' | 'M' | 'D' | 'R'; + +export interface GitDiffPathChange { + status: GitNameStatus; + path: string; + oldPath?: string; +} + +function splitZ(raw: string): string[] { + return raw.split('\0').map((s) => s.trim()).filter(Boolean); +} + +export async function getStagedNameStatus(repoRoot: string): Promise { + const git = simpleGit(repoRoot); + const raw = await git.raw(['diff', '--cached', '--name-status', '-z', '--find-renames']); + const parts = splitZ(raw); + const out: GitDiffPathChange[] = []; + + for (let i = 0; i < parts.length; i++) { + const entry = parts[i] ?? ''; + if (!entry) continue; + + const tabIdx = entry.indexOf('\t'); + const statusRaw = tabIdx >= 0 ? entry.slice(0, tabIdx) : entry; + const rest = tabIdx >= 0 ? entry.slice(tabIdx + 1) : ''; + const statusLetter = statusRaw[0] ?? ''; + + if (statusLetter === 'R') { + const oldPath = rest; + const newPath = parts[i + 1] ?? ''; + i += 1; + if (oldPath && newPath) out.push({ status: 'R', oldPath, path: newPath }); + continue; + } + + if (statusLetter === 'A' || statusLetter === 'M' || statusLetter === 'D') { + const p = rest; + if (p) out.push({ status: statusLetter, path: p }); + continue; + } + } + + out.sort((a, b) => `${a.status}\t${a.oldPath ?? ''}\t${a.path}`.localeCompare(`${b.status}\t${b.oldPath ?? ''}\t${b.path}`)); + return out; +} + +export async function getWorktreeNameStatus(repoRoot: string): Promise { + const git = simpleGit(repoRoot); + const raw = await git.raw(['diff', '--name-status', '-z', '--find-renames']); + const parts = splitZ(raw); + const out: GitDiffPathChange[] = []; + + for (let i = 0; i < parts.length; i++) { + const entry = parts[i] ?? ''; + if (!entry) continue; + + const tabIdx = entry.indexOf('\t'); + const statusRaw = tabIdx >= 0 ? entry.slice(0, tabIdx) : entry; + const rest = tabIdx >= 0 ? entry.slice(tabIdx + 1) : ''; + const statusLetter = statusRaw[0] ?? ''; + + if (statusLetter === 'R') { + const oldPath = rest; + const newPath = parts[i + 1] ?? ''; + i += 1; + if (oldPath && newPath) out.push({ status: 'R', oldPath, path: newPath }); + continue; + } + + if (statusLetter === 'A' || statusLetter === 'M' || statusLetter === 'D') { + const p = rest; + if (p) out.push({ status: statusLetter, path: p }); + continue; + } + } + + out.sort((a, b) => `${a.status}\t${a.oldPath ?? ''}\t${a.path}`.localeCompare(`${b.status}\t${b.oldPath ?? ''}\t${b.path}`)); + return out; +} diff --git a/src/core/indexerIncremental.ts b/src/core/indexerIncremental.ts new file mode 100644 index 0000000..e1fd29c --- /dev/null +++ b/src/core/indexerIncremental.ts @@ -0,0 +1,313 @@ +import fs from 'fs-extra'; +import path from 'path'; +import simpleGit from 'simple-git'; +import { sha256Hex } from './crypto'; +import { defaultDbDir, IndexLang, openTablesByLang, ALL_INDEX_LANGS } from './lancedb'; +import { hashEmbedding } from './embedding'; +import { quantizeSQ8 } from './sq8'; +import { toPosixPath } from './paths'; +import { removeFileFromAstGraph, writeAstGraphToCozo } from './astGraph'; +import { ChunkRow, RefRow } from './types'; +import { GitDiffPathChange } from './gitDiff'; +import { SnapshotCodeParser } from './dsr/snapshotParser'; + +export interface IncrementalIndexOptions { + repoRoot: string; + scanRoot?: string; + dim: number; + source: 'worktree' | 'staged'; + changes: GitDiffPathChange[]; + onProgress?: (p: { totalFiles: number; processedFiles: number; currentFile?: string }) => void; +} + +function buildChunkText(file: string, symbol: { name: string; kind: string; signature: string }): string { + return `file:${file}\nkind:${symbol.kind}\nname:${symbol.name}\nsignature:${symbol.signature}`; +} + +function inferIndexLang(file: string): IndexLang { + if (file.endsWith('.java')) return 'java'; + if (file.endsWith('.c') || file.endsWith('.h')) return 'c'; + if (file.endsWith('.go')) return 'go'; + if (file.endsWith('.py')) return 'python'; + if (file.endsWith('.rs')) return 'rust'; + return 'ts'; +} + +function isIndexableFile(file: string): boolean { + return /\.(ts|tsx|js|jsx|java|c|h|go|py|rs)$/i.test(file); +} + +function escapeQuotes(s: string): string { + return s.replace(/'/g, "''"); +} + +async function readStagedFile(repoRoot: string, filePosix: string): Promise { + const git = simpleGit(repoRoot); + try { + return await git.raw(['show', `:${filePosix}`]); + } catch { + return null; + } +} + +async function readWorktreeFile(scanRoot: string, filePosix: string): Promise { + const abs = path.join(scanRoot, filePosix); + try { + return await fs.readFile(abs, 'utf-8'); + } catch { + return null; + } +} + +async function deleteRefsForFile(byLang: Partial>, filePosix: string): Promise { + const safe = escapeQuotes(filePosix); + const pred = `file == '${safe}'`; + const langs = Object.keys(byLang) as IndexLang[]; + for (const lang of langs) { + const t = byLang[lang]; + if (!t) continue; + if (typeof t.refs.delete === 'function') { + await t.refs.delete(pred); + } + } +} + +export class IncrementalIndexerV2 { + private repoRoot: string; + private scanRoot: string; + private dim: number; + private source: IncrementalIndexOptions['source']; + private changes: GitDiffPathChange[]; + private onProgress?: IncrementalIndexOptions['onProgress']; + private parser: SnapshotCodeParser; + + constructor(options: IncrementalIndexOptions) { + this.repoRoot = path.resolve(options.repoRoot); + this.scanRoot = path.resolve(options.scanRoot ?? options.repoRoot); + this.dim = options.dim; + this.source = options.source; + this.changes = options.changes; + this.onProgress = options.onProgress; + this.parser = new SnapshotCodeParser(); + } + + async run(): Promise<{ processed: number; addedByLang: Record }> { + const gitAiDir = path.join(this.repoRoot, '.git-ai'); + await fs.ensureDir(gitAiDir); + const dbDir = defaultDbDir(this.repoRoot); + + const { byLang } = await openTablesByLang({ + dbDir, + dim: this.dim, + mode: 'create_if_missing', + languages: ALL_INDEX_LANGS, + }); + + const chunkRowsByLang: Partial> = {}; + const refRowsByLang: Partial> = {}; + const candidateChunksByLang: Partial>> = {}; + const astFiles: Array<[string, string, string]> = []; + const astSymbols: Array<[string, string, string, string, string, string, number, number]> = []; + const astContains: Array<[string, string]> = []; + const astExtendsName: Array<[string, string]> = []; + const astImplementsName: Array<[string, string]> = []; + const astRefsName: Array<[string, string, string, string, string, number, number]> = []; + const astCallsName: Array<[string, string, string, string, number, number]> = []; + + const neededHashByLang: Partial>> = {}; + + const totalFiles = this.changes.length; + this.onProgress?.({ totalFiles, processedFiles: 0 }); + + let processed = 0; + for (const ch of this.changes) { + processed++; + const filePosix = toPosixPath(ch.path); + this.onProgress?.({ totalFiles, processedFiles: processed, currentFile: filePosix }); + + if (ch.status === 'R' && ch.oldPath) { + const oldFile = toPosixPath(ch.oldPath); + await deleteRefsForFile(byLang as any, oldFile); + await removeFileFromAstGraph(this.repoRoot, oldFile); + } + if (ch.status === 'D') { + await deleteRefsForFile(byLang as any, filePosix); + await removeFileFromAstGraph(this.repoRoot, filePosix); + continue; + } + + await deleteRefsForFile(byLang as any, filePosix); + await removeFileFromAstGraph(this.repoRoot, filePosix); + + if (!isIndexableFile(filePosix)) continue; + + const lang = inferIndexLang(filePosix); + if (!chunkRowsByLang[lang]) chunkRowsByLang[lang] = []; + if (!refRowsByLang[lang]) refRowsByLang[lang] = []; + if (!candidateChunksByLang[lang]) candidateChunksByLang[lang] = new Map(); + if (!neededHashByLang[lang]) neededHashByLang[lang] = new Set(); + + const content = this.source === 'staged' + ? await readStagedFile(this.repoRoot, filePosix) + : await readWorktreeFile(this.scanRoot, filePosix); + if (content == null) continue; + + const parsed = this.parser.parseContent(filePosix, content); + const symbols = parsed.symbols; + const fileRefs = parsed.refs; + const fileId = sha256Hex(`file:${filePosix}`); + astFiles.push([fileId, filePosix, lang]); + + const callableScopes: Array<{ refId: string; startLine: number; endLine: number }> = []; + for (const s of symbols) { + const text = buildChunkText(filePosix, s); + const contentHash = sha256Hex(text); + const refId = sha256Hex(`${filePosix}:${s.name}:${s.kind}:${s.startLine}:${s.endLine}:${contentHash}`); + + astSymbols.push([refId, filePosix, lang, s.name, s.kind, s.signature, s.startLine, s.endLine]); + if (s.kind === 'function' || s.kind === 'method') { + callableScopes.push({ refId, startLine: s.startLine, endLine: s.endLine }); + } + + let parentId = fileId; + if (s.container) { + const cText = buildChunkText(filePosix, s.container); + const cHash = sha256Hex(cText); + parentId = sha256Hex(`${filePosix}:${s.container.name}:${s.container.kind}:${s.container.startLine}:${s.container.endLine}:${cHash}`); + } + astContains.push([parentId, refId]); + + if (s.kind === 'class') { + if (s.extends) for (const superName of s.extends) astExtendsName.push([refId, superName]); + if (s.implements) for (const ifaceName of s.implements) astImplementsName.push([refId, ifaceName]); + } + + neededHashByLang[lang]!.add(contentHash); + candidateChunksByLang[lang]!.set(contentHash, text); + + const refRow: RefRow = { + ref_id: refId, + content_hash: contentHash, + file: filePosix, + symbol: s.name, + kind: s.kind, + signature: s.signature, + start_line: s.startLine, + end_line: s.endLine, + }; + refRowsByLang[lang]!.push(refRow as any); + } + + const pickScope = (line: number): string => { + let best: { refId: string; span: number } | null = null; + for (const s of callableScopes) { + if (line < s.startLine || line > s.endLine) continue; + const span = s.endLine - s.startLine; + if (!best || span < best.span) best = { refId: s.refId, span }; + } + return best ? best.refId : fileId; + }; + + for (const r of fileRefs) { + const fromId = pickScope(r.line); + astRefsName.push([fromId, lang, r.name, r.refKind, filePosix, r.line, r.column]); + if (r.refKind === 'call' || r.refKind === 'new') { + astCallsName.push([fromId, lang, r.name, filePosix, r.line, r.column]); + } + } + } + + const existingChunkIdsByLang: Partial>> = {}; + for (const lang of Object.keys(neededHashByLang) as IndexLang[]) { + const t = (byLang as any)[lang]; + if (!t) continue; + const needed = Array.from(neededHashByLang[lang] ?? []); + const existing = new Set(); + for (let i = 0; i < needed.length; i += 400) { + const chunk = needed.slice(i, i + 400); + if (chunk.length === 0) continue; + const pred = `content_hash IN (${chunk.map((h) => `'${escapeQuotes(h)}'`).join(',')})`; + const rows = await t.chunks.query().where(pred).select(['content_hash']).limit(chunk.length).toArray(); + for (const row of rows as any[]) { + const id = String(row.content_hash ?? ''); + if (id) existing.add(id); + } + } + existingChunkIdsByLang[lang] = existing; + } + + for (const lang of Object.keys(candidateChunksByLang) as IndexLang[]) { + const t = (byLang as any)[lang]; + if (!t) continue; + const existing = existingChunkIdsByLang[lang] ?? new Set(); + const chunkRows: ChunkRow[] = []; + const candidates = candidateChunksByLang[lang]!; + for (const [contentHash, text] of candidates.entries()) { + if (!contentHash || !text) continue; + if (existing.has(contentHash)) continue; + const vec = hashEmbedding(text, { dim: this.dim }); + const q = quantizeSQ8(vec); + chunkRows.push({ + content_hash: contentHash, + text, + dim: q.dim, + scale: q.scale, + qvec_b64: Buffer.from(q.q).toString('base64'), + }); + existing.add(contentHash); + } + chunkRowsByLang[lang] = chunkRows as any[]; + } + + const addedByLang: Record = {}; + for (const lang of ALL_INDEX_LANGS) { + const t = byLang[lang]; + if (!t) continue; + const chunkRows = chunkRowsByLang[lang] ?? []; + const refRows = refRowsByLang[lang] ?? []; + if (chunkRows.length > 0) await t.chunks.add(chunkRows); + if (refRows.length > 0) await t.refs.add(refRows); + if (chunkRows.length > 0 || refRows.length > 0) { + addedByLang[lang] = { chunksAdded: chunkRows.length, refsAdded: refRows.length }; + } + } + + const astGraph = await writeAstGraphToCozo(this.repoRoot, { + files: astFiles, + symbols: astSymbols, + contains: astContains, + extends_name: astExtendsName, + implements_name: astImplementsName, + refs_name: astRefsName, + calls_name: astCallsName, + }, { mode: 'put' }); + + const metaPath = path.join(gitAiDir, 'meta.json'); + const prev = await fs.readJSON(metaPath).catch(() => null); + const meta = { + ...(prev && typeof prev === 'object' ? prev : {}), + version: '2.1', + index_schema_version: 3, + dim: this.dim, + dbDir: path.relative(this.repoRoot, dbDir), + scanRoot: path.relative(this.repoRoot, this.scanRoot), + languages: ALL_INDEX_LANGS, + byLang: addedByLang, + astGraph: astGraph.enabled + ? { + backend: 'cozo', + engine: astGraph.engine, + dbPath: astGraph.dbPath ? path.relative(this.repoRoot, astGraph.dbPath) : undefined, + counts: astGraph.counts, + } + : { + backend: 'cozo', + enabled: false, + skippedReason: astGraph.skippedReason, + }, + }; + await fs.writeJSON(metaPath, meta, { spaces: 2 }); + + return { processed: this.changes.length, addedByLang }; + } +} diff --git a/src/mcp/server.ts b/src/mcp/server.ts index 230ad95..70985d2 100644 --- a/src/mcp/server.ts +++ b/src/mcp/server.ts @@ -10,6 +10,7 @@ import { packLanceDb, unpackLanceDb } from '../core/archive'; import { defaultDbDir, openTablesByLang } from '../core/lancedb'; import { ensureLfsTracking } from '../core/lfs'; import { buildQueryVector, scoreAgainst } from '../core/search'; +import { IndexerV2 } from '../core/indexer'; import { queryManifestWorkspace } from '../core/workspace'; import { buildCallChainDownstreamByNameQuery, buildCallChainUpstreamByNameQuery, buildCalleesByNameQuery, buildCallersByNameQuery, buildChildrenQuery, buildFindReferencesQuery, buildFindSymbolsQuery, runAstGraphQuery } from '../core/astGraphQuery'; import { buildCoarseWhere, filterAndRankSymbolRows, inferSymbolSearchMode, pickCoarseToken } from '../core/symbolSearch'; @@ -86,12 +87,18 @@ export class GitAIV2MCPServer { tools: [ { name: 'get_repo', - description: 'Get current default repository root for this MCP server', - inputSchema: { type: 'object', properties: {} }, + description: 'Resolve repository root and scan root for a given path. Risk: low (read-only).', + inputSchema: { + type: 'object', + properties: { + path: { type: 'string', description: 'Repository root path' }, + }, + required: ['path'], + }, }, { name: 'search_symbols', - description: 'Search symbols and return file locations (substring/prefix/wildcard/regex/fuzzy)', + description: 'Search symbols and return file locations (substring/prefix/wildcard/regex/fuzzy). Risk: low (read-only).', inputSchema: { type: 'object', properties: { @@ -100,24 +107,24 @@ export class GitAIV2MCPServer { case_insensitive: { type: 'boolean', default: false }, max_candidates: { type: 'number', default: 1000 }, lang: { type: 'string', enum: ['auto', 'all', 'java', 'ts'], default: 'auto' }, - path: { type: 'string', description: 'Repository path (optional)' }, + path: { type: 'string', description: 'Repository root path' }, limit: { type: 'number', default: 50 }, with_repo_map: { type: 'boolean', default: false }, repo_map_max_files: { type: 'number', default: 20 }, repo_map_max_symbols: { type: 'number', default: 5 }, wiki_dir: { type: 'string', description: 'Wiki dir relative to repo root (optional)' }, }, - required: ['query'], + required: ['path', 'query'], }, }, { name: 'semantic_search', - description: 'Semantic search using SQ8 vectors stored in LanceDB (brute-force)', + description: 'Semantic search using SQ8 vectors stored in LanceDB (brute-force). Risk: low (read-only).', inputSchema: { type: 'object', properties: { query: { type: 'string' }, - path: { type: 'string', description: 'Repository path (optional)' }, + path: { type: 'string', description: 'Repository root path' }, topk: { type: 'number', default: 10 }, lang: { type: 'string', enum: ['auto', 'all', 'java', 'ts'], default: 'auto' }, with_repo_map: { type: 'boolean', default: false }, @@ -125,175 +132,182 @@ export class GitAIV2MCPServer { repo_map_max_symbols: { type: 'number', default: 5 }, wiki_dir: { type: 'string', description: 'Wiki dir relative to repo root (optional)' }, }, - required: ['query'], + required: ['path', 'query'], }, }, { name: 'repo_map', - description: 'Generate a lightweight repository map (ranked files + top symbols + wiki links)', + description: 'Generate a lightweight repository map (ranked files + top symbols + wiki links). Risk: low (read-only).', inputSchema: { type: 'object', properties: { - path: { type: 'string', description: 'Repository path (optional)' }, + path: { type: 'string', description: 'Repository root path' }, max_files: { type: 'number', default: 20 }, max_symbols: { type: 'number', default: 5 }, wiki_dir: { type: 'string', description: 'Wiki dir relative to repo root (optional)' }, }, + required: ['path'], }, }, { name: 'check_index', - description: 'Check whether the repository index structure matches current expected schema', + description: 'Check whether the repository index structure matches current expected schema. Risk: low (read-only).', + inputSchema: { + type: 'object', + properties: { + path: { type: 'string', description: 'Repository root path' }, + }, + required: ['path'], + }, + }, + { + name: 'rebuild_index', + description: 'Rebuild full repository index under .git-ai (LanceDB + AST graph). Risk: high (writes .git-ai; can be slow).', inputSchema: { type: 'object', properties: { - path: { type: 'string', description: 'Repository path (optional)' }, + path: { type: 'string', description: 'Repository root path' }, + dim: { type: 'number', default: 256 }, + overwrite: { type: 'boolean', default: true }, }, + required: ['path'], }, }, { name: 'pack_index', - description: 'Pack .git-ai/lancedb into .git-ai/lancedb.tar.gz', + description: 'Pack .git-ai/lancedb into .git-ai/lancedb.tar.gz. Risk: medium (writes archive; may touch git-lfs config).', inputSchema: { type: 'object', properties: { - path: { type: 'string', description: 'Repository path (optional)' }, + path: { type: 'string', description: 'Repository root path' }, lfs: { type: 'boolean', default: false, description: 'Run git lfs track for .git-ai/lancedb.tar.gz' }, }, + required: ['path'], }, }, { name: 'unpack_index', - description: 'Unpack .git-ai/lancedb.tar.gz into .git-ai/lancedb', + description: 'Unpack .git-ai/lancedb.tar.gz into .git-ai/lancedb. Risk: medium (writes .git-ai/lancedb).', inputSchema: { type: 'object', properties: { - path: { type: 'string', description: 'Repository path (optional)' }, + path: { type: 'string', description: 'Repository root path' }, }, + required: ['path'], }, }, { name: 'list_files', - description: 'List repository files by glob pattern', + description: 'List repository files by glob pattern. Risk: low (read-only).', inputSchema: { type: 'object', properties: { - path: { type: 'string', description: 'Repository path (optional)' }, + path: { type: 'string', description: 'Repository root path' }, pattern: { type: 'string', default: '**/*' }, limit: { type: 'number', default: 500 }, }, + required: ['path'], }, }, { name: 'read_file', - description: 'Read a repository file with optional line range', + description: 'Read a repository file with optional line range. Risk: low (read-only).', inputSchema: { type: 'object', properties: { - path: { type: 'string', description: 'Repository path (optional)' }, + path: { type: 'string', description: 'Repository root path' }, file: { type: 'string', description: 'File path relative to repo root' }, start_line: { type: 'number', default: 1 }, end_line: { type: 'number', default: 200 }, }, - required: ['file'], - }, - }, - { - name: 'set_repo', - description: 'Set default repository path for subsequent tool calls', - inputSchema: { - type: 'object', - properties: { - path: { type: 'string' }, - }, - required: ['path'], + required: ['path', 'file'], }, }, { name: 'ast_graph_query', - description: 'Run a CozoScript query against the AST graph database (advanced)', + description: 'Run a CozoScript query against the AST graph database (advanced). Risk: low (read-only).', inputSchema: { type: 'object', properties: { query: { type: 'string' }, params: { type: 'object', default: {} }, - path: { type: 'string', description: 'Repository path (optional)' }, + path: { type: 'string', description: 'Repository root path' }, }, - required: ['query'], + required: ['path', 'query'], }, }, { name: 'ast_graph_find', - description: 'Find symbols by name prefix (case-insensitive) using the AST graph', + description: 'Find symbols by name prefix (case-insensitive) using the AST graph. Risk: low (read-only).', inputSchema: { type: 'object', properties: { prefix: { type: 'string' }, - path: { type: 'string', description: 'Repository path (optional)' }, + path: { type: 'string', description: 'Repository root path' }, limit: { type: 'number', default: 50 }, lang: { type: 'string', enum: ['auto', 'all', 'java', 'ts'], default: 'auto' }, }, - required: ['prefix'], + required: ['path', 'prefix'], }, }, { name: 'ast_graph_children', - description: 'List direct children in the AST containment graph (file -> top-level symbols, class -> methods)', + description: 'List direct children in the AST containment graph (file -> top-level symbols, class -> methods). Risk: low (read-only).', inputSchema: { type: 'object', properties: { id: { type: 'string', description: 'Parent id (ref_id or file_id; or file path when as_file=true)' }, as_file: { type: 'boolean', default: false }, - path: { type: 'string', description: 'Repository path (optional)' }, + path: { type: 'string', description: 'Repository root path' }, }, - required: ['id'], + required: ['path', 'id'], }, }, { name: 'ast_graph_refs', - description: 'Find reference locations by name (calls/new/type)', + description: 'Find reference locations by name (calls/new/type). Risk: low (read-only).', inputSchema: { type: 'object', properties: { name: { type: 'string' }, limit: { type: 'number', default: 200 }, lang: { type: 'string', enum: ['auto', 'all', 'java', 'ts'], default: 'auto' }, - path: { type: 'string', description: 'Repository path (optional)' }, + path: { type: 'string', description: 'Repository root path' }, }, - required: ['name'], + required: ['path', 'name'], }, }, { name: 'ast_graph_callers', - description: 'Find callers by callee name', + description: 'Find callers by callee name. Risk: low (read-only).', inputSchema: { type: 'object', properties: { name: { type: 'string' }, limit: { type: 'number', default: 200 }, lang: { type: 'string', enum: ['auto', 'all', 'java', 'ts'], default: 'auto' }, - path: { type: 'string', description: 'Repository path (optional)' }, + path: { type: 'string', description: 'Repository root path' }, }, - required: ['name'], + required: ['path', 'name'], }, }, { name: 'ast_graph_callees', - description: 'Find callees by caller name', + description: 'Find callees by caller name. Risk: low (read-only).', inputSchema: { type: 'object', properties: { name: { type: 'string' }, limit: { type: 'number', default: 200 }, lang: { type: 'string', enum: ['auto', 'all', 'java', 'ts'], default: 'auto' }, - path: { type: 'string', description: 'Repository path (optional)' }, + path: { type: 'string', description: 'Repository root path' }, }, - required: ['name'], + required: ['path', 'name'], }, }, { name: 'ast_graph_chain', - description: 'Compute call chain by symbol name (heuristic, name-based)', + description: 'Compute call chain by symbol name (heuristic, name-based). Risk: low (read-only).', inputSchema: { type: 'object', properties: { @@ -303,9 +317,9 @@ export class GitAIV2MCPServer { limit: { type: 'number', default: 500 }, min_name_len: { type: 'number', default: 1 }, lang: { type: 'string', enum: ['auto', 'all', 'java', 'ts'], default: 'auto' }, - path: { type: 'string', description: 'Repository path (optional)' }, + path: { type: 'string', description: 'Repository root path' }, }, - required: ['name'], + required: ['path', 'name'], }, }, ], @@ -320,6 +334,9 @@ export class GitAIV2MCPServer { const startedAt = Date.now(); const response = await (async () => { + if (typeof callPath !== 'string' || callPath.trim() === '') { + throw new Error('Missing required argument: path'); + } if (name === 'get_repo') { const ctx = await this.openRepoContext(callPath); @@ -330,17 +347,6 @@ export class GitAIV2MCPServer { }; } - if (name === 'set_repo') { - const p = String((args as any).path ?? ''); - this.startDir = path.resolve(p); - const ctx = await this.openRepoContext(this.startDir); - const repoRoot = ctx.repoRoot; - const scanRoot = ctx.scanRoot; - return { - content: [{ type: 'text', text: JSON.stringify({ ok: true, startDir: this.startDir, repoRoot, scanRoot }, null, 2) }], - }; - } - if (name === 'check_index') { const repoRoot = await this.resolveRepoRoot(callPath); const res = await checkIndex(repoRoot); @@ -350,6 +356,18 @@ export class GitAIV2MCPServer { }; } + if (name === 'rebuild_index') { + const { repoRoot, scanRoot, meta } = await this.openRepoContext(callPath); + const overwrite = Boolean((args as any).overwrite ?? true); + const dimOpt = Number((args as any).dim ?? 256); + const dim = typeof meta?.dim === 'number' ? meta.dim : dimOpt; + const indexer = new IndexerV2({ repoRoot, scanRoot, dim, overwrite }); + await indexer.run(); + return { + content: [{ type: 'text', text: JSON.stringify({ ok: true, repoRoot, scanRoot, dim, overwrite }, null, 2) }], + }; + } + if (name === 'pack_index') { const repoRoot = await this.resolveRepoRoot(callPath); const packed = await packLanceDb(repoRoot); diff --git a/.trae/documents/Fix EISDIR error and enable multi-language indexing.md b/templates/agents/common/documents/Fix EISDIR error and enable multi-language indexing.md similarity index 100% rename from .trae/documents/Fix EISDIR error and enable multi-language indexing.md rename to templates/agents/common/documents/Fix EISDIR error and enable multi-language indexing.md diff --git a/.trae/documents/Fix git-ai index error in CodaGraph directory.md b/templates/agents/common/documents/Fix git-ai index error in CodaGraph directory.md similarity index 100% rename from .trae/documents/Fix git-ai index error in CodaGraph directory.md rename to templates/agents/common/documents/Fix git-ai index error in CodaGraph directory.md diff --git a/templates/agents/common/rules/git-ai-mcp/RULE.md b/templates/agents/common/rules/git-ai-mcp/RULE.md new file mode 100644 index 0000000..c8b3ef3 --- /dev/null +++ b/templates/agents/common/rules/git-ai-mcp/RULE.md @@ -0,0 +1,15 @@ +# git-ai-mcp + +Use `git-ai` as the single entry point for indexing and semantic queries. + +## Hard rules + +- Determine history and branches only from Git (never from semantic artifacts) +- Treat `.git-ai/dsr/.json` as immutable canonical artifacts +- Treat databases as rebuildable caches (derivable from DSR + Git) +- If DSR is missing for a commit needed by a query, report and stop (do not infer) + +## Practical defaults + +- Prefer read-only operations unless explicitly asked to modify the repository +- For repository understanding, use MCP search tools first, then read code with line ranges diff --git a/templates/agents/common/rules/git-ai.yaml b/templates/agents/common/rules/git-ai.yaml new file mode 100644 index 0000000..9593199 --- /dev/null +++ b/templates/agents/common/rules/git-ai.yaml @@ -0,0 +1,165 @@ +name: trae-dsr +version: "1.1.2" +description: | + Non-negotiable constraints for a Git-native, decentralized semantic indexing system. + Enforces DSR determinism/immutability, canonical-vs-cache boundaries, and Git DAG authority. + +global_invariants: + - "Git commit DAG is authoritative for history and branches." + - "DSR is per-commit, immutable, deterministic." + - "DSR files are canonical artifacts; databases are rebuildable caches." + - "Never infer Git topology from semantic data." + +precedence: + conflict_resolution_order: + - git + - dsr + - database + - heuristics + +canonical_artifacts: + dsr: + directory: ".git-ai/dsr" + file_pattern: ".json" + rules: + - "One commit → one DSR file." + - "DSR file must be immutable once generated." + +cli_entrypoints: + dsr: + - "git-ai ai dsr context --json" + - "git-ai ai dsr generate " + - "git-ai ai dsr rebuild-index" + - "git-ai ai dsr query symbol-evolution --json" + +dsr_schema: + required_fields: + - commit_hash + - affected_symbols + - ast_operations + - semantic_change_type + optional_fields: + - summary + - risk_level + forbidden_fields: + - parent_commits + - branch_names + - timestamps_beyond_commit_time + - merge_topology + +must_follow: + - id: git_is_authority + severity: error + rule: | + Use Git commands to determine repo root, HEAD, branch, and history. + Never infer branches/parents/topology from semantic artifacts. + + - id: do_not_parse_git_internals + severity: error + rule: | + Do not parse .git internals manually. + Use git rev-parse / git show / git log / git diff. + + - id: per_commit_scope_only + severity: error + rule: | + For a given commit, analyze only the snapshot defined by that commit. + Do not inspect future commits. + + - id: dsr_determinism + severity: error + rule: | + DSR generation must be deterministic: same inputs produce identical DSR output. + No randomness, no time-based or environment-based variance. + + - id: dsr_immutability + severity: error + rule: | + Once written, a DSR file must not be modified. + If regeneration is necessary, treat it as a system error and stop. + + - id: cache_is_derivable + severity: error + rule: | + Databases and indexes are caches only. + Every semantic fact stored in any database must be derivable from DSR + Git. + + - id: read_only_query + severity: error + rule: | + Query execution is read-only with respect to DSR. + DSR enriches commit nodes; it never defines DAG edges. + +if_data_missing: + - "Rebuild caches from DSR." + - "If DSR missing, report and stop (do NOT infer)." + +prohibited: + - action: "Inferring Git parents/branches/topology from semantic data" + reason: "Git DAG is authoritative; semantics may only enrich nodes" + + - action: "Writing semantic facts only into a database" + reason: "Databases must be derivable caches; DSR is canonical" + + - action: "Including parent commits or branch names in DSR" + reason: "DSR must not encode merge topology or branch structure" + + - action: "Mutating an existing DSR file" + reason: "DSR must be immutable once generated" + +phases: + phase_0: + name: "Context Discovery" + must_produce: + - commit_hash + - repo_root + - dsr_directory_state + tools: + - "git rev-parse" + - "git show" + - "git log" + - "filesystem inspection" + + phase_1: + name: "Commit Semantic Extraction" + tools: + - "git diff " + - "language parser → AST" + - "AST diff engine" + - "semantic rules engine" + must_produce: + - ast_diff + - symbol_level_operations + + phase_2: + name: "DSR Generation" + output_path: ".git-ai/dsr/.json" + + phase_3: + name: "Cache / Index Materialization" + tools: + - "SQLite / CozoDB" + - "LanceDB" + + phase_4: + name: "Versioning & Distribution" + tools: + - "Git LFS" + - "pre-commit / pre-push hooks" + must_hold: + - "DSR files are Git-visible." + - "No central service required." + + phase_5: + name: "Clone / Pull / Checkout Handling" + rules: + - "Never persist checkout-specific state into DSR." + - "Git DAG traversal always comes first." + + phase_6: + name: "Query Execution" + supported_queries: + - semantic_blame + - semantic_diff + - symbol_evolution + - semantic_search diff --git a/templates/agents/common/skills/git-ai-mcp/SKILL.md b/templates/agents/common/skills/git-ai-mcp/SKILL.md new file mode 100644 index 0000000..22b98aa --- /dev/null +++ b/templates/agents/common/skills/git-ai-mcp/SKILL.md @@ -0,0 +1,25 @@ +# git-ai-mcp + +This skill helps an agent use `git-ai` safely for: + +- Repository indexing/search via MCP (`git-ai ai serve`) +- Per-commit semantics via DSR (Deterministic Semantic Record) + +## Key invariants + +- Git commit DAG is authoritative for history/branches +- DSR is per-commit, immutable, deterministic +- DSR files are canonical artifacts; databases are rebuildable caches +- Never infer Git topology from semantic data + +## Recommended workflow + +1. (Optional) Build repository index (checkout-local cache): + - `git-ai ai index --overwrite` +2. Start MCP server for symbol/semantic/graph tools: + - `git-ai ai serve` +3. For history/evolution questions, rely on DSR (commit-addressable artifacts): + - Inspect: `git-ai ai dsr context --json` + - Generate for one commit: `git-ai ai dsr generate ` + - Rebuild cache index: `git-ai ai dsr rebuild-index` + - Query (read-only, Git DAG first): `git-ai ai dsr query symbol-evolution --json` diff --git a/templates/agents/common/skills/git-ai/skill.yaml b/templates/agents/common/skills/git-ai/skill.yaml new file mode 100644 index 0000000..ae8a564 --- /dev/null +++ b/templates/agents/common/skills/git-ai/skill.yaml @@ -0,0 +1,168 @@ +name: trae-dsr +version: "1.1.2" +role: | + You are Trae, an autonomous code agent implementing a Git-native, decentralized semantic indexing system. + +global_invariants: + - "Git commit DAG is authoritative for history and branches." + - "DSR is per-commit, immutable, deterministic." + - "DSR files are canonical artifacts; databases are rebuildable caches." + - "Never infer Git topology from semantic data." + +fail_safe_rules: + conflict_precedence: + - git + - dsr + - database + - heuristics + missing_data_behavior: + - "Rebuild caches from DSR." + - "If DSR missing, report and stop (do NOT infer)." + +design_philosophy: | + Semantic understanding is a first-class, commit-addressable artifact, + but Git remains the sole authority over history. + +cli_entrypoints: + dsr: + - "git-ai ai dsr context --json" + - "git-ai ai dsr generate " + - "git-ai ai dsr rebuild-index" + - "git-ai ai dsr query symbol-evolution --json" + +triggers: + - pattern: "semantic blame|semantic diff|symbol evolution|semantic search" + - pattern: "generate dsr|dsr generation|deterministic semantic record" + - pattern: "index materialization|rebuild cache|materialize index" + - pattern: "checkout handling|clone handling|pull handling" + +supported_queries: + - semantic_blame + - semantic_diff + - symbol_evolution + - semantic_search + +phases: + - phase: 0 + name: "Context Discovery" + task: + - "Identify repository root." + - "Detect current HEAD commit and branch via Git (not via files)." + - "Detect presence of .git-ai and existing DSR files." + tools: + - "git rev-parse" + - "git show" + - "git log" + - "filesystem inspection" + artifacts: + - commit_hash + - repo_root + - dsr_directory_state + rules: + - "Do NOT assume linear history." + - "Do NOT parse Git internals manually." + - "Never infer Git topology from semantic data." + + - phase: 1 + name: "Commit Semantic Extraction" + task: + - "Given a specific commit, derive its semantic meaning." + tools: + - "git diff " + - "language parser → AST" + - "AST diff engine" + - "semantic rules engine" + artifacts: + - ast_diff + - symbol_level_operations + rules: + - "Only analyze the snapshot defined by the commit." + - "Do NOT inspect future commits." + - "Determinism required: same input → same output." + - "If the commit has multiple parents, do not infer merge topology from semantics." + + - phase: 2 + name: "DSR Generation (Canonical)" + task: + - "Generate a Deterministic Semantic Record for exactly one commit." + tools: + - "JSON / MsgPack serializer" + - "content hashing (optional)" + artifact_path: ".git-ai/dsr/.json" + required_fields: + - commit_hash + - affected_symbols + - ast_operations + - semantic_change_type + - summary + - risk_level + forbidden_fields: + - parent_commits + - branch_names + - timestamps_beyond_commit_time + - merge_topology + rules: + - "One commit → one DSR file." + - "File must be immutable once generated." + + - phase: 3 + name: "Cache / Index Materialization" + task: + - "Update local performance-oriented stores." + tools: + - "SQLite / CozoDB (AST graph, symbol DAG)" + - "LanceDB (embeddings)" + artifacts: + - ast_db + - symbol_graph_db + - embedding_index + rules: + - "Databases must be derivable from DSR + Git." + - "No semantic fact may exist only inside a database." + + - phase: 4 + name: "Versioning & Distribution" + task: + - "Ensure semantic artifacts are distributed via Git." + tools: + - "Git LFS" + - "pre-commit / pre-push hooks" + artifacts: + - lfs_tracked_dsr_files + - compressed_cache_snapshots_optional + rules: + - "DSR files must be Git-visible." + - "Distribution must not require a central service." + + - phase: 5 + name: "Clone / Pull / Checkout Handling" + task: + - "Handle repository state changes safely." + tools: + - "git clone" + - "git pull" + - "git checkout" + - "filesystem checks" + behavior: + on_clone: + - "Fetch DSR files via Git LFS." + - "If caches missing → rebuild from DSR." + on_checkout: + - "Query Git DAG for history." + - "Attach DSR data to commits dynamically." + rules: + - "Never persist checkout-specific state into DSR." + - "Git DAG traversal always comes first." + + - phase: 6 + name: "Query Execution (Read-Only)" + task: + - "Answer semantic queries." + tools: + - "Git DAG traversal" + - "DSR filtering" + - "AST / symbol DAG aggregation" + - "embedding similarity (optional)" + rules: + - "History traversal must start from Git DAG." + - "DSR only enriches nodes; never defines edges." diff --git a/test/e2e.test.js b/test/e2e.test.js index e13d9c6..0fbce27 100644 --- a/test/e2e.test.js +++ b/test/e2e.test.js @@ -86,8 +86,8 @@ test('git-ai works in Spring Boot and Vue repos', async () => { runOk('node', [CLI, 'ai', 'agent', 'install'], repo); assert.ok(runOk('node', [CLI, 'ai', 'agent', 'install', '--overwrite'], repo).status === 0); { - const skill = await fs.readFile(path.join(repo, '.trae', 'skills', 'git-ai-mcp', 'SKILL.md'), 'utf-8'); - const rule = await fs.readFile(path.join(repo, '.trae', 'rules', 'git-ai-mcp', 'RULE.md'), 'utf-8'); + const skill = await fs.readFile(path.join(repo, '.agents', 'skills', 'git-ai-mcp', 'SKILL.md'), 'utf-8'); + const rule = await fs.readFile(path.join(repo, '.agents', 'rules', 'git-ai-mcp', 'RULE.md'), 'utf-8'); assert.ok(skill.includes('git-ai-mcp')); assert.ok(rule.includes('git-ai-mcp')); } diff --git a/test/mcp.smoke.test.js b/test/mcp.smoke.test.js index 8372528..351cf75 100644 --- a/test/mcp.smoke.test.js +++ b/test/mcp.smoke.test.js @@ -37,7 +37,7 @@ async function createRepo(baseDir, name, files) { return repoDir; } -test('mcp server exposes set_repo and supports path arg', async () => { +test('mcp server supports atomic tool calls via path arg', async () => { const { Client } = await import('@modelcontextprotocol/sdk/client/index.js'); const { StdioClientTransport } = await import('@modelcontextprotocol/sdk/client/stdio.js'); @@ -82,7 +82,6 @@ test('mcp server exposes set_repo and supports path arg', async () => { assert.ok(toolNames.has('search_symbols')); assert.ok(toolNames.has('semantic_search')); assert.ok(toolNames.has('repo_map')); - assert.ok(toolNames.has('set_repo')); assert.ok(toolNames.has('get_repo')); assert.ok(toolNames.has('check_index')); assert.ok(toolNames.has('pack_index')); @@ -98,7 +97,7 @@ test('mcp server exposes set_repo and supports path arg', async () => { assert.ok(toolNames.has('ast_graph_chain')); { - const call = await client.callTool({ name: 'set_repo', arguments: { path: repoDir } }); + const call = await client.callTool({ name: 'get_repo', arguments: { path: repoDir } }); const text = String(call?.content?.[0]?.text ?? ''); const parsed = text ? JSON.parse(text) : null; assert.equal(parsed.ok, true); @@ -106,15 +105,7 @@ test('mcp server exposes set_repo and supports path arg', async () => { } { - const call = await client.callTool({ name: 'get_repo', arguments: {} }); - const text = String(call?.content?.[0]?.text ?? ''); - const parsed = text ? JSON.parse(text) : null; - assert.equal(parsed.ok, true); - assert.equal(await fs.realpath(parsed.repoRoot), repoRootReal); - } - - { - const call = await client.callTool({ name: 'check_index', arguments: {} }); + const call = await client.callTool({ name: 'check_index', arguments: { path: repoDir } }); const text = String(call?.content?.[0]?.text ?? ''); const parsed = text ? JSON.parse(text) : null; assert.equal(parsed.ok, true); @@ -125,6 +116,7 @@ test('mcp server exposes set_repo and supports path arg', async () => { const call = await client.callTool({ name: 'search_symbols', arguments: { + path: repoDir, query: 'hello', mode: 'substring', case_insensitive: true, @@ -141,6 +133,7 @@ test('mcp server exposes set_repo and supports path arg', async () => { const call = await client.callTool({ name: 'search_symbols', arguments: { + path: repoDir, query: 'hello', mode: 'substring', case_insensitive: true, @@ -158,7 +151,7 @@ test('mcp server exposes set_repo and supports path arg', async () => { } { - const call = await client.callTool({ name: 'semantic_search', arguments: { query: 'hello world', topk: 3 } }); + const call = await client.callTool({ name: 'semantic_search', arguments: { path: repoDir, query: 'hello world', topk: 3 } }); const text = String(call?.content?.[0]?.text ?? ''); const parsed = text ? JSON.parse(text) : null; assert.ok(parsed && Array.isArray(parsed.rows)); @@ -166,7 +159,7 @@ test('mcp server exposes set_repo and supports path arg', async () => { } { - const call = await client.callTool({ name: 'repo_map', arguments: { max_files: 5, max_symbols: 2 } }); + const call = await client.callTool({ name: 'repo_map', arguments: { path: repoDir, max_files: 5, max_symbols: 2 } }); const text = String(call?.content?.[0]?.text ?? ''); const parsed = text ? JSON.parse(text) : null; assert.ok(parsed && parsed.repo_map && parsed.repo_map.enabled === true); @@ -175,7 +168,7 @@ test('mcp server exposes set_repo and supports path arg', async () => { } { - const call = await client.callTool({ name: 'list_files', arguments: { pattern: 'src/**/*', limit: 50 } }); + const call = await client.callTool({ name: 'list_files', arguments: { path: repoDir, pattern: 'src/**/*', limit: 50 } }); const text = String(call?.content?.[0]?.text ?? ''); const parsed = text ? JSON.parse(text) : null; assert.ok(parsed && Array.isArray(parsed.files)); @@ -183,7 +176,7 @@ test('mcp server exposes set_repo and supports path arg', async () => { } { - const call = await client.callTool({ name: 'read_file', arguments: { file: 'src/foo.ts', start_line: 1, end_line: 20 } }); + const call = await client.callTool({ name: 'read_file', arguments: { path: repoDir, file: 'src/foo.ts', start_line: 1, end_line: 20 } }); const text = String(call?.content?.[0]?.text ?? ''); const parsed = text ? JSON.parse(text) : null; assert.ok(parsed && typeof parsed.text === 'string'); @@ -191,7 +184,7 @@ test('mcp server exposes set_repo and supports path arg', async () => { } { - const call = await client.callTool({ name: 'ast_graph_query', arguments: { query: "?[file] := *ast_symbol{ref_id, file, lang, name: 'Foo', kind, signature, start_line, end_line}" } }); + const call = await client.callTool({ name: 'ast_graph_query', arguments: { path: repoDir, query: "?[file] := *ast_symbol{ref_id, file, lang, name: 'Foo', kind, signature, start_line, end_line}" } }); const text = String(call?.content?.[0]?.text ?? ''); const parsed = text ? JSON.parse(text) : null; assert.ok(parsed && parsed.result && Array.isArray(parsed.result.rows)); @@ -199,7 +192,7 @@ test('mcp server exposes set_repo and supports path arg', async () => { } { - const call = await client.callTool({ name: 'ast_graph_find', arguments: { prefix: 'Fo', limit: 10 } }); + const call = await client.callTool({ name: 'ast_graph_find', arguments: { path: repoDir, prefix: 'Fo', limit: 10 } }); const text = String(call?.content?.[0]?.text ?? ''); const parsed = text ? JSON.parse(text) : null; assert.ok(parsed && parsed.result && Array.isArray(parsed.result.rows)); @@ -207,7 +200,7 @@ test('mcp server exposes set_repo and supports path arg', async () => { } { - const call = await client.callTool({ name: 'ast_graph_children', arguments: { id: 'src/foo.ts', as_file: true } }); + const call = await client.callTool({ name: 'ast_graph_children', arguments: { path: repoDir, id: 'src/foo.ts', as_file: true } }); const text = String(call?.content?.[0]?.text ?? ''); const parsed = text ? JSON.parse(text) : null; assert.ok(parsed && parsed.result && Array.isArray(parsed.result.rows)); @@ -215,7 +208,7 @@ test('mcp server exposes set_repo and supports path arg', async () => { } { - const call = await client.callTool({ name: 'ast_graph_refs', arguments: { name: 'helloWorld', limit: 50 } }); + const call = await client.callTool({ name: 'ast_graph_refs', arguments: { path: repoDir, name: 'helloWorld', limit: 50 } }); const text = String(call?.content?.[0]?.text ?? ''); const parsed = text ? JSON.parse(text) : null; assert.ok(parsed && parsed.result && Array.isArray(parsed.result.rows)); @@ -223,7 +216,7 @@ test('mcp server exposes set_repo and supports path arg', async () => { } { - const call = await client.callTool({ name: 'ast_graph_callers', arguments: { name: 'helloWorld', limit: 50 } }); + const call = await client.callTool({ name: 'ast_graph_callers', arguments: { path: repoDir, name: 'helloWorld', limit: 50 } }); const text = String(call?.content?.[0]?.text ?? ''); const parsed = text ? JSON.parse(text) : null; assert.ok(parsed && parsed.result && Array.isArray(parsed.result.rows)); @@ -231,7 +224,7 @@ test('mcp server exposes set_repo and supports path arg', async () => { } { - const call = await client.callTool({ name: 'ast_graph_chain', arguments: { name: 'run', direction: 'downstream', max_depth: 2, limit: 200 } }); + const call = await client.callTool({ name: 'ast_graph_chain', arguments: { path: repoDir, name: 'run', direction: 'downstream', max_depth: 2, limit: 200 } }); const text = String(call?.content?.[0]?.text ?? ''); const parsed = text ? JSON.parse(text) : null; assert.ok(parsed && parsed.result && Array.isArray(parsed.result.rows)); @@ -239,7 +232,7 @@ test('mcp server exposes set_repo and supports path arg', async () => { } { - const call = await client.callTool({ name: 'pack_index', arguments: { lfs: true } }); + const call = await client.callTool({ name: 'pack_index', arguments: { path: repoDir, lfs: true } }); const text = String(call?.content?.[0]?.text ?? ''); const parsed = text ? JSON.parse(text) : null; assert.equal(parsed.ok, true); @@ -249,7 +242,7 @@ test('mcp server exposes set_repo and supports path arg', async () => { { await fs.rm(path.join(repoDir, '.git-ai', 'lancedb'), { recursive: true, force: true }); - const call = await client.callTool({ name: 'unpack_index', arguments: {} }); + const call = await client.callTool({ name: 'unpack_index', arguments: { path: repoDir } }); const text = String(call?.content?.[0]?.text ?? ''); const parsed = text ? JSON.parse(text) : null; assert.equal(parsed.ok, true); @@ -260,33 +253,3 @@ test('mcp server exposes set_repo and supports path arg', async () => { await transport.close(); } }); - -test('mcp server supports --path on serve', async () => { - const { Client } = await import('@modelcontextprotocol/sdk/client/index.js'); - const { StdioClientTransport } = await import('@modelcontextprotocol/sdk/client/stdio.js'); - - const tmp = await fs.mkdtemp(path.join(os.tmpdir(), 'git-ai-mcp-')); - const repoDir = await createRepo(tmp, 'repo', { - 'src/foo.ts': 'export function foo() { return 1; }\n', - }); - const repoRootReal = await fs.realpath(repoDir); - - const transport = new StdioClientTransport({ - command: 'node', - args: [CLI, 'ai', 'serve', '--path', repoDir], - stderr: 'ignore', - }); - - const client = new Client({ name: 'git-ai-test', version: '0.0.0' }, { capabilities: {} }); - - try { - await client.connect(transport); - const call = await client.callTool({ name: 'get_repo', arguments: {} }); - const text = String(call?.content?.[0]?.text ?? ''); - const parsed = text ? JSON.parse(text) : null; - assert.equal(parsed.ok, true); - assert.equal(await fs.realpath(parsed.repoRoot), repoRootReal); - } finally { - await transport.close(); - } -});