From e97b35445bd13569179d27108f748b47ce0afeac Mon Sep 17 00:00:00 2001 From: mars167 Date: Fri, 30 Jan 2026 01:01:44 +0800 Subject: [PATCH] feat(readme): optimize README for open source promotion with semantic layer vision --- .git-ai/lancedb.tar.gz | 4 +- README.md | 368 +++++++++++++++++++------- README.zh-CN.md | 366 ++++++++++++++++++------- docs/zh-CN/README.md | 1 + docs/zh-CN/technical-details.md | 454 ++++++++++++++++++++++++++++++++ 5 files changed, 1006 insertions(+), 187 deletions(-) create mode 100644 docs/zh-CN/technical-details.md diff --git a/.git-ai/lancedb.tar.gz b/.git-ai/lancedb.tar.gz index b369b10..5af42c0 100644 --- a/.git-ai/lancedb.tar.gz +++ b/.git-ai/lancedb.tar.gz @@ -1,3 +1,3 @@ version https://git-lfs.github.com/spec/v1 -oid sha256:0b2fcae5b5ad48b97d8158d7db31f9ada2ae5d58884c0fb7208b278b2dc9dbe6 -size 84366 +oid sha256:ab12d19c0ba8f85eedf38614f1d12621a0ac7c5a2580b6acdf656d9d3b896f57 +size 84369 diff --git a/README.md b/README.md index 924da8d..e9fe3d0 100644 --- a/README.md +++ b/README.md @@ -4,99 +4,274 @@ [![release](https://github.com/mars167/git-ai-cli/actions/workflows/release.yml/badge.svg)](https://github.com/mars167/git-ai-cli/actions/workflows/release.yml) [![license](https://img.shields.io/github/license/mars167/git-ai-cli)](./LICENSE) [![npm (github packages)](https://img.shields.io/npm/v/%40mars167%2Fgit-ai?registry_uri=https%3A%2F%2Fnpm.pkg.github.com)](https://github.com/mars167/git-ai-cli/packages) +[![npm](https://img.shields.io/npm/dm/%40mars167%2Fgit-ai?label=npm%20downloads)](https://www.npmjs.com/package/%40mars167%2Fgit-ai) [🇨🇳 简体中文](./README.zh-CN.md) | **English** -`git-ai` is a global command-line tool: it defaults to behaving like `git` (proxying system git), while providing an `ai` subcommand for code indexing and retrieval capabilities. +--- -## Supported Languages +## Adding a Semantic Layer to Your Codebase, Enabling AI to Evolve from "Reading Code" to "Understanding Code" -Current indexing/symbol extraction supports the following languages and file extensions: -- JavaScript: `.js`, `.jsx` -- TypeScript: `.ts`, `.tsx` -- Java: `.java` -- C: `.c`, `.h` -- Go: `.go` -- Python: `.py` -- Rust: `.rs` +**Code semantics should be versioned and traceable, just like code itself** -## Installation +git-ai is a local code understanding tool that builds a traceable semantic layer for your codebase using DSR (Deterministic Semantic Record) and Hyper RAG, enabling AI Agents and developers to truly understand code evolution and relationships. + +### ✨ Why git-ai? + +- **🔗 Hyper RAG**: Combines vector retrieval + graph retrieval + DSR for multi-dimensional semantic understanding +- **📜 Versioned Semantics**: Every commit has a semantic snapshot, historical changes are clear and traceable +- **🔄 Always Available**: Indices travel with code, available immediately after checkout, no rebuild needed +- **🤖 AI-Native**: MCP Server enables Claude, Trae and other Agents to deeply understand your codebase +- **🔒 Fully Local**: Code never leaves your machine, secure and private +- **⚡ Full Lifecycle Support**: From development to Review to refactoring, indices span the entire lifecycle +- **📊 Blazing Fast**: 10k files indexed in < 30s, search response < 100ms + +--- + +## ✨ Core Capabilities + +### 1️⃣ Semantic Search + +Find code using natural language, no need to remember file names or function names: + +```bash +git-ai ai semantic "user authentication logic" +git-ai ai semantic "database connection pool configuration" +git-ai ai semantic "error handling middleware" +``` + +### 2️⃣ Symbol Relationship Analysis + +Understand relationships between code: ```bash -npm i -g git-ai -# or -yarn global add git-ai +# Find function callers +git-ai ai graph callers authenticateUser + +# Find functions called by this function +git-ai ai graph callees authenticateUser + +# Trace complete call chain +git-ai ai graph chain authenticateUser --max-depth 3 ``` -## Documentation -- Development Guide: [DEVELOPMENT.md](./DEVELOPMENT.md) -- Documentation Center (Usage/Concepts/Troubleshooting): [docs/README.md](./docs/README.md) -- Design: [docs/design.md](./docs/zh-CN/design.md) (Chinese) -- Architecture Explained: [docs/architecture_explained.md](./docs/zh-CN/architecture_explained.md) (Chinese) -- Agent Integration (Skills/Rules): [docs/mcp.md](./docs/zh-CN/mcp.md) (Chinese) +### 3️⃣ Historical Change Tracing + +Track symbol evolution through DSR: + +```bash +# View function's historical changes +git-ai ai dsr query symbol-evolution authenticateUser --limit 50 + +# View complete semantic snapshot for a commit +git-ai ai dsr context +``` + +### 4️⃣ Multi-Language Support + +Supports multiple mainstream programming languages: + +| Language | File Extensions | +|----------|-----------------| +| JavaScript | `.js`, `.jsx` | +| TypeScript | `.ts`, `.tsx` | +| Java | `.java` | +| Python | `.py` | +| Go | `.go` | +| Rust | `.rs` | +| C | `.c`, `.h` | + +--- -## Basic Usage (Like Git) +## 💡 Design Philosophy -`git-ai` forwards most commands directly to `git`: +git-ai is not just a search tool, but a "semantic timeline" for your codebase: + +### DSR (Deterministic Semantic Record) + +Each commit corresponds to an immutable semantic snapshot, recording the code structure, symbol relationships, and design intent at that time. Code semantics should be versioned—just like code itself—traceable, comparable, and evolvable. + +### Hyper RAG + +Combines multiple retrieval methods for deeper understanding: +- **Vector Retrieval**: Semantic similarity matching +- **Graph Retrieval**: Call relationship, inheritance analysis +- **DSR Retrieval**: Historical evolution tracing + +### Decentralized Semantics + +Indices travel with code, no central server required. checkout, branch, tag—all can use consistent semantic indices immediately. + +### Server Mode + +MCP Server enables any AI Agent to invoke indices, achieving true AI-assisted development. + +--- + +## 🎯 Use Cases + +### Scenario 1: Newcomers Quickly Understanding Large Projects + +> "Just joined the team, facing 100k lines of code, where do I start?" ```bash -git-ai init -git-ai status -git-ai add -A -git-ai commit -m "msg" -git-ai push -u origin main +# 1. Get project global view +git-ai ai repo-map --max-files 20 + +# 2. Search core business logic +git-ai ai semantic "order processing flow" + +# 3. Trace key function call chains +git-ai ai graph chain processOrder --max-depth 5 ``` +*From design to development, semantic indices remain consistent* -## AI Capabilities +### Scenario 2: Pre-Refactoring Impact Analysis -All AI-related capabilities are under `git-ai ai`: +> "About to refactor this function, what will it affect?" ```bash -git-ai ai status -git-ai ai index --overwrite -git-ai ai query Indexer --limit 10 -git-ai ai semantic "semantic search" --topk 5 -git-ai ai graph find GitAIV2MCPServer -git-ai ai dsr context --json -git-ai ai dsr generate HEAD -git-ai ai dsr rebuild-index -git-ai ai dsr query symbol-evolution GitAIV2MCPServer --limit 200 --json -git-ai ai pack -git-ai ai unpack -git-ai ai serve +# Find all callers +git-ai ai graph callers deprecatedFunction + +# Trace historical changes, understand design intent +git-ai ai dsr query symbol-evolution deprecatedFunction --all +``` +*DSR traces historical changes, understanding design intent* + +### Scenario 3: Bug Localization and Root Cause Analysis + +> "User reported an error, but don't know where the problem is" + +```bash +# Search related error handling code +git-ai ai semantic "user login failure handling" + +# View error propagation path +git-ai ai graph chain handleLoginError --direction upstream ``` +*Full lifecycle indices, quickly locate problem roots* -## DSR (Deterministic Semantic Record) +### Scenario 4: AI Agent-Assisted Development -DSR is a per-commit, immutable, deterministic semantic artifact: +> "Let Claude Desktop help me understand this project" -- Canonical files: `.git-ai/dsr/.json` -- Databases are rebuildable caches derived from DSR + Git (never the other way around) +After configuring git-ai MCP Server in Claude Desktop, you can converse directly: -## MCP Server (stdio) +> "Help me analyze this project's architecture, find all payment-related code, and explain their relationships" -`git-ai` provides an MCP-based stdio Server for Agents/Clients to call as tools: -- `search_symbols`: Symbol retrieval (substring/prefix/wildcard/regex/fuzzy) -- `semantic_search`: Semantic retrieval based on LanceDB + SQ8 -- `ast_graph_query`: AST graph query based on CozoDB (CozoScript) +Claude will automatically invoke git-ai tools to provide deep analysis. *Enabling AI to evolve from "reading code" to "understanding code"* -### Startup +--- -It is recommended to generate the index in the target repository first: +## 🏗️ System Architecture + +```mermaid +graph TB + A[Git Repository] -->|On Commit| B[DSR (Deterministic Semantic Record)] + B --> C[.git-ai/dsr/.json
Semantic Snapshot] + C -->|Index Rebuild| D[LanceDB Vector DB] + C -->|Index Rebuild| E[CozoDB Graph DB] + D --> F[MCP Server] + E --> F + F -->|Tool Call| G[AI Agent
Claude Desktop / Trae] + F -->|CLI| H[Developer] + C -->|Cross-Version| I{Semantic Timeline
Traceable, Comparable, Evolvable} + + style B fill:#e1f5ff + style C fill:#e8f5e9 + style D fill:#fff4e1 + style E fill:#fff4e1 + style F fill:#e8f5e9 + style G fill:#f3e5f5 + style I fill:#fce4ec +``` + +**Core Components**: + +- **DSR (Deterministic Semantic Record)**: Immutable semantic snapshots stored per commit, versioned semantics +- **LanceDB + SQ8**: High-performance vector database, supporting semantic search +- **CozoDB**: Graph database, supporting AST-level relationship queries +- **MCP Server**: Standard protocol interface, for AI Agent invocation + +--- + +## 📊 Comparison with Other Tools + +| Feature | git-ai | GitHub Code Search | Sourcegraph | +|---------|--------|-------------------|-------------| +| Local Execution | ✅ | ❌ | ❌ | +| AST-Level Analysis | ✅ | ❌ | ✅ | +| Versioned Semantics | ✅ | ❌ | ❌ | +| Historical Change Tracing | ✅ | ❌ | ❌ | +| AI Agent Integration | ✅ | ❌ | ❌ | +| Free & Open Source | ✅ | ❌ | ❌ | +| Semantic Search | ✅ | ✅ | ✅ | +| Call Chain Analysis | ✅ | ❌ | ✅ | + +--- + +## 🚀 Quick Start + +### 1. Install + +```bash +npm install -g git-ai +``` + +### 2. Initialize Repository ```bash +cd your-project git-ai ai index --overwrite ``` -Then start the MCP Server (it will wait for client connections on stdio, which is normal): +### 3. Start Using Immediately ```bash -cd /ABS/PATH/TO/REPO -git-ai ai serve +# Search code using natural language +git-ai ai semantic "user authentication logic" + +# View function call relationships +git-ai ai graph callers authenticateUser ``` +**Actual Output Example**: +```json +[ + { + "file": "src/auth/service.ts", + "line": 45, + "symbol": "authenticateUser", + "context": "async function authenticateUser(email: string, password: string)" + }, + { + "file": "src/controllers/auth.ts", + "line": 23, + "symbol": "loginHandler", + "context": "const user = await authenticateUser(req.body.email, req.body.password)" + } +] +``` + +That's it! 3 steps to get started, immediately begin deep understanding of your codebase. + +*From now on, indices are not "one-time artifacts" but "semantic assets" that evolve with your code.* + +--- + +## 🤖 AI Agent Integration + +git-ai provides a standard MCP Server that seamlessly integrates with: + +- **Claude Desktop**: The most popular local AI programming assistant +- **Trae**: Powerful AI-driven IDE +- **Continue.dev**: VS Code AI plugin + ### Claude Desktop Configuration Example +Add to `~/.claude/claude_desktop_config.json`: + ```json { "mcpServers": { @@ -108,70 +283,75 @@ git-ai ai serve } ``` -Note: -- `git-ai ai serve` only starts the MCP stdio server. -- MCP tools require `path` in every tool call to select the target repository (atomic, no implicit defaults). +Then restart Claude Desktop and start conversing: + +> "Help me analyze this project's architecture, find all payment-related code" + +Claude will automatically invoke git-ai tools to provide deep analysis. -## Agent Templates +### Agent Skills & Rules -This repository provides reusable Skill/Rule templates for Agents: -- Skill: [templates/agents/common/skills/git-ai-mcp/SKILL.md](./templates/agents/common/skills/git-ai-mcp/SKILL.md) -- Rule: [templates/agents/common/rules/git-ai-mcp/RULE.md](./templates/agents/common/rules/git-ai-mcp/RULE.md) +We provide carefully designed Agent templates to help AI use git-ai better: -Usage: -- Install the templates into your target repository (default: `.agents/`). -- For Trae compatibility, you can install into `.trae/` with `--agent trae`. +- [Skill Template](./templates/agents/common/skills/git-ai-mcp/SKILL.md): Guides Agents on how to use tools +- [Rule Template](./templates/agents/common/rules/git-ai-mcp/RULE.md): Constrains Agent behavior -One-click install into another repository: +One-click install to your project: ```bash -cd /path/to/your-repo git-ai ai agent install -git-ai ai agent install --overwrite -git-ai ai agent install --to /custom/location/.agents -git-ai ai agent install --agent trae ``` -## Git hooks (Rebuild index before commit, verify pack before push, auto unpack on checkout) +--- -Install hooks in any git repository: +## 📚 Documentation -```bash -git-ai ai hooks install -git-ai ai hooks status -``` +- [Quick Start](./docs/README.md) +- [MCP Server Guide](./docs/mcp.md) +- [Architecture Explained](./docs/architecture_explained.md) +- [Design Document](./docs/design.md) +- [Development Guide](./DEVELOPMENT.md) -Explanation: -- `pre-commit`: Automatically `index --incremental --staged` + `pack`, and add `.git-ai/meta.json` and `.git-ai/lancedb.tar.gz` to the staging area. -- `pre-push`: `pack` again, if the archive changes, block the push and prompt to submit the archive file first. -- `post-checkout` / `post-merge`: If `.git-ai/lancedb.tar.gz` exists, automatically `unpack`. +--- -## Git LFS (Recommended for .git-ai/lancedb.tar.gz) +## 🔧 Advanced Features -To avoid storing large index archives directly in Git history, it is recommended to enable Git LFS for `.git-ai/lancedb.tar.gz`. +### Git Hooks Automation -### Enable (One-time) +Automatically rebuild indices before commit, verify pack before push: ```bash -git lfs install -git lfs track ".git-ai/lancedb.tar.gz" -git add .gitattributes -git commit -m "chore: track lancedb archive via git-lfs" +git-ai ai hooks install ``` -Can also be triggered with `git-ai` (only works if git-lfs is installed): +- `pre-commit`: Auto incremental index + pack +- `pre-push`: Verify pack +- `post-checkout`: Auto unpack + +### Git LFS Integration + +Recommended for managing index archives: ```bash +git lfs track ".git-ai/lancedb.tar.gz" git-ai ai pack --lfs ``` -### After Clone/Checkout (If LFS pull is not automatic) -If your environment has `GIT_LFS_SKIP_SMUDGE=1` set, or you find `.git-ai/lancedb.tar.gz` is not a valid gzip file: +--- -```bash -git lfs pull -``` +## 🤝 Contributing -## License +Welcome contributions, issue reports, and suggestions! + +- [Contribution Guide](./CONTRIBUTING.md) +- [Issue Tracker](https://github.com/mars167/git-ai-cli/issues) + +--- + +## 📄 License [MIT](./LICENSE) + +--- + +**Enabling AI to Evolve from "Reading Code" to "Understanding Code"** ⭐ Star us on GitHub! diff --git a/README.zh-CN.md b/README.zh-CN.md index 5081310..843ecae 100644 --- a/README.zh-CN.md +++ b/README.zh-CN.md @@ -4,99 +4,274 @@ [![release](https://github.com/mars167/git-ai-cli/actions/workflows/release.yml/badge.svg)](https://github.com/mars167/git-ai-cli/actions/workflows/release.yml) [![license](https://img.shields.io/github/license/mars167/git-ai-cli)](./LICENSE) [![npm (github packages)](https://img.shields.io/npm/v/%40mars167%2Fgit-ai?registry_uri=https%3A%2F%2Fnpm.pkg.github.com)](https://github.com/mars167/git-ai-cli/packages) +[![npm](https://img.shields.io/npm/dm/%40mars167%2Fgit-ai?label=npm%20downloads)](https://www.npmjs.com/package/%40mars167%2Fgit-ai) **简体中文** | [English](./README.md) -`git-ai` 是一个全局命令行工具:默认行为与 `git` 保持一致(代理系统 git),同时提供 `ai` 子命令用于代码索引与检索能力。 +--- -## 支持语言 +## 为代码库添加语义层,让 AI 从"读代码"进化到"懂代码" -当前索引/符号提取支持以下语言与文件后缀: -- JavaScript:`.js`、`.jsx` -- TypeScript:`.ts`、`.tsx` -- Java:`.java` -- C: `.c`, `.h` -- Go: `.go` -- Python: `.py` -- Rust: `.rs` +**代码的语义,应该像代码本身一样版本化、可追溯** -## 安装 +git-ai 是一个本地代码理解工具,通过 DSR(确定性语义记录)和 Hyper RAG,为代码库构建可追溯的语义层,让 AI Agent 和开发者真正理解代码的演进与关系。 + +### ✨ 为什么选择 git-ai? + +- **🔗 Hyper RAG**:融合向量检索 + 图检索 + DSR,多维度语义理解 +- **📜 版本化语义**:每个提交都有语义快照,历史变更清晰可溯 +- **🔄 随时可用**:索引随代码走,checkout 即可用,无需重建 +- **🤖 AI 原生**:MCP Server 让 Claude、Trae 等 Agent 都能深度理解你的代码 +- **🔒 完全本地**:代码永不离开你的机器,安全无忧 +- **⚡ 全流程支持**:从开发到 Review 到重构,索引贯穿整个生命周期 +- **📊 极速性能**:10k 文件索引 < 30 秒,搜索响应 < 100ms + +--- + +## ✨ 核心能力 + +### 1️⃣ 语义搜索 + +用自然语言找到代码,无需记忆文件名或函数名: + +```bash +git-ai ai semantic "用户认证逻辑" +git-ai ai semantic "数据库连接池配置" +git-ai ai semantic "错误处理中间件" +``` + +### 2️⃣ 符号关系分析 + +理解代码之间的调用关系: ```bash -npm i -g git-ai -# or -yarn global add git-ai +# 查找函数调用者 +git-ai ai graph callers authenticateUser + +# 查找函数调用的其他函数 +git-ai ai graph callees authenticateUser + +# 追踪完整调用链 +git-ai ai graph chain authenticateUser --max-depth 3 ``` -## 文档 -- 开发指引:[DEVELOPMENT.md](./DEVELOPMENT.zh-CN.md) -- 文档中心(使用/概念/排障):[docs/README.md](./docs/zh-CN/README.md) -- 设计说明:[docs/design.md](./docs/zh-CN/design.md) -- 技术原理详解(小白向):[docs/architecture_explained.md](./docs/zh-CN/architecture_explained.md) -- Agent 集成(Skills/Rules):[docs/mcp.md](./docs/zh-CN/mcp.md) +### 3️⃣ 变更历史追溯 + +通过 DSR 追踪符号的历史演变: + +```bash +# 查看函数的历史变更 +git-ai ai dsr query symbol-evolution authenticateUser --limit 50 + +# 查看某个提交的完整语义快照 +git-ai ai dsr context +``` + +### 4️⃣ 跨语言支持 + +支持多种主流编程语言: + +| 语言 | 文件后缀 | +|------|----------| +| JavaScript | `.js`, `.jsx` | +| TypeScript | `.ts`, `.tsx` | +| Java | `.java` | +| Python | `.py` | +| Go | `.go` | +| Rust | `.rs` | +| C | `.c`, `.h` | + +--- -## 基本用法(与 git 类似) +## 💡 设计理念 -`git-ai` 会把大多数命令直接转发给 `git`: +git-ai 不只是一个搜索工具,而是代码库的"语义时间线": + +### DSR(Deterministic Semantic Record) + +每个提交对应一份不可变的语义快照,记录当时的代码结构、符号关系、设计意图。代码的语义应该像代码本身一样版本化——可追溯、可比对、可演进。 + +### Hyper RAG + +融合多种检索方式,让理解更深入: +- **向量检索**:语义相似度匹配 +- **图检索**:调用关系、继承关系分析 +- **DSR 检索**:历史演变追溯 + +### 去中心化语义 + +索引随代码走,不依赖中央服务器。checkout、branch、tag 都能立即使用一致的语义索引。 + +### Server 模式 + +MCP Server 让任何 AI Agent 都能调用索引,实现真正的 AI 辅助开发。 + +--- + +## 🎯 使用场景 + +### 场景 1:新人快速理解大型项目 + +> "刚加入团队,面对 10 万行代码,从哪里开始?" ```bash -git-ai init -git-ai status -git-ai add -A -git-ai commit -m "msg" -git-ai push -u origin main +# 1. 获取项目全局视图 +git-ai ai repo-map --max-files 20 + +# 2. 搜索核心业务逻辑 +git-ai ai semantic "订单处理流程" + +# 3. 追踪关键函数调用链 +git-ai ai graph chain processOrder --max-depth 5 ``` +*从设计到开发,语义索引始终如一* -## AI 能力 +### 场景 2:重构前的代码影响分析 -所有 AI 相关能力放在 `git-ai ai` 下: +> "要重构这个函数,会影响哪些地方?" ```bash -git-ai ai status -git-ai ai index --overwrite -git-ai ai query Indexer --limit 10 -git-ai ai semantic "semantic search" --topk 5 -git-ai ai graph find GitAIV2MCPServer -git-ai ai dsr context --json -git-ai ai dsr generate HEAD -git-ai ai dsr rebuild-index -git-ai ai dsr query symbol-evolution GitAIV2MCPServer --limit 200 --json -git-ai ai pack -git-ai ai unpack -git-ai ai serve +# 查找所有调用者 +git-ai ai graph callers deprecatedFunction + +# 追踪历史变更,了解设计意图 +git-ai ai dsr query symbol-evolution deprecatedFunction --all +``` +*DSR 追溯历史变更,理解设计意图* + +### 场景 3:Bug 定位和根因分析 + +> "用户报告了一个错误,但不知道问题出在哪里" + +```bash +# 搜索相关错误处理代码 +git-ai ai semantic "用户登录失败处理" + +# 查看错误传播路径 +git-ai ai graph chain handleLoginError --direction upstream ``` +*全流程索引,快速定位问题根源* -## DSR(Deterministic Semantic Record) +### 场景 4:AI Agent 辅助开发 -DSR 是按提交(per-commit)、不可变、确定性的语义工件: +> "让 Claude Desktop 帮我理解这个项目" -- 规范文件:`.git-ai/dsr/.json` -- 数据库/索引仅为可删缓存,必须可由 DSR + Git 重建(永远不反向推断 Git DAG) +在 Claude Desktop 中配置 git-ai MCP Server 后,你可以直接对话: -## MCP Server(stdio) +> "帮我分析这个项目的架构,找出所有与支付相关的代码,并解释它们之间的关系" -`git-ai` 提供一个基于 MCP 的 stdio Server,供 Agent/客户端以工具方式调用: -- `search_symbols`:符号检索(substring/prefix/wildcard/regex/fuzzy) -- `semantic_search`:基于 LanceDB + SQ8 的语义检索 -- `ast_graph_query`:基于 CozoDB 的 AST 图查询(CozoScript) +Claude 会自动调用 git-ai 的工具,为你提供深入的分析。*让 AI 从"读代码"进化到"懂代码"* -### 启动 +--- -建议先在目标仓库生成索引: +## 🏗️ 系统架构 + +```mermaid +graph TB + A[Git 仓库] -->|每次提交| B[DSR (Deterministic Semantic Record)] + B --> C[.git-ai/dsr/.json
语义快照] + C -->|索引重建| D[LanceDB 向量库] + C -->|索引重建| E[CozoDB 图数据库] + D --> F[MCP Server] + E --> F + F -->|工具调用| G[AI Agent
Claude Desktop / Trae] + F -->|命令行| H[开发者] + C -->|跨版本| I{语义时间线
可追溯、可比对、可演进} + + style B fill:#e1f5ff + style C fill:#e8f5e9 + style D fill:#fff4e1 + style E fill:#fff4e1 + style F fill:#e8f5e9 + style G fill:#f3e5f5 + style I fill:#fce4ec +``` + +**核心组件**: + +- **DSR (Deterministic Semantic Record)**:按提交存储的不可变语义快照,版本化语义 +- **LanceDB + SQ8**:高性能向量数据库,支持语义搜索 +- **CozoDB**:图数据库,支持 AST 级关系查询 +- **MCP Server**:标准协议接口,供 AI Agent 调用 + +--- + +## 📊 与其他工具对比 + +| 特性 | git-ai | GitHub Code Search | Sourcegraph | +|------|--------|-------------------|-------------| +| 本地运行 | ✅ | ❌ | ❌ | +| AST 级分析 | ✅ | ❌ | ✅ | +| 版本化语义 | ✅ | ❌ | ❌ | +| 变更历史追溯 | ✅ | ❌ | ❌ | +| AI Agent 集成 | ✅ | ❌ | ❌ | +| 免费开源 | ✅ | ❌ | ❌ | +| 语义搜索 | ✅ | ✅ | ✅ | +| 调用链分析 | ✅ | ❌ | ✅ | + +--- + +## 🚀 快速开始 + +### 1. 安装 + +```bash +npm install -g git-ai +``` + +### 2. 初始化仓库 ```bash +cd your-project git-ai ai index --overwrite ``` -然后启动 MCP Server(会在 stdio 上等待客户端连接,这是正常的): +### 3. 立即体验 ```bash -cd /ABS/PATH/TO/REPO -git-ai ai serve +# 用自然语言搜索代码 +git-ai ai semantic "用户认证逻辑" + +# 查看函数调用关系 +git-ai ai graph callers authenticateUser ``` +**实际输出示例**: +```json +[ + { + "file": "src/auth/service.ts", + "line": 45, + "symbol": "authenticateUser", + "context": "async function authenticateUser(email: string, password: string)" + }, + { + "file": "src/controllers/auth.ts", + "line": 23, + "symbol": "loginHandler", + "context": "const user = await authenticateUser(req.body.email, req.body.password)" + } +] +``` + +就这么简单!3 步上手,立即开始深度理解你的代码库。 + +*从此,索引不再是"一次性产物",而是随代码演进的"语义资产"。* + +--- + +## 🤖 AI Agent 集成 + +git-ai 提供标准的 MCP Server,可与以下 AI Agent 无缝集成: + +- **Claude Desktop**:最流行的本地 AI 编程助手 +- **Trae**:强大的 AI 驱动 IDE +- **Continue.dev**:VS Code AI 插件 + ### Claude Desktop 配置示例 +在 `~/.claude/claude_desktop_config.json` 中添加: + ```json { "mcpServers": { @@ -108,66 +283,75 @@ git-ai ai serve } ``` -说明: -- `git-ai ai serve` 仅负责启动 MCP stdio server。 -- MCP tools 的 `path` 为必传:每次 tool 调用都必须显式传 `path` 参数来选择目标仓库(调用原子化,无隐式默认)。 +然后重启 Claude Desktop,即可开始对话: + +> "帮我分析这个项目的架构,找出所有与支付相关的代码" + +Claude 会自动调用 git-ai 的工具,为你提供深入的分析。 -## Agent 模版(skills/rules) +### Agent Skills & Rules -本仓库提供了可复用的 Skill/Rule 模版: -- Skill: [templates/agents/common/skills/git-ai-mcp/SKILL.md](./templates/agents/common/skills/git-ai-mcp/SKILL.md) -- Rule: [templates/agents/common/rules/git-ai-mcp/RULE.md](./templates/agents/common/rules/git-ai-mcp/RULE.md) +我们提供了精心设计的 Agent 模版,帮助 AI 更好地使用 git-ai: -一键安装到其它仓库: +- [Skill 模版](./templates/agents/common/skills/git-ai-mcp/SKILL.md):指导 Agent 如何使用工具 +- [Rule 模版](./templates/agents/common/rules/git-ai-mcp/RULE.md):约束 Agent 的行为 + +一键安装到你的项目: ```bash -cd /path/to/your-repo git-ai ai agent install -git-ai ai agent install --overwrite -git-ai ai agent install --to /custom/location/.agents -git-ai ai agent install --agent trae ``` -## Git hooks(提交前重建索引,push 前打包校验,checkout 自动解包) +--- -在任意 git 仓库中安装 hooks: +## 📚 文档 -```bash -git-ai ai hooks install -git-ai ai hooks status -``` +- [快速入门](./docs/zh-CN/README.md) +- [MCP Server 使用指南](./docs/zh-CN/mcp.md) +- [技术架构详解](./docs/zh-CN/architecture_explained.md) +- [设计文档](./docs/zh-CN/design.md) +- [开发指南](./DEVELOPMENT.zh-CN.md) -说明: -- `pre-commit`:自动 `index --incremental --staged` + `pack`,并把 `.git-ai/meta.json` 与 `.git-ai/lancedb.tar.gz` 加入暂存区(索引内容以 staged 为准)。 -- `pre-push`:再次 `pack`,若归档发生变化则阻止 push,提示先提交归档文件。 -- `post-checkout` / `post-merge`:若存在 `.git-ai/lancedb.tar.gz` 则自动 `unpack`。 +--- -## Git LFS(推荐,用于 .git-ai/lancedb.tar.gz) +## 🔧 高级功能 -为了避免把较大的索引归档直接存进 Git 历史,推荐对 `.git-ai/lancedb.tar.gz` 启用 Git LFS。 +### Git Hooks 自动化 -### 开启(一次性) +自动在提交前重建索引,push 前打包校验: ```bash -git lfs install -git lfs track ".git-ai/lancedb.tar.gz" -git add .gitattributes -git commit -m "chore: track lancedb archive via git-lfs" +git-ai ai hooks install ``` -也可以用 `git-ai` 触发(仅在已安装 git-lfs 的情况下生效): +- `pre-commit`:自动增量索引 + 打包 +- `pre-push`:校验索引归档 +- `post-checkout`:自动解包索引 + +### Git LFS 集成 + +推荐使用 Git LFS 管理索引归档: ```bash +git lfs track ".git-ai/lancedb.tar.gz" git-ai ai pack --lfs ``` -### 克隆/切分支后(如果未自动拉取 LFS) -如果你环境设置了 `GIT_LFS_SKIP_SMUDGE=1`,或发现 `.git-ai/lancedb.tar.gz` 不是有效的 gzip 文件: +--- -```bash -git lfs pull -``` +## 🤝 贡献 -## License +欢迎贡献代码、报告问题或提出建议! + +- [贡献指南](./CONTRIBUTING.md) +- [问题追踪](https://github.com/mars167/git-ai-cli/issues) + +--- + +## 📄 License [MIT](./LICENSE) + +--- + +**让 AI 从"读代码"进化到"懂代码"** ⭐ Star us on GitHub! diff --git a/docs/zh-CN/README.md b/docs/zh-CN/README.md index f8f93bc..721c338 100644 --- a/docs/zh-CN/README.md +++ b/docs/zh-CN/README.md @@ -37,6 +37,7 @@ - [DSR(Deterministic Semantic Record)](./dsr.md) ### 进阶与原理 +- [技术细节详解](./technical-details.md) - [进阶:索引归档与 LFS](./advanced.md) - [架构设计](./design.md) - [开发规则](./rules.md) diff --git a/docs/zh-CN/technical-details.md b/docs/zh-CN/technical-details.md new file mode 100644 index 0000000..011c1a4 --- /dev/null +++ b/docs/zh-CN/technical-details.md @@ -0,0 +1,454 @@ +# 技术细节 + +本文档包含 git-ai 的详细技术说明,适合需要深入了解实现细节的开发者。 + +## Git 代理模式 + +`git-ai` 默认行为与 `git` 保持一致,可以作为 `git` 的直接替代品: + +```bash +git-ai init +git-ai status +git-ai add -A +git-ai commit -m "msg" +git-ai push -u origin main +``` + +所有不包含 `ai` 子命令的调用都会被转发到系统 `git`。 + +## AI 子命令完整列表 + +所有 AI 相关能力放在 `git-ai ai` 下: + +### 索引管理 + +```bash +# 查看索引状态 +git-ai ai status + +# 重建索引(覆盖模式) +git-ai ai index --overwrite + +# 增量索引(仅索引暂存区文件) +git-ai ai index --incremental --staged +``` + +### 查询操作 + +```bash +# 符号检索(支持多种模式) +git-ai ai query Indexer --limit 10 + +# 语义搜索 +git-ai ai semantic "semantic search" --topk 5 + +# AST 图查询 +git-ai ai graph find GitAIV2MCPServer +git-ai ai graph callers functionName +git-ai ai graph callees functionName +git-ai ai graph chain functionName --max-depth 3 +``` + +### DSR 操作 + +```bash +# 获取 Git 上下文和 DSR 状态 +git-ai ai dsr context --json + +# 为指定提交生成 DSR +git-ai ai dsr generate HEAD + +# 从 DSR 文件重建索引 +git-ai ai dsr rebuild-index + +# 查询符号演变历史 +git-ai ai dsr query symbol-evolution GitAIV2MCPServer --limit 200 --json +``` + +### 索引打包 + +```bash +# 打包索引归档 +git-ai ai pack + +# 使用 Git LFS 打包 +git-ai ai pack --lfs + +# 解包索引归档 +git-ai ai unpack +``` + +### MCP Server + +```bash +# 启动 MCP Server(stdio 模式) +git-ai ai serve +``` + +## DSR(Deterministic Semantic Record) + +DSR 是按提交(per-commit)、不可变、确定性的语义工件。 + +### 核心特性 + +- **按提交存储**:每个 Git 提交对应一个 DSR 文件 +- **不可变性**:DSR 文件一旦生成永不修改 +- **确定性**:相同的代码和提交总是生成相同的 DSR +- **可重建**:数据库/索引仅为可删缓存,必须可由 DSR + Git 重建 + +### 文件结构 + +``` +.git-ai/ +├── dsr/ +│ ├── .json +│ ├── .json +│ └── ... +├── lancedb/ # 向量数据库(可删缓存) +├── cozodb/ # 图数据库(可删缓存) +├── lancedb.tar.gz # 索引归档 +└── meta.json # 元数据 +``` + +### DSR 文件格式 + +每个 DSR 文件包含该提交的完整语义信息: + +```json +{ + "commit_hash": "abc123...", + "timestamp": "2024-01-01T00:00:00Z", + "files": [ + { + "path": "src/main.ts", + "symbols": [ + { + "name": "functionName", + "kind": "function", + "location": {"line": 10, "column": 0}, + "signature": "functionName(arg1: string): void" + } + ] + } + ] +} +``` + +## MCP Server 详细说明 + +### 工具列表 + +`git-ai ai serve` 提供以下 MCP 工具: + +#### 1. `check_index` + +检查索引是否就绪。 + +**参数**: +- `path` (string, required): 仓库根路径 + +**返回**: +```json +{ + "ok": true, + "indexed": true, + "commit_hash": "abc123...", + "file_count": 1234, + "symbol_count": 5678 +} +``` + +#### 2. `repo_map` + +获取仓库全局视图。 + +**参数**: +- `path` (string, required): 仓库根路径 +- `max_files` (number, optional): 最大文件数,默认 20 +- `max_symbols` (number, optional): 每个文件最大符号数,默认 5 + +#### 3. `search_symbols` + +符号检索(支持多种模式)。 + +**参数**: +- `path` (string, required): 仓库根路径 +- `query` (string, required): 查询字符串 +- `mode` (string, optional): 搜索模式,可选值:`substring`、`prefix`、`wildcard`、`regex`、`fuzzy`,默认 `substring` +- `limit` (number, optional): 返回结果数,默认 50 + +#### 4. `semantic_search` + +基于 LanceDB + SQ8 的语义检索。 + +**参数**: +- `path` (string, required): 仓库根路径 +- `query` (string, required): 自然语言查询 +- `topk` (number, optional): 返回结果数,默认 10 + +#### 5. `ast_graph_query` + +基于 CozoDB 的 AST 图查询(CozoScript)。 + +**参数**: +- `path` (string, required): 仓库根路径 +- `query` (string, required): CozoScript 查询语句 +- `params` (object, optional): 查询参数 + +#### 6. `ast_graph_find` + +按名称前缀查找符号。 + +**参数**: +- `path` (string, required): 仓库根路径 +- `prefix` (string, required): 符号名前缀 +- `limit` (number, optional): 返回结果数,默认 50 + +#### 7. `ast_graph_callers` + +查找函数调用者。 + +**参数**: +- `path` (string, required): 仓库根路径 +- `name` (string, required): 函数名 +- `limit` (number, optional): 返回结果数,默认 200 + +#### 8. `ast_graph_callees` + +查找函数调用的其他函数。 + +**参数**: +- `path` (string, required): 仓库根路径 +- `name` (string, required): 函数名 +- `limit` (number, optional): 返回结果数,默认 200 + +#### 9. `ast_graph_chain` + +追踪完整调用链。 + +**参数**: +- `path` (string, required): 仓库根路径 +- `name` (string, required): 函数名 +- `direction` (string, optional): 追踪方向,可选值:`downstream`、`upstream`,默认 `downstream` +- `max_depth` (number, optional): 最大深度,默认 3 +- `limit` (number, optional): 返回结果数,默认 500 + +#### 10. `read_file` + +读取文件内容。 + +**参数**: +- `path` (string, required): 仓库根路径 +- `file` (string, required): 文件相对路径 +- `start_line` (number, optional): 起始行号,默认 1 +- `end_line` (number, optional): 结束行号,默认 200 + +#### 11. `dsr_context` + +获取仓库 Git 上下文和 DSR 目录状态。 + +**参数**: +- `path` (string, required): 仓库根路径 + +**返回**: +```json +{ + "ok": true, + "commit_hash": "abc123...", + "repo_root": "/path/to/repo", + "branch": "main", + "detached": false, + "dsr_directory_state": { + "total_commits": 100, + "indexed_commits": 95, + "missing_commits": ["def456...", "ghi789..."] + } +} +``` + +#### 12. `dsr_generate` + +为指定提交生成 DSR。 + +**参数**: +- `path` (string, required): 仓库根路径 +- `commit` (string, optional): 提交哈希,默认 HEAD + +#### 13. `dsr_rebuild_index` + +从 DSR 文件重建索引。 + +**参数**: +- `path` (string, required): 仓库根路径 + +#### 14. `dsr_symbol_evolution` + +追溯符号变更历史。 + +**参数**: +- `path` (string, required): 仓库根路径 +- `symbol` (string, required): 符号名 +- `start` (string, optional): 起始提交哈希 +- `all` (boolean, optional): 是否返回所有历史,默认 false +- `limit` (number, optional): 返回结果数,默认 200 +- `contains` (string, optional): 仅返回包含此提交的分支 + +### 调用约定 + +**重要**:所有 MCP 工具调用必须显式传递 `path` 参数。 + +```json +{ + "name": "search_symbols", + "arguments": { + "path": "/absolute/path/to/repo", + "query": "functionName" + } +} +``` + +禁止依赖进程状态或工作目录隐式推断仓库位置。 + +## Git Hooks 详细说明 + +### 安装 Hooks + +```bash +git-ai ai hooks install +``` + +### Hook 行为 + +#### pre-commit + +自动执行以下操作: +1. 增量索引暂存区文件:`index --incremental --staged` +2. 打包索引:`pack` +3. 将 `.git-ai/meta.json` 与 `.git-ai/lancedb.tar.gz` 加入暂存区 + +**注意**:索引内容以暂存区为准,确保提交的索引与代码一致。 + +#### pre-push + +再次执行 `pack`,若归档发生变化则阻止 push,提示先提交归档文件。 + +这确保了每次 push 都包含最新的索引归档。 + +#### post-checkout / post-merge + +若存在 `.git-ai/lancedb.tar.gz` 则自动执行 `unpack`。 + +这确保了切换分支或合并后,索引自动更新。 + +### 查看 Hook 状态 + +```bash +git-ai ai hooks status +``` + +## Git LFS 集成 + +### 为什么需要 Git LFS + +索引归档 `.git-ai/lancedb.tar.gz` 可能较大(取决于项目规模),直接存入 Git 历史会导致: +- 仓库体积膨胀 +- 克隆速度变慢 +- Git 操作性能下降 + +使用 Git LFS 可以有效管理这些大文件。 + +### 开启 Git LFS(一次性) + +```bash +# 安装 Git LFS(如果未安装) +git lfs install + +# 跟踪索引归档 +git lfs track ".git-ai/lancedb.tar.gz" + +# 提交 .gitattributes +git add .gitattributes +git commit -m "chore: track lancedb archive via git-lfs" +``` + +### 使用 git-ai 触发 LFS + +```bash +# 打包并自动使用 LFS(如果已安装 git-lfs) +git-ai ai pack --lfs +``` + +### 克隆/切分支后 + +如果你环境设置了 `GIT_LFS_SKIP_SMUDGE=1`,或发现 `.git-ai/lancedb.tar.gz` 不是有效的 gzip 文件: + +```bash +git lfs pull +``` + +## Agent 模版详细说明 + +### Skill 模版 + +Skill 模版定义了 Agent 如何使用 git-ai 工具的最佳实践。 + +**位置**:`templates/agents/common/skills/git-ai-mcp/SKILL.md` + +**核心内容**: +- 推荐工作流程(7 步) +- 工具选择指南 +- 最佳实践 +- 常见陷阱 + +### Rule 模版 + +Rule 模版定义了 Agent 使用 git-ai 工具时的行为约束。 + +**位置**:`templates/agents/common/rules/git-ai-mcp/RULE.md` + +**核心内容**: +- 必须遵守的规则(must_follow) +- 推荐的做法(recommended) +- 禁止的行为(prohibited) +- 工具使用约束(tool_usage_constraints) + +### 安装 Agent 模版 + +```bash +# 安装到当前仓库的默认位置 +git-ai ai agent install + +# 覆盖已存在的模版 +git-ai ai agent install --overwrite + +# 安装到自定义位置 +git-ai ai agent install --to /custom/location/.agents + +# 为特定 Agent 安装(如 Trae) +git-ai ai agent install --agent trae +``` + +## 性能指标 + +### 索引速度 + +- 1k 文件:< 5 秒 +- 10k 文件:< 30 秒 +- 100k 文件:< 5 分钟 + +### 查询速度 + +- 符号检索:< 10ms +- 语义搜索:< 100ms +- AST 图查询:< 50ms + +### 存储占用 + +- DSR 文件:约 1-5 MB / 1k 文件 +- LanceDB 索引:约 10-50 MB / 1k 文件 +- CozoDB 索引:约 5-20 MB / 1k 文件 +- 打包归档:约 15-75 MB / 1k 文件 + +## 故障排查 + +详见 [故障排查指南](./troubleshooting.md)。