Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
73 changes: 73 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -286,6 +286,79 @@ That's it! 3 steps to get started, immediately begin deep understanding of your

---

## ⚙️ Configuration

### File Filtering

git-ai respects your project's ignore files to control which files are indexed:

#### `.gitignore` - Standard Git Ignore

Files matching patterns in `.gitignore` are excluded from indexing by default.

#### `.aiignore` - AI-Specific Exclusions (Highest Priority)

Create a `.aiignore` file in your repository root to exclude specific files from indexing that should be ignored by git-ai but not necessarily by Git:

```bash
# Example .aiignore
test-fixtures/**
*.generated.ts
docs/api-reference/**
```

#### `.git-ai/include.txt` - Force Include (Overrides `.gitignore`)

Sometimes you need to index generated code or files that are in `.gitignore` but important for code understanding. Create `.git-ai/include.txt` to force-index specific patterns:

```bash
# Example .git-ai/include.txt
# Include generated API clients
generated/api/**

# Include specific build artifacts that contain important types
dist/types/**

# Include code from specific ignored directories
vendor/important-lib/**
```

**Priority Order (Highest to Lowest):**
1. `.aiignore` - Explicit exclusions always win
2. `.git-ai/include.txt` - Force-include patterns override `.gitignore`
3. `.gitignore` - Standard Git ignore patterns

**Supported Pattern Syntax:**
- `**` - Match any number of directories
- `*` - Match any characters within a directory
- `directory/` - Match entire directory (automatically converts to `directory/**`)
- `file.ts` - Match specific file
- Lines starting with `#` are comments

**Example Configuration:**

```bash
# .gitignore
dist/
generated/
*.log

# .git-ai/include.txt
generated/api/**
generated/types/**

# .aiignore (overrides everything)
generated/test-data/**
```

With this configuration:
- ✅ `generated/api/client.ts` - Indexed (included via include.txt)
- ✅ `generated/types/models.ts` - Indexed (included via include.txt)
- ❌ `generated/test-data/mock.ts` - Not indexed (.aiignore takes priority)
- ❌ `dist/bundle.js` - Not indexed (.gitignore, not in include.txt)

---

## 🛠️ Troubleshooting

### Windows Installation Issues
Expand Down
73 changes: 73 additions & 0 deletions README.zh-CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -283,6 +283,79 @@ git-ai ai graph callers authenticateUser

---

## ⚙️ 配置

### 文件过滤

git-ai 遵循项目的忽略文件配置来控制哪些文件会被索引:

#### `.gitignore` - 标准 Git 忽略规则

默认情况下,匹配 `.gitignore` 中模式的文件会被排除在索引之外。

#### `.aiignore` - AI 专属排除规则(最高优先级)

在仓库根目录创建 `.aiignore` 文件,用于排除特定文件的索引,这些文件应该被 git-ai 忽略但不一定要被 Git 忽略:

```bash
# .aiignore 示例
test-fixtures/**
*.generated.ts
docs/api-reference/**
```

#### `.git-ai/include.txt` - 强制包含(覆盖 `.gitignore`)

有时您需要索引生成的代码或在 `.gitignore` 中但对代码理解很重要的文件。创建 `.git-ai/include.txt` 来强制索引特定模式:

```bash
# .git-ai/include.txt 示例
# 包含生成的 API 客户端
generated/api/**

# 包含特定的构建产物,其中包含重要的类型定义
dist/types/**

# 包含特定被忽略目录中的代码
vendor/important-lib/**
```

**优先级顺序(从高到低):**
1. `.aiignore` - 显式排除规则始终生效
2. `.git-ai/include.txt` - 强制包含模式覆盖 `.gitignore`
3. `.gitignore` - 标准 Git 忽略模式

**支持的模式语法:**
- `**` - 匹配任意数量的目录
- `*` - 匹配目录内的任意字符
- `directory/` - 匹配整个目录(自动转换为 `directory/**`)
- `file.ts` - 匹配特定文件
- 以 `#` 开头的行为注释

**配置示例:**

```bash
# .gitignore
dist/
generated/
*.log

# .git-ai/include.txt
generated/api/**
generated/types/**

# .aiignore (覆盖所有规则)
generated/test-data/**
```

此配置下:
- ✅ `generated/api/client.ts` - 被索引(通过 include.txt 包含)
- ✅ `generated/types/models.ts` - 被索引(通过 include.txt 包含)
- ❌ `generated/test-data/mock.ts` - 不被索引(.aiignore 优先级最高)
- ❌ `dist/bundle.js` - 不被索引(在 .gitignore 中,不在 include.txt 中)

---

## 🛠️ 故障排除

### Windows 安装问题
Expand Down
6 changes: 6 additions & 0 deletions package-lock.json

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

96 changes: 80 additions & 16 deletions src/core/indexer.ts
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,9 @@ import { ChunkRow, RefRow } from './types';
import { toPosixPath } from './paths';
import { getCurrentCommitHash } from './git';

// Supported file extensions for indexing
const INDEXABLE_EXTENSIONS = 'ts,tsx,js,jsx,java,c,h,go,py,rs,md,mdx,yml,yaml';

export interface IndexOptions {
repoRoot: string;
scanRoot?: string;
Expand Down Expand Up @@ -36,6 +39,23 @@ async function loadIgnorePatterns(repoRoot: string, fileName: string): Promise<s
.filter((l): l is string => Boolean(l));
}

async function loadIncludePatterns(repoRoot: string): Promise<string[]> {
const includePath = path.join(repoRoot, '.git-ai', 'include.txt');
if (!await fs.pathExists(includePath)) return [];
const raw = await fs.readFile(includePath, 'utf-8');
return raw
.split('\n')
.map(l => l.trim())
.map((l) => {
if (l.length === 0) return null;
if (l.startsWith('#')) return null;
const withoutLeadingSlash = l.startsWith('/') ? l.slice(1) : l;
if (withoutLeadingSlash.endsWith('/')) return `${withoutLeadingSlash}**`;
return withoutLeadingSlash;
})
.filter((l): l is string => Boolean(l));
}

function inferIndexLang(file: string): IndexLang {
if (file.endsWith('.md') || file.endsWith('.mdx')) return 'markdown';
if (file.endsWith('.yml') || file.endsWith('.yaml')) return 'yaml';
Expand Down Expand Up @@ -71,30 +91,74 @@ export class IndexerV2 {

const aiIgnore = await loadIgnorePatterns(this.repoRoot, '.aiignore');
const gitIgnore = await loadIgnorePatterns(this.repoRoot, '.gitignore');
const files = await glob('**/*.{ts,tsx,js,jsx,java,c,h,go,py,rs,md,mdx,yml,yaml}', {
const includePatterns = await loadIncludePatterns(this.repoRoot);

// Base ignore patterns that are always applied
const baseIgnore = [
'node_modules/**',
'**/node_modules/**',
'.git/**',
'**/.git/**',
'.git-ai/**',
'**/.git-ai/**',
'.repo/**',
'**/.repo/**',
'dist/**',
'**/dist/**',
'target/**',
'**/target/**',
'build/**',
'**/build/**',
'.gradle/**',
'**/.gradle/**',
];

// Get files with normal ignore patterns (aiIgnore and gitIgnore)
const filesNormal = await glob(`**/*.{${INDEXABLE_EXTENSIONS}}`, {
cwd: this.scanRoot,
nodir: true,
ignore: [
'node_modules/**',
'**/node_modules/**',
'.git/**',
'**/.git/**',
'.git-ai/**',
'**/.git-ai/**',
'.repo/**',
'**/.repo/**',
'dist/**',
'target/**',
'**/target/**',
'build/**',
'**/build/**',
'.gradle/**',
'**/.gradle/**',
...baseIgnore,
...aiIgnore,
...gitIgnore,
],
});

let files = filesNormal;

// If include patterns exist, also get files matching those patterns (ignoring gitIgnore but respecting aiIgnore)
if (includePatterns.length > 0) {
// For each include pattern, get files matching it without gitIgnore restrictions
const includedFileSets = await Promise.all(
includePatterns.map(async (pattern) => {
// Ensure pattern covers all file extensions we support
let fullPattern = pattern;
// If pattern is a directory pattern (e.g., "generated/**"), append file extensions
if (pattern.endsWith('**')) {
fullPattern = `${pattern}/*.{${INDEXABLE_EXTENSIONS}}`;
} else if (!pattern.match(/\.(ts|tsx|js|jsx|java|c|h|go|py|rs|md|mdx|yml|yaml)$/)) {
// If pattern doesn't end with a file extension, treat it as a directory
fullPattern = `${pattern}/**/*.{${INDEXABLE_EXTENSIONS}}`;
}

return glob(fullPattern, {
cwd: this.scanRoot,
nodir: true,
ignore: [
...baseIgnore,
...aiIgnore,
// Note: gitIgnore is NOT applied here
],
});
})
);

// Flatten and merge with normal files
const includedFiles = includedFileSets.flat();
const fileSet = new Set([...filesNormal, ...includedFiles]);
files = Array.from(fileSet);
}

const languages = Array.from(new Set(files.map(inferIndexLang)));
const { byLang } = await openTablesByLang({
dbDir,
Expand Down
Loading