Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
52 changes: 41 additions & 11 deletions src/qmd.ts
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
import { Database } from "bun:sqlite";
import { Glob, $ } from "bun";
import { parseArgs } from "util";
import { readFileSync, statSync, existsSync, unlinkSync, writeFileSync, openSync, closeSync, mkdirSync } from "fs";
import { readFileSync, readdirSync, statSync, existsSync, unlinkSync, writeFileSync, openSync, closeSync, mkdirSync } from "fs";
import {
getPwd,
getRealPath,
Expand Down Expand Up @@ -1365,20 +1365,50 @@ async function indexFiles(pwd?: string, globPattern: string = DEFAULT_GLOB, coll

progress.indeterminate();
const glob = new Glob(globPattern);
const excludeSet = new Set([...excludeDirs, "node_modules"]);
Copy link

Copilot AI Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The excludeSet includes "node_modules" twice: once from the spread of excludeDirs array (which already contains "node_modules" at line 1354) and once as a direct addition. This redundancy should be removed by either removing "node_modules" from the explicit addition here, or removing it from the excludeDirs array definition.

Suggested change
const excludeSet = new Set([...excludeDirs, "node_modules"]);
const excludeSet = new Set(excludeDirs);

Copilot uses AI. Check for mistakes.
const files: string[] = [];
for await (const file of glob.scan({ cwd: resolvedPwd, onlyFiles: true, followSymlinks: true })) {
// Skip node_modules, hidden folders (.*), and other common excludes
const parts = file.split("/");
const shouldSkip = parts.some(part =>
part === "node_modules" ||
part.startsWith(".") ||
excludeDirs.includes(part)
);
if (!shouldSkip) {
files.push(file);

// Walk directories manually to skip excluded dirs BEFORE descending into them.
// Bun's Glob.scan() traverses into node_modules/etc before the filter runs,
// which causes OOM kills on large monorepos with millions of files.
function walkDir(dir: string, relPrefix: string): void {
let entries: ReturnType<typeof readdirSync>;
try {
entries = readdirSync(resolve(resolvedPwd, dir), { withFileTypes: true });
} catch {
return; // Permission denied, broken symlink, etc.
}
for (const entry of entries) {
const name = entry.name;
// Skip excluded directories and hidden directories before descending
if (entry.isDirectory() || entry.isSymbolicLink()) {
if (excludeSet.has(name) || name.startsWith(".")) continue;
// For symlinks, check if they point to a directory
if (entry.isSymbolicLink()) {
try {
const target = statSync(resolve(resolvedPwd, dir, name));
if (!target.isDirectory()) {
// Symlink to a file — check if it matches
const relPath = relPrefix ? `${relPrefix}/${name}` : name;
if (glob.match(relPath)) files.push(relPath);
continue;
}
} catch {
continue; // Broken symlink
}
}
walkDir(dir ? `${dir}/${name}` : name, relPrefix ? `${relPrefix}/${name}` : name);
Comment on lines +1387 to +1400
Copy link

Copilot AI Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The symlink handling could lead to infinite recursion if there are circular symlinks in the directory tree. While this is also an issue with the previous Glob.scan({followSymlinks: true}) implementation, consider adding cycle detection by tracking visited real paths (using realpathSync) to prevent potential stack overflow errors. This would make the implementation more robust than the original.

Copilot uses AI. Check for mistakes.
} else if (entry.isFile()) {
const relPath = relPrefix ? `${relPrefix}/${name}` : name;
if (glob.match(relPath)) {
files.push(relPath);
}
}
}
}

walkDir("", "");

const total = files.length;
if (total === 0) {
progress.clear();
Expand Down