diff --git a/docs/case-studies/issue-173/CASE-STUDY.md b/docs/case-studies/issue-173/CASE-STUDY.md new file mode 100644 index 0000000..2ce9fb5 --- /dev/null +++ b/docs/case-studies/issue-173/CASE-STUDY.md @@ -0,0 +1,232 @@ +# Case Study: Issue #173 - `--model kilo/glm-5-free` Hangs Forever + +## Summary + +When using `--model kilo/glm-5-free`, the agent hangs indefinitely during provider package installation. The process gets stuck at the `bun add @openrouter/ai-sdk-provider@latest` command. + +## Timeline of Events + +### Sequence of Events (from verbose logs) + +1. **T+0ms**: Agent started with `--model kilo/glm-5-free --verbose` +2. **T+32ms**: Model parsed: `providerID: "kilo"`, `modelID: "glm-5-free"` +3. **T+113ms**: Provider state initialization started +4. **T+136ms**: Provider SDK requested for `kilo` provider +5. **T+137ms**: Package installation initiated: `@openrouter/ai-sdk-provider@latest` +6. **T+138ms**: `bun add` command spawned +7. **∞**: Process hangs indefinitely - no completion, no error + +### The Hanging Command + +```json +{ + "type": "log", + "level": "info", + "timestamp": "2026-02-14T13:43:01.984Z", + "service": "bun", + "cmd": [ + "/home/hive/.bun/bin/bun", + "add", + "--force", + "--exact", + "--cwd", + "/home/hive/.cache/link-assistant-agent", + "@openrouter/ai-sdk-provider@latest" + ], + "cwd": "/home/hive/.cache/link-assistant-agent", + "message": "running" +} +``` + +## Root Cause Analysis + +### Primary Issue: Missing Timeout in Bun.spawn + +The `BunProc.run()` function in `js/src/bun/index.ts` uses `Bun.spawn()` without a `timeout` option: + +```typescript +const result = Bun.spawn([which(), ...cmd], { + ...options, + stdout: 'pipe', + stderr: 'pipe', + env: { + ...process.env, + ...options?.env, + BUN_BE_BUN: '1', + }, +}); +``` + +Without a timeout, if `bun add` encounters any of the known hanging issues, the process waits indefinitely. + +### Known Bun Package Manager Hang Issues + +Based on research, several Bun issues can cause `bun add`/`bun install` to hang: + +1. **HTTP 304 Response Handling** ([Issue #5831](https://github.com/oven-sh/bun/issues/5831)) + - Improper handling of HTTP 304 (Not Modified) responses + - IPv6 configuration issues causing connection hangs + - Fixes merged in PR #6192 and PR #15511 + +2. **Failed Dependency Fetch** ([Issue #26341](https://github.com/oven-sh/bun/issues/26341)) + - When tarball download fails (e.g., 401 Unauthorized), `bun install` hangs + - Missing error callback in isolated install mode + - Fix merged in PR #26342 + +3. **Large Package Count** ([Issue #23607](https://github.com/oven-sh/bun/issues/23607)) + - Security scanner causes hang with 790+ packages + - Hang occurs in scanner loading mechanism + +4. **Containerized Linux Environments** ([Issue #25624](https://github.com/oven-sh/bun/issues/25624)) + - `bun install` hangs at "Resolving dependencies" + - Issues with Bun's in-memory resolution algorithm + +### Contributing Factors + +1. **Network Conditions**: The user's environment may have intermittent network issues +2. **IPv6 Configuration**: IPv6 issues can cause Bun to hang on DNS resolution +3. **Cache State**: Corrupted or partial cache can trigger hangs +4. **Missing Timeout**: The `BunProc.run()` function has no timeout mechanism + +## Proposed Solutions + +### Solution 1: Add Timeout to BunProc.run (Recommended) + +Add a timeout option to the `Bun.spawn()` call in `BunProc.run()`: + +```typescript +export async function run( + cmd: string[], + options?: Bun.SpawnOptions.OptionsObject & { timeout?: number } +) { + const timeout = options?.timeout ?? 120000; // 2 minutes default + + log.info(() => ({ + message: 'running', + cmd: [which(), ...cmd], + timeout, + ...options, + })); + + const result = Bun.spawn([which(), ...cmd], { + ...options, + stdout: 'pipe', + stderr: 'pipe', + timeout, // Add timeout support + killSignal: 'SIGTERM', // Graceful termination + env: { + ...process.env, + ...options?.env, + BUN_BE_BUN: '1', + }, + }); + // ... +} +``` + +### Solution 2: Pre-bundle the @openrouter/ai-sdk-provider Package + +Instead of dynamically installing the package at runtime, pre-install it as a dependency: + +```json +// package.json +{ + "dependencies": { + "@openrouter/ai-sdk-provider": "^2.2.3" + } +} +``` + +This is how KiloCode and Kilo repositories handle the provider package. + +### Solution 3: Use AbortSignal for More Control + +```typescript +const controller = new AbortController(); +const timeoutId = setTimeout(() => controller.abort(), 120000); + +const result = Bun.spawn([which(), ...cmd], { + signal: controller.signal, + // ... +}); + +const code = await result.exited; +clearTimeout(timeoutId); +``` + +### Solution 4: Add Retry with Exponential Backoff + +If the package installation fails, retry with exponential backoff: + +```typescript +const MAX_RETRIES = 3; +const BASE_DELAY = 1000; + +for (let attempt = 1; attempt <= MAX_RETRIES; attempt++) { + try { + await BunProc.run(args, { cwd, timeout: 60000 }); + break; // Success + } catch (e) { + if (attempt === MAX_RETRIES) throw e; + await delay(BASE_DELAY * Math.pow(2, attempt - 1)); + } +} +``` + +## Recommended Fix + +Implement **Solution 1** with a reasonable timeout (60-120 seconds) for package installation. This prevents indefinite hangs while still allowing enough time for legitimate package installations. + +Additionally, consider implementing **Solution 2** for commonly-used provider packages to avoid runtime installation altogether. + +## References + +### Related Issues + +- [Bun Issue #5831: bun install hangs sporadically](https://github.com/oven-sh/bun/issues/5831) +- [Bun Issue #26341: Bun install hangs when failing to fetch](https://github.com/oven-sh/bun/issues/26341) +- [Bun Issue #23607: bun install hangs with security scanner](https://github.com/oven-sh/bun/issues/23607) +- [Bun Issue #25624: bun install hangs in containerized Linux](https://github.com/oven-sh/bun/issues/25624) + +### Bun Documentation + +- [Bun Spawn Documentation](https://bun.sh/docs/runtime/child-process) +- Timeout option: `timeout: number` (milliseconds) +- Kill signal: `killSignal: "SIGTERM" | "SIGKILL" | ...` + +### KiloCode/Kilo Reference Implementation + +The Kilo provider implementation uses: +- Pre-installed `@openrouter/ai-sdk-provider` package +- API endpoint: `https://api.kilo.ai/api/openrouter/` +- Custom headers: `X-KILOCODE-EDITORNAME`, `User-Agent` + +## Workarounds + +### For Users + +1. **Pre-install the package manually**: + ```bash + bun add @openrouter/ai-sdk-provider + ``` + +2. **Clear Bun cache**: + ```bash + bun pm cache rm + ``` + +3. **Disable IPv6** (if applicable): + ```bash + # Linux + sudo sysctl -w net.ipv6.conf.all.disable_ipv6=1 + ``` + +4. **Use a different model** while the issue is being fixed: + ```bash + echo "hi" | agent --model opencode/grok-code --verbose + ``` + +## Files Affected + +- `js/src/bun/index.ts` - Main fix location (add timeout) +- `js/src/provider/provider.ts` - Provider SDK loading diff --git a/docs/case-studies/issue-173/issue-data.json b/docs/case-studies/issue-173/issue-data.json new file mode 100644 index 0000000..6d8cb1d --- /dev/null +++ b/docs/case-studies/issue-173/issue-data.json @@ -0,0 +1 @@ +{"author":{"id":"MDQ6VXNlcjE0MzE5MDQ=","is_bot":false,"login":"konard","name":"Konstantin Diachenko"},"body":"```\nhive@vmi2955137:~$ echo \"hi\" | agent --model kilo/glm-5-free --verbose\n{ \n \"type\": \"status\",\n \"mode\": \"stdin-stream\",\n \"message\": \"Agent CLI in continuous listening mode. Accepts JSON and plain text input.\",\n \"hint\": \"Press CTRL+C to exit. Use --help for options.\",\n \"acceptedFormats\": [\n \"JSON object with \\\"message\\\" field\",\n \"Plain text\"\n ],\n \"options\": {\n \"interactive\": true,\n \"autoMergeQueuedMessages\": true,\n \"alwaysAcceptStdin\": true,\n \"compactJson\": false\n }\n}\n{ \n \"type\": \"log\",\n \"level\": \"info\",\n \"timestamp\": \"2026-02-14T13:43:01.845Z\",\n \"service\": \"default\",\n \"version\": \"0.13.0\",\n \"command\": \"/home/hive/.bun/bin/bun /home/hive/.bun/install/global/node_modules/@link-assistant/agent/src/index.js --model kilo/glm-5-free --verbose\",\n \"workingDirectory\": \"/home/hive\",\n \"scriptPath\": \"/home/hive/.bun/install/global/node_modules/@link-assistant/agent/src/index.js\",\n \"message\": \"Agent started (continuous mode)\"\n}\n\n{ \n \"type\": \"log\",\n \"level\": \"info\",\n \"timestamp\": \"2026-02-14T13:43:01.846Z\",\n \"service\": \"default\",\n \"directory\": \"/home/hive\",\n \"message\": \"creating instance\"\n}\n{ \n \"type\": \"log\",\n \"level\": \"info\",\n \"timestamp\": \"2026-02-14T13:43:01.846Z\",\n \"service\": \"project\",\n \"directory\": \"/home/hive\",\n \"message\": \"fromDirectory\"\n}\n\n{ \n \"type\": \"log\",\n \"level\": \"info\",\n \"timestamp\": \"2026-02-14T13:43:01.878Z\",\n \"service\": \"default\",\n \"rawModel\": \"kilo/glm-5-free\",\n \"providerID\": \"kilo\",\n \"modelID\": \"glm-5-free\",\n \"message\": \"using explicit provider/model\"\n}\n\n{ \n \"type\": \"log\",\n \"level\": \"info\",\n \"timestamp\": \"2026-02-14T13:43:01.901Z\",\n \"service\": \"server\",\n \"method\": \"POST\",\n \"path\": \"/session\",\n \"message\": \"request\"\n}\n\n{ \n \"type\": \"log\",\n \"level\": \"info\",\n \"timestamp\": \"2026-02-14T13:43:01.901Z\",\n \"service\": \"server\",\n \"status\": \"started\",\n \"method\": \"POST\",\n \"path\": \"/session\",\n \"message\": \"request\"\n}\n{ \n \"type\": \"log\",\n \"level\": \"info\",\n \"timestamp\": \"2026-02-14T13:43:01.905Z\",\n \"service\": \"session\",\n \"id\": \"ses_3a39c05eeffeC3971iD1mpvpy1\",\n \"version\": \"agent-cli-1.0.0\",\n \"projectID\": \"global\",\n \"directory\": \"/home/hive\",\n \"title\": \"New session - 2026-02-14T13:43:01.905Z\",\n \"time\": {\n \"created\": 1771076581905,\n \"updated\": 1771076581905\n },\n \"message\": \"created\"\n}\n\n{ \n \"type\": \"session.created\",\n \"level\": \"info\",\n \"timestamp\": \"2026-02-14T13:43:01.905Z\",\n \"service\": \"bus\",\n \"message\": \"publishing\"\n}\n\n{ \n \"type\": \"session.updated\",\n \"level\": \"info\",\n \"timestamp\": \"2026-02-14T13:43:01.905Z\",\n \"service\": \"bus\",\n \"message\": \"publishing\"\n}\n\n{ \n \"type\": \"log\",\n \"level\": \"info\",\n \"timestamp\": \"2026-02-14T13:43:01.906Z\",\n \"service\": \"server\",\n \"status\": \"completed\",\n \"duration\": 5,\n \"method\": \"POST\",\n \"path\": \"/session\",\n \"message\": \"request\"\n}\n{ \n \"type\": \"*\",\n \"level\": \"info\",\n \"timestamp\": \"2026-02-14T13:43:01.909Z\",\n \"service\": \"bus\",\n \"message\": \"subscribing\"\n}\n\n{ \n \"type\": \"input\",\n \"timestamp\": \"2026-02-14T13:43:01.913Z\",\n \"raw\": \"hi\",\n \"parsed\": {\n \"message\": \"hi\"\n },\n \"format\": \"text\"\n}\n{ \n \"type\": \"*\",\n \"level\": \"info\",\n \"timestamp\": \"2026-02-14T13:43:01.914Z\",\n \"service\": \"bus\",\n \"message\": \"subscribing\"\n}\n\n{ \n \"type\": \"log\",\n \"level\": \"info\",\n \"timestamp\": \"2026-02-14T13:43:01.915Z\",\n \"service\": \"server\",\n \"method\": \"POST\",\n \"path\": \"/session/ses_3a39c05eeffeC3971iD1mpvpy1/message\",\n \"message\": \"request\"\n}\n\n{ \n \"type\": \"log\",\n \"level\": \"info\",\n \"timestamp\": \"2026-02-14T13:43:01.915Z\",\n \"service\": \"server\",\n \"status\": \"started\",\n \"method\": \"POST\",\n \"path\": \"/session/ses_3a39c05eeffeC3971iD1mpvpy1/message\",\n \"message\": \"request\"\n}\n{ \n \"type\": \"log\",\n \"level\": \"info\",\n \"timestamp\": \"2026-02-14T13:43:01.921Z\",\n \"service\": \"server\",\n \"status\": \"completed\",\n \"duration\": 6,\n \"method\": \"POST\",\n \"path\": \"/session/ses_3a39c05eeffeC3971iD1mpvpy1/message\",\n \"message\": \"request\"\n}\n{ \n \"type\": \"log\",\n \"level\": \"info\",\n \"timestamp\": \"2026-02-14T13:43:01.930Z\",\n \"service\": \"config\",\n \"path\": \"/home/hive/.config/link-assistant-agent/config.json\",\n \"message\": \"loading\"\n}\n\n{ \n \"type\": \"log\",\n \"level\": \"info\",\n \"timestamp\": \"2026-02-14T13:43:01.930Z\",\n \"service\": \"config\",\n \"path\": \"/home/hive/.config/link-assistant-agent/opencode.json\",\n \"message\": \"loading\"\n}\n\n{ \n \"type\": \"log\",\n \"level\": \"info\",\n \"timestamp\": \"2026-02-14T13:43:01.931Z\",\n \"service\": \"config\",\n \"path\": \"/home/hive/.config/link-assistant-agent/opencode.jsonc\",\n \"message\": \"loading\"\n}\n\n{ \n \"type\": \"message.updated\",\n \"level\": \"info\",\n \"timestamp\": \"2026-02-14T13:43:01.946Z\",\n \"service\": \"bus\",\n \"message\": \"publishing\"\n}\n\n{ \n \"type\": \"message.part.updated\",\n \"level\": \"info\",\n \"timestamp\": \"2026-02-14T13:43:01.949Z\",\n \"service\": \"bus\",\n \"message\": \"publishing\"\n}\n\n{ \n \"type\": \"session.updated\",\n \"level\": \"info\",\n \"timestamp\": \"2026-02-14T13:43:01.950Z\",\n \"service\": \"bus\",\n \"message\": \"publishing\"\n}\n\n{ \n \"type\": \"log\",\n \"level\": \"info\",\n \"timestamp\": \"2026-02-14T13:43:01.953Z\",\n \"service\": \"session.prompt\",\n \"step\": 0,\n \"sessionID\": \"ses_3a39c05eeffeC3971iD1mpvpy1\",\n \"message\": \"loop\"\n}\n\n{ \n \"type\": \"log\",\n \"level\": \"info\",\n \"timestamp\": \"2026-02-14T13:43:01.956Z\",\n \"service\": \"session.prompt\",\n \"hint\": \"Enable with --generate-title flag or AGENT_GENERATE_TITLE=true\",\n \"message\": \"title generation disabled\"\n}\n\n{ \n \"type\": \"log\",\n \"level\": \"info\",\n \"timestamp\": \"2026-02-14T13:43:01.958Z\",\n \"service\": \"provider\",\n \"status\": \"started\",\n \"message\": \"state\"\n}\n{ \n \"type\": \"log\",\n \"level\": \"info\",\n \"timestamp\": \"2026-02-14T13:43:01.958Z\",\n \"service\": \"models.dev\",\n \"file\": {},\n \"message\": \"refreshing\"\n}\n\n{ \n \"type\": \"log\",\n \"level\": \"info\",\n \"timestamp\": \"2026-02-14T13:43:01.970Z\",\n \"service\": \"provider\",\n \"message\": \"init\"\n}\n\n{ \n \"type\": \"log\",\n \"level\": \"info\",\n \"timestamp\": \"2026-02-14T13:43:01.981Z\",\n \"service\": \"claude-oauth\",\n \"subscriptionType\": \"max\",\n \"scopes\": [\n \"user:inference\",\n \"user:mcp_servers\",\n \"user:profile\",\n \"user:sessions:claude_code\"\n ],\n \"message\": \"loaded oauth credentials\"\n}\n\n{ \n \"type\": \"log\",\n \"level\": \"info\",\n \"timestamp\": \"2026-02-14T13:43:01.981Z\",\n \"service\": \"provider\",\n \"source\": \"credentials file (max)\",\n \"message\": \"using claude oauth credentials\"\n}\n\n{ \n \"type\": \"log\",\n \"level\": \"info\",\n \"timestamp\": \"2026-02-14T13:43:01.981Z\",\n \"service\": \"provider\",\n \"providerID\": \"opencode\",\n \"message\": \"found\"\n}\n\n{ \n \"type\": \"log\",\n \"level\": \"info\",\n \"timestamp\": \"2026-02-14T13:43:01.981Z\",\n \"service\": \"provider\",\n \"providerID\": \"kilo\",\n \"message\": \"found\"\n}\n\n{ \n \"type\": \"log\",\n \"level\": \"info\",\n \"timestamp\": \"2026-02-14T13:43:01.982Z\",\n \"service\": \"provider\",\n \"providerID\": \"claude-oauth\",\n \"message\": \"found\"\n}\n\n{ \n \"type\": \"log\",\n \"level\": \"info\",\n \"timestamp\": \"2026-02-14T13:43:01.982Z\",\n \"service\": \"provider\",\n \"status\": \"completed\",\n \"duration\": 24,\n \"message\": \"state\"\n}\n{ \n \"type\": \"log\",\n \"level\": \"info\",\n \"timestamp\": \"2026-02-14T13:43:01.982Z\",\n \"service\": \"provider\",\n \"providerID\": \"kilo\",\n \"modelID\": \"glm-5-free\",\n \"message\": \"getModel\"\n}\n\n{ \n \"type\": \"log\",\n \"level\": \"info\",\n \"timestamp\": \"2026-02-14T13:43:01.982Z\",\n \"service\": \"provider\",\n \"status\": \"started\",\n \"providerID\": \"kilo\",\n \"message\": \"getSDK\"\n}\n{ \n \"type\": \"log\",\n \"level\": \"info\",\n \"timestamp\": \"2026-02-14T13:43:01.983Z\",\n \"service\": \"provider\",\n \"providerID\": \"kilo\",\n \"pkg\": \"@openrouter/ai-sdk-provider\",\n \"version\": \"latest\",\n \"message\": \"installing provider package\"\n}\n\n{ \n \"type\": \"log\",\n \"level\": \"info\",\n \"timestamp\": \"2026-02-14T13:43:01.984Z\",\n \"service\": \"bun\",\n \"pkg\": \"@openrouter/ai-sdk-provider\",\n \"version\": \"latest\",\n \"message\": \"installing package using Bun's default registry resolution\"\n}\n\n{ \n \"type\": \"log\",\n \"level\": \"info\",\n \"timestamp\": \"2026-02-14T13:43:01.984Z\",\n \"service\": \"bun\",\n \"cmd\": [\n \"/home/hive/.bun/bin/bun\",\n \"add\",\n \"--force\",\n \"--exact\",\n \"--cwd\",\n \"/home/hive/.cache/link-assistant-agent\",\n \"@openrouter/ai-sdk-provider@latest\"\n ],\n \"cwd\": \"/home/hive/.cache/link-assistant-agent\",\n \"message\": \"running\"\n}\n\n^C{\n \"type\": \"status\",\n \"message\": \"Received SIGINT. Shutting down...\"\n}\n{ \n \"type\": \"*\",\n \"level\": \"info\",\n \"timestamp\": \"2026-02-14T13:43:09.013Z\",\n \"service\": \"bus\",\n \"message\": \"unsubscribing\"\n}\n\n{ \n \"type\": \"log\",\n \"level\": \"info\",\n \"timestamp\": \"2026-02-14T13:43:09.014Z\",\n \"service\": \"default\",\n \"directory\": \"/home/hive\",\n \"message\": \"disposing instance\"\n}\n{ \n \"type\": \"log\",\n \"level\": \"info\",\n \"timestamp\": \"2026-02-14T13:43:09.014Z\",\n \"service\": \"state\",\n \"key\": \"/home/hive\",\n \"message\": \"waiting for state disposal to complete\"\n}\n\n{ \n \"type\": \"log\",\n \"level\": \"info\",\n \"timestamp\": \"2026-02-14T13:43:09.014Z\",\n \"service\": \"state\",\n \"key\": \"/home/hive\",\n \"message\": \"state disposal completed\"\n}\n\nhive@vmi2955137:~$ \n```\n\nIt stuck forever on:\n```\n{ \n \"type\": \"log\",\n \"level\": \"info\",\n \"timestamp\": \"2026-02-14T13:43:01.984Z\",\n \"service\": \"bun\",\n \"cmd\": [\n \"/home/hive/.bun/bin/bun\",\n \"add\",\n \"--force\",\n \"--exact\",\n \"--cwd\",\n \"/home/hive/.cache/link-assistant-agent\",\n \"@openrouter/ai-sdk-provider@latest\"\n ],\n \"cwd\": \"/home/hive/.cache/link-assistant-agent\",\n \"message\": \"running\"\n}\n```\n\nSo I had to CTRL+C it.\n\nDouble check how it is done correctly in https://github.com/Kilo-Org/kilocode or https://github.com/Kilo-Org/kilo\n\nPlease download all logs and data related about the issue to this repository, make sure we compile that data to `./docs/case-studies/issue-{id}` folder, and use it to do deep case study analysis (also make sure to search online for additional facts and data), in which we will reconstruct timeline/sequence of events, find root causes of the problem, and propose possible solutions (including known existing components/libraries, that solve similar problem or can help in solutions).\n\nIf issue related to any other repository/project, where we can report issues on GitHub, please do so. Each issue must contain reproducible examples, workarounds and suggestions for fix the issue in code.","comments":[],"createdAt":"2026-02-14T13:45:50Z","labels":[{"id":"LA_kwDOQYTy3M8AAAACQHoi-w","name":"bug","description":"Something isn't working","color":"d73a4a"}],"number":173,"state":"OPEN","title":"`--model kilo/glm-5-free` is still not working","updatedAt":"2026-02-14T13:47:01Z"} diff --git a/js/.changeset/fix-bun-timeout-hang.md b/js/.changeset/fix-bun-timeout-hang.md new file mode 100644 index 0000000..fdc940c --- /dev/null +++ b/js/.changeset/fix-bun-timeout-hang.md @@ -0,0 +1,17 @@ +--- +'@link-assistant/agent': patch +--- + +Fix indefinite hang when using Kilo provider by adding timeout to BunProc.run (#173) + +- Add DEFAULT_TIMEOUT_MS (2 minutes) for subprocess commands +- Add INSTALL_TIMEOUT_MS (60 seconds) for package installation +- Create TimeoutError for better error handling and retry logic +- Add retry logic for timeout errors (up to 3 attempts) +- Add helpful error messages for timeout and recovery scenarios + +This prevents indefinite hangs caused by known Bun package manager issues: + +- HTTP 304 response handling (oven-sh/bun#5831) +- Failed dependency fetch (oven-sh/bun#26341) +- IPv6 configuration issues diff --git a/js/src/bun/index.ts b/js/src/bun/index.ts index 7fcaff1..1d0cf0b 100644 --- a/js/src/bun/index.ts +++ b/js/src/bun/index.ts @@ -13,26 +13,69 @@ export namespace BunProc { // Lock key for serializing package installations to prevent race conditions const INSTALL_LOCK_KEY = 'bun-install'; + // Default timeout for subprocess commands (2 minutes) + // This prevents indefinite hangs from known Bun issues: + // - HTTP 304 response handling (https://github.com/oven-sh/bun/issues/5831) + // - Failed dependency fetch (https://github.com/oven-sh/bun/issues/26341) + // - IPv6 configuration issues + const DEFAULT_TIMEOUT_MS = 120000; + + // Timeout specifically for package installation (60 seconds) + // Package installations should complete within this time for typical packages + const INSTALL_TIMEOUT_MS = 60000; + + export const TimeoutError = NamedError.create( + 'BunTimeoutError', + z.object({ + cmd: z.array(z.string()), + timeoutMs: z.number(), + }) + ); + export async function run( cmd: string[], - options?: Bun.SpawnOptions.OptionsObject + options?: Bun.SpawnOptions.OptionsObject & { + timeout?: number; + } ) { + const timeout = options?.timeout ?? DEFAULT_TIMEOUT_MS; + log.info(() => ({ message: 'running', cmd: [which(), ...cmd], - ...options, + timeout, + cwd: options?.cwd, })); + const result = Bun.spawn([which(), ...cmd], { ...options, stdout: 'pipe', stderr: 'pipe', + timeout, // Automatically kills process after timeout + killSignal: 'SIGTERM', // Graceful termination signal env: { ...process.env, ...options?.env, BUN_BE_BUN: '1', }, }); + const code = await result.exited; + + // Check if process was killed due to timeout + if (result.signalCode === 'SIGTERM' && code !== 0) { + log.error(() => ({ + message: 'command timed out', + cmd: [which(), ...cmd], + timeout, + signalCode: result.signalCode, + })); + throw new TimeoutError({ + cmd: [which(), ...cmd], + timeoutMs: timeout, + }); + } + const stdout = result.stdout ? typeof result.stdout === 'number' ? result.stdout @@ -84,6 +127,13 @@ export namespace BunProc { ); } + /** + * Check if an error is a timeout error + */ + function isTimeoutError(error: unknown): boolean { + return error instanceof TimeoutError; + } + /** * Wait for a specified duration */ @@ -139,12 +189,13 @@ export namespace BunProc { version, })); - // Retry logic for cache-related errors + // Retry logic for cache-related errors and timeout errors let lastError: Error | undefined; for (let attempt = 1; attempt <= MAX_RETRIES; attempt++) { try { await BunProc.run(args, { cwd: Global.Path.cache, + timeout: INSTALL_TIMEOUT_MS, // Use specific timeout for package installation }); log.info(() => ({ @@ -159,6 +210,7 @@ export namespace BunProc { } catch (e) { const errorMsg = e instanceof Error ? e.message : String(e); const isCacheError = isCacheRelatedError(errorMsg); + const isTimeout = isTimeoutError(e); log.warn(() => ({ message: 'package installation attempt failed', @@ -168,11 +220,15 @@ export namespace BunProc { maxRetries: MAX_RETRIES, error: errorMsg, isCacheError, + isTimeout, })); - if (isCacheError && attempt < MAX_RETRIES) { + // Retry on cache-related errors or timeout errors + if ((isCacheError || isTimeout) && attempt < MAX_RETRIES) { log.info(() => ({ - message: 'retrying installation after cache-related error', + message: isTimeout + ? 'retrying installation after timeout (possible network issue)' + : 'retrying installation after cache-related error', pkg, version, attempt, @@ -184,7 +240,7 @@ export namespace BunProc { continue; } - // Non-cache error or final attempt - log and throw + // Non-retriable error or final attempt - log and throw log.error(() => ({ message: 'package installation failed', pkg, @@ -192,10 +248,11 @@ export namespace BunProc { error: errorMsg, stack: e instanceof Error ? e.stack : undefined, possibleCacheCorruption: isCacheError, + timedOut: isTimeout, attempts: attempt, })); - // Provide helpful recovery instructions for cache-related errors + // Provide helpful recovery instructions if (isCacheError) { log.error(() => ({ message: @@ -203,6 +260,15 @@ export namespace BunProc { })); } + if (isTimeout) { + log.error(() => ({ + message: + 'Package installation timed out. This may be due to network issues or Bun hanging. ' + + 'Try: 1) Check network connectivity, 2) Run "bun pm cache rm" to clear cache, ' + + '3) Check for IPv6 issues (try disabling IPv6)', + })); + } + throw new InstallFailedError( { pkg, version, details: errorMsg }, {