link-assistant · konard · Feb 14, 2026 · Feb 14, 2026 · Feb 14, 2026 · Feb 14, 2026
diff --git a/docs/case-studies/issue-169/analysis.md b/docs/case-studies/issue-169/analysis.md
@@ -0,0 +1,191 @@
+# Case Study: Premature Session Termination Due to SSE Stream Corruption (Issue #169)
+
+## Summary
+
+The agent's session terminated after ~5 minutes instead of retrying for the expected 7-day window. The root cause is a chain of three failures:
+
+1. **SSE stream corruption** at the OpenCode Zen gateway level when proxying responses from Moonshot's Kimi K2.5 API
+2. **Vercel AI SDK** emitting `{ type: 'error', error: JSONParseError }` into the stream (does NOT throw — allows stream to continue)
+3. **Agent's processor** (`processor.ts:208`) throwing `value.error` on any stream error event, terminating the session
+
+## Infrastructure Chain
+
+From the log file (`original-log.txt`), the provider was resolved as follows:
+
+```
+Agent (Bun) → OpenCode Zen (opencode.ai/zen/v1) → Moonshot Kimi K2.5 API
+```
+
+Evidence from logs:
+
+```
+[2026-02-14T08:29:06.525Z] "providerID": "opencode",
+[2026-02-14T08:29:06.525Z] "modelID": "kimi-k2.5-free",
+[2026-02-14T08:29:06.525Z] "message": "using explicit provider/model"
+```
+
+```
+[2026-02-14T08:29:06.628Z] "pkg": "@ai-sdk/openai-compatible",
+```
+
+- **Provider ID**: `opencode` (resolved from `--model kimi-k2.5-free` via `resolveShortModelName()` in `provider.ts:1452`)
+- **SDK**: `@ai-sdk/openai-compatible`
+- **Base URL**: `https://opencode.ai/zen/v1` (from models.dev database for the "opencode" provider)
+- **Model ID sent to API**: `kimi-k2.5-free` (from models.dev `opencode.models["kimi-k2.5-free"].id`)
+- **API Key**: `"public"` (free model, no API key needed — see `provider.ts:87`)
+
+### Why "opencode" provider, not "kilo"?
+
+The model `kimi-k2.5-free` exists in **both** the `opencode` and `kilo` providers. The resolution logic in `provider.ts:1450-1458` prefers `opencode` for shared models:
+
+```typescript
+// provider.ts:1450-1458
+// Multiple providers have this model - prefer OpenCode for shared free models
+if (matchingProviders.includes('opencode')) {
+  return { providerID: 'opencode', modelID };
+}
+```
+
+**The Kilo AI Gateway (`api.kilo.ai`) is NOT involved in this incident.** The previous analysis incorrectly stated Kilo was in the chain. The actual gateway is OpenCode Zen (`opencode.ai/zen/v1`).
+
+## Timeline of Events
+
+All timestamps from `original-log.txt`:
+
+| Time (UTC) | Event | Evidence |
+|------------|-------|----------|
+| 08:28:32 | Process started | `solve v1.23.1`, `--model kimi-k2.5-free` |
+| 08:29:06.525 | Provider resolved | `"providerID": "opencode"`, `"modelID": "kimi-k2.5-free"` |
+| 08:29:06.628 | SDK loaded | `"pkg": "@ai-sdk/openai-compatible"` |
+| 08:29:08.662 | Rate limit 429 | `"headerValue": 55852` → retry-after ~15.5 hours |
+| 08:29:08–08:30:31 | Multiple 429s | Correct retry-after handling |
+| 08:33:41.604 | Session 2 started | Same provider: `"providerID": "opencode"` |
+| 08:34:12.210 | **Stream error** | `"name": "AI_JSONParseError"`, `"text": "{\"id\":\"chatcmpl-jQugNdata:..."` |
+| 08:34:12.211 | Error classified | `"name": "UnknownError"` — not retryable |
+| 08:34:12.213 | Tool aborted | `"error": "Tool execution aborted"` (side effect) |
+| 08:34:12.293 | Solve script exit | Misclassified as `UsageLimit` |
+
+## Root Cause Analysis
+
+### Root Cause 1: Malformed SSE Data from OpenCode Zen
+
+The SSE stream returned corrupted data where two SSE chunks were concatenated without proper delimiters. From the log:
+
+```json
+"text": "{\"id\":\"chatcmpl-jQugNdata:{\"id\":\"chatcmpl-iU6vkr3fItZ0Y4rTCmIyAnXO\",\"object\":\"chat.completion.chunk\",\"created\":1771058051,\"model\":\"kimi-k2.5\",\"choices\":[{\"index\":0,\"delta\":{\"role\":\"assistant\",\"content\":\"\"},\"finish_reason\":null}],\"system_fingerprint\":\"fpv0_f7e5c49a\"}"
+```
+
+Breaking this down:
+- `{"id":"chatcmpl-jQugN` — partial first chunk (truncated after the `id` value)
+- `data:` — SSE protocol prefix that should start a new line
+- `{"id":"chatcmpl-iU6vk...` — complete second chunk
+
+Expected (correct SSE format):
+```
+data: {"id":"chatcmpl-jQugN","object":"chat.completion.chunk",...}\n\n
+data: {"id":"chatcmpl-iU6vkr3fItZ0Y4rTCmIyAnXO","object":"chat.completion.chunk",...}\n\n
+```
+
+The `data:` prefix of the second event is embedded inside the first event's JSON value, indicating SSE chunk boundary corruption at the gateway level.
+
+Similar issues reported in other projects:
+- [OpenCode #7692](https://github.com/anomalyco/opencode/issues/7692): "JSON Parse Error with Zhipu GLM-4.7"
+- [OpenCode #10967](https://github.com/anomalyco/opencode/issues/10967): "Error Writing Large Files with Kimi K2.5"
+- [sglang #8613](https://github.com/sgl-project/sglang/issues/8613): "Kimi-K2 model outputs incomplete content during multi-turn streaming"
+
+### Root Cause 2: Agent's Processor Throws on All Stream Error Events
+
+The Vercel AI SDK handles this error correctly — it:
+1. Catches JSON parse failure in `safeParseJSON()` (returns `{ success: false }`)
+2. In `openai-compatible-chat-language-model.ts:417-420`, enqueues `{ type: 'error', error: chunk.error }` and **returns** — the stream continues processing subsequent chunks
+
+However, the agent's `processor.ts:207-208` throws on **any** stream error event:
+
+```typescript
+case 'error':
+  throw value.error;
+```
+
+This converts a recoverable stream event into a fatal session error. The error then flows to `fromError()` in `message-v2.ts`, where `AI_JSONParseError` (which extends `Error`) falls through to `NamedError.Unknown` — which is not retryable.
+
+### Root Cause 3: Solve Script Error Misclassification
+
+The external solve script detected "Tool execution aborted" (a side effect — in-flight tool calls are marked "aborted" when the stream errors out) and misclassified it as `UsageLimit`, preventing any further retry at the script level.
+
+## Comparison with Other CLI Agents
+
+| Agent | JSON parse error in SSE | Stream continues? | Source |
+|-------|------------------------|-------------------|--------|
+| **OpenAI Codex** (Rust) | `debug!("Failed to parse SSE event"); continue;` | Yes — skip & continue | `codex-api/src/sse/responses.rs:373-379` |
+| **Gemini CLI** | `throw e;` in `@google/genai` SDK | No — stream terminates | `@google/genai` `Stream.fromSSEResponse()` |
+| **Qwen Code** | SDK JSONL: `return null` (skip). OpenAI path: no safe parse | Partial — only SDK JSONL mode | `sdk-typescript/src/utils/jsonLines.ts` |
+| **OpenCode** (upstream) | Falls to `NamedError.Unknown` | No — session terminates | `session/message-v2.ts:fromError()` |
+| **Vercel AI SDK** | `safeParseJSON()` → `{ type: 'error' }` event | **Yes** — stream continues | `openai-compatible-chat-language-model.ts:417-420` |
+
+**Key insight**: The Vercel AI SDK already handles this gracefully — it emits an error event and continues. The problem is that consumers (this agent and upstream OpenCode) throw on that error event instead of handling it.
+
+OpenAI Codex is the gold standard: skip the corrupted event, log a warning, and keep processing the stream.
+
+## Root Cause Summary
+
+| Layer | Responsible Party | Issue |
+|-------|------------------|-------|
+| SSE Stream | OpenCode Zen gateway / Moonshot Kimi K2.5 | Corrupted SSE chunks (concatenated without delimiters) |
+| SSE Parsing | Vercel AI SDK | Handles correctly — emits error event, stream continues |
+| **Error Handling** | **Agent `processor.ts`** | **Throws on all error events — should skip parse errors** |
+| Error Classification | Agent `message-v2.ts` | Falls to `NamedError.Unknown` (not retryable) — moot if we skip |
+| Process Management | Solve script | Misclassifies "Tool execution aborted" as `UsageLimit` |
+
+## Solution: Skip-and-Continue (Codex Approach)
+
+Instead of throwing on stream parse errors, log a warning and continue processing:
+
+```typescript
+// In processor.ts, case 'error':
+case 'error':
+  // Check if this is a stream parse error (malformed SSE from gateway)
+  // These are recoverable — the AI SDK continues the stream after emitting this event
+  // Skip and continue, like OpenAI Codex does
+  if (JSONParseError.isInstance(value.error)) {
+    log.warn(() => ({
+      message: 'skipping malformed SSE event (stream parse error)',
+      errorName: value.error?.name,
+      errorMessage: value.error?.message?.substring(0, 200),
+    }));
+    continue;
+  }
+  throw value.error;
+```
+
+This approach:
+- **Does NOT retry** — the error is not retryable, it's skippable
+- **Does NOT terminate** — the stream continues processing valid chunks
+- **Logs a warning** — visibility into corrupted events for monitoring
+- **Matches Codex pattern** — proven approach in production
+
+## Upstream Issues
+
+### 1. OpenCode Zen / Moonshot Kimi K2.5
+
+**Root cause**: The SSE stream produces corrupted chunks where event boundaries are not properly delimited. This appears to happen at the gateway level (OpenCode Zen at `opencode.ai/zen/v1`) when proxying Kimi K2.5 responses.
+
+### 2. Vercel AI SDK (`vercel/ai`)
+
+**Enhancement request**: While the SDK correctly handles parse errors in the stream transform (enqueues error event, continues), it would be beneficial to:
+- Add `isRetryable` property to `AI_JSONParseError`
+- Provide a configurable `onStreamParseError` callback in provider settings
+- Document the recommended pattern for handling `{ type: 'error' }` events in `fullStream`
+
+### 3. OpenCode (sst/opencode — upstream)
+
+**Same gap**: The upstream OpenCode project has the same `case 'error': throw value.error;` pattern in its `processor.ts`, causing identical session termination on stream parse errors.
+
+## Conclusion
+
+The premature session termination was caused by a chain of failures:
+1. **Origin**: Kimi K2.5 or OpenCode Zen gateway produced corrupted SSE chunks
+2. **SDK**: Vercel AI SDK correctly handled it — emitted error event, continued stream
+3. **Agent**: Threw on the error event, terminating the session
+4. **Process**: Solve script misclassified the downstream effect
+
+The fix is to adopt the OpenAI Codex approach: **skip corrupted SSE events and continue processing the stream**. A single bad chunk should never terminate an entire session.
diff --git a/docs/case-studies/issue-169/cli-agents-comparison.md b/docs/case-studies/issue-169/cli-agents-comparison.md
@@ -0,0 +1,99 @@
+# CLI Agent Comparison: SSE Stream Parse Error Handling
+
+Analysis of how major CLI agents handle SSE stream parse errors (like `AI_JSONParseError`),
+conducted as part of the investigation for [issue #169](https://github.com/link-assistant/agent/issues/169).
+
+## Summary Table
+
+| Feature | OpenAI Codex | Gemini CLI | Qwen Code | OpenCode | This Agent (fix) |
+|---------|-------------|------------|-----------|----------|-----------------|
+| **Language** | Rust | TypeScript | TypeScript | TypeScript | TypeScript |
+| **SSE Parser** | `eventsource_stream` crate | `@google/genai` SDK `SSEDecoder` | Delegates to `openai` npm | `eventsource-parser` (via AI SDK) | `eventsource-parser` (via AI SDK) |
+| **JSON parse error in SSE** | Skip & continue | throw in SDK (gap) | SDK JSONL: skip. OpenAI path: no safe parse | Falls to UnknownError (gap) | **Skip & continue (Codex approach)** |
+| **Stream continues?** | Yes | No | Partial (SDK JSONL only) | No | **Yes** |
+
+## Detailed Analysis
+
+### OpenAI Codex CLI — Best Practice
+
+**Approach**: Skip-and-continue for individual bad events.
+
+**Key code** (`codex-api/src/sse/responses.rs:373-379`):
+```rust
+let event: ResponsesStreamEvent = match serde_json::from_str(&sse.data) {
+    Ok(event) => event,
+    Err(e) => {
+        debug!("Failed to parse SSE event: {e}, data: {}", &sse.data);
+        continue;  // Skip and continue processing the stream
+    }
+};
+```
+
+**Design philosophy**:
+- Individual SSE events that fail JSON parsing are **skipped** (logged at debug level)
+- SSE framing errors (protocol-level) terminate the stream → triggers stream retry
+- A single corrupted chunk should never terminate an entire session
+
+### Gemini CLI
+
+**Approach**: Delegates to `@google/genai` SDK which throws on JSON parse errors.
+
+**Key finding**: The `@google/genai` SDK's `Stream.fromSSEResponse()` catches JSON parse errors,
+logs them, and **re-throws** — the error propagates up and terminates the stream.
+`SyntaxError` from `JSON.parse` is neither an `InvalidStreamError` nor a retryable error.
+
+**Gap**: Same as OpenCode — JSON parse errors during SSE consumption terminate the stream.
+
+### Qwen Code
+
+**Approach**: Two different paths with different behavior.
+
+**SDK JSONL transport** (`packages/sdk-typescript/src/utils/jsonLines.ts`):
+```typescript
+export function parseJsonLineSafe(line, context) {
+  try {
+    return JSON.parse(line);
+  } catch (error) {
+    logger.warn('Failed to parse JSON line, skipping:', line.substring(0, 100));
+    return null;  // Caller skips
+  }
+}
+```
+
+**OpenAI-compatible streaming**: Delegates to `openai` npm package. No safe parse for SSE events.
+
+**Gap**: The safe parse only protects the SDK's own JSON Lines mode, not the OpenAI-compatible path.
+
+### OpenCode (sst/opencode) — Upstream
+
+**Approach**: AI SDK error classification; throws on all stream error events.
+
+**Key finding**: `processor.ts` has `case 'error': throw value.error;` — identical to our bug.
+`AI_JSONParseError` falls through to `NamedError.Unknown` in `fromError()` — not retried.
+
+**Critical nuance**: The Vercel AI SDK actually emits the error as a stream event and
+**continues the stream transform**. It is the consumer (`processor.ts`) that terminates
+the session by throwing on the error event.
+
+### This Agent — Fix Applied
+
+**Approach**: Skip-and-continue (Codex approach).
+
+```typescript
+case 'error':
+  if (JSONParseError.isInstance(value.error)) {
+    log.warn(() => ({
+      message: 'skipping malformed SSE event (stream parse error)',
+      errorName: value.error?.name,
+      errorMessage: value.error?.message?.substring(0, 200),
+    }));
+    continue;  // Skip and continue, like Codex
+  }
+  throw value.error;  // Other errors still terminate
+```
+
+**Why skip, not retry**:
+- The Vercel AI SDK continues the stream after emitting the error event
+- Subsequent chunks may be valid — no need to restart the entire stream
+- Retrying would lose all progress made before the corrupted event
+- This matches the OpenAI Codex pattern — proven in production
diff --git a/docs/case-studies/issue-169/filed-issues.md b/docs/case-studies/issue-169/filed-issues.md
@@ -0,0 +1,46 @@
+# Filed Issues for Case Study #169
+
+Issues filed as part of the investigation into premature session termination
+due to SSE stream corruption and unhandled `AI_JSONParseError`.
+
+## Filed Issues
+
+1. **Vercel AI SDK** — [vercel/ai#12595](https://github.com/vercel/ai/issues/12595)
+   - Title: AI_JSONParseError should support retry for mid-stream parse errors
+   - Requests `isRetryable` property, `onStreamParseError` callback, or stream-level retry config
+   - **Update**: The SDK already handles this correctly (emits error event, stream continues).
+     The real issue is that consumers throw on error events. Updated draft recommends
+     documenting the skip-and-continue pattern.
+
+2. **OpenCode (upstream)** — [anomalyco/opencode#13579](https://github.com/anomalyco/opencode/issues/13579)
+   - Title: AI_JSONParseError during SSE streaming is not retried (falls to NamedError.Unknown)
+   - Reports the same bug in the upstream project
+   - **Update**: The fix should be skip-and-continue (like Codex), not retry.
+     `processor.ts` should detect `JSONParseError` in `case 'error'` and `continue`
+     instead of `throw value.error`.
+
+3. **Kilo AI Gateway** — ~~[Kilo-Org/kilocode#5875](https://github.com/Kilo-Org/kilocode/issues/5875)~~
+   - **Correction**: Kilo Gateway was NOT involved in this incident.
+     The actual gateway was OpenCode Zen (`opencode.ai/zen/v1`).
+     The `kimi-k2.5-free` model resolved to `opencode` provider, not `kilo`.
+
+## Corrected Provider Chain
+
+```
+Agent (Bun) → OpenCode Zen (opencode.ai/zen/v1) → Moonshot Kimi K2.5 API
+```
+
+Evidence from logs:
+```
+[2026-02-14T08:29:06.525Z] "providerID": "opencode"
+[2026-02-14T08:29:06.525Z] "modelID": "kimi-k2.5-free"
+```
+
+The OpenCode provider's base URL is `https://opencode.ai/zen/v1` (from models.dev database).
+
+## Related Existing Issues
+
+- [anomalyco/opencode#7692](https://github.com/anomalyco/opencode/issues/7692): JSON Parse Error with GLM-4.7 stream chunks concatenated
+- [anomalyco/opencode#10967](https://github.com/anomalyco/opencode/issues/10967): Error Writing Large Files with Kimi K2.5
+- [vercel/ai#4099](https://github.com/vercel/ai/issues/4099): streamText error handling
+- [sglang#8613](https://github.com/sgl-project/sglang/issues/8613): Kimi-K2 incomplete content during streaming
diff --git a/docs/case-studies/issue-169/kilo-gateway-issue-draft.md b/docs/case-studies/issue-169/kilo-gateway-issue-draft.md
@@ -0,0 +1,29 @@
+# Note: Kilo AI Gateway was NOT involved in this incident
+
+The original analysis incorrectly attributed the SSE corruption to Kilo AI Gateway (`api.kilo.ai`).
+
+After re-analysis of the logs, the actual provider chain was:
+
+```
+Agent → OpenCode Zen (opencode.ai/zen/v1) → Moonshot Kimi K2.5
+```
+
+The `kimi-k2.5-free` model was resolved to the `opencode` provider (not `kilo`) via `resolveShortModelName()` in `provider.ts:1452`, which prefers OpenCode for shared models.
+
+The Kilo Gateway issue draft is preserved here for reference, but the SSE corruption issue should be reported to OpenCode Zen / Moonshot instead.
+
+---
+
+## Original Draft (for reference — DO NOT file to Kilo)
+
+### SSE stream corruption when proxying Kimi K2.5
+
+The SSE stream returned corrupted data where two events were concatenated:
+
+```json
+{
+  "text": "{\"id\":\"chatcmpl-jQugNdata:{\"id\":\"chatcmpl-iU6vkr3fItZ0Y4rTCmIyAnXO\",...}"
+}
+```
+
+This issue was observed through OpenCode Zen, not Kilo Gateway.