Skip to content

fix: use native fetch for Ollama embedding to ensure AbortController works#362

Closed
jlin53882 wants to merge 2 commits intoCortexReach:masterfrom
jlin53882:fix/ollama-native-fetch-abort
Closed

fix: use native fetch for Ollama embedding to ensure AbortController works#362
jlin53882 wants to merge 2 commits intoCortexReach:masterfrom
jlin53882:fix/ollama-native-fetch-abort

Conversation

@jlin53882
Copy link
Copy Markdown
Contributor

@jlin53882 jlin53882 commented Mar 26, 2026

Problem

When using Ollama as the embedding provider, the \embedQuery\ timeout (EMBED_TIMEOUT_MS = 10s) does not reliably abort stalled Ollama HTTP requests. This causes the gateway-level \autoRecallTimeoutMs\ (120s) to fire instead.

Evidence from gateway logs:
\
20:48:46 auto-recall query truncated from 1233 to 1000 chars
[120 seconds of silence]
20:50:46 auto-recall timed out after 120000ms
\\

CPU ~20%, Ollama CPU ~0% — signature of a hanging HTTP connection.

Root Cause

\embedder.ts\ uses the OpenAI SDK to call Ollama. The SDK HTTP client in Node.js does not reliably abort the underlying TCP connection when \AbortController.abort()\ is called. Ollama keeps processing and the socket hangs until the 120s gateway timeout fires.

Fix

Use *native \ etch* for Ollama endpoints. Node.js 18+ native \ etch\ correctly respects \AbortController\ — TCP connection is properly closed when the signal fires.

Added

  • \isOllamaProvider(): detects \localhost:11434\ / \127.0.0.1:11434\ / /ollama\ URLs
  • \embedWithNativeFetch(): calls Ollama via native \ etch\ with proper signal handling

Modified

\embedWithRetry()\ now routes Ollama URLs through \embedWithNativeFetch()\ instead of the OpenAI SDK.

Test

  • Added \ estOllamaAbortWithNativeFetch\ (Test 8) to \cjk-recursion-regression.test.mjs\
  • Added \ est/pr354-standalone.mjs\ for quick verification
  • Added \ est/pr354-30iter.mjs\ for stress testing

30 iterations — all passed ✅

Abort time consistent at ~208–215ms (signal fires at 200ms).

Fixes #361

…works

Root cause: OpenAI SDK HTTP client does not reliably abort Ollama
TCP connections when AbortController.abort() fires in Node.js. This
causes stalled sockets that hang until the gateway-level 120s timeout.

Fix: Add isOllamaProvider() to detect localhost:11434 endpoints, and
embedWithNativeFetch() using Node.js 18+ native fetch instead of the
OpenAI SDK. Native fetch properly closes TCP connections on abort.

Added Test 8 (testOllamaAbortWithNativeFetch) to cjk-recursion-regression
test suite. Also added standalone test (pr354-standalone.mjs) and
30-iteration stress test (pr354-30iter.mjs).

Fixes CortexReach#361.
@chatgpt-codex-connector
Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.
To continue using code reviews, you can upgrade your account or add credits to your account and enable them for code reviews in your settings.

@AliceLJY
Copy link
Copy Markdown
Collaborator

Hey @jlin53882, the cli-smoke check is failing on this PR. Could you take a look and push a fix? The version-sync check passes, so it's just the smoke test that needs attention. Thanks!

@AliceLJY
Copy link
Copy Markdown
Collaborator

CI Failure Diagnosis

What's failing

The cli-smoke job fails on Test 8 (testOllamaAbortWithNativeFetch) in test/cjk-recursion-regression.test.mjs:

Test 8: Ollama endpoint uses native fetch and abort propagates correctly (PR354 fix)
AssertionError [ERR_ASSERTION]: Error should come from Ollama native fetch path, got:
  Embedding provider unreachable (fetch failed). Verify the endpoint is reachable
  at http://127.0.0.1:11434/v1 and that model "mxbai-embed-large" is available.

Why it's failing

The test calls embedder.embedPassage("ollama abort test probe") against http://127.0.0.1:11434/v1 — but there is no Ollama running in CI. The native fetch correctly fails with a connection error, but then embedSingle() catches it and wraps it through formatEmbeddingProviderError() (line ~769 and ~896 in embedder.ts), which produces:

Embedding provider unreachable (fetch failed). Verify the endpoint is reachable at ...

The assertion at line 282 expects the error message to match:

/ollama embedding failed|404|Failed to generate embedding from Ollama/i.test(errorCaught.message)

None of those patterns match "Embedding provider unreachable". The error from embedWithNativeFetch() ("Ollama embedding failed: ...") gets re-wrapped by the outer embedSingle() error handling before reaching the test.

The fix

In test/cjk-recursion-regression.test.mjs, update the assertion regex at line ~280-283 to also match the formatEmbeddingProviderError wrapper output. Change:

assert.ok(
  /ollama embedding failed|404|Failed to generate embedding from Ollama/i.test(errorCaught.message),
  "Error should come from Ollama native fetch path, got: " + errorCaught.message
);

to:

assert.ok(
  /ollama embedding failed|404|Failed to generate embedding from Ollama|Embedding provider unreachable/i.test(errorCaught.message),
  "Error should come from Ollama native fetch path, got: " + errorCaught.message
);

This accounts for the fact that in CI (no Ollama running), the connection-refused error from native fetch gets wrapped by formatEmbeddingProviderError into "Embedding provider unreachable (...)", which is the correct behavior — it proves the Ollama native fetch path is being used, the error just has a different shape than a 404.

Note on the standalone test files

test/pr354-standalone.mjs and test/pr354-30iter.mjs also assume a real Ollama is running at 127.0.0.1:11434. They aren't in the npm test script so they don't break CI, but they would fail in any environment without Ollama. Consider adding a skip condition or removing them to avoid confusion.

@jlin53882
Copy link
Copy Markdown
Contributor Author

Replaced by PR #383 — rebased, cleaned up test dependencies, AliceLJY assertion fix applied.

@jlin53882
Copy link
Copy Markdown
Contributor Author

Thanks for the detailed diagnosis @AliceLJY!

Follow-up has been addressed in a clean rebase at PR #383. The changes:

Please take a look at PR #383 when you have time. Thanks!

@jlin53882
Copy link
Copy Markdown
Contributor Author

PR Chain & Related Issues

This PR (#362) is part of a chain of work. Here's the full relationship:

Core Fix: Ollama Native Fetch

Discovered During Analysis

While investigating CI failures on #383, two additional issues were found and addressed:

Other Related PRs (recent merges)

Recommended Merge Order

  1. Merge fix test: update hardcoded seed date to stay within 14-day age limit #385 first (fixes pre-existing CI failure)
  2. Merge fix: use native fetch for Ollama embedding to ensure AbortController works #383 (main Ollama fix, CI should pass after fix test: update hardcoded seed date to stay within 14-day age limit #385)

@AliceLJY — the cli-smoke failure on #383 is caused by the pre-existing reflection-bypass-hook test failure (#384/#385). Once #385 is merged, please re-trigger CI on #383. Thanks!

@rwmjhb
Copy link
Copy Markdown
Collaborator

rwmjhb commented Mar 28, 2026

Superseded by #383.

@rwmjhb rwmjhb closed this Mar 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bug: Ollama embedding AbortController doesn't abort — causes 120s gateway timeout

3 participants