fix: add NVIDIA NIM provider profile for input_type embedding field by vicjayjay · Pull Request #268 · CortexReach/memory-lancedb-pro

vicjayjay · 2026-03-19T02:01:56Z

Summary

Supersedes #219. Implements the fix properly using the provider profile architecture from #216.

Adds "nvidia" to EmbeddingProviderProfile type
Detects NVIDIA NIM via *.nvidia.com base URLs, nvidia/* model prefixes, and nv-embed* model names
Sets taskField: "input_type" with value mapping (retrieval.query → query, retrieval.passage → passage)
Supports encoding_format: "float" and forwards dimensions for NVIDIA dynamic models

Changes

src/embedder.ts: Added the NVIDIA provider profile, kept dimensions forwarding for dynamic models, and simplified the NVIDIA baseURL detection regex to /\.nvidia\.com/i
test/nvidia-nim-provider-profile.test.mjs: 6 automated tests covering all maintainer requirements:
- NVIDIA requests send input_type (not task)
- retrieval.query/retrieval.passage are mapped correctly
- NVIDIA is detected from an .nvidia.com baseURL
- Non-NVIDIA providers (Jina) still send task
- Generic providers send neither task nor input_type

Test plan

All 6 NVIDIA profile tests pass (node --test test/nvidia-nim-provider-profile.test.mjs)
Existing embedder tests pass (node test/embedder-error-hints.test.mjs)
CI pipeline

🤖 Generated with Claude Code

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ad8c5ddd19

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-03-19T02:08:30Z

src/embedder.ts

+          "query": "query",
+          "passage": "passage",
+        },
+        dimensionsField: null,


Keep dimensions forwarding for dynamic NVIDIA models

For the new nvidia profile, setting dimensionsField to null means buildPayload() never forwards embedding.dimensions (src/embedder.ts:646-647), even though the constructor still treats that config as the expected vector size (src/embedder.ts:470). NVIDIA's OpenAI-compatible docs explicitly use a dimensions parameter for dynamic models such as nvidia/llama-3.2-nv-embedqa-1b-v2; with a config like dimensions: 384, this change now requests the model's default 2048-d output and immediately fails with Embedding dimension mismatch, or creates the wrong-sized table.

Useful? React with 👍 / 👎.

rwmjhb

Nice work — the provider profile approach following #216 is clean and extensible. Two small things:

Suggestion: update PR description

The summary still says "disables dimensionsField", but the second commit intentionally changed it to "dimensions" for dynamic models. Worth syncing the description so it doesn't confuse future readers.

Nits:

Detection regex /\.nvidia\.com|integrate\.api\.nvidia\.com/i — the second alternative is already matched by the first, could simplify to /\.nvidia\.com/i
Consider adding a test that detects NVIDIA via a .nvidia.com baseURL (currently all tests use model name patterns)

vicjayjay · 2026-03-22T03:32:08Z

@rwmjhb synced this up.

I updated the PR description so it now reflects that NVIDIA dynamic models still forward dimensions, simplified the detection regex to /\.nvidia\.com/i, and added a .nvidia.com baseURL detection test.

The follow-up commits are:

7871fa0 Add NVIDIA detection test and update imports
1507994 Refactor NVIDIA compatibility check regex

I also re-checked the earlier dimensions regression and didn't find another leftover behavior bug beyond the description drift and the two nits you called out.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 15079944d9

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-03-22T03:34:48Z

src/embedder.ts

    return "voyage-compatible";
  }

+  if (/\.nvidia\.com/i.test(base) || /^nvidia\//i.test(model) || /^nv-embed/i.test(model)) {


Restrict the NVIDIA profile to models that accept input_type

This detection is broader than the capability it enables. Any *.nvidia.com endpoint or nvidia/*/nv-embed* model now gets the nvidia profile, and buildPayload() will therefore add input_type whenever embedding.taskQuery or taskPassage is set. That also sweeps in NVIDIA-hosted embeddings such as BAAI/bge-m3, snowflake/arctic-embed-l, nvidia/nv-embed-v1, and nvidia/nv-embedcode-7b-v1, whose model docs describe plain text (or task-specific instructions) rather than the retriever-style query/passage contract. Those configs previously behaved as generic OpenAI-compatible embeddings, so this heuristic can turn valid requests into 400s or wrong embeddings for non-retriever models.

Useful? React with 👍 / 👎.

vicjayjay · 2026-03-22T06:07:45Z

Follow-up commits are pushed and the PR is ready for re-check. There are currently 3 workflow runs awaiting maintainer approval on this PR before they can execute: CI, Claude Code Review, and Claude Code.

AliceLJY · 2026-03-24T16:48:55Z

看了 follow-up commits，改动没问题。

不过有个情况同步一下：我们昨天发了 v1.1.0-beta.10，main 上有比较大的变动（OpenClaw 2026.3+ hook 适配 + 若干 fix）。麻烦你 rebase 到最新 main，确认没有冲突，CI 跑过之后我们合。

NVIDIA NIM rejects the `task` field and requires `input_type` instead. This adds a proper "nvidia" provider profile following the architecture introduced in CortexReach#216, rather than hardcoding URL checks in buildPayload. Detection: matches *.nvidia.com base URLs, nvidia/* model prefixes, and nv-embed* model names. Capabilities: sends input_type instead of task, maps retrieval.query → query and retrieval.passage → passage, supports encoding_format: float. Includes 5 automated tests covering: - NVIDIA sends input_type (not task) - retrieval.passage → passage value mapping - nvidia/ model prefix detection - Jina still sends task field - Generic providers send neither Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

NVIDIA NIM's OpenAI-compatible API supports a `dimensions` parameter for dynamic models like nvidia/llama-3.2-nv-embedqa-1b-v2. Setting dimensionsField to null prevented buildPayload() from forwarding the configured dimensions, causing dimension mismatch errors. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Added a test case to detect NVIDIA from a .nvidia.com baseURL and modified import to include formatEmbeddingProviderError.

Simplifies the NVIDIA provider baseURL detection pattern now that .nvidia.com already covers integrate.api.nvidia.com.

vicjayjay · 2026-03-24T22:21:58Z

Rebased onto latest master (5c477c1). One trivial conflict in getProviderLabel() — master added an Azure OpenAI label line, our branch added the NVIDIA NIM line. Kept both.

Commits are clean on top of v1.1.0-beta.10. Waiting for maintainer CI approval to confirm green.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 3f26c60be2

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-03-24T22:27:00Z

src/embedder.ts

    return "voyage-compatible";
  }

+  if (/\.nvidia\.com/i.test(base) || /^nvidia\//i.test(model) || /^nv-embed/i.test(model)) {


Prefer NVIDIA baseURL over model-prefix profile matches

In detectEmbeddingProviderProfile, the new NVIDIA check runs after the jina- and voyage* model-prefix checks, so a .nvidia.com endpoint using a model name like voyage-... or jina-... will be classified as Voyage/Jina instead of NVIDIA. In that configuration, buildPayload() sends the wrong provider-specific fields (output_dimension/task semantics) rather than NVIDIA's dimensions/input_type, which can cause request rejections or silently incorrect task behavior on NIM.

Useful? React with 👍 / 👎.

AliceLJY · 2026-03-24T23:12:08Z

Thanks for the rebase, commits look clean on top of beta.10.

Before we merge, there are two things I'd like addressed — embedding provider detection is load-bearing infrastructure and we need to be careful here.

1. Detection scope is too broad

Right now any *.nvidia.com endpoint gets the nvidia profile, which means buildPayload() will inject input_type into every request hitting that host. NVIDIA also serves non-retriever models on .nvidia.com (e.g. NV-CLIP, future multimodal endpoints) that don't expect input_type — this could silently break those requests or return 400s that are very hard for users to diagnose.

Ask: Tighten the detection. Either:

Narrow the host match to known retriever-specific subdomains/paths, or
Keep the broad host match but only inject input_type when taskQuery/taskPassage is explicitly configured (so non-retriever users who don't set those fields are unaffected)

2. Detection ordering — host vs model prefix precedence

The NVIDIA check currently runs after Jina/Voyage model-prefix checks. So a .nvidia.com endpoint serving a model named voyage-xxx or jina-xxx would be classified as Voyage/Jina instead of NVIDIA, sending wrong provider-specific fields (task/output_dimension instead of input_type/dimensions).

If the intent is "NVIDIA host semantics win", the .nvidia.com baseURL check should come before model-prefix heuristics. If model prefix should win, document why.

Ask: Either move the baseURL portion of the NVIDIA check above the model-prefix providers, or add a comment explaining the intended precedence.

3. Negative test coverage

The current 6 tests are all happy-path. Please add:

.nvidia.com baseURL + conflicting model prefix (e.g. jina-xxx) → verify correct profile wins
.nvidia.com baseURL + no taskQuery/taskPassage configured → verify no input_type is injected (if you go with option B in point 1)

None of these are hard to fix and the core implementation is solid. Just want to make sure we don't ship edge cases that are painful to debug for users. CI approval is pending — I'll approve the workflow runs so they can start in parallel while you work on the above.

@AliceLJY

Address PR review feedback from @AliceLJY: 1. Detection ordering: split host-based detection (runs first) from model-prefix fallback. .nvidia.com host now takes precedence over jina-/voyage model prefixes, preventing misclassification of models like jina-xxx hosted on NVIDIA endpoints. 2. Detection scope safety: the broad .nvidia.com match is safe because buildPayload() only injects input_type when the user explicitly configures taskQuery/taskPassage. Non-retriever models (NV-CLIP etc.) are unaffected. Added comment documenting this guard. 3. Negative tests: - .nvidia.com + jina- model prefix → NVIDIA wins - NVIDIA profile without taskQuery/taskPassage → no input_type All 8 NVIDIA tests + existing embedder tests pass. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

vicjayjay · 2026-03-25T04:09:55Z

@AliceLJY Addressed all three points in e6e705d:

1. Detection scope — Already safe by design. buildPayload() only injects input_type when both the profile defines taskField AND the user explicitly configures taskQuery/taskPassage (line 658: if (this._capabilities.taskField && task)). Users hosting NV-CLIP or other non-retriever models on .nvidia.com who don't set those fields are unaffected. Added a comment documenting this guard.

2. Detection ordering — Split detection into two phases:

Host-based checks first (.nvidia.com, api.jina.ai, api.voyageai.com) — endpoint owner semantics win
Model-prefix fallback second (jina-, voyage, nvidia/, nv-embed) — only when host didn't match

A .nvidia.com endpoint serving a jina-xxx model now correctly gets the NVIDIA profile.

3. Negative tests — Added 2 new tests (8 total, all passing):

.nvidia.com baseURL + jina-embeddings-v3 model → NVIDIA wins (verifies precedence)
nvidia/nv-clip-v1 model without taskQuery/taskPassage → no input_type injected (verifies safety)

Existing embedder-error-hints tests also pass — no regressions.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e6e705de1b

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-03-25T04:13:16Z

src/embedder.ts

-  }
+  if (/api\.jina\.ai/i.test(base)) return "jina";
+  if (/api\.voyageai\.com/i.test(base)) return "voyage-compatible";
+  if (/\.nvidia\.com/i.test(base)) return "nvidia";


Parse baseURL host before NVIDIA profile matching

The new NVIDIA detection checks /\.nvidia\.com/i against the entire baseURL string, so any proxy URL that merely contains .nvidia.com in its path or query can be misclassified as nvidia even when the actual host is different. In that case buildPayload() will apply NVIDIA-specific request semantics (input_type mapping and NVIDIA profile behavior) to non-NVIDIA backends, which can cause request failures when task hints are configured. Matching should be done on the parsed hostname (or with a strict host-bound regex) rather than the raw URL string.

Useful? React with 👍 / 👎.

Address Codex review: regex matching /.nvidia.com/ against the full baseURL string can misclassify proxy URLs that contain .nvidia.com in their path or query. Parse the hostname via URL() and match with endsWith() for all host-based provider checks. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

AliceLJY · 2026-03-26T06:51:22Z

@rwmjhb 恒哥看一下这个 PR，作者两轮修完了，我和 rwmjhb 之前都看过，你最终确认下能否 merge 👀

rwmjhb · 2026-03-26T08:16:40Z

detectEmbeddingProviderProfile() now treats any .nvidia.com host as nvidia, and
buildPayload() will then send input_type whenever taskQuery / taskPassage is
configured. NVIDIA’s OpenAI-compatible embedding docs split behavior by model family here: NV-
Embed models use input_type, but GTE/GTR models do not accept that parameter.

Because our config/docs already encourage taskQuery / taskPassage, this can break valid
.nvidia.com deployments that are not NV-Embed. Can we narrow the NVIDIA-specific payload
path to the model families that actually support input_type, instead of keying off host
alone?

Docs: https://docs.nvidia.com/nim/nemo-retriever/text-embedding/1.13.0/use-the-api-openai.html

chatgpt-codex-connector bot reviewed Mar 19, 2026

View reviewed changes

rwmjhb reviewed Mar 20, 2026

View reviewed changes

chatgpt-codex-connector bot reviewed Mar 22, 2026

View reviewed changes

vicjayjay and others added 4 commits March 24, 2026 15:21

Add NVIDIA detection test and update imports

a9b4f66

Added a test case to detect NVIDIA from a .nvidia.com baseURL and modified import to include formatEmbeddingProviderError.

Refactor NVIDIA compatibility check regex

3f26c60

Simplifies the NVIDIA provider baseURL detection pattern now that .nvidia.com already covers integrate.api.nvidia.com.

vicjayjay force-pushed the fix/nvidia-nim-provider-profile branch from 1507994 to 3f26c60 Compare March 24, 2026 22:21

chatgpt-codex-connector bot reviewed Mar 24, 2026

View reviewed changes

chatgpt-codex-connector bot reviewed Mar 25, 2026

View reviewed changes

AliceLJY assigned rwmjhb Mar 26, 2026

Conversation

vicjayjay commented Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Test plan

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

rwmjhb left a comment

Choose a reason for hiding this comment

Uh oh!

vicjayjay commented Mar 22, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Mar 22, 2026

Choose a reason for hiding this comment

Uh oh!

vicjayjay commented Mar 22, 2026

Uh oh!

AliceLJY commented Mar 24, 2026

Uh oh!

vicjayjay commented Mar 24, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

AliceLJY commented Mar 24, 2026

1. Detection scope is too broad

2. Detection ordering — host vs model prefix precedence

3. Negative test coverage

Uh oh!

vicjayjay commented Mar 25, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

AliceLJY commented Mar 26, 2026

Uh oh!

rwmjhb commented Mar 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

vicjayjay commented Mar 19, 2026 •

edited

Loading