From f2d1d52752f752bbb1c875e75bf1f784852368c6 Mon Sep 17 00:00:00 2001 From: fzowl Date: Wed, 12 Nov 2025 19:09:56 +0100 Subject: [PATCH 1/3] Adding VoyageAI embeddings documentation --- en/rag/embedding.html | 145 +++++++++++++++++++++++ en/reference/rag/embedding.html | 200 ++++++++++++++++++++++++++++++++ 2 files changed, 345 insertions(+) diff --git a/en/rag/embedding.html b/en/rag/embedding.html index e21b1a6a66..4e58c8aa9d 100644 --- a/en/rag/embedding.html +++ b/en/rag/embedding.html @@ -490,6 +490,151 @@

SPLADE ranking

+

VoyageAI Embedder

+ +

An embedder that uses the VoyageAI embedding API +to generate high-quality embeddings for semantic search. This embedder calls the VoyageAI API service +and does not require local model files or ONNX inference.

+ +
{% highlight xml %}
+
+    
+        voyage-3.5
+        voyage_api_key
+    
+
+{% endhighlight %}
+ + + +

Add your VoyageAI API key to the secret store:

+
+vespa secret add voyage_api_key --value "pa-xxxxx..."
+
+ +

See the reference +for all configuration parameters including caching, retry logic, and performance tuning.

+ +

VoyageAI embedder models

+

+ VoyageAI offers several embedding models optimized for different use cases. + The resulting tensor type can be float or + bfloat16 for storage efficiency. +

+ +

Latest general-purpose models (recommended):

+ + +

Previous generation general-purpose models:

+ + +

Specialized models:

+ + +

Input type detection

+

VoyageAI models distinguish between query and document embeddings for improved retrieval quality. +The embedder automatically detects the context and sets the appropriate input type:

+ + +

You can disable auto-detection and set a fixed input type:

+
{% highlight xml %}
+
+    voyage-3.5
+    voyage_api_key
+    false
+    query
+
+{% endhighlight %}
+ +

VoyageAI performance features

+

The VoyageAI embedder includes several performance optimizations:

+ + +

Example with performance tuning:

+
{% highlight xml %}
+
+    voyage-3.5
+    voyage_api_key
+    10000
+    20
+    true
+
+{% endhighlight %}
+ +

Usage example

+

Complete example showing document indexing and query-time embedding:

+ +

Schema definition:

+
+schema doc {
+    document doc {
+        field text type string {
+            indexing: summary | index
+        }
+    }
+
+    field embedding type tensor<float>(x[1024]) {
+        indexing: input text | embed voyage | attribute | index
+        attribute {
+            distance-metric: angular
+        }
+    }
+
+    rank-profile semantic {
+        inputs {
+            query(q) tensor<float>(x[1024])
+        }
+        first-phase {
+            expression: closeness(field, embedding)
+        }
+    }
+}
+
+ +

Query with embedding:

+
{% highlight bash %}
+vespa query \
+  'yql=select * from doc where {targetHits:10}nearestNeighbor(embedding,q)' \
+  'input.query(q)=embed(voyage, "machine learning tutorials")'
+{% endhighlight %}
+ +

When using normalize set to true, use +distance-metric: prenormalized-angular +for more efficient similarity computation.

+ +

Embedder performance

Embedding inference can be resource-intensive for larger embedding models. Factors that impact performance:

diff --git a/en/reference/rag/embedding.html b/en/reference/rag/embedding.html index 0bf9d40a98..7c24a1d232 100644 --- a/en/reference/rag/embedding.html +++ b/en/reference/rag/embedding.html @@ -478,6 +478,206 @@

splade embedder reference config

+

VoyageAI Embedder

+

+ An embedder that uses the VoyageAI API + to generate embeddings. This is an API-based embedder that does not require local model files or ONNX inference. + It calls the VoyageAI service to generate high-quality embeddings optimized for semantic search. +

+

+ The VoyageAI embedder is configured in services.xml, + within the container tag: +

+
{% highlight xml %}
+
+    
+        voyage-3.5
+        voyage_api_key
+    
+
+{% endhighlight %}
+ +

Secret Management

+

+ The VoyageAI API key must be stored in Vespa's + secret store for secure management: +

+
+vespa secret add voyage_api_key --value "pa-xxxxx..."
+
+

+ The api-key-secret-ref parameter references the secret name. + Secrets are automatically refreshed when rotated without requiring application restart. +

+ +

VoyageAI embedder reference config

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
NameOccurrenceDescriptionTypeDefault
api-key-secret-refOneRequired. Reference to the secret in Vespa's secret store containing the VoyageAI API key.stringN/A
modelOneThe VoyageAI model to use. Available models: +
    +
  • voyage-3.5 (1024 dims) - Latest and best quality, state-of-the-art (recommended)
  • +
  • voyage-3.5-lite (512 dims) - Newest lite model, excellent quality at lower cost
  • +
  • voyage-3 (1024 dims) - Previous generation, high quality
  • +
  • voyage-3-lite (512 dims) - Previous generation, cost-efficient
  • +
  • voyage-code-3 (1024 dims) - Code search optimization
  • +
  • voyage-finance-2 (1024 dims) - Financial documents
  • +
  • voyage-law-2 (1024 dims) - Legal documents
  • +
  • voyage-multilingual-2 (1024 dims) - Multilingual support
  • +
+
stringvoyage-3.5
endpointOneVoyageAI API endpoint URL. Can be overridden for custom proxies or regional endpoints.stringhttps://api.voyageai.com/v1/embeddings
timeoutOneRequest timeout in milliseconds. Also serves as the bound for retry attempts - retries stop when total elapsed time would exceed this timeout. Minimum value: 1000ms.numeric30000
max-retriesOneMaximum number of retry attempts for failed requests. Used as a safety limit in addition to the timeout-based retry bound.numeric10
default-input-typeOneDefault input type when auto-detection is disabled. Valid values: query or document. VoyageAI models use different optimizations for queries vs documents.enumdocument
auto-detect-input-typeOneWhether to automatically detect input type based on context. When enabled, uses query type for query-time embeddings and document type for indexing.booleantrue
normalizeOneWhether to apply L2 normalization to embeddings. When enabled, all embedding vectors are normalized to unit length. Use with prenormalized-angular distance-metric for efficient similarity computation.booleanfalse
truncateOneWhether to truncate input text exceeding model limits. When enabled, text is automatically truncated. When disabled, requests with too-long text will fail.booleantrue
pool-sizeOneHTTP connection pool size. Higher values improve throughput for concurrent requests but use more resources.numeric5
cache-sizeOneLRU cache size for storing recent embeddings. Reduces duplicate API calls. Set to 0 to disable caching. Cache key includes text, input type, and embedder ID.numeric1000
+ +

Example Configurations

+ +

Basic configuration (recommended):

+
{% highlight xml %}
+
+    voyage-3.5
+    voyage_api_key
+
+{% endhighlight %}
+ +

High-performance configuration:

+
{% highlight xml %}
+
+    voyage-3.5
+    voyage_api_key
+    10000
+    20
+    60000
+
+{% endhighlight %}
+ +

Fast and cost-efficient configuration:

+
{% highlight xml %}
+
+    voyage-3.5-lite
+    voyage_api_key
+    10000
+
+{% endhighlight %}
+ +

Query-optimized configuration:

+
{% highlight xml %}
+
+    voyage-3.5
+    voyage_api_key
+    query
+    false
+    true
+
+{% endhighlight %}
+ +

Code search configuration:

+
{% highlight xml %}
+
+    voyage-code-3
+    voyage_api_key
+    true
+
+{% endhighlight %}
+ +

Cost and Performance Optimization

+

The VoyageAI embedder includes several features to reduce API costs and improve performance:

+ + +

For detailed performance monitoring, the embedder emits standard Vespa embedder metrics + (see Container Metrics). + Monitor API usage and costs through the VoyageAI dashboard.

+ + +

Huggingface tokenizer embedder

The Huggingface tokenizer embedder is configured in services.xml, From e8e64717a9fdd0281b0f1af7e558f9f21656c7c5 Mon Sep 17 00:00:00 2001 From: fzowl Date: Mon, 17 Nov 2025 19:26:43 +0100 Subject: [PATCH 2/3] Adding VoyageAI embeddings documentation --- en/rag/embedding.html | 9 ++++----- en/reference/rag/embedding.html | 21 ++++++--------------- 2 files changed, 10 insertions(+), 20 deletions(-) diff --git a/en/rag/embedding.html b/en/rag/embedding.html index 4e58c8aa9d..f6a85daf05 100644 --- a/en/rag/embedding.html +++ b/en/rag/embedding.html @@ -576,9 +576,9 @@

Input type detection

VoyageAI performance features

The VoyageAI embedder includes several performance optimizations:

@@ -587,8 +587,7 @@

VoyageAI performance features

voyage-3.5 voyage_api_key - 10000 - 20 + 20 true {% endhighlight %} diff --git a/en/reference/rag/embedding.html b/en/reference/rag/embedding.html index 7c24a1d232..fdba70712f 100644 --- a/en/reference/rag/embedding.html +++ b/en/reference/rag/embedding.html @@ -597,19 +597,12 @@

VoyageAI embedder reference configtrue - pool-size + max-idle-connections One - HTTP connection pool size. Higher values improve throughput for concurrent requests but use more resources. + Maximum number of idle HTTP connections to keep in the connection pool. Higher values improve throughput for concurrent requests but use more resources. numeric 5 - - cache-size - One - LRU cache size for storing recent embeddings. Reduces duplicate API calls. Set to 0 to disable caching. Cache key includes text, input type, and embedder ID. - numeric - 1000 - @@ -628,8 +621,7 @@

Example Configurations

voyage-3.5 voyage_api_key - 10000 - 20 + 20 60000 {% endhighlight %} @@ -639,7 +631,6 @@

Example Configurations

voyage-3.5-lite voyage_api_key - 10000 {% endhighlight %} @@ -666,9 +657,9 @@

Example Configurations

Cost and Performance Optimization

The VoyageAI embedder includes several features to reduce API costs and improve performance:

From 3f34645206f61c28cf09a004ff439ac217cd4540 Mon Sep 17 00:00:00 2001 From: fzowl Date: Mon, 22 Dec 2025 15:20:49 +0100 Subject: [PATCH 3/3] Adding VoyageAI embeddings documentation --- en/rag/embedding.html | 12 ++++++++++++ en/reference/rag/embedding.html | 2 ++ 2 files changed, 14 insertions(+) diff --git a/en/rag/embedding.html b/en/rag/embedding.html index f6a85daf05..21fccad8c0 100644 --- a/en/rag/embedding.html +++ b/en/rag/embedding.html @@ -555,6 +555,18 @@

VoyageAI embedder models

  • voyage-multilingual-2 produces tensor<float>(x[1024]) - supports 100+ languages
  • +

    Contextual model:

    + + +

    Multimodal model (preview):

    + +

    Input type detection

    VoyageAI models distinguish between query and document embeddings for improved retrieval quality. The embedder automatically detects the context and sets the appropriate input type:

    diff --git a/en/reference/rag/embedding.html b/en/reference/rag/embedding.html index fdba70712f..3dcfdba557 100644 --- a/en/reference/rag/embedding.html +++ b/en/reference/rag/embedding.html @@ -542,6 +542,8 @@

    VoyageAI embedder reference configvoyage-finance-2 (1024 dims) - Financial documents
  • voyage-law-2 (1024 dims) - Legal documents
  • voyage-multilingual-2 (1024 dims) - Multilingual support
  • +
  • voyage-context-3 (1024 dims, configurable: 256/512/1024/2048) - Contextualized document chunk embeddings
  • +
  • voyage-multimodal-3.5 (1024 dims, configurable: 256/512/1024/2048) - Multimodal embeddings (text, images, video) [preview]
  • string