-
Notifications
You must be signed in to change notification settings - Fork 4
feat: add embeddings API endpoint with nomic-embed-text model #130
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- Add /v1/embeddings endpoint to tinfoil-proxy (Go) with handleEmbeddings handler - Enable nomic-embed-text model in tinfoil-proxy model configs - Add proxy_embeddings handler in Rust with encryption middleware - Add nomic-embed-text to proxy router config and models list - Include billing/usage tracking for embeddings (prompt_tokens only) - Follows OpenAI embeddings API format (768 dimensions) Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
WalkthroughThis PR introduces embeddings API support across the proxy stack. It adds a new "nomic-embed-text" model to the configuration, implements an embeddings endpoint in the Rust service with request validation, billing integration, and response encryption, and extends the Go proxy with embeddings handler and types. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant RustService as Rust Service
participant ProxyRouter as Proxy Router
participant GoProxy as Go Proxy<br/>(Tinfoil)
participant Provider as External<br/>Provider
Client->>RustService: POST /v1/embeddings<br/>(EmbeddingRequest)
Note over RustService: Decrypt request<br/>Validate input (non-empty)
alt Input invalid
RustService-->>Client: Error response
else Proceed
RustService->>RustService: Check billing<br/>(guest user gate)
alt Guest billing blocked
RustService-->>Client: Billing error
else Proceed
RustService->>ProxyRouter: Resolve model route<br/>(nomic-embed-text)
ProxyRouter-->>RustService: Go Proxy endpoint
RustService->>GoProxy: Forward embeddings request
GoProxy->>Provider: POST /v1/embeddings
alt Timeout or non-200 response
Provider-->>GoProxy: Error
GoProxy-->>RustService: Error response
RustService-->>Client: Error (encrypted)
else Success
Provider-->>GoProxy: EmbeddingResponse
rect rgb(220, 240, 230)
Note over GoProxy: Parse embeddings<br/>Convert to Go types
end
GoProxy-->>RustService: Parsed response
rect rgb(240, 230, 220)
Note over RustService: Publish usage event<br/>(prompt_tokens)<br/>Encrypt response
end
RustService-->>Client: Encrypted response
end
end
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Comment |
Greptile SummaryAdds text embeddings support via a new
Confidence Score: 5/5
Important Files Changed
Sequence DiagramsequenceDiagram
participant Client
participant Rust API (opensecret)
participant Go Proxy (tinfoil-proxy)
participant Tinfoil Backend
participant SQS (Billing)
Client->>Rust API (opensecret): POST /v1/embeddings (encrypted)
Rust API (opensecret)->>Rust API (opensecret): Decrypt request
Rust API (opensecret)->>Rust API (opensecret): Check guest user billing
Rust API (opensecret)->>Rust API (opensecret): Validate input (non-empty)
Rust API (opensecret)->>Rust API (opensecret): Get model route config
Rust API (opensecret)->>Go Proxy (tinfoil-proxy): POST /v1/embeddings
Go Proxy (tinfoil-proxy)->>Go Proxy (tinfoil-proxy): Parse & validate input
Go Proxy (tinfoil-proxy)->>Tinfoil Backend: Embeddings.New()
Tinfoil Backend-->>Go Proxy (tinfoil-proxy): EmbeddingResponse
Go Proxy (tinfoil-proxy)-->>Rust API (opensecret): JSON response with usage
Rust API (opensecret)->>SQS (Billing): Publish usage event (async)
Rust API (opensecret)->>Rust API (opensecret): Encrypt response
Rust API (opensecret)-->>Client: Encrypted embeddings (768 dims)
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
tinfoil-proxy/main.go (1)
1106-1120: Silent dropping of non-string array items may cause unexpected behavior.When the input is an array, non-string items are silently filtered out. If a client accidentally sends
["hello", 123, "world"], only["hello", "world"]would be processed without any indication. Consider logging a warning or returning an error for invalid array items.🔎 Proposed enhancement
case []interface{}: for _, item := range v { if str, ok := item.(string); ok { inputs = append(inputs, str) + } else { + log.Printf("Warning: non-string item in input array ignored") } }src/web/openai.rs (1)
1566-1567: Parameter naming inconsistency:_auth_methodis actually used.The parameter
_auth_methodhas an underscore prefix which conventionally indicates an unused parameter, but it's used on line 1704 for billing context. Consider removing the underscore prefix for clarity.🔎 Proposed fix
async fn proxy_embeddings( State(state): State<Arc<AppState>>, _headers: HeaderMap, axum::Extension(session_id): axum::Extension<Uuid>, axum::Extension(user): axum::Extension<User>, - axum::Extension(_auth_method): axum::Extension<AuthMethod>, + axum::Extension(auth_method): axum::Extension<AuthMethod>, axum::Extension(embedding_request): axum::Extension<EmbeddingRequest>, ) -> Result<Json<EncryptedResponse<Value>>, ApiError> {And on line 1704:
let billing_context = - BillingContext::new(_auth_method, embedding_request.model.clone()); + BillingContext::new(auth_method, embedding_request.model.clone());
📜 Review details
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (3)
pcrDev.jsonis excluded by!pcrDev.jsonpcrProd.jsonis excluded by!pcrProd.jsontinfoil-proxy/dist/tinfoil-proxyis excluded by!**/dist/**
📒 Files selected for processing (5)
pcrDevHistory.jsonpcrProdHistory.jsonsrc/proxy_config.rssrc/web/openai.rstinfoil-proxy/main.go
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-08-26T16:05:09.950Z
Learnt from: AnthonyRonning
Repo: OpenSecretCloud/opensecret PR: 95
File: tinfoil-proxy/main.go:55-59
Timestamp: 2025-08-26T16:05:09.950Z
Learning: The qwen3-coder-480b model is available in Tinfoil's model catalog and can be used in the tinfoil-proxy service.
Applied to files:
src/proxy_config.rs
🧬 Code graph analysis (1)
src/web/openai.rs (1)
tinfoil-proxy/main.go (1)
EmbeddingRequest(171-177)
⏰ Context from checks skipped due to timeout of 100000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Greptile Review
- GitHub Check: Development Reproducible Build
🔇 Additional comments (12)
pcrProdHistory.json (1)
470-477: LGTM!The new PCR history entry follows the established format with valid PCR0, PCR1, PCR2 values, timestamp, and signature. The structure is consistent with existing entries.
pcrDevHistory.json (1)
470-477: LGTM!The dev environment PCR history entry is properly formatted and follows the established pattern.
tinfoil-proxy/main.go (3)
43-47: LGTM!The nomic-embed-text model is correctly added to the model configurations with an appropriate description and active status.
171-195: LGTM!The embedding types are well-structured and align with the OpenAI embeddings API specification. Using
interface{}forInputcorrectly handles both string and array inputs, and theEmbeddingUsagestruct appropriately omitsCompletionTokenssince embeddings only consume prompt tokens.
1256-1259: LGTM!The embeddings route is correctly registered following the established pattern for other endpoints.
src/proxy_config.rs (2)
143-143: LGTM!The nomic-embed-text model is correctly added to the Tinfoil-only routes without fallback, consistent with how other Tinfoil-exclusive models are configured.
185-185: LGTM!The model is properly included in the user-facing models list when Tinfoil is configured.
src/web/openai.rs (5)
115-131: LGTM!The
EmbeddingRequeststruct is well-designed with appropriate serde attributes. Usingserde_json::Valueforinputallows flexible handling of both string and array inputs, and the default model is correctly set to "nomic-embed-text".
221-227: LGTM!The embeddings route is correctly registered with the
decrypt_requestmiddleware, following the established pattern for other endpoints in this router.
1571-1608: LGTM!The guest user billing checks and input validation are thorough and consistent with other endpoints. The validation correctly handles empty strings, empty arrays, and invalid input types.
1695-1718: LGTM!The billing handling for embeddings correctly accounts for prompt tokens only (with
completion_tokens: 0), which is appropriate since embeddings don't generate completion tokens. The usage event is properly published when prompt_tokens > 0.
1636-1651: Embeddings endpoint limitation is intentional—nomic-embed-text is Tinfoil-only with no configured fallback provider.Unlike chat completions which support primary + fallback cycling, nomic-embed-text is configured exclusively for the Tinfoil provider (line 143 in proxy_config.rs with
fallbacks: vec![]). Theproxy_embeddingsfunction correctly only attempts the primary provider since no fallback is available. This design is intentional and requires no changes.
Summary
Adds support for text embeddings via the OpenAI-compatible
/v1/embeddingsendpoint using Tinfoil'snomic-embed-textmodel (768 dimensions).Changes
tinfoil-proxy (Go)
nomic-embed-textmodel in model configsEmbeddingRequest,EmbeddingResponse,EmbeddingData,EmbeddingUsagetypeshandleEmbeddingshandler with proper input handling (single string or array)/v1/embeddingsPOST routeopensecret (Rust)
EmbeddingRequeststruct insrc/web/openai.rsproxy_embeddingshandler with:nomic-embed-textto proxy router config and models list insrc/proxy_config.rs/v1/embeddingsrouteAPI Format
Follows standard OpenAI embeddings API:
Testing
Summary by CodeRabbit
New Features
✏️ Tip: You can customize this high-level summary in your review settings.