Skip to content

Conversation

@pingSubhajit
Copy link
Contributor

Summary

This release adds configurable embedding batching and concurrency controls for improved ingestion performance, along with comprehensive type safety improvements across the codebase.

Changes

Features

  • Embedding batching and concurrency: The ingest pipeline now uses embedMany() for batch text embeddings when supported by the provider, with configurable concurrency limits for both text and image embeddings
  • New embeddingProcessing config: Configure concurrency (default: 4) and batchSize (default: 32) via unrag.config.ts under defaults.embedding or engine.embeddingProcessing

Type Safety Improvements

  • Replaced all any type assertions with properly-typed interfaces for external SDKs:
    • Typed interfaces for all 12 embedding providers (OpenAI, Azure, Bedrock, Cohere, Google, Mistral, Ollama, OpenRouter, Together, Vertex, Voyage, AI Gateway)
    • Structural types for Google Drive API (DriveFile, DriveClient, AuthClient)
    • Typed interfaces for extractors (pdfjs, audio transcription, video processing)
    • Proper QueryRow interface for Drizzle store
  • New AssetMetadataFields interface and hasAssetMetadata() type guard for asset-derived chunk metadata
  • Refactored mergeDeep utility to dedicated module with proper generic type signatures
  • Typed requireOptional() helper now requires explicit type parameters

Internal

  • Added core-embedding-batching.test.ts test suite for batching behavior
  • Extracted deep-merge.ts utility with isRecord() type guard

Related PRs

@pingSubhajit pingSubhajit self-assigned this Dec 28, 2025
@vercel
Copy link

vercel bot commented Dec 28, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Review Updated (UTC)
unrag-web Ready Ready Preview, Comment Dec 28, 2025 7:13am

@pingSubhajit pingSubhajit merged commit 419103f into main Dec 28, 2025
3 checks passed
@pingSubhajit pingSubhajit deleted the release/v0.2.7 branch December 28, 2025 07:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants