Skip to content

Conversation

@bkataru
Copy link
Member

@bkataru bkataru commented Jan 12, 2026

feat(rag): Pure-Rust Vector Store & RAG Pipeline (DIR-24)

Summary

Implements a complete RAG (Retrieval Augmented Generation) system with a pure-Rust vector database, eliminating external service dependencies. This release brings A.R.E.S to v0.3.0.

New Crate: ares-vector

  • Pure-Rust embedded vector database with HNSW indexing
  • Memory-mapped persistence via memmap2
  • Multiple distance metrics (Cosine, Euclidean, Dot Product)
  • No external dependencies (Qdrant/Milvus not required)
  • Published to crates.io as ares-vector@0.1.1

RAG Pipeline

  • Embeddings: Multi-model support (BGE, MiniLM, Nomic, Qwen3, GTE) via FastEmbed/Candle
  • Chunking: Word, character, and semantic strategies with configurable overlap
  • Search: Semantic, BM25, fuzzy, and hybrid strategies
  • Reranking: Cross-encoder reranking (MiniLM-L6-v2, BGE)

New API Endpoints

Endpoint Method Description
/api/rag/ingest POST Document ingestion with chunking
/api/rag/search POST Multi-strategy search
/api/rag/collections GET List collections
/api/rag/collections/{name} DELETE Delete collection

Configuration

New [rag] section in ares.toml with full configuration options.

Breaking Changes

None - all new functionality is additive behind the ares-vector feature flag.

Documentation

  • Updated README with RAG section and feature flags
  • Added docs/DIR-24_RAG_IMPLEMENTATION_PLAN.md
  • Added docs/FUTURE_ENHANCEMENTS.md for deferred features
  • Updated PROJECT_STATUS.md with Iteration 5

Test Coverage

  • 130 library tests passing
  • CI updated to test ares-vector on all platforms (Ubuntu, Windows, macOS)

What's Deferred

  • AI-native protocols (ACP, AG-UI, ANP, A2A) - pending standardization
  • GPU acceleration for embeddings
  • Embedding cache

Closes DIR-24

- Add ares-vector crate: pure-Rust vector DB with HNSW indexing
  - Memory-mapped persistence via memmap2
  - Multiple distance metrics (Cosine, Euclidean, Dot Product)
  - Thread-safe with parking_lot RwLocks
  - No external dependencies (Qdrant/Milvus not required)

- Add comprehensive RAG pipeline:
  - Document ingestion with chunking (word/semantic/character)
  - Multi-model embedding service (BGE, MiniLM, Nomic, Qwen3, GTE)
  - Multi-strategy search (semantic, BM25, fuzzy, hybrid)
  - Cross-encoder reranking (MiniLM-L6-v2, BGE Reranker)

- Add RAG API endpoints:
  - POST /api/rag/ingest - Document ingestion
  - POST /api/rag/search - Multi-strategy search
  - GET /api/rag/collections - List collections
  - DELETE /api/rag/collections/{name} - Delete collection

- Add VectorStore trait abstraction for pluggable backends
- Add RagConfig to ares.toml with full configuration options
- Update CI workflow to test ares-vector on all platforms
- Update documentation (README, PROJECT_STATUS, CHANGELOG)

Tests: 130 library tests passing
@bkataru bkataru self-assigned this Jan 12, 2026
Copilot AI review requested due to automatic review settings January 12, 2026 22:36
@bkataru bkataru added documentation Improvements or additions to documentation enhancement New feature or request labels Jan 12, 2026
@bkataru bkataru linked an issue Jan 12, 2026 that may be closed by this pull request
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR implements a comprehensive RAG (Retrieval Augmented Generation) system with a pure-Rust vector database, bringing A.R.E.S to v0.3.0. The implementation adds local-first vector storage with HNSW indexing, multi-model embedding support, multiple search strategies, and reranking capabilities—all without requiring external service dependencies.

Changes:

  • Pure-Rust ares-vector crate with HNSW indexing and memory-mapped persistence
  • Multi-strategy search engine (semantic, BM25, fuzzy, hybrid with RRF fusion)
  • Comprehensive embedding service supporting 38+ models via FastEmbed/Candle
  • Document chunking with word/semantic/character strategies
  • Cross-encoder reranking for improved relevance
  • New RAG API endpoints for ingestion, search, and collection management
  • 130+ new tests including live integration tests with real models

Reviewed changes

Copilot reviewed 33 out of 34 changed files in this pull request and generated no comments.

Show a summary per file
File Description
tests/rag_live_tests.rs Live integration tests with real embedding models (ignored by default)
src/utils/toml_config.rs Expanded RAG configuration with hybrid weights and reranking options
src/types/mod.rs New RAG API request/response types with OpenAPI schemas
src/rag/search.rs Multi-strategy search engine (BM25, fuzzy, hybrid with RRF)
src/rag/reranker.rs Cross-encoder reranking service with 4 model options
src/rag/embeddings.rs Comprehensive embedding service with 38+ models
src/rag/chunker.rs Text chunking with semantic boundary awareness
src/db/vectorstore.rs Abstract vector store trait with multi-provider support
src/db/lancedb.rs LanceDB vector store implementation (888 lines)
src/db/ares_vector.rs Pure-Rust vector store using ares-vector crate
src/api/routes.rs New RAG API routes for ingestion and search
src/api/handlers/rag.rs RAG endpoint handlers with global service initialization
docs/PROJECT_STATUS.md Updated with Iteration 5 completion summary
docs/FUTURE_ENHANCEMENTS.md Documented deferred features (GPU, caching, protocols)
docs/DIR-24_RAG_IMPLEMENTATION_PLAN.md Comprehensive implementation plan with research findings
crates/ares-vector/* Pure-Rust vector database crate (separate package)

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

- Fix rustdoc warnings for Vec<String> by wrapping in backticks
- Fix bare URL warnings by using angle brackets or backticks
- Add empty [workspace] table to ui/Cargo.toml to prevent trunk workspace confusion
@bkataru bkataru merged commit 411e638 into main Jan 13, 2026
43 of 44 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

AI-native protocols, vector DBs, RAG, and search strategies

2 participants