feat(rag): Pure-Rust Vector Store & RAG Pipeline (DIR-24) #3

bkataru · 2026-01-12T22:36:24Z

feat(rag): Pure-Rust Vector Store & RAG Pipeline (DIR-24)

Summary

Implements a complete RAG (Retrieval Augmented Generation) system with a pure-Rust vector database, eliminating external service dependencies. This release brings A.R.E.S to v0.3.0.

New Crate: `ares-vector`

Pure-Rust embedded vector database with HNSW indexing
Memory-mapped persistence via memmap2
Multiple distance metrics (Cosine, Euclidean, Dot Product)
No external dependencies (Qdrant/Milvus not required)
Published to crates.io as ares-vector@0.1.1

RAG Pipeline

Embeddings: Multi-model support (BGE, MiniLM, Nomic, Qwen3, GTE) via FastEmbed/Candle
Chunking: Word, character, and semantic strategies with configurable overlap
Search: Semantic, BM25, fuzzy, and hybrid strategies
Reranking: Cross-encoder reranking (MiniLM-L6-v2, BGE)

New API Endpoints

Endpoint	Method	Description
`/api/rag/ingest`	POST	Document ingestion with chunking
`/api/rag/search`	POST	Multi-strategy search
`/api/rag/collections`	GET	List collections
`/api/rag/collections/{name}`	DELETE	Delete collection

Configuration

New [rag] section in ares.toml with full configuration options.

Breaking Changes

None - all new functionality is additive behind the ares-vector feature flag.

Documentation

Updated README with RAG section and feature flags
Added docs/DIR-24_RAG_IMPLEMENTATION_PLAN.md
Added docs/FUTURE_ENHANCEMENTS.md for deferred features
Updated PROJECT_STATUS.md with Iteration 5

Test Coverage

130 library tests passing
CI updated to test ares-vector on all platforms (Ubuntu, Windows, macOS)

What's Deferred

AI-native protocols (ACP, AG-UI, ANP, A2A) - pending standardization
GPU acceleration for embeddings
Embedding cache

Closes DIR-24

- Add ares-vector crate: pure-Rust vector DB with HNSW indexing - Memory-mapped persistence via memmap2 - Multiple distance metrics (Cosine, Euclidean, Dot Product) - Thread-safe with parking_lot RwLocks - No external dependencies (Qdrant/Milvus not required) - Add comprehensive RAG pipeline: - Document ingestion with chunking (word/semantic/character) - Multi-model embedding service (BGE, MiniLM, Nomic, Qwen3, GTE) - Multi-strategy search (semantic, BM25, fuzzy, hybrid) - Cross-encoder reranking (MiniLM-L6-v2, BGE Reranker) - Add RAG API endpoints: - POST /api/rag/ingest - Document ingestion - POST /api/rag/search - Multi-strategy search - GET /api/rag/collections - List collections - DELETE /api/rag/collections/{name} - Delete collection - Add VectorStore trait abstraction for pluggable backends - Add RagConfig to ares.toml with full configuration options - Update CI workflow to test ares-vector on all platforms - Update documentation (README, PROJECT_STATUS, CHANGELOG) Tests: 130 library tests passing

…dule

Copilot

Pull request overview

This PR implements a comprehensive RAG (Retrieval Augmented Generation) system with a pure-Rust vector database, bringing A.R.E.S to v0.3.0. The implementation adds local-first vector storage with HNSW indexing, multi-model embedding support, multiple search strategies, and reranking capabilities—all without requiring external service dependencies.

Changes:

Pure-Rust ares-vector crate with HNSW indexing and memory-mapped persistence
Multi-strategy search engine (semantic, BM25, fuzzy, hybrid with RRF fusion)
Comprehensive embedding service supporting 38+ models via FastEmbed/Candle
Document chunking with word/semantic/character strategies
Cross-encoder reranking for improved relevance
New RAG API endpoints for ingestion, search, and collection management
130+ new tests including live integration tests with real models

Reviewed changes

Copilot reviewed 33 out of 34 changed files in this pull request and generated no comments.

Show a summary per file

File	Description
`tests/rag_live_tests.rs`	Live integration tests with real embedding models (ignored by default)
`src/utils/toml_config.rs`	Expanded RAG configuration with hybrid weights and reranking options
`src/types/mod.rs`	New RAG API request/response types with OpenAPI schemas
`src/rag/search.rs`	Multi-strategy search engine (BM25, fuzzy, hybrid with RRF)
`src/rag/reranker.rs`	Cross-encoder reranking service with 4 model options
`src/rag/embeddings.rs`	Comprehensive embedding service with 38+ models
`src/rag/chunker.rs`	Text chunking with semantic boundary awareness
`src/db/vectorstore.rs`	Abstract vector store trait with multi-provider support
`src/db/lancedb.rs`	LanceDB vector store implementation (888 lines)
`src/db/ares_vector.rs`	Pure-Rust vector store using ares-vector crate
`src/api/routes.rs`	New RAG API routes for ingestion and search
`src/api/handlers/rag.rs`	RAG endpoint handlers with global service initialization
`docs/PROJECT_STATUS.md`	Updated with Iteration 5 completion summary
`docs/FUTURE_ENHANCEMENTS.md`	Documented deferred features (GPU, caching, protocols)
`docs/DIR-24_RAG_IMPLEMENTATION_PLAN.md`	Comprehensive implementation plan with research findings
`crates/ares-vector/*`	Pure-Rust vector database crate (separate package)

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

- Fix rustdoc warnings for Vec<String> by wrapping in backticks - Fix bare URL warnings by using angle brackets or backticks - Add empty [workspace] table to ui/Cargo.toml to prevent trunk workspace confusion

bkataru added 5 commits January 13, 2026 03:53

chore: bump version to 0.3.0

8fc7f2a

fix: remove unused imports in ares-vector

084abd8

fix: add version to ares-vector workspace dependency for crates.io

43fcc15

fix: add AsyncReadExt/AsyncWriteExt imports in bincode_persistence mo…

e8dc5fb

…dule

bkataru self-assigned this Jan 12, 2026

Copilot AI review requested due to automatic review settings January 12, 2026 22:36

bkataru added documentation Improvements or additions to documentation enhancement New feature or request labels Jan 12, 2026

bkataru linked an issue Jan 12, 2026 that may be closed by this pull request

AI-native protocols, vector DBs, RAG, and search strategies #2

Closed

bkataru requested a review from suprabhatrapolu January 12, 2026 22:39

Copilot started reviewing on behalf of bkataru January 12, 2026 22:41 View session

Copilot AI reviewed Jan 12, 2026

View reviewed changes

bkataru added 3 commits January 13, 2026 04:17

fix(ci): use CI-safe features to avoid CUDA/protoc issues

b7db2e3

fix(ci): resolve CUDA/clippy/format issues for CI

d27ae49

fix(ci): resolve documentation and UI build failures

d920f20

- Fix rustdoc warnings for Vec<String> by wrapping in backticks - Fix bare URL warnings by using angle brackets or backticks - Add empty [workspace] table to ui/Cargo.toml to prevent trunk workspace confusion

bkataru merged commit 411e638 into main Jan 13, 2026
43 of 44 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(rag): Pure-Rust Vector Store & RAG Pipeline (DIR-24) #3

feat(rag): Pure-Rust Vector Store & RAG Pipeline (DIR-24) #3

Uh oh!

bkataru commented Jan 12, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

feat(rag): Pure-Rust Vector Store & RAG Pipeline (DIR-24) #3

feat(rag): Pure-Rust Vector Store & RAG Pipeline (DIR-24) #3

Uh oh!

Conversation

bkataru commented Jan 12, 2026