π A Learning-Focused RAG Framework in Rust
β οΈ Learning Project Disclaimer: Cheungfun is a personal learning project designed to explore and practice RAG (Retrieval-Augmented Generation) architecture design in Rust. While functionally complete, it's still in development and not recommended for production use.π Learning Goals:
- Deep dive into Rust's advanced features and best practices
- Explore RAG system architecture design and implementation patterns
- Practice LlamaIndex design philosophy and interface patterns
- Provide learning and reference code examples
Cheungfun is a high-performance RAG (Retrieval-Augmented Generation) framework built in Rust, inspired by LlamaIndex. It features modular design, streaming processing architecture, and performance optimization capabilities, primarily designed for learning and exploring modern RAG system implementations.
- π Performance Exploration: Exploring Rust's zero-cost abstractions and memory safety features
- SIMD-accelerated vector operations experiments
- HNSW approximate nearest neighbor search implementation
- Memory management optimization practices
- π§ Modular Design: Learning separation of concerns and scalable architecture design
- π Streaming Processing: Experimenting with large-scale streaming indexing and querying
- π» Advanced Code Indexing: Tree-sitter AST-based code processing
- Extract functions, classes, imports, comments, and complexity metrics
- Code-aware splitting that maintains syntax boundaries
- Support for Rust, Python, JavaScript, TypeScript, Java, C#, C/C++, Go
- π‘οΈ Type Safety: Leveraging Rust's type system for runtime safety guarantees
- π Unified Interface: Adopting LlamaIndex's Transform interface design pattern
- β‘ Async-First: High-performance async operations built on tokio
- π Learning-Oriented: Complete examples and documentation for learning reference
Performance optimization achievements during the learning process:
| Feature | Performance | Learning Outcome |
|---|---|---|
| SIMD Vector Operations | 30.17x speedup | Learned SIMD optimization techniques |
| Vector Search (HNSW) | 378+ QPS | Understanding of approximate nearest neighbor algorithms |
| Memory Optimization | Significant improvement | Mastered Rust memory management |
| Indexing Throughput | Streaming processing | Practiced async programming patterns |
Note: These data come from learning experiments and do not represent production environment performance guarantees.
cheungfun/
βββ cheungfun-core/ # Core traits and data structures
βββ cheungfun-indexing/ # Data loading and indexing with unified Transform interface
βββ cheungfun-query/ # Query processing and response generation
βββ cheungfun-agents/ # Intelligent agents and tool calling (MCP integration)
βββ cheungfun-integrations/ # External service integrations (FastEmbed, Qdrant, etc.)
βββ cheungfun-multimodal/ # Multi-modal processing (text, images, audio, video)
βββ examples/ # Learning examples and usage demonstrations
Recently completed a major architectural refactoring, adopting a unified Transform interface consistent with LlamaIndex:
- Unified Interface: All processing components implement the same
Transformtrait - Type Safety: Using
TransformInputenum for compile-time type checking - Pipeline Simplification: Unified processing flow, easier to compose and extend
Add to your Cargo.toml:
[dependencies]
cheungfun = "0.1.0"
siumai = "0.9.1" # LLM integration
tokio = { version = "1.0", features = ["full"] }Choose features suitable for learning:
# Default: stable and secure
cheungfun = "0.1.0"
# Learning experiments (includes all features)
cheungfun = { version = "0.1.0", features = ["performance"] }
# Full feature set
cheungfun = { version = "0.1.0", features = ["full"] }use cheungfun::prelude::*;
use cheungfun_core::traits::{Transform, TransformInput};
use siumai::prelude::*;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
// 1. Configure embedding model
let embedder = SiumaiEmbedder::new("openai", "text-embedding-3-small", "your-api-key").await?;
// 2. Set up vector storage
let vector_store = InMemoryVectorStore::new(384, DistanceMetric::Cosine);
// 3. Build indexing pipeline with unified interface
let indexing_pipeline = DefaultIndexingPipeline::builder()
.with_loader(Arc::new(DirectoryLoader::new("./docs")?))
.with_transformer(Arc::new(SentenceSplitter::from_defaults(1000, 200)?)) // Unified interface
.with_transformer(Arc::new(MetadataExtractor::new())) // Unified interface
.build()?;
// 4. Run indexing
let stats = indexing_pipeline.run().await?;
println!("Indexing complete: {} documents, {} nodes", stats.documents_processed, stats.nodes_created);
// 5. Configure LLM client
let llm_client = Siumai::builder()
.openai()
.api_key("your-api-key")
.model("gpt-4")
.build()
.await?;
// 6. Build query engine
let query_engine = DefaultQueryPipeline::builder()
.with_retriever(Arc::new(VectorRetriever::new(vector_store, embedder)))
.with_synthesizer(Arc::new(SimpleResponseSynthesizer::new(llm_client)))
.build()?;
// 7. Execute query
let response = query_engine.query("What is the main content of the documents?").await?;
println!("Answer: {}", response.content);
Ok(())
}use cheungfun_core::traits::{Transform, TransformInput};
use cheungfun_indexing::node_parser::text::SentenceSplitter;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
// Create document splitter
let splitter = SentenceSplitter::from_defaults(300, 75)?;
// Use unified interface to process documents
let input = TransformInput::Documents(documents);
let nodes = splitter.transform(input).await?;
// Polymorphic processing example
let transforms: Vec<Box<dyn Transform>> = vec![
Box::new(SentenceSplitter::from_defaults(200, 40)?),
Box::new(TokenTextSplitter::from_defaults(180, 35)?),
];
for transform in transforms {
let nodes = transform.transform(input.clone()).await?;
println!("Transform {}: {} nodes", transform.name(), nodes.len());
}
Ok(())
}use cheungfun::prelude::*;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
// Use HNSW for high-performance vector search
let vector_store = HnswVectorStore::new(384, DistanceMetric::Cosine);
vector_store.initialize_index(10000)?; // Pre-allocate for 10k vectors
// Use optimized memory store for better performance
let optimized_store = OptimizedInMemoryVectorStore::new(384, DistanceMetric::Cosine);
// SIMD-accelerated vector operations
#[cfg(feature = "simd")]
{
let simd_ops = SimdVectorOps::new();
if simd_ops.is_simd_available() {
println!("SIMD acceleration enabled: {}", simd_ops.get_capabilities());
}
}
Ok(())
}Cheungfun provides granular control over features and dependencies:
| Feature | Description | Use Case |
|---|---|---|
default |
Stable, optimized memory operations | Development, testing |
simd |
SIMD-accelerated vector operations | High-performance computing |
hnsw |
HNSW approximate nearest neighbor | Large-scale vector search |
performance |
All performance optimizations | Production deployments |
candle |
Candle ML framework integration | Local embeddings |
qdrant |
Qdrant vector database | Distributed vector storage |
fastembed |
FastEmbed integration | Quick embedding setup |
full |
All features enabled | Maximum functionality |
- Architecture Guide - System design and development guide
- Performance Report - Detailed benchmarks and optimizations
- API Documentation - Complete API reference
- Examples - Practical usage examples
- β Core Foundation: Complete modular architecture with 6 main crates
- β Performance Optimization: SIMD acceleration, HNSW search, memory optimization
- β Advanced Indexing: Tree-sitter AST parsing for 9+ programming languages
- β Query Processing: Hybrid search, query transformations, reranking algorithms
- β Agent Framework: Complete MCP integration with workflow orchestration
- β Multimodal Support: Text, image, audio, video processing capabilities
- β External Integrations: FastEmbed, Qdrant, API embedders with caching
- β Unified Interface: LlamaIndex-style Transform interface for all components
We welcome contributions of all kinds! Please see our Contributing Guide for details.
# Clone the repository
git clone https://github.com/YumchaLabs/cheungfun.git
cd cheungfun
# Build with default features
cargo build
# Build with performance features
cargo build --features performance
# Run tests
cargo test
# Run performance benchmarks
cargo test --features performance --test performance_integration_test
# Run examples
cargo run --example basic_usage# Test SIMD acceleration
cargo test --features simd test_simd_performance -- --nocapture
# Test vector store performance
cargo test --features "hnsw,simd" test_vector_store_performance -- --nocapture
# Full performance suite
cargo test --features performance --test performance_integration_test -- --nocaptureThis project is dual-licensed under:
- Apache License, Version 2.0 (LICENSE-APACHE)
- MIT License (LICENSE-MIT)
- LlamaIndex - Inspiration for the design philosophy
- Swiftide - Reference implementation in Rust ecosystem
- Siumai - Unified LLM interface library
- SimSIMD - High-performance SIMD operations
- HNSW-RS - Rust HNSW implementation
- GitHub Issues: Bug reports and feature requests
- Discussions: Community discussions
- Documentation: API docs and guides
Cheungfun is a personal learning project, primarily used for:
- π¦ Learning Rust: Exploring Rust's advanced features and best practices
- ποΈ Architecture Design: Practicing modern RAG system architectural patterns
- π Knowledge Sharing: Providing learning and reference code examples
- π¬ Technical Experimentation: Trying new algorithms and optimization techniques
While functionally complete, it's not recommended for production use. If you're interested in RAG systems and Rust development, welcome to learn and reference!
Made with β€οΈ for learning and exploration