Ariyan-Pro / RAG-Latency-Optimization Star 0 Code Issues Pull requests CPU-optimized RAG pipeline reducing latency 2.7× (247ms → 92ms). Implements caching, filtering, quantization for production. Complete with FastAPI, Docker, benchmarks, investor materials. The engineering showcase that sells itself. docker caching dockerfile sales-engineering sqlite showcase embeddings low-latency production-ready demonstration semantic-search faiss fastapi retrieval-augmented-generation cpu-only rag-optimization ai-ml-performance-tuning becnhmarking Updated Jan 24, 2026 Python