Skip to content

ngusadeep/Enterprise-RAG-System

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

8 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Enterprise Knowledge Assistant

A production-ready Enterprise Knowledge Assistant using advanced RAG (Retrieval-Augmented Generation) architecture. The system ingests internal company documents (PDFs, emails, Confluence pages, Google Docs) and provides accurate, cited answers to employee questions.

πŸ—οΈ Architecture

Core Workflow

  1. Query Construction: Natural language β†’ optimized database queries
  2. Query Translation: HyDE, multi-query, decomposition techniques
  3. Routing: Determine optimal retrieval path (vector/relational/graph)
  4. Indexing: Semantic chunking, multi-representation indexing
  5. Retrieval: Vector search + re-ranking + active retrieval
  6. Generation: LLM synthesis with Self-RAG capabilities
  7. Feedback Loop: Quality assessment and iterative improvement

πŸ› οΈ Tech Stack

Backend

  • Framework: FastAPI with Uvicorn ASGI server
  • Runtime: Python 3.12 with UV package manager
  • Data Validation: Pydantic v2 models
  • Database ORM: SQLAlchemy 2.0 (async)
  • Task Queue: Celery with Redis broker
  • Vector DB: Qdrant with HNSW indexing
  • Relational DB: PostgreSQL 15+
  • Cache: Redis

Frontend

  • Framework: Next.js 14 with App Router
  • Styling: Tailwind CSS
  • State Management: Zustand
  • Data Fetching: React Query (TanStack Query)

AI/ML

  • Primary LLM: OpenAI GPT-3.5-Turbo
  • Fallback LLM: GPT-4 for complex queries
  • Embedding Model: SentenceTransformers all-MiniLM-L6-v2 (384-dim)
  • RAG Framework: LangChain for orchestration
  • Observability: LangSmith for tracing/monitoring

πŸš€ Quick Start

Prerequisites

  • Docker and Docker Compose
  • Python 3.12+
  • Node.js 20+
  • UV package manager (pip install uv)
  • OpenAI API key

1. Clone and Setup

git clone <repository-url>
cd Enterprise-RAG-System

2. Environment Configuration

Copy .env.example to .env and configure:

cp .env.example .env

Edit .env with your settings:

OPENAI_API_KEY=sk-your-key-here
LANGSMITH_API_KEY=ls-your-key-here  # Optional
POSTGRES_URL=postgresql+asyncpg://raguser:ragpass@localhost:5432/ragdb
QDRANT_URL=http://localhost:6333
REDIS_URL=redis://localhost:6379

3. Start Services with Docker Compose

docker-compose up -d

This will start:

  • PostgreSQL (port 5432)
  • Qdrant (ports 6333, 6334)
  • Redis (port 6379)
  • Backend API (port 8000)
  • Celery worker

4. Setup Backend (if running locally)

cd backend
uv pip install -e .

5. Setup Frontend (if running locally)

cd frontend
npm install
npm run dev

Frontend will be available at http://localhost:3000

6. Access the Application

πŸ“ Project Structure

Enterprise-RAG-System/
β”œβ”€β”€ backend/
β”‚   β”œβ”€β”€ src/
β”‚   β”‚   β”œβ”€β”€ api/              # FastAPI routes and models
β”‚   β”‚   β”œβ”€β”€ core/             # Configuration
β”‚   β”‚   β”œβ”€β”€ services/         # Business logic
β”‚   β”‚   β”‚   β”œβ”€β”€ document/    # Document processing
β”‚   β”‚   β”‚   β”œβ”€β”€ embeddings/  # Embedding generation
β”‚   β”‚   β”‚   β”œβ”€β”€ vector/      # Qdrant operations
β”‚   β”‚   β”‚   β”œβ”€β”€ retrieval/   # Retrieval logic
β”‚   β”‚   β”‚   β”œβ”€β”€ generation/  # LLM integration
β”‚   β”‚   β”‚   └── query/       # Query optimization
β”‚   β”‚   β”œβ”€β”€ database/        # SQLAlchemy models
β”‚   β”‚   └── utils/           # Utilities
β”‚   β”œβ”€β”€ pyproject.toml
β”‚   └── Dockerfile
β”œβ”€β”€ frontend/
β”‚   β”œβ”€β”€ app/                 # Next.js app router
β”‚   β”œβ”€β”€ components/          # React components
β”‚   β”œβ”€β”€ lib/                # Utilities and API client
β”‚   └── package.json
β”œβ”€β”€ docker-compose.yml
β”œβ”€β”€ .env.example
└── README.md

πŸ”§ Advanced RAG Features

1. Query Processing

  • HyDE (Hypothetical Document Embeddings): Generate hypothetical answers to improve retrieval
  • Multi-Query Generation: Create 3-5 query variations for better recall
  • Query Decomposition: Break complex questions into sub-queries

2. Retrieval Enhancement

  • RAG-Fusion: Combine results from multiple query variations
  • Cross-Encoder Re-ranking: Use cross-encoder/ms-marco-MiniLM-L-6-v2 for result refinement
  • Hierarchical Retrieval: Summary β†’ detail retrieval pattern

3. Generation Optimization

  • Citation Management: Automatic source attribution
  • Confidence Scoring: Estimate answer reliability
  • Streaming Responses: Real-time answer generation

πŸ“‘ API Endpoints

Chat

  • POST /api/chat - Send a chat message
  • POST /api/chat/stream - Stream chat response

Documents

  • GET /api/documents - List all documents
  • POST /api/documents - Upload a document
  • GET /api/documents/{id} - Get document details
  • DELETE /api/documents/{id} - Delete a document

Health

  • GET /api/health - Health check

πŸ§ͺ Development

Running Tests

cd backend
pytest

Code Formatting

cd backend
black src/
ruff check src/

Database Migrations

cd backend
alembic revision --autogenerate -m "description"
alembic upgrade head

πŸ“Š Monitoring

  • LangSmith: LLM tracing and monitoring (if configured)
  • Prometheus: System metrics (to be configured)
  • Grafana: Dashboards (to be configured)

πŸ”’ Security

  • Environment variables for sensitive data
  • JWT authentication (to be implemented)
  • CORS configuration
  • Rate limiting (to be implemented)

πŸ“ˆ Performance Targets

  • Latency: < 3 seconds for end-to-end response
  • Accuracy: High answer correctness (RAGAS evaluation)
  • Uptime: 99.9% availability target

🚧 Roadmap

Phase 1: MVP βœ…

  • Basic document upload and chunking
  • Simple embedding with SentenceTransformers
  • Qdrant setup and basic vector search
  • FastAPI endpoints for chat and documents
  • Next.js basic chat interface

Phase 2: Core RAG (In Progress)

  • Advanced chunking strategies
  • Query optimization (HyDE implementation)
  • Re-ranking with cross-encoders
  • Improved prompt engineering
  • Basic citation management

Phase 3: Advanced Features

  • Multi-query and RAG-Fusion
  • Self-RAG capabilities
  • Semantic routing
  • Active retrieval mechanisms
  • Comprehensive monitoring

Phase 4: Production Ready

  • Scalability improvements
  • Security hardening
  • Performance optimization
  • Comprehensive testing
  • Deployment automation

πŸ“ License

See LICENSE file for details.

🀝 Contributing

Contributions are welcome! Please open an issue or submit a pull request.

πŸ“§ Support

For issues and questions, please open a GitHub issue.

About

A personal, observable RAG system for structured knowledge

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors