🚀 IA Workspace - Enterprise RAG System

Enterprise-grade RAG (Retrieval-Augmented Generation) system with specialized AI agents, built with modern technologies and best practices.

A production-ready AI workspace featuring intelligent document processing, semantic search, and four specialized agents for research, analysis, writing, and code generation.

📸 Screenshots

Dashboard

Real-time metrics and analytics overview

Document Management

Multi-format document upload and processing

Interactive Chat

Chat interface with specialized AI agents

Agent Catalog

Specialized agents with custom tools

Analytics

Detailed metrics and performance monitoring

✨ Features

🎯 Core Capabilities

📄 Intelligent Document Processing
- Multi-format support (PDF, DOCX, TXT, MD, CSV)
- Smart chunking strategies (recursive, semantic, markdown-aware, code-aware)
- Automatic metadata extraction and preservation
- Asynchronous processing with progress tracking
🤖 Specialized AI Agents
- Research Agent - Expert at finding and synthesizing information
- Analysis Agent - Data analysis and insight generation
- Writer Agent - Professional content creation
- Code Agent - Code generation and technical documentation
🔍 Advanced Search
- Semantic search with Qdrant vector database
- Hybrid search capabilities (dense + sparse)
- Metadata filtering and reranking
- Multi-query search strategies
💬 Interactive Chat Interface
- Real-time streaming responses
- WebSocket support for live updates
- Agent selection for specialized tasks
- Markdown rendering with syntax highlighting
📊 Analytics & Monitoring
- Real-time performance metrics
- Usage tracking and statistics
- Structured logging with context
- Health checks and observability
🔌 Model Context Protocol (MCP)
- Standard protocol for AI tool integration
- 5 specialized tools exposed via MCP
- Easy integration with external systems

🏗️ Architecture

┌─────────────────────────────────────────────────────────────┐
│                        Frontend (React)                      │
│  Dashboard | Documents | Chat | Agents | Analytics          │
└─────────────────┬───────────────────────────────────────────┘
                  │
                  │ REST API / WebSocket
                  ▼
┌─────────────────────────────────────────────────────────────┐
│                     Backend (FastAPI)                        │
│  ┌────────────┐  ┌────────────┐  ┌─────────────────────┐  │
│  │  Document  │  │   Agent    │  │   MCP Server        │  │
│  │ Processor  │  │Orchestrator│  │   (Tools)           │  │
│  └────────────┘  └────────────┘  └─────────────────────┘  │
└─────────┬───────────────┬───────────────────────────────────┘
          │               │
          │               │
  ┌───────▼──────┐  ┌────▼──────────────────────────────┐
  │   Qdrant     │  │    OpenAI API                     │
  │  (Vectors)   │  │  (Embeddings & Agents)            │
  └──────────────┘  └───────────────────────────────────┘

Tech Stack

Backend

FastAPI - Modern, high-performance web framework
LangChain - LLM orchestration and agent framework
Qdrant - High-performance vector database
Pydantic - Data validation and settings management
Structlog - Structured logging for observability
PyPDF2, python-docx - Document parsing

Frontend

React 18 - Modern UI library
TypeScript - Type-safe JavaScript
Tailwind CSS - Utility-first CSS framework
Zustand - Lightweight state management
React Query - Server state management
React Markdown - Markdown rendering

Infrastructure

Docker & Docker Compose - Containerization
Nginx - Reverse proxy and static file serving
Redis (optional) - Caching layer

🚀 Quick Start

Prerequisites

Python 3.11+
Node.js 18+
Docker & Docker Compose
OpenAI API Key

Installation

1. Clone the repository

git clone git@github.com:kaninstein/workspaceai.git
cd workspaceai

2. Configure environment variables

cp .env.example .env

Edit .env and add your credentials:

# Required
OPENAI_API_KEY=sk-your-openai-key-here
SECRET_KEY=your-secret-key-for-jwt

# Optional (defaults provided)
QDRANT_HOST=localhost
QDRANT_PORT=6333
DATABASE_URL=sqlite:///./ia_workspace.db

3. Start with Docker Compose (Recommended)

# Start all services
docker-compose up -d

# View logs
docker-compose logs -f

# Stop services
docker-compose down

4. Or run locally for development

Backend:

cd backend

# Install dependencies
pip install -r requirements.txt

# Start Qdrant (in separate terminal)
docker run -p 6333:6333 qdrant/qdrant

# Run backend
uvicorn app.main:app --reload --port 8000

Frontend:

cd frontend

# Install dependencies
npm install

# Run development server
npm run dev

5. Access the application

Frontend: http://localhost:5173
API Docs: http://localhost:8000/docs
Qdrant Dashboard: http://localhost:6333/dashboard

💼 Real-World Use Cases

This system solves real business problems across multiple industries. Here's how companies can leverage it:

🏢 1. Enterprise Customer Support

Problem: Support teams waste time searching through documentation, SOPs, and past tickets.

Solution:

Documents: Upload KB articles, product manuals, troubleshooting guides, FAQs
Agent: Research Agent
Query: "How do I reset a user's password in the enterprise dashboard?"
Result: Instant answers with citations from indexed documentation

Business Impact:

70% faster ticket resolution
Consistent answers across support team
New hires productive in days, not weeks

⚖️ 2. Legal Document Analysis

Problem: Law firms spend hours reviewing contracts, case files, and legal precedents.

Solution:

Documents: Upload contracts, case law, compliance documents
Agent: Analysis Agent
Query: "Identify all liability clauses in these vendor contracts and compare terms"
Result: Structured analysis with risk assessment and comparisons

Business Impact:

80% reduction in document review time
Identify hidden risks automatically
Scale legal review without hiring more associates

🔬 3. Research & Development

Problem: R&D teams struggle to synthesize insights from research papers, patents, and test results.

Solution:

Documents: Upload research papers, patents, lab reports, technical specs
Agent: Research Agent
Query: "What battery technologies show promise for EVs based on recent papers?"
Result: Synthesized insights from multiple sources with references

Business Impact:

Accelerate literature review from weeks to hours
Surface hidden connections across research
Make data-driven R&D decisions

💻 4. Internal Developer Documentation

Problem: Engineers waste time searching wikis, Confluence pages, and outdated docs.

Solution:

Documents: Upload API docs, architecture diagrams, runbooks, README files
Agent: Code Agent
Query: "How do I implement OAuth2 authentication in our microservices?"
Result: Step-by-step guide with code examples from your own docs

Business Impact:

50% reduction in "How do I...?" questions
Onboard developers 3x faster
Keep tribal knowledge accessible

📊 5. Financial Analysis & Compliance

Problem: Finance teams manually review hundreds of invoices, reports, and regulatory documents.

Solution:

Documents: Upload financial reports, invoices, audit trails, regulations
Agent: Analysis Agent
Query: "Analyze Q4 expense reports and flag anomalies over $10K"
Result: Automated analysis with anomaly detection and compliance checks

Business Impact:

Detect fraud and errors automatically
Ensure regulatory compliance
Close books 60% faster

👥 6. HR & Recruitment

Problem: HR teams manually screen resumes and answer repetitive policy questions.

Solution:

Documents: Upload resumes, job descriptions, company policies, benefits docs
Agent: Research + Analysis Agents
Query: "Find candidates with 5+ years Python and AWS experience"
Result: Ranked candidates with match scores and reasoning

Business Impact:

Screen 100 resumes in minutes
Reduce bias in initial screening
Instant answers to employee policy questions

✍️ 7. Content Marketing & SEO

Problem: Marketing teams struggle to maintain brand consistency and optimize content.

Solution:

Documents: Upload brand guidelines, competitor content, keyword research, past campaigns
Agent: Writer Agent
Query: "Write a blog post about AI automation following our brand voice"
Result: On-brand content using insights from uploaded materials

Business Impact:

10x content production speed
Consistent brand voice across channels
SEO-optimized content using your own data

🏥 8. Healthcare & Medical Records

Problem: Doctors spend hours reviewing patient histories and medical literature.

Solution:

Documents: Upload patient records, medical journals, treatment protocols
Agent: Research Agent
Query: "Summarize this patient's cardiac history and recommend screening tests"
Result: Comprehensive summary with evidence-based recommendations

Business Impact:

More time with patients, less with paperwork
Evidence-based treatment decisions
Reduce medical errors

🏭 9. Manufacturing & Quality Control

Problem: QA teams manually inspect reports and cross-reference specifications.

Solution:

Documents: Upload quality reports, specifications, inspection logs, SOPs
Agent: Analysis Agent
Query: "Identify defect patterns in last month's production reports"
Result: Root cause analysis with trend identification

Business Impact:

Predict quality issues before they escalate
Reduce defect rates by 40%
Optimize production processes

📚 10. Education & Training

Problem: Trainers manually create materials and answer repetitive questions.

Solution:

Documents: Upload textbooks, course materials, assessments, student questions
Agent: Writer + Research Agents
Query: "Create a quiz on Chapter 5 with explanations for answers"
Result: Automated assessment generation with detailed feedback

Business Impact:

Personalized learning at scale
Instant answers to student questions 24/7
Reduce trainer workload by 50%

📖 Quick Start Guide

Basic Workflow

Upload Documents
- Navigate to Documents page
- Drag and drop files (PDF, DOCX, TXT, MD, CSV - max 10MB)
- Documents are automatically processed and indexed
Chat with Agents
- Go to Chat page
- Select the right agent for your task:
  - Research - Finding and synthesizing information
  - Analysis - Data analysis and insights
  - Writer - Content creation and summarization
  - Code - Technical documentation and code help
- Ask questions and get contextual answers from your documents
Monitor Performance
- Visit Analytics page to track:
  - Document count and storage usage
  - Query statistics and response times
  - Agent performance metrics

🏗️ Project Structure

workspaceai/
├── backend/
│   ├── app/
│   │   ├── agents/              # AI agent implementations
│   │   ├── api/routes/          # API endpoints
│   │   ├── core/                # Config, logging, security
│   │   ├── mcp/                 # MCP server
│   │   ├── models/              # Pydantic schemas
│   │   ├── services/            # Business logic
│   │   └── utils/               # Utilities
│   ├── tests/                   # Unit tests
│   ├── requirements.txt
│   └── Dockerfile
├── frontend/
│   ├── src/
│   │   ├── components/          # React components
│   │   ├── pages/               # Page components
│   │   ├── lib/                 # API client, utilities
│   │   ├── types/               # TypeScript types
│   │   └── App.tsx
│   ├── package.json
│   └── Dockerfile
├── docs/
│   └── screenshots/             # Application screenshots
├── docker-compose.yml
├── .env.example
├── Makefile
└── README.md

🔧 API Examples

Upload Document

curl -X POST "http://localhost:8000/api/v1/documents/upload" \
  -H "Content-Type: multipart/form-data" \
  -F "file=@document.pdf"

Chat Request

curl -X POST "http://localhost:8000/api/v1/chat" \
  -H "Content-Type: application/json" \
  -d '{
    "message": "Summarize the key points from the uploaded documents",
    "agent_type": "research"
  }'

Invoke Specific Agent

curl -X POST "http://localhost:8000/api/v1/agents/analysis/invoke" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "Analyze the data patterns in the CSV files",
    "context": {}
  }'

Get Metrics

curl -X GET "http://localhost:8000/api/v1/analytics/metrics"

🧪 Testing

Backend Tests

cd backend

# Run all tests
pytest tests/ -v

# With coverage report
pytest tests/ -v --cov=app --cov-report=html

# Run specific test
pytest tests/test_agents.py -v

Frontend Tests

cd frontend
npm test
npm run test:coverage

Using Make Commands

make test        # Run all tests (backend + frontend)
make lint        # Run linters
make format      # Format code
make clean       # Clean caches

🔒 Security Features

JWT-based authentication (configurable)
CORS protection
Rate limiting with slowapi
Input validation with Pydantic
Environment-based configuration
Secure file upload handling
SQL injection prevention

🚀 Deployment

Docker Production Deployment

# Build and deploy
docker-compose -f docker-compose.yml -f docker-compose.prod.yml up -d

# Scale services
docker-compose up -d --scale backend=3

# View logs
docker-compose logs -f backend

Environment Variables for Production

ENVIRONMENT=production
LOG_LEVEL=INFO
DATABASE_URL=postgresql://user:pass@host:5432/db
REDIS_URL=redis://redis:6379/0
ENABLE_REDIS_CACHE=true

See DEPLOYMENT_GUIDE.md for detailed deployment instructions.

🤝 Contributing

Contributions are welcome! Please see CONTRIBUTING.md for guidelines.

Fork the repository
Create your feature branch (git checkout -b feature/AmazingFeature)
Commit your changes (git commit -m 'Add some AmazingFeature')
Push to the branch (git push origin feature/AmazingFeature)
Open a Pull Request

📝 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

LangChain - For the amazing LLM orchestration framework
Qdrant - For the high-performance vector database
FastAPI - For the modern Python web framework
OpenAI - For powerful language models and embeddings

📧 Contact

Project Link: https://github.com/kaninstein/workspaceai

🗺️ Roadmap

Built with ❤️ for the AI community

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
backend		backend
docs/screenshots		docs/screenshots
frontend		frontend
.env.example		.env.example
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
CONTRIBUTING.md		CONTRIBUTING.md
DEPLOYMENT_GUIDE.md		DEPLOYMENT_GUIDE.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
deploy-production.sh		deploy-production.sh
docker-compose.nginx.yml		docker-compose.nginx.yml
docker-compose.prod.yml		docker-compose.prod.yml
docker-compose.yml		docker-compose.yml
setup-local-dev.sh		setup-local-dev.sh
start-production.sh		start-production.sh
verify-setup.sh		verify-setup.sh

License

kaninstein/workspaceai

Folders and files

Latest commit

History

Repository files navigation

🚀 IA Workspace - Enterprise RAG System

📸 Screenshots

Dashboard

Document Management

Interactive Chat

Agent Catalog

Analytics

✨ Features

🎯 Core Capabilities

🏗️ Architecture

Tech Stack

Backend

Frontend

Infrastructure

🚀 Quick Start

Prerequisites

Installation

1. Clone the repository

2. Configure environment variables

3. Start with Docker Compose (Recommended)

4. Or run locally for development

5. Access the application

💼 Real-World Use Cases

🏢 1. Enterprise Customer Support

⚖️ 2. Legal Document Analysis

🔬 3. Research & Development

💻 4. Internal Developer Documentation

📊 5. Financial Analysis & Compliance

👥 6. HR & Recruitment

✍️ 7. Content Marketing & SEO

🏥 8. Healthcare & Medical Records

🏭 9. Manufacturing & Quality Control

📚 10. Education & Training

📖 Quick Start Guide

Basic Workflow

🏗️ Project Structure

🔧 API Examples

Upload Document

Chat Request

Invoke Specific Agent

Get Metrics

🧪 Testing

Backend Tests

Frontend Tests

Using Make Commands

🔒 Security Features

🚀 Deployment

Docker Production Deployment

Environment Variables for Production

🤝 Contributing

📝 License

🙏 Acknowledgments

📧 Contact

🗺️ Roadmap

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages