Skip to content

Enterprise-grade RAG system with 4 specialized AI agents (Research, Analysis, Writer, Code). Built with FastAPI, LangChain, Qdrant vector database, and React. Features intelligent document processing, semantic search, and contextual AI responses with MCP integration.

License

Notifications You must be signed in to change notification settings

kaninstein/workspaceai

Repository files navigation

πŸš€ IA Workspace - Enterprise RAG System

Python FastAPI React TypeScript License

Enterprise-grade RAG (Retrieval-Augmented Generation) system with specialized AI agents, built with modern technologies and best practices.

A production-ready AI workspace featuring intelligent document processing, semantic search, and four specialized agents for research, analysis, writing, and code generation.

πŸ“Έ Screenshots

Dashboard

Dashboard Real-time metrics and analytics overview

Document Management

Documents Multi-format document upload and processing

Interactive Chat

Chat Chat interface with specialized AI agents

Agent Catalog

Agents Specialized agents with custom tools

Analytics

Analytics Detailed metrics and performance monitoring


✨ Features

🎯 Core Capabilities

  • πŸ“„ Intelligent Document Processing

    • Multi-format support (PDF, DOCX, TXT, MD, CSV)
    • Smart chunking strategies (recursive, semantic, markdown-aware, code-aware)
    • Automatic metadata extraction and preservation
    • Asynchronous processing with progress tracking
  • πŸ€– Specialized AI Agents

    • Research Agent - Expert at finding and synthesizing information
    • Analysis Agent - Data analysis and insight generation
    • Writer Agent - Professional content creation
    • Code Agent - Code generation and technical documentation
  • πŸ” Advanced Search

    • Semantic search with Qdrant vector database
    • Hybrid search capabilities (dense + sparse)
    • Metadata filtering and reranking
    • Multi-query search strategies
  • πŸ’¬ Interactive Chat Interface

    • Real-time streaming responses
    • WebSocket support for live updates
    • Agent selection for specialized tasks
    • Markdown rendering with syntax highlighting
  • πŸ“Š Analytics & Monitoring

    • Real-time performance metrics
    • Usage tracking and statistics
    • Structured logging with context
    • Health checks and observability
  • πŸ”Œ Model Context Protocol (MCP)

    • Standard protocol for AI tool integration
    • 5 specialized tools exposed via MCP
    • Easy integration with external systems

πŸ—οΈ Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                        Frontend (React)                      β”‚
β”‚  Dashboard | Documents | Chat | Agents | Analytics          β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                  β”‚
                  β”‚ REST API / WebSocket
                  β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                     Backend (FastAPI)                        β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚
β”‚  β”‚  Document  β”‚  β”‚   Agent    β”‚  β”‚   MCP Server        β”‚  β”‚
β”‚  β”‚ Processor  β”‚  β”‚Orchestratorβ”‚  β”‚   (Tools)           β”‚  β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
          β”‚               β”‚
          β”‚               β”‚
  β”Œβ”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
  β”‚   Qdrant     β”‚  β”‚    OpenAI API                     β”‚
  β”‚  (Vectors)   β”‚  β”‚  (Embeddings & Agents)            β”‚
  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Tech Stack

Backend

  • FastAPI - Modern, high-performance web framework
  • LangChain - LLM orchestration and agent framework
  • Qdrant - High-performance vector database
  • Pydantic - Data validation and settings management
  • Structlog - Structured logging for observability
  • PyPDF2, python-docx - Document parsing

Frontend

  • React 18 - Modern UI library
  • TypeScript - Type-safe JavaScript
  • Tailwind CSS - Utility-first CSS framework
  • Zustand - Lightweight state management
  • React Query - Server state management
  • React Markdown - Markdown rendering

Infrastructure

  • Docker & Docker Compose - Containerization
  • Nginx - Reverse proxy and static file serving
  • Redis (optional) - Caching layer

πŸš€ Quick Start

Prerequisites

  • Python 3.11+
  • Node.js 18+
  • Docker & Docker Compose
  • OpenAI API Key

Installation

1. Clone the repository

git clone git@github.com:kaninstein/workspaceai.git
cd workspaceai

2. Configure environment variables

cp .env.example .env

Edit .env and add your credentials:

# Required
OPENAI_API_KEY=sk-your-openai-key-here
SECRET_KEY=your-secret-key-for-jwt

# Optional (defaults provided)
QDRANT_HOST=localhost
QDRANT_PORT=6333
DATABASE_URL=sqlite:///./ia_workspace.db

3. Start with Docker Compose (Recommended)

# Start all services
docker-compose up -d

# View logs
docker-compose logs -f

# Stop services
docker-compose down

4. Or run locally for development

Backend:

cd backend

# Install dependencies
pip install -r requirements.txt

# Start Qdrant (in separate terminal)
docker run -p 6333:6333 qdrant/qdrant

# Run backend
uvicorn app.main:app --reload --port 8000

Frontend:

cd frontend

# Install dependencies
npm install

# Run development server
npm run dev

5. Access the application


πŸ’Ό Real-World Use Cases

This system solves real business problems across multiple industries. Here's how companies can leverage it:

🏒 1. Enterprise Customer Support

Problem: Support teams waste time searching through documentation, SOPs, and past tickets.

Solution:

Documents: Upload KB articles, product manuals, troubleshooting guides, FAQs
Agent: Research Agent
Query: "How do I reset a user's password in the enterprise dashboard?"
Result: Instant answers with citations from indexed documentation

Business Impact:

  • 70% faster ticket resolution
  • Consistent answers across support team
  • New hires productive in days, not weeks

βš–οΈ 2. Legal Document Analysis

Problem: Law firms spend hours reviewing contracts, case files, and legal precedents.

Solution:

Documents: Upload contracts, case law, compliance documents
Agent: Analysis Agent
Query: "Identify all liability clauses in these vendor contracts and compare terms"
Result: Structured analysis with risk assessment and comparisons

Business Impact:

  • 80% reduction in document review time
  • Identify hidden risks automatically
  • Scale legal review without hiring more associates

πŸ”¬ 3. Research & Development

Problem: R&D teams struggle to synthesize insights from research papers, patents, and test results.

Solution:

Documents: Upload research papers, patents, lab reports, technical specs
Agent: Research Agent
Query: "What battery technologies show promise for EVs based on recent papers?"
Result: Synthesized insights from multiple sources with references

Business Impact:

  • Accelerate literature review from weeks to hours
  • Surface hidden connections across research
  • Make data-driven R&D decisions

πŸ’» 4. Internal Developer Documentation

Problem: Engineers waste time searching wikis, Confluence pages, and outdated docs.

Solution:

Documents: Upload API docs, architecture diagrams, runbooks, README files
Agent: Code Agent
Query: "How do I implement OAuth2 authentication in our microservices?"
Result: Step-by-step guide with code examples from your own docs

Business Impact:

  • 50% reduction in "How do I...?" questions
  • Onboard developers 3x faster
  • Keep tribal knowledge accessible

πŸ“Š 5. Financial Analysis & Compliance

Problem: Finance teams manually review hundreds of invoices, reports, and regulatory documents.

Solution:

Documents: Upload financial reports, invoices, audit trails, regulations
Agent: Analysis Agent
Query: "Analyze Q4 expense reports and flag anomalies over $10K"
Result: Automated analysis with anomaly detection and compliance checks

Business Impact:

  • Detect fraud and errors automatically
  • Ensure regulatory compliance
  • Close books 60% faster

πŸ‘₯ 6. HR & Recruitment

Problem: HR teams manually screen resumes and answer repetitive policy questions.

Solution:

Documents: Upload resumes, job descriptions, company policies, benefits docs
Agent: Research + Analysis Agents
Query: "Find candidates with 5+ years Python and AWS experience"
Result: Ranked candidates with match scores and reasoning

Business Impact:

  • Screen 100 resumes in minutes
  • Reduce bias in initial screening
  • Instant answers to employee policy questions

✍️ 7. Content Marketing & SEO

Problem: Marketing teams struggle to maintain brand consistency and optimize content.

Solution:

Documents: Upload brand guidelines, competitor content, keyword research, past campaigns
Agent: Writer Agent
Query: "Write a blog post about AI automation following our brand voice"
Result: On-brand content using insights from uploaded materials

Business Impact:

  • 10x content production speed
  • Consistent brand voice across channels
  • SEO-optimized content using your own data

πŸ₯ 8. Healthcare & Medical Records

Problem: Doctors spend hours reviewing patient histories and medical literature.

Solution:

Documents: Upload patient records, medical journals, treatment protocols
Agent: Research Agent
Query: "Summarize this patient's cardiac history and recommend screening tests"
Result: Comprehensive summary with evidence-based recommendations

Business Impact:

  • More time with patients, less with paperwork
  • Evidence-based treatment decisions
  • Reduce medical errors

🏭 9. Manufacturing & Quality Control

Problem: QA teams manually inspect reports and cross-reference specifications.

Solution:

Documents: Upload quality reports, specifications, inspection logs, SOPs
Agent: Analysis Agent
Query: "Identify defect patterns in last month's production reports"
Result: Root cause analysis with trend identification

Business Impact:

  • Predict quality issues before they escalate
  • Reduce defect rates by 40%
  • Optimize production processes

πŸ“š 10. Education & Training

Problem: Trainers manually create materials and answer repetitive questions.

Solution:

Documents: Upload textbooks, course materials, assessments, student questions
Agent: Writer + Research Agents
Query: "Create a quiz on Chapter 5 with explanations for answers"
Result: Automated assessment generation with detailed feedback

Business Impact:

  • Personalized learning at scale
  • Instant answers to student questions 24/7
  • Reduce trainer workload by 50%

πŸ“– Quick Start Guide

Basic Workflow

  1. Upload Documents

    • Navigate to Documents page
    • Drag and drop files (PDF, DOCX, TXT, MD, CSV - max 10MB)
    • Documents are automatically processed and indexed
  2. Chat with Agents

    • Go to Chat page
    • Select the right agent for your task:
      • Research - Finding and synthesizing information
      • Analysis - Data analysis and insights
      • Writer - Content creation and summarization
      • Code - Technical documentation and code help
    • Ask questions and get contextual answers from your documents
  3. Monitor Performance

    • Visit Analytics page to track:
      • Document count and storage usage
      • Query statistics and response times
      • Agent performance metrics

πŸ—οΈ Project Structure

workspaceai/
β”œβ”€β”€ backend/
β”‚   β”œβ”€β”€ app/
β”‚   β”‚   β”œβ”€β”€ agents/              # AI agent implementations
β”‚   β”‚   β”œβ”€β”€ api/routes/          # API endpoints
β”‚   β”‚   β”œβ”€β”€ core/                # Config, logging, security
β”‚   β”‚   β”œβ”€β”€ mcp/                 # MCP server
β”‚   β”‚   β”œβ”€β”€ models/              # Pydantic schemas
β”‚   β”‚   β”œβ”€β”€ services/            # Business logic
β”‚   β”‚   └── utils/               # Utilities
β”‚   β”œβ”€β”€ tests/                   # Unit tests
β”‚   β”œβ”€β”€ requirements.txt
β”‚   └── Dockerfile
β”œβ”€β”€ frontend/
β”‚   β”œβ”€β”€ src/
β”‚   β”‚   β”œβ”€β”€ components/          # React components
β”‚   β”‚   β”œβ”€β”€ pages/               # Page components
β”‚   β”‚   β”œβ”€β”€ lib/                 # API client, utilities
β”‚   β”‚   β”œβ”€β”€ types/               # TypeScript types
β”‚   β”‚   └── App.tsx
β”‚   β”œβ”€β”€ package.json
β”‚   └── Dockerfile
β”œβ”€β”€ docs/
β”‚   └── screenshots/             # Application screenshots
β”œβ”€β”€ docker-compose.yml
β”œβ”€β”€ .env.example
β”œβ”€β”€ Makefile
└── README.md

πŸ”§ API Examples

Upload Document

curl -X POST "http://localhost:8000/api/v1/documents/upload" \
  -H "Content-Type: multipart/form-data" \
  -F "file=@document.pdf"

Chat Request

curl -X POST "http://localhost:8000/api/v1/chat" \
  -H "Content-Type: application/json" \
  -d '{
    "message": "Summarize the key points from the uploaded documents",
    "agent_type": "research"
  }'

Invoke Specific Agent

curl -X POST "http://localhost:8000/api/v1/agents/analysis/invoke" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "Analyze the data patterns in the CSV files",
    "context": {}
  }'

Get Metrics

curl -X GET "http://localhost:8000/api/v1/analytics/metrics"

πŸ§ͺ Testing

Backend Tests

cd backend

# Run all tests
pytest tests/ -v

# With coverage report
pytest tests/ -v --cov=app --cov-report=html

# Run specific test
pytest tests/test_agents.py -v

Frontend Tests

cd frontend
npm test
npm run test:coverage

Using Make Commands

make test        # Run all tests (backend + frontend)
make lint        # Run linters
make format      # Format code
make clean       # Clean caches

πŸ”’ Security Features

  • JWT-based authentication (configurable)
  • CORS protection
  • Rate limiting with slowapi
  • Input validation with Pydantic
  • Environment-based configuration
  • Secure file upload handling
  • SQL injection prevention

πŸš€ Deployment

Docker Production Deployment

# Build and deploy
docker-compose -f docker-compose.yml -f docker-compose.prod.yml up -d

# Scale services
docker-compose up -d --scale backend=3

# View logs
docker-compose logs -f backend

Environment Variables for Production

ENVIRONMENT=production
LOG_LEVEL=INFO
DATABASE_URL=postgresql://user:pass@host:5432/db
REDIS_URL=redis://redis:6379/0
ENABLE_REDIS_CACHE=true

See DEPLOYMENT_GUIDE.md for detailed deployment instructions.


🀝 Contributing

Contributions are welcome! Please see CONTRIBUTING.md for guidelines.

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/AmazingFeature)
  3. Commit your changes (git commit -m 'Add some AmazingFeature')
  4. Push to the branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

πŸ“ License

This project is licensed under the MIT License - see the LICENSE file for details.


πŸ™ Acknowledgments

  • LangChain - For the amazing LLM orchestration framework
  • Qdrant - For the high-performance vector database
  • FastAPI - For the modern Python web framework
  • OpenAI - For powerful language models and embeddings

πŸ“§ Contact

Project Link: https://github.com/kaninstein/workspaceai


πŸ—ΊοΈ Roadmap

  • Core RAG functionality with multiple agents
  • Document processing pipeline
  • Real-time chat interface
  • Analytics dashboard
  • MCP protocol integration
  • User authentication and multi-tenancy
  • Document versioning and history
  • Advanced analytics dashboards
  • Export conversations to PDF
  • Custom agent creation UI
  • Integration with more LLM providers
  • Mobile app (React Native)
  • Voice input/output support

Built with ❀️ for the AI community

GitHub

About

Enterprise-grade RAG system with 4 specialized AI agents (Research, Analysis, Writer, Code). Built with FastAPI, LangChain, Qdrant vector database, and React. Features intelligent document processing, semantic search, and contextual AI responses with MCP integration.

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published