Skip to content

ngusadeep/ai-agentic-system

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

27 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Production-Grade Agentic AI System

A production-ready AI agentic system built with FastAPI, LangGraph, and PostgreSQL. This system implements a complete 7-layer architecture for scalable, reliable, and observable AI agent deployments.

Version: 0.1.0
Python: >=3.13,<4.0
Author: Samwel Ngusa (ngusadeep@gmail.com)

πŸ—οΈ Architecture

This system follows a 7-layer production architecture:

  1. API/Router Layer - RESTful endpoints with rate limiting and validation
  2. Service Layer - Business logic abstraction (Database, LLM services)
  3. Core/Infrastructure Layer - Configuration, logging, metrics, middleware
  4. Agent/Orchestration Layer - LangGraph-based agent workflows
  5. Data/Persistence Layer - PostgreSQL with pgvector for embeddings
  6. Schema/Validation Layer - Pydantic models for type safety
  7. Utility/Helper Layer - Reusable utilities and helpers

✨ Features

Core Capabilities

  • LangGraph Agent Workflows - Stateful, multi-step agent execution
  • Long-Term Memory - Vector-based memory storage using Mem0AI and pgvector
  • Tool Integration - Extensible tool system (e.g., DuckDuckGo search)
  • Streaming Responses - Real-time token streaming for better UX

Production Features

  • Structured Logging - Structured logging with context tracking
  • Observability - Prometheus metrics and Langfuse tracing
  • Rate Limiting - Endpoint-specific rate limits
  • Authentication & Authorization - JWT-based auth with session management
  • Database Connection Pooling - Optimized for high concurrency
  • Retry Logic - Automatic retry with exponential backoff for LLM calls
  • Model Fallback - Automatic fallback between LLM models
  • Health Checks - Comprehensive health monitoring
  • Input Sanitization - Protection against injection attacks

πŸ“‹ Prerequisites

  • Docker & Docker Compose (for containerized deployment)
  • Python 3.13+ (>=3.13,<4.0) - Required for local development
  • Poetry - Python dependency management
  • PostgreSQL 16+ with pgvector extension
  • OpenAI API Key (for LLM capabilities)
  • Langfuse API Keys (optional, for observability)

πŸš€ Quick Start

Using Docker Compose (Recommended)

  1. Clone the repository

    git clone <repository-url>
    cd ai-agentic-system
  2. Set up environment variables

    cp .env.development.example .env.development
    # Edit .env.development with your API keys and configuration
  3. Start the services

    docker compose up --build
  4. Access the application

Local Development

  1. Install dependencies

    poetry install
  2. Set up environment variables

    export APP_ENV=development
    export OPENAI_API_KEY=your_key_here
    export POSTGRES_HOST=localhost
    # ... other environment variables
  3. Run database migrations (if needed)

    # Tables are created automatically on startup
  4. Start the application

    poetry run uvicorn app.main:app --reload --host 0.0.0.0 --port 8000

βš™οΈ Configuration

The application uses environment-based configuration. Key environment variables:

Required

OPENAI_API_KEY=your_openai_api_key
POSTGRES_HOST=localhost
POSTGRES_PORT=5432
POSTGRES_DB=postgres
POSTGRES_USER=postgres
POSTGRES_PASSWORD=your_password
JWT_SECRET_KEY=your_jwt_secret_key  # Generate with: scripts/generate_jwt_secret.py

Optional

APP_ENV=development  # development, staging, production
DEFAULT_LLM_MODEL=gpt-4o-mini
MAX_TOKENS=2000
MAX_LLM_CALL_RETRIES=3

# Langfuse (Observability)
LANGFUSE_PUBLIC_KEY=your_key
LANGFUSE_SECRET_KEY=your_secret
LANGFUSE_HOST=https://cloud.langfuse.com

# Rate Limiting
RATE_LIMIT_CHAT=30 per minute
RATE_LIMIT_LOGIN=20 per minute
# ... see app/core/config.py for all options

πŸ“‘ API Endpoints

Authentication

Register User

POST /api/v1/auth/register
Content-Type: application/json

{
  "email": "user@example.com",
  "password": "SecurePass123!"
}

Login

POST /api/v1/auth/login
Content-Type: application/x-www-form-urlencoded

username=user@example.com&password=SecurePass123!

Create Session

POST /api/v1/auth/session
Authorization: Bearer <user_token>

Response:
{
  "session_id": "uuid-here",
  "name": "",
  "token": {
    "access_token": "session_token_here",
    "token_type": "bearer",
    "expires_at": "2025-01-26T12:00:00"
  }
}

List Sessions

GET /api/v1/auth/sessions
Authorization: Bearer <user_token>

Chat Endpoints

Standard Chat (Non-Streaming)

POST /api/v1/chatbot/chat
Authorization: Bearer <session_token>
Content-Type: application/json

{
  "messages": [
    {
      "role": "user",
      "content": "Hello, how are you?"
    }
  ]
}

Streaming Chat

POST /api/v1/chatbot/chat/stream
Authorization: Bearer <session_token>
Content-Type: application/json

{
  "messages": [
    {
      "role": "user",
      "content": "Tell me a story"
    }
  ]
}

# Returns Server-Sent Events (SSE) stream

Get Conversation History

GET /api/v1/chatbot/messages
Authorization: Bearer <session_token>

Clear Conversation History

DELETE /api/v1/chatbot/messages
Authorization: Bearer <session_token>

System Endpoints

Health Check

GET /health

Response:
{
  "status": "healthy",
  "components": {
    "api": "healthy",
    "database": "healthy"
  },
  "timestamp": "2025-12-26T12:00:00"
}

Metrics (Prometheus)

GET /metrics

πŸ” Authentication Workflow

Production Workflow (Recommended)

  1. User Login

    POST /api/v1/auth/login
    β†’ Returns: User Token (subject = user_id)
    
  2. Create Session (Explicit)

    POST /api/v1/auth/session
    Authorization: Bearer <user_token>
    β†’ Returns: Session Token (subject = session_id UUID)
    
  3. Use Chat Endpoints

    POST /api/v1/chatbot/chat
    Authorization: Bearer <session_token>
    β†’ Works with explicit session
    

Benefits of Explicit Session Creation

  • βœ… Clear separation of user auth vs session management
  • βœ… Multiple concurrent chat sessions per user
  • βœ… Better auditability and monitoring
  • βœ… Session-scoped tokens for security
  • βœ… Full control over session lifecycle

Fallback Behavior

If a user token is used directly on chat endpoints, the system will:

  • Automatically find the most recent session, OR
  • Create a new session automatically
  • Log a warning for monitoring purposes

πŸ› οΈ Development

Project Structure

ai-agentic-system/
β”œβ”€β”€ app/
β”‚   β”œβ”€β”€ api/v1/          # API endpoints and routers
β”‚   β”œβ”€β”€ core/             # Core infrastructure (config, logging, middleware)
β”‚   β”‚   └── langgraph/    # Agent orchestration
β”‚   β”œβ”€β”€ models/           # Database models
β”‚   β”œβ”€β”€ schemas/          # Pydantic schemas
β”‚   β”œβ”€β”€ services/         # Business logic services
β”‚   └── utils/            # Utility functions
β”œβ”€β”€ scripts/              # Utility scripts
β”œβ”€β”€ tests/                # Test suite
β”œβ”€β”€ docker-compose.yml    # Docker orchestration
β”œβ”€β”€ Dockerfile            # Application container
└── pyproject.toml        # Python dependencies (Poetry)

Running Tests

# Install test dependencies
poetry install --with test

# Run all tests
poetry run pytest

# Run tests excluding slow markers
poetry run pytest -m "not slow"

Code Quality

This project uses multiple code quality tools:

# Formatting with Black
poetry run black app/

# Import sorting with isort
poetry run isort app/

# Linting with Ruff (fast, modern)
poetry run ruff check app/

# Linting with Flake8
poetry run flake8 app/

# Linting with Pylint
poetry run pylint app/

Development Dependencies:

  • black - Code formatter
  • isort - Import sorter
  • ruff - Fast Python linter (modern replacement for flake8)
  • flake8 - Traditional linting tool
  • djlint - Linter/formatter for HTML & templates

Test Dependencies:

  • pytest - Testing framework
  • httpx - Async HTTP client for testing APIs

πŸ“Š Monitoring & Observability

Prometheus Metrics

  • HTTP request metrics (count, duration, status codes)
  • LLM inference duration
  • Database connection pool metrics

Access metrics at: http://localhost:8000/metrics

Structured Logging

All logs use structured logging with context tracking:

  • Request IDs
  • User IDs
  • Session IDs
  • Operation names

Langfuse Tracing

LLM calls are automatically traced for:

  • Token usage
  • Latency
  • Cost tracking
  • Response quality

πŸ”’ Security Features

  • JWT Authentication - Secure token-based auth
  • Password Hashing - Bcrypt with strength validation
  • Input Sanitization - XSS and injection prevention
  • Rate Limiting - DoS protection
  • CORS Configuration - Configurable origins
  • SQL Injection Protection - SQLModel ORM
  • Environment-based Secrets - No hardcoded credentials

🐳 Docker Deployment

Build Image

docker build -t ai-agentic-system:latest .

Run with Docker Compose

# Development
docker compose -f docker-compose.yml up

# Production (with environment file)
APP_ENV=production docker compose up

Production Considerations

  • Use environment-specific .env files
  • Set strong JWT_SECRET_KEY
  • Configure proper ALLOWED_ORIGINS
  • Enable Prometheus/Grafana monitoring
  • Set up database backups
  • Configure log aggregation

πŸ”§ Environment-Specific Settings

Development

  • Debug mode enabled
  • Console logging format
  • Relaxed rate limits
  • Hot-reload enabled

Production

  • Debug mode disabled
  • JSON logging format
  • Strict rate limits
  • Production-grade security

Configure via APP_ENV environment variable.

πŸ“š Dependencies

Core Dependencies

Web Framework & Server:

  • fastapi>=0.121.0 - Modern async web framework
  • uvicorn>=0.34.0 - ASGI server
  • asgiref>=3.8.1 - ASGI utilities

LangChain / LangGraph Ecosystem:

  • langchain>=1.0.5 - High-level LLM orchestration
  • langchain-core>=1.0.4 - Core abstractions
  • langchain-openai>=1.0.2 - OpenAI integrations
  • langchain-community>=0.4.1 - Community tools
  • langgraph>=1.0.2 - Graph-based agent workflows
  • langgraph-checkpoint-postgres>=3.0.1 - PostgreSQL checkpointing

Observability:

  • langfuse==3.9.1 - LLM tracing and monitoring
  • structlog>=25.2.0 - Structured logging
  • prometheus-client>=0.19.0 - Prometheus metrics
  • starlette-prometheus>=0.7.0 - Prometheus middleware

Database & Persistence:

  • psycopg2-binary>=2.9.10 - PostgreSQL driver
  • sqlmodel>=0.0.24 - SQLAlchemy + Pydantic ORM
  • mem0ai>=1.0.0 - AI memory management

Authentication & Security:

  • passlib[bcrypt]>=1.7.4 - Password hashing
  • bcrypt>=4.3.0 - Low-level bcrypt
  • python-jose[cryptography]>=3.4.0 - JWT handling
  • email-validator>=2.2.0 - Email validation

Reliability:

  • tenacity>=9.1.2 - Retry logic
  • slowapi>=0.1.9 - Rate limiting

Tools:

  • duckduckgo-search>=3.9.0 - Search integration

See pyproject.toml for complete dependency list with versions.

🀝 Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Add tests
  5. Ensure code quality checks pass
  6. Submit a pull request

πŸ“ License

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ‘€ Author

Samwel Ngusa
Email: ngusadeep@gmail.com

πŸ“Š Project Metadata

  • Name: ai-agentic-system
  • Version: 0.1.0
  • Python Version: >=3.13,<4.0
  • Dependency Manager: Poetry
  • Build Backend: poetry-core

πŸ™ Acknowledgments

Built following production-grade architectural patterns for agentic AI systems. Inspired by industry best practices for scalable, observable, and reliable AI deployments.

πŸ“ž Support

For issues, questions, or contributions, please open an issue on the repository.

About

A 7-layer architectural blueprint for building reliable, observable, and scalable agentic AI systems in production. Separates concerns like orchestration, context, security, and monitoring to ensure both agent behavior correctness and system performance.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors