A production-ready AI agentic system built with FastAPI, LangGraph, and PostgreSQL. This system implements a complete 7-layer architecture for scalable, reliable, and observable AI agent deployments.
Version: 0.1.0
Python: >=3.13,<4.0
Author: Samwel Ngusa (ngusadeep@gmail.com)
This system follows a 7-layer production architecture:
- API/Router Layer - RESTful endpoints with rate limiting and validation
- Service Layer - Business logic abstraction (Database, LLM services)
- Core/Infrastructure Layer - Configuration, logging, metrics, middleware
- Agent/Orchestration Layer - LangGraph-based agent workflows
- Data/Persistence Layer - PostgreSQL with pgvector for embeddings
- Schema/Validation Layer - Pydantic models for type safety
- Utility/Helper Layer - Reusable utilities and helpers
- LangGraph Agent Workflows - Stateful, multi-step agent execution
- Long-Term Memory - Vector-based memory storage using Mem0AI and pgvector
- Tool Integration - Extensible tool system (e.g., DuckDuckGo search)
- Streaming Responses - Real-time token streaming for better UX
- Structured Logging - Structured logging with context tracking
- Observability - Prometheus metrics and Langfuse tracing
- Rate Limiting - Endpoint-specific rate limits
- Authentication & Authorization - JWT-based auth with session management
- Database Connection Pooling - Optimized for high concurrency
- Retry Logic - Automatic retry with exponential backoff for LLM calls
- Model Fallback - Automatic fallback between LLM models
- Health Checks - Comprehensive health monitoring
- Input Sanitization - Protection against injection attacks
- Docker & Docker Compose (for containerized deployment)
- Python 3.13+ (>=3.13,<4.0) - Required for local development
- Poetry - Python dependency management
- PostgreSQL 16+ with pgvector extension
- OpenAI API Key (for LLM capabilities)
- Langfuse API Keys (optional, for observability)
-
Clone the repository
git clone <repository-url> cd ai-agentic-system
-
Set up environment variables
cp .env.development.example .env.development # Edit .env.development with your API keys and configuration -
Start the services
docker compose up --build
-
Access the application
- API: http://localhost:8000
- API Docs: http://localhost:8000/docs
- Health Check: http://localhost:8000/health
-
Install dependencies
poetry install
-
Set up environment variables
export APP_ENV=development export OPENAI_API_KEY=your_key_here export POSTGRES_HOST=localhost # ... other environment variables
-
Run database migrations (if needed)
# Tables are created automatically on startup -
Start the application
poetry run uvicorn app.main:app --reload --host 0.0.0.0 --port 8000
The application uses environment-based configuration. Key environment variables:
OPENAI_API_KEY=your_openai_api_key
POSTGRES_HOST=localhost
POSTGRES_PORT=5432
POSTGRES_DB=postgres
POSTGRES_USER=postgres
POSTGRES_PASSWORD=your_password
JWT_SECRET_KEY=your_jwt_secret_key # Generate with: scripts/generate_jwt_secret.pyAPP_ENV=development # development, staging, production
DEFAULT_LLM_MODEL=gpt-4o-mini
MAX_TOKENS=2000
MAX_LLM_CALL_RETRIES=3
# Langfuse (Observability)
LANGFUSE_PUBLIC_KEY=your_key
LANGFUSE_SECRET_KEY=your_secret
LANGFUSE_HOST=https://cloud.langfuse.com
# Rate Limiting
RATE_LIMIT_CHAT=30 per minute
RATE_LIMIT_LOGIN=20 per minute
# ... see app/core/config.py for all optionsPOST /api/v1/auth/register
Content-Type: application/json
{
"email": "user@example.com",
"password": "SecurePass123!"
}POST /api/v1/auth/login
Content-Type: application/x-www-form-urlencoded
username=user@example.com&password=SecurePass123!POST /api/v1/auth/session
Authorization: Bearer <user_token>
Response:
{
"session_id": "uuid-here",
"name": "",
"token": {
"access_token": "session_token_here",
"token_type": "bearer",
"expires_at": "2025-01-26T12:00:00"
}
}GET /api/v1/auth/sessions
Authorization: Bearer <user_token>POST /api/v1/chatbot/chat
Authorization: Bearer <session_token>
Content-Type: application/json
{
"messages": [
{
"role": "user",
"content": "Hello, how are you?"
}
]
}POST /api/v1/chatbot/chat/stream
Authorization: Bearer <session_token>
Content-Type: application/json
{
"messages": [
{
"role": "user",
"content": "Tell me a story"
}
]
}
# Returns Server-Sent Events (SSE) streamGET /api/v1/chatbot/messages
Authorization: Bearer <session_token>DELETE /api/v1/chatbot/messages
Authorization: Bearer <session_token>GET /health
Response:
{
"status": "healthy",
"components": {
"api": "healthy",
"database": "healthy"
},
"timestamp": "2025-12-26T12:00:00"
}GET /metrics-
User Login
POST /api/v1/auth/login β Returns: User Token (subject = user_id) -
Create Session (Explicit)
POST /api/v1/auth/session Authorization: Bearer <user_token> β Returns: Session Token (subject = session_id UUID) -
Use Chat Endpoints
POST /api/v1/chatbot/chat Authorization: Bearer <session_token> β Works with explicit session
- β Clear separation of user auth vs session management
- β Multiple concurrent chat sessions per user
- β Better auditability and monitoring
- β Session-scoped tokens for security
- β Full control over session lifecycle
If a user token is used directly on chat endpoints, the system will:
- Automatically find the most recent session, OR
- Create a new session automatically
- Log a warning for monitoring purposes
ai-agentic-system/
βββ app/
β βββ api/v1/ # API endpoints and routers
β βββ core/ # Core infrastructure (config, logging, middleware)
β β βββ langgraph/ # Agent orchestration
β βββ models/ # Database models
β βββ schemas/ # Pydantic schemas
β βββ services/ # Business logic services
β βββ utils/ # Utility functions
βββ scripts/ # Utility scripts
βββ tests/ # Test suite
βββ docker-compose.yml # Docker orchestration
βββ Dockerfile # Application container
βββ pyproject.toml # Python dependencies (Poetry)
# Install test dependencies
poetry install --with test
# Run all tests
poetry run pytest
# Run tests excluding slow markers
poetry run pytest -m "not slow"This project uses multiple code quality tools:
# Formatting with Black
poetry run black app/
# Import sorting with isort
poetry run isort app/
# Linting with Ruff (fast, modern)
poetry run ruff check app/
# Linting with Flake8
poetry run flake8 app/
# Linting with Pylint
poetry run pylint app/Development Dependencies:
black- Code formatterisort- Import sorterruff- Fast Python linter (modern replacement for flake8)flake8- Traditional linting tooldjlint- Linter/formatter for HTML & templates
Test Dependencies:
pytest- Testing frameworkhttpx- Async HTTP client for testing APIs
- HTTP request metrics (count, duration, status codes)
- LLM inference duration
- Database connection pool metrics
Access metrics at: http://localhost:8000/metrics
All logs use structured logging with context tracking:
- Request IDs
- User IDs
- Session IDs
- Operation names
LLM calls are automatically traced for:
- Token usage
- Latency
- Cost tracking
- Response quality
- JWT Authentication - Secure token-based auth
- Password Hashing - Bcrypt with strength validation
- Input Sanitization - XSS and injection prevention
- Rate Limiting - DoS protection
- CORS Configuration - Configurable origins
- SQL Injection Protection - SQLModel ORM
- Environment-based Secrets - No hardcoded credentials
docker build -t ai-agentic-system:latest .# Development
docker compose -f docker-compose.yml up
# Production (with environment file)
APP_ENV=production docker compose up- Use environment-specific
.envfiles - Set strong
JWT_SECRET_KEY - Configure proper
ALLOWED_ORIGINS - Enable Prometheus/Grafana monitoring
- Set up database backups
- Configure log aggregation
- Debug mode enabled
- Console logging format
- Relaxed rate limits
- Hot-reload enabled
- Debug mode disabled
- JSON logging format
- Strict rate limits
- Production-grade security
Configure via APP_ENV environment variable.
Web Framework & Server:
fastapi>=0.121.0- Modern async web frameworkuvicorn>=0.34.0- ASGI serverasgiref>=3.8.1- ASGI utilities
LangChain / LangGraph Ecosystem:
langchain>=1.0.5- High-level LLM orchestrationlangchain-core>=1.0.4- Core abstractionslangchain-openai>=1.0.2- OpenAI integrationslangchain-community>=0.4.1- Community toolslanggraph>=1.0.2- Graph-based agent workflowslanggraph-checkpoint-postgres>=3.0.1- PostgreSQL checkpointing
Observability:
langfuse==3.9.1- LLM tracing and monitoringstructlog>=25.2.0- Structured loggingprometheus-client>=0.19.0- Prometheus metricsstarlette-prometheus>=0.7.0- Prometheus middleware
Database & Persistence:
psycopg2-binary>=2.9.10- PostgreSQL driversqlmodel>=0.0.24- SQLAlchemy + Pydantic ORMmem0ai>=1.0.0- AI memory management
Authentication & Security:
passlib[bcrypt]>=1.7.4- Password hashingbcrypt>=4.3.0- Low-level bcryptpython-jose[cryptography]>=3.4.0- JWT handlingemail-validator>=2.2.0- Email validation
Reliability:
tenacity>=9.1.2- Retry logicslowapi>=0.1.9- Rate limiting
Tools:
duckduckgo-search>=3.9.0- Search integration
See pyproject.toml for complete dependency list with versions.
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests
- Ensure code quality checks pass
- Submit a pull request
This project is licensed under the MIT License - see the LICENSE file for details.
Samwel Ngusa
Email: ngusadeep@gmail.com
- Name: ai-agentic-system
- Version: 0.1.0
- Python Version: >=3.13,<4.0
- Dependency Manager: Poetry
- Build Backend: poetry-core
Built following production-grade architectural patterns for agentic AI systems. Inspired by industry best practices for scalable, observable, and reliable AI deployments.
For issues, questions, or contributions, please open an issue on the repository.