VideoGen Messenger is a distributed, cloud-native AI video generation platform designed for high availability, scalability, and performance.
┌─────────────┐
│ Clients │
│ (iOS/Web) │
└──────┬──────┘
│
▼
┌─────────────────────────────────────┐
│ API Gateway / Load Balancer │
│ (AWS ALB + WAF) │
└────────┬───────────┬─────────┬──────┘
│ │ │
▼ ▼ ▼
┌─────────┐ ┌─────────┐ ┌─────────┐
│ API │ │ API │ │ API │
│Instance │ │Instance │ │Instance │
│ (ECS) │ │ (ECS) │ │ (ECS) │
└────┬────┘ └────┬────┘ └────┬────┘
│ │ │
└───────────┴───────────┘
│
┌───────────┼───────────┐
│ │ │
▼ ▼ ▼
┌────────┐ ┌─────────┐ ┌──────────┐
│ RDS │ │ Redis │ │Elasticsearch│
│(Postgres)│(ElastiCache)│(OpenSearch)│
└────────┘ └─────────┘ └──────────┘
│
▼
┌────────┐
│BullMQ │
│Workers │
└────┬───┘
│
▼
┌──────────────┐
│AI Providers │
│(Veo/Runway/ │
│ Minimax) │
└──────┬───────┘
│
▼
┌────────┐
│ S3 │
│(Videos)│
└────┬───┘
│
▼
┌───────────┐
│CloudFront │
│ (CDN) │
└───────────┘
Technology: Express.js, Node.js 18+
Responsibilities:
- RESTful API endpoints
- Request validation
- Authentication & Authorization
- Rate limiting
- Error handling
- Response formatting
Key Files:
backend/api/server.js- Main Express serverbackend/api/routes/- Route definitionsbackend/api/controllers/- Business logicbackend/api/middleware/- Middleware functions
Scaling: Horizontal auto-scaling based on CPU/memory metrics
Responsibilities:
- Core business logic
- External API integration
- Data processing
- Job orchestration
Services:
- AI provider integration (Google Veo, Runway, Minimax)
- Provider selection logic
- Status tracking
- Content safety checks
- Elasticsearch integration
- Multi-field search with boosting
- Autocomplete/suggestions
- Faceted search
- Result caching
- Video metadata management
- Storage orchestration
- Thumbnail generation
- Video processing
- Real-time trending calculations
- View count aggregation
- Engagement metrics
- Cache management
- BullMQ job management
- Job scheduling
- Priority queue handling
- Worker coordination
Primary Database: PostgreSQL 15+ (AWS RDS)
Schema Design:
-- Users table
CREATE TABLE users (
id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
email VARCHAR(255) UNIQUE NOT NULL,
password_hash VARCHAR(255) NOT NULL,
name VARCHAR(255),
api_key VARCHAR(255) UNIQUE,
subscription_tier VARCHAR(50) DEFAULT 'free',
credits INTEGER DEFAULT 100,
created_at TIMESTAMP DEFAULT NOW(),
updated_at TIMESTAMP DEFAULT NOW()
);
-- Videos table
CREATE TABLE videos (
id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
user_id UUID REFERENCES users(id),
title VARCHAR(500),
description TEXT,
prompt TEXT NOT NULL,
video_url VARCHAR(1000),
thumbnail_url VARCHAR(1000),
quality VARCHAR(50),
duration INTEGER,
status VARCHAR(50),
provider VARCHAR(50),
metadata JSONB,
views INTEGER DEFAULT 0,
likes INTEGER DEFAULT 0,
shares INTEGER DEFAULT 0,
created_at TIMESTAMP DEFAULT NOW(),
updated_at TIMESTAMP DEFAULT NOW()
);
-- Generation jobs table
CREATE TABLE generation_jobs (
id UUID PRIMARY KEY,
user_id UUID REFERENCES users(id),
video_id UUID REFERENCES videos(id),
prompt TEXT NOT NULL,
quality VARCHAR(50),
duration INTEGER,
status VARCHAR(50),
provider VARCHAR(50),
provider_job_id VARCHAR(255),
error TEXT,
progress DECIMAL(3,2),
created_at TIMESTAMP DEFAULT NOW(),
updated_at TIMESTAMP DEFAULT NOW()
);
-- Indexes
CREATE INDEX idx_videos_user_id ON videos(user_id);
CREATE INDEX idx_videos_status ON videos(status);
CREATE INDEX idx_videos_created_at ON videos(created_at DESC);
CREATE INDEX idx_generation_jobs_user_id ON generation_jobs(user_id);
CREATE INDEX idx_generation_jobs_status ON generation_jobs(status);Connections:
- Connection pooling: Min 2, Max 10
- SSL/TLS enabled
- Read replicas for scaling reads
- Automated backups (7-day retention)
Technology: Redis 7+ (AWS ElastiCache)
Use Cases:
- Session storage
- API response caching
- Job status tracking
- Rate limit counters
- Search result caching
- Trending video cache
Key Patterns:
cache:search:{query_hash} # Search results
generation:job:{job_id} # Job status
trending:videos:24h # Trending cache
ratelimit:{user_id}:{endpoint} # Rate limits
session:{session_id} # User sessions
Configuration:
- Cluster mode enabled
- Automatic failover
- Multi-AZ deployment
- TLS encryption
Technology: Elasticsearch/OpenSearch 2.9+
Index Structure:
{
"videos": {
"settings": {
"number_of_shards": 2,
"number_of_replicas": 1,
"analysis": {
"analyzer": {
"autocomplete_analyzer": {
"type": "custom",
"tokenizer": "edge_ngram",
"filter": ["lowercase", "asciifolding"]
}
}
}
},
"mappings": {
"properties": {
"video_id": { "type": "keyword" },
"title": {
"type": "text",
"fields": {
"keyword": { "type": "keyword" },
"autocomplete": { "type": "text", "analyzer": "autocomplete_analyzer" }
}
},
"description": { "type": "text" },
"prompt": { "type": "text" },
"tags": { "type": "keyword" },
"quality": { "type": "keyword" },
"duration": { "type": "float" },
"views": { "type": "long" },
"created_at": { "type": "date" }
}
}
}
}Technology: BullMQ with Redis
Queue Types:
-
Generation Queue
- Priority: High
- Concurrency: 5 workers
- Job types: Video generation requests
- Retry: 3 attempts with exponential backoff
-
Processing Queue
- Priority: Medium
- Concurrency: 10 workers
- Job types: Thumbnail generation, encoding
- Retry: 2 attempts
-
Indexing Queue
- Priority: Low
- Concurrency: 5 workers
- Job types: Elasticsearch indexing
- Retry: 5 attempts
Job Lifecycle:
Created → Queued → Active → Completed/Failed
↓
Delayed (retry)
Technology: AWS S3 + CloudFront CDN
Bucket Structure:
videogen-videos-prod/
├── videos/
│ ├── {user_id}/
│ │ └── {video_id}.mp4
├── thumbnails/
│ ├── {user_id}/
│ │ └── {video_id}.jpg
└── temp/
└── {job_id}.mp4
Access Patterns:
- Videos: Served via CloudFront CDN
- Thumbnails: Optimized for fast loading
- Temp files: Auto-expire after 24 hours
Storage Classes:
- Hot videos (< 30 days): S3 Standard
- Warm videos (30-90 days): S3 Intelligent-Tiering
- Cold videos (> 90 days): S3 Glacier
Providers:
-
Google Veo 3
- Quality: HD, 4K
- Max duration: 8 seconds
- Best for: High-quality, realistic videos
- Rate limits: 100 requests/minute
-
Runway Gen-3
- Quality: HD, 4K, Premium
- Max duration: 10 seconds
- Best for: Motion control, premium quality
- Rate limits: 50 requests/minute
-
Minimax Video-01
- Quality: SD, HD
- Max duration: 6 seconds
- Best for: Fast, cost-effective generation
- Rate limits: 200 requests/minute
Provider Selection Logic:
Quality: 4K → Veo3 or Runway
Quality: HD → Runway > Veo3 > Minimax
Quality: SD → Minimax > Veo3 > Runway
1. Client Request
↓
2. API Validation
↓
3. Authentication Check
↓
4. Rate Limit Check
↓
5. Credit Deduction
↓
6. Job Creation (Redis)
↓
7. Queue Job (BullMQ)
↓
8. Worker Picks Job
↓
9. Provider Selection
↓
10. AI Generation (External API)
↓
11. Status Polling
↓
12. Video Download
↓
13. Upload to S3
↓
14. Thumbnail Generation
↓
15. Database Update
↓
16. Elasticsearch Index
↓
17. Cache Invalidation
↓
18. Client Notification (WebSocket/Polling)
1. Client Search Request
↓
2. Cache Check (Redis)
↓ (if miss)
3. Build Elasticsearch Query
↓
4. Execute Search
↓
5. Aggregate Results
↓
6. Enrich with Metadata (DB)
↓
7. Cache Results
↓
8. Return to Client
- API servers: Auto-scale based on CPU (target: 70%)
- Workers: Scale based on queue depth
- Database: Read replicas for read-heavy workloads
- Cache: Redis cluster with sharding
- Database: Upgrade instance type as needed
- Elasticsearch: Add more nodes for indexing performance
- Connection pooling
- Query result caching
- CDN for static assets
- Compression (gzip/brotli)
- Database query optimization
- Eager loading relationships
- Pagination everywhere
- JWT tokens (24h expiry)
- API keys for programmatic access
- Bcrypt for password hashing (10 rounds)
- Session management via Redis
- Role-based access control (RBAC)
- Resource ownership validation
- Rate limiting per user/API key
- Encryption at rest (RDS, S3, ElastiCache)
- Encryption in transit (TLS 1.3)
- Secrets in AWS Secrets Manager
- VPC isolation
- Security groups
- WAF for DDoS protection
- Prompt filtering for unsafe content
- Content moderation APIs
- CORS policies
- CSP headers
- Request latency (p50, p95, p99)
- Error rates
- Queue depth
- Worker utilization
- Database connections
- Cache hit rates
- AI provider response times
- Structured logging (JSON)
- Log aggregation (CloudWatch)
- Error tracking (Sentry)
- Request tracing (X-Ray)
- High error rate (> 1%)
- High latency (> 2s)
- Queue backup (> 100 jobs)
- Database connection errors
- AI provider failures
- Disk space warnings
- Database: Automated daily backups (7-day retention)
- S3: Versioning enabled
- Redis: AOF persistence + snapshots
- RTO (Recovery Time Objective): 1 hour
- RPO (Recovery Point Objective): 15 minutes
- Multi-region failover capability
- Database point-in-time recovery
| Layer | Technology | Purpose |
|---|---|---|
| Runtime | Node.js 18+ | Application server |
| Framework | Express.js | Web framework |
| Database | PostgreSQL 15 | Primary datastore |
| Cache | Redis 7 | Caching & sessions |
| Search | Elasticsearch 8 | Full-text search |
| Queue | BullMQ | Job processing |
| Storage | AWS S3 | Object storage |
| CDN | CloudFront | Content delivery |
| Container | Docker | Containerization |
| Orchestration | ECS/Kubernetes | Container orchestration |
| Load Balancer | AWS ALB | Traffic distribution |
| Monitoring | CloudWatch | Metrics & logs |
| APM | Sentry/NewRelic | Performance monitoring |
- GraphQL API - More flexible querying
- WebSocket Support - Real-time updates
- Multi-region Deployment - Lower latency globally
- Video Streaming - HLS/DASH support
- Machine Learning - Personalized recommendations
- Blockchain Integration - NFT minting for videos
- Social Features - Comments, likes, shares
- Advanced Analytics - User behavior tracking