Skip to content

Latest commit

 

History

History
530 lines (454 loc) · 12.8 KB

File metadata and controls

530 lines (454 loc) · 12.8 KB

VideoGen Messenger Architecture

System Overview

VideoGen Messenger is a distributed, cloud-native AI video generation platform designed for high availability, scalability, and performance.

┌─────────────┐
│   Clients   │
│  (iOS/Web)  │
└──────┬──────┘
       │
       ▼
┌─────────────────────────────────────┐
│      API Gateway / Load Balancer    │
│          (AWS ALB + WAF)            │
└────────┬───────────┬─────────┬──────┘
         │           │         │
         ▼           ▼         ▼
   ┌─────────┐ ┌─────────┐ ┌─────────┐
   │  API    │ │  API    │ │  API    │
   │Instance │ │Instance │ │Instance │
   │  (ECS)  │ │  (ECS)  │ │  (ECS)  │
   └────┬────┘ └────┬────┘ └────┬────┘
        │           │           │
        └───────────┴───────────┘
                    │
        ┌───────────┼───────────┐
        │           │           │
        ▼           ▼           ▼
   ┌────────┐ ┌─────────┐ ┌──────────┐
   │  RDS   │ │  Redis  │ │Elasticsearch│
   │(Postgres)│(ElastiCache)│(OpenSearch)│
   └────────┘ └─────────┘ └──────────┘
        │
        ▼
   ┌────────┐
   │BullMQ  │
   │Workers │
   └────┬───┘
        │
        ▼
   ┌──────────────┐
   │AI Providers  │
   │(Veo/Runway/  │
   │  Minimax)    │
   └──────┬───────┘
          │
          ▼
     ┌────────┐
     │   S3   │
     │(Videos)│
     └────┬───┘
          │
          ▼
     ┌───────────┐
     │CloudFront │
     │   (CDN)   │
     └───────────┘

Core Components

1. API Layer

Technology: Express.js, Node.js 18+

Responsibilities:

  • RESTful API endpoints
  • Request validation
  • Authentication & Authorization
  • Rate limiting
  • Error handling
  • Response formatting

Key Files:

  • backend/api/server.js - Main Express server
  • backend/api/routes/ - Route definitions
  • backend/api/controllers/ - Business logic
  • backend/api/middleware/ - Middleware functions

Scaling: Horizontal auto-scaling based on CPU/memory metrics

2. Service Layer

Responsibilities:

  • Core business logic
  • External API integration
  • Data processing
  • Job orchestration

Services:

GenerationService

  • AI provider integration (Google Veo, Runway, Minimax)
  • Provider selection logic
  • Status tracking
  • Content safety checks

SearchService

  • Elasticsearch integration
  • Multi-field search with boosting
  • Autocomplete/suggestions
  • Faceted search
  • Result caching

VideoService

  • Video metadata management
  • Storage orchestration
  • Thumbnail generation
  • Video processing

TrendingService

  • Real-time trending calculations
  • View count aggregation
  • Engagement metrics
  • Cache management

QueueService

  • BullMQ job management
  • Job scheduling
  • Priority queue handling
  • Worker coordination

3. Database Layer

Primary Database: PostgreSQL 15+ (AWS RDS)

Schema Design:

-- Users table
CREATE TABLE users (
    id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
    email VARCHAR(255) UNIQUE NOT NULL,
    password_hash VARCHAR(255) NOT NULL,
    name VARCHAR(255),
    api_key VARCHAR(255) UNIQUE,
    subscription_tier VARCHAR(50) DEFAULT 'free',
    credits INTEGER DEFAULT 100,
    created_at TIMESTAMP DEFAULT NOW(),
    updated_at TIMESTAMP DEFAULT NOW()
);

-- Videos table
CREATE TABLE videos (
    id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
    user_id UUID REFERENCES users(id),
    title VARCHAR(500),
    description TEXT,
    prompt TEXT NOT NULL,
    video_url VARCHAR(1000),
    thumbnail_url VARCHAR(1000),
    quality VARCHAR(50),
    duration INTEGER,
    status VARCHAR(50),
    provider VARCHAR(50),
    metadata JSONB,
    views INTEGER DEFAULT 0,
    likes INTEGER DEFAULT 0,
    shares INTEGER DEFAULT 0,
    created_at TIMESTAMP DEFAULT NOW(),
    updated_at TIMESTAMP DEFAULT NOW()
);

-- Generation jobs table
CREATE TABLE generation_jobs (
    id UUID PRIMARY KEY,
    user_id UUID REFERENCES users(id),
    video_id UUID REFERENCES videos(id),
    prompt TEXT NOT NULL,
    quality VARCHAR(50),
    duration INTEGER,
    status VARCHAR(50),
    provider VARCHAR(50),
    provider_job_id VARCHAR(255),
    error TEXT,
    progress DECIMAL(3,2),
    created_at TIMESTAMP DEFAULT NOW(),
    updated_at TIMESTAMP DEFAULT NOW()
);

-- Indexes
CREATE INDEX idx_videos_user_id ON videos(user_id);
CREATE INDEX idx_videos_status ON videos(status);
CREATE INDEX idx_videos_created_at ON videos(created_at DESC);
CREATE INDEX idx_generation_jobs_user_id ON generation_jobs(user_id);
CREATE INDEX idx_generation_jobs_status ON generation_jobs(status);

Connections:

  • Connection pooling: Min 2, Max 10
  • SSL/TLS enabled
  • Read replicas for scaling reads
  • Automated backups (7-day retention)

4. Cache Layer

Technology: Redis 7+ (AWS ElastiCache)

Use Cases:

  • Session storage
  • API response caching
  • Job status tracking
  • Rate limit counters
  • Search result caching
  • Trending video cache

Key Patterns:

cache:search:{query_hash}           # Search results
generation:job:{job_id}             # Job status
trending:videos:24h                 # Trending cache
ratelimit:{user_id}:{endpoint}      # Rate limits
session:{session_id}                # User sessions

Configuration:

  • Cluster mode enabled
  • Automatic failover
  • Multi-AZ deployment
  • TLS encryption

5. Search Layer

Technology: Elasticsearch/OpenSearch 2.9+

Index Structure:

{
  "videos": {
    "settings": {
      "number_of_shards": 2,
      "number_of_replicas": 1,
      "analysis": {
        "analyzer": {
          "autocomplete_analyzer": {
            "type": "custom",
            "tokenizer": "edge_ngram",
            "filter": ["lowercase", "asciifolding"]
          }
        }
      }
    },
    "mappings": {
      "properties": {
        "video_id": { "type": "keyword" },
        "title": {
          "type": "text",
          "fields": {
            "keyword": { "type": "keyword" },
            "autocomplete": { "type": "text", "analyzer": "autocomplete_analyzer" }
          }
        },
        "description": { "type": "text" },
        "prompt": { "type": "text" },
        "tags": { "type": "keyword" },
        "quality": { "type": "keyword" },
        "duration": { "type": "float" },
        "views": { "type": "long" },
        "created_at": { "type": "date" }
      }
    }
  }
}

6. Queue System

Technology: BullMQ with Redis

Queue Types:

  1. Generation Queue

    • Priority: High
    • Concurrency: 5 workers
    • Job types: Video generation requests
    • Retry: 3 attempts with exponential backoff
  2. Processing Queue

    • Priority: Medium
    • Concurrency: 10 workers
    • Job types: Thumbnail generation, encoding
    • Retry: 2 attempts
  3. Indexing Queue

    • Priority: Low
    • Concurrency: 5 workers
    • Job types: Elasticsearch indexing
    • Retry: 5 attempts

Job Lifecycle:

Created → Queued → Active → Completed/Failed
                     ↓
                 Delayed (retry)

7. Storage Layer

Technology: AWS S3 + CloudFront CDN

Bucket Structure:

videogen-videos-prod/
├── videos/
│   ├── {user_id}/
│   │   └── {video_id}.mp4
├── thumbnails/
│   ├── {user_id}/
│   │   └── {video_id}.jpg
└── temp/
    └── {job_id}.mp4

Access Patterns:

  • Videos: Served via CloudFront CDN
  • Thumbnails: Optimized for fast loading
  • Temp files: Auto-expire after 24 hours

Storage Classes:

  • Hot videos (< 30 days): S3 Standard
  • Warm videos (30-90 days): S3 Intelligent-Tiering
  • Cold videos (> 90 days): S3 Glacier

8. AI Provider Integration

Providers:

  1. Google Veo 3

    • Quality: HD, 4K
    • Max duration: 8 seconds
    • Best for: High-quality, realistic videos
    • Rate limits: 100 requests/minute
  2. Runway Gen-3

    • Quality: HD, 4K, Premium
    • Max duration: 10 seconds
    • Best for: Motion control, premium quality
    • Rate limits: 50 requests/minute
  3. Minimax Video-01

    • Quality: SD, HD
    • Max duration: 6 seconds
    • Best for: Fast, cost-effective generation
    • Rate limits: 200 requests/minute

Provider Selection Logic:

Quality: 4K → Veo3 or Runway
Quality: HD → Runway > Veo3 > Minimax
Quality: SD → Minimax > Veo3 > Runway

Data Flow

Video Generation Flow

1. Client Request
   ↓
2. API Validation
   ↓
3. Authentication Check
   ↓
4. Rate Limit Check
   ↓
5. Credit Deduction
   ↓
6. Job Creation (Redis)
   ↓
7. Queue Job (BullMQ)
   ↓
8. Worker Picks Job
   ↓
9. Provider Selection
   ↓
10. AI Generation (External API)
    ↓
11. Status Polling
    ↓
12. Video Download
    ↓
13. Upload to S3
    ↓
14. Thumbnail Generation
    ↓
15. Database Update
    ↓
16. Elasticsearch Index
    ↓
17. Cache Invalidation
    ↓
18. Client Notification (WebSocket/Polling)

Search Flow

1. Client Search Request
   ↓
2. Cache Check (Redis)
   ↓ (if miss)
3. Build Elasticsearch Query
   ↓
4. Execute Search
   ↓
5. Aggregate Results
   ↓
6. Enrich with Metadata (DB)
   ↓
7. Cache Results
   ↓
8. Return to Client

Scalability Considerations

Horizontal Scaling

  • API servers: Auto-scale based on CPU (target: 70%)
  • Workers: Scale based on queue depth
  • Database: Read replicas for read-heavy workloads
  • Cache: Redis cluster with sharding

Vertical Scaling

  • Database: Upgrade instance type as needed
  • Elasticsearch: Add more nodes for indexing performance

Performance Optimizations

  • Connection pooling
  • Query result caching
  • CDN for static assets
  • Compression (gzip/brotli)
  • Database query optimization
  • Eager loading relationships
  • Pagination everywhere

Security Architecture

Authentication

  • JWT tokens (24h expiry)
  • API keys for programmatic access
  • Bcrypt for password hashing (10 rounds)
  • Session management via Redis

Authorization

  • Role-based access control (RBAC)
  • Resource ownership validation
  • Rate limiting per user/API key

Data Security

  • Encryption at rest (RDS, S3, ElastiCache)
  • Encryption in transit (TLS 1.3)
  • Secrets in AWS Secrets Manager
  • VPC isolation
  • Security groups
  • WAF for DDoS protection

Content Security

  • Prompt filtering for unsafe content
  • Content moderation APIs
  • CORS policies
  • CSP headers

Monitoring & Observability

Metrics

  • Request latency (p50, p95, p99)
  • Error rates
  • Queue depth
  • Worker utilization
  • Database connections
  • Cache hit rates
  • AI provider response times

Logging

  • Structured logging (JSON)
  • Log aggregation (CloudWatch)
  • Error tracking (Sentry)
  • Request tracing (X-Ray)

Alerts

  • High error rate (> 1%)
  • High latency (> 2s)
  • Queue backup (> 100 jobs)
  • Database connection errors
  • AI provider failures
  • Disk space warnings

Disaster Recovery

Backup Strategy

  • Database: Automated daily backups (7-day retention)
  • S3: Versioning enabled
  • Redis: AOF persistence + snapshots

Recovery Procedures

  • RTO (Recovery Time Objective): 1 hour
  • RPO (Recovery Point Objective): 15 minutes
  • Multi-region failover capability
  • Database point-in-time recovery

Technology Stack Summary

Layer Technology Purpose
Runtime Node.js 18+ Application server
Framework Express.js Web framework
Database PostgreSQL 15 Primary datastore
Cache Redis 7 Caching & sessions
Search Elasticsearch 8 Full-text search
Queue BullMQ Job processing
Storage AWS S3 Object storage
CDN CloudFront Content delivery
Container Docker Containerization
Orchestration ECS/Kubernetes Container orchestration
Load Balancer AWS ALB Traffic distribution
Monitoring CloudWatch Metrics & logs
APM Sentry/NewRelic Performance monitoring

Future Enhancements

  1. GraphQL API - More flexible querying
  2. WebSocket Support - Real-time updates
  3. Multi-region Deployment - Lower latency globally
  4. Video Streaming - HLS/DASH support
  5. Machine Learning - Personalized recommendations
  6. Blockchain Integration - NFT minting for videos
  7. Social Features - Comments, likes, shares
  8. Advanced Analytics - User behavior tracking