Skip to content

skabartem/video_maker

Repository files navigation

AI-Powered Video Maker

An intelligent video creation tool that generates engaging videos from character descriptions and topics using state-of-the-art AI models with an interactive Telegram approval workflow.

Features

  • Intelligent Plot Generation: Uses Hermes LLM to create logical, engaging plots and scene breakdowns
  • High-Quality Visuals: Generates images with Nano Banana (Imagen 3) and videos with VEO 3.1
  • Interactive Approval: Telegram bot workflow for approving/rejecting generated content
  • Iterative Refinement: Analyzes feedback and regenerates only problematic scenes
  • Professional Prompting: Research-based prompt engineering for optimal results
  • Automated Stitching: Seamlessly combines scenes into final video

Architecture

┌─────────────────┐
│  User Input     │
│  via CLI        │
└────────┬────────┘
         │
         ▼
┌─────────────────────────────────────────┐
│         Workflow Orchestrator           │
├─────────────────────────────────────────┤
│  1. Plot Generation (Hermes LLM)        │
│  2. Image Generation (Nano Banana)      │
│  3. Telegram Approval Workflow          │
│  4. Video Generation (VEO 3.1)          │
│  5. Video Stitching (FFmpeg)            │
│  6. Iterative Feedback Loop             │
└─────────────────────────────────────────┘

Prerequisites

  • Python 3.9+
  • FFmpeg installed on your system
  • Google Cloud account with Vertex AI enabled
  • Hermes API key
  • Nano Banana (AI Studio) API key
  • Telegram Bot Token and Chat ID

Installation

  1. Clone the repository:
git clone <repository-url>
cd video_maker
  1. Install dependencies:
pip install -r requirements.txt
  1. Install FFmpeg (if not already installed):
# Ubuntu/Debian
sudo apt-get install ffmpeg

# macOS
brew install ffmpeg

# Windows
# Download from https://ffmpeg.org/download.html
  1. Set up Google Cloud credentials:
# Download service account JSON from Google Cloud Console
export GOOGLE_APPLICATION_CREDENTIALS="/path/to/service-account.json"
  1. Configure environment variables:
cp .env.example .env
# Edit .env with your API keys and configuration

Configuration

Edit the .env file with your credentials:

# API Keys
HERMES_API_KEY=your_hermes_key
NANO_BANANA_API_KEY=your_nano_banana_key
VEO_PROJECT_ID=your_google_cloud_project_id
VEO_LOCATION=us-central1
TELEGRAM_BOT_TOKEN=your_telegram_bot_token
TELEGRAM_CHAT_ID=your_telegram_chat_id

# Google Cloud credentials (optional)
GOOGLE_APPLICATION_CREDENTIALS=/path/to/service-account.json

# Video Settings
DEFAULT_VIDEO_LENGTH=30
MAX_RETRIES=3
FRAME_GENERATION_MODE=first

Setting up Telegram Bot

  1. Create a bot with @BotFather
  2. Get your bot token
  3. Start a chat with your bot
  4. Get your chat ID from @userinfobot

Usage

Basic Usage

python main.py \
  --characters "a brave knight in shining armor" "a wise old wizard" \
  --topic "an epic quest to find a magical artifact" \
  --length 30 \
  --style "cinematic fantasy"

Command Line Arguments

  • --characters, -c: Character descriptions (can specify multiple)
  • --topic, -t: What the video is about (required)
  • --length, -l: Video length in seconds (default: 30)
  • --style, -s: Video style (optional)
  • --validate-config: Validate configuration and exit

Example Commands

Fantasy Adventure:

python main.py \
  -c "a brave warrior princess" "a mischievous dragon" \
  -t "becoming unlikely friends during a festival" \
  -l 45 \
  -s "animated, whimsical"

Documentary Style:

python main.py \
  -c "a marine biologist" \
  -t "discovering a new species in the deep ocean" \
  -l 60 \
  -s "documentary, National Geographic style"

Sci-Fi:

python main.py \
  -c "a space explorer" "an alien ambassador" \
  -t "first contact with an alien civilization" \
  -l 40 \
  -s "cinematic sci-fi, dramatic lighting"

Workflow

1. Plot Generation

The system uses Hermes LLM to create:

  • Engaging plot summary
  • Detailed scene breakdown
  • Character actions and cinematography
  • Audio descriptions

2. Image Generation & Approval

For each scene:

  • Generates optimized image prompts using Hermes
  • Creates images with Nano Banana (Imagen 3)
  • Sends to Telegram for approval
  • Regenerates rejected images until all approved

3. Video Generation

  • Converts approved images to videos using VEO 3.1
  • Uses research-based prompting for best quality
  • Includes audio generation
  • Maintains character and scene consistency

4. Video Stitching

  • Combines all scene videos
  • Normalizes resolution and frame rate
  • Creates seamless final video

5. Iterative Feedback

  • Sends final video via Telegram
  • Analyzes user feedback with Hermes
  • Regenerates only problematic scenes
  • Re-stitches and repeats until approved

Project Structure

video_maker/
├── main.py                 # Entry point
├── requirements.txt        # Python dependencies
├── .env.example           # Example configuration
├── README.md              # This file
├── src/
│   ├── __init__.py
│   ├── config.py          # Configuration management
│   ├── workflow.py        # Main orchestrator
│   ├── api/
│   │   ├── hermes_client.py      # Hermes LLM integration
│   │   ├── nano_banana_client.py # Image generation
│   │   └── veo_client.py         # Video generation
│   ├── telegram/
│   │   └── bot.py         # Telegram approval bot
│   └── video/
│       └── stitcher.py    # Video processing
├── output/                # Generated content
│   ├── images/
│   ├── videos/
│   └── final/
└── temp/                  # Temporary files

Advanced Features

Prompt Engineering

The system uses research-based VEO 3.1 prompting:

  • Detailed cinematography specifications
  • Precise lighting descriptions (e.g., "soft wrap", "hard rim")
  • Camera movements (dolly, tracking, crane shots)
  • Character consistency across scenes
  • Audio and dialogue integration

Regeneration Logic

Smart scene regeneration:

  • Keeps approved scenes unchanged
  • Only regenerates problematic parts
  • Maintains plot coherence
  • Preserves good elements

Error Handling

  • Automatic retry with exponential backoff
  • Multiple generation attempts for failed scenes
  • Graceful degradation
  • Detailed error logging

Troubleshooting

Common Issues

FFmpeg not found:

# Install ffmpeg for your system
sudo apt-get install ffmpeg  # Ubuntu/Debian
brew install ffmpeg          # macOS

API Authentication errors:

  • Verify all API keys in .env
  • Check Google Cloud credentials path
  • Ensure VEO 3.1 is enabled in your GCP project

Telegram bot not responding:

  • Verify bot token is correct
  • Check that you've started a chat with the bot
  • Confirm chat ID is accurate

Video generation fails:

  • VEO 3.1 has 8-second maximum per clip
  • Check internet connection
  • Verify GCP quotas and billing

Validation

Test your configuration:

python main.py --validate-config

API Rate Limits

Be aware of rate limits:

  • VEO 3.1: Check Google Cloud quotas
  • Nano Banana: AI Studio limits
  • Hermes: API-specific limits

The system includes automatic retry logic and delays between requests.

Performance Tips

  1. Parallel Processing: Images are generated in batches
  2. Caching: Approved content is saved and reused
  3. Smart Regeneration: Only problematic scenes are redone
  4. Scene Length: Shorter scenes (5-8s) generate faster

Contributing

Contributions welcome! Areas for improvement:

  • Additional video generation models
  • More transition effects
  • Advanced scene planning
  • Voice-over generation
  • Music integration

License

MIT License - see LICENSE file for details

Acknowledgments

  • Hermes LLM by Nous Research for intelligent prompting
  • Nano Banana (Imagen 3) by Google for image generation
  • VEO 3.1 by Google for video generation
  • FFmpeg for video processing
  • Research on VEO 3.1 prompting best practices (2025)

Support

For issues and questions:

  1. Check troubleshooting section
  2. Verify configuration with --validate-config
  3. Review logs in output directories
  4. Open an issue on GitHub

Happy video making! 🎬

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors