A TypeScript-based CLI pipeline that automates the generation of faceless motivational videos using Vercel AI SDK, ElevenLabs, and Pexels APIs, optimized for TikTok with FFmpeg.
- Clone the repository
- Install dependencies:
pnpm install
- Create a
.envfile based on.env.exampleand add your API keys:
NODE_ENV=development
OPENAI_API_KEY=your_openai_api_key
OPENAI_MODEL=gpt-4o
DEBUG=true
ELEVENLABS_API_KEY=your_elevenlabs_api_key
ELEVENLABS_DEFAULT_VOICE_ID=your_elevenlabs_voice_id
PEXELS_API_KEY=your_pexels_api_key
- Make sure you have FFmpeg installed on your system. For macOS:
brew install ffmpeg
To generate a complete TikTok-optimized video from a topic, use the generate-video command:
pnpm generate-video "your topic here" [options]
This will:
- Generate a script based on the topic
- Convert the script to speech using ElevenLabs
- Fetch related videos from Pexels based on the script's keywords and visual tags
- Compose everything into a final video optimized for TikTok
Script Generation Options:
--duration <seconds>: Target script duration--tone <string>: Tone of the script (e.g., "motivational", "educational")--audience <string>: Target audience (e.g., "young adults", "professionals")
Voice Generation Options:
--voice <id>: ElevenLabs voice ID--stability <number>: Voice stability (0.0-1.0)--similarity-boost <number>: Voice similarity boost (0.0-1.0)
Video Options:
--video-query <string>: Custom search query for video (defaults to script's visual prompt or keywords)--video-count <number>: Number of videos to fetch (default: 3)--video-orientation <orientation>: 'landscape', 'portrait', or 'square' (default: 'portrait' for TikTok)--width <pixels>: Output video width (default: 1080 for TikTok)--height <pixels>: Output video height (default: 1920 for TikTok)--fps <number>: Output video frame rate (default: 30)
Caption Options:
--disable-captions: Disable captions in the video (captions are enabled by default)--animated-captions: Use animated word highlighting for captions (default: true)
Workflow Options:
--skip-script: Skip script generation (requires --use-timestamp)--skip-voice: Skip voice generation (requires --use-timestamp)--skip-video-fetch: Skip video fetching (requires --use-timestamp)--skip-video-composition: Skip video composition--use-timestamp <timestamp>: Use existing assets with the specified timestamp
Example:
pnpm generate-video "discipline" --tone "motivational" --audience "young adults" --video-count 5
The generated videos are specifically optimized for TikTok with:
- 9:16 vertical aspect ratio (1080x1920 pixels)
- Proper video encoding settings for TikTok's platform
- Smart transitions between video clips
- Videos automatically selected based on script content
- Center-crop scaling to maintain visual focus
- Captions with word-by-word highlighting based on ElevenLabs timestamps (enabled by default)
- Subtle background filter to enhance caption readability
The script dictates which videos are fetched - visual tags from the script are used to search for relevant footage to match the narrative.
Each time you run one of the CLI commands, a new directory is created in the output folder using the current timestamp as the directory name. All files generated during that run are stored in this directory:
output/
├── 1684521234567/ (first run)
│ ├── script.json
│ ├── voice.mp3
│ ├── videos.json
│ ├── final.mp4
│ ├── final_with_captions.mp4 (if captions enabled)
│ ├── metadata.json
│ └── videos/ (raw video files)
│ ├── video-main-1.mp4
│ └── video-nature-1.mp4
└── 1684521345678/ (second run)
├── script.json
├── voice.mp3
└── ...
This organization keeps all files from a single generation run together, making it easier to manage multiple generations and preventing file conflicts.
You can run specific parts of the pipeline by using the skip flags with an existing timestamp:
# Generate script and voice, but use existing videos for composition
pnpm generate-video "discipline" --skip-video-fetch --use-timestamp 1684521234567
# Use existing script and voice, but fetch new videos and compose
pnpm generate-video "discipline" --skip-script --skip-voice --use-timestamp 1684521234567
When using --use-timestamp, the CLI will look for existing files in the specified timestamp directory.
src/lib/scriptGenerator: Script generation module using Vercel AI SDK with OpenAIsrc/lib/voiceGenerator: Voice generation module using ElevenLabs APIsrc/lib/videoFetcher: Video fetching and composition module using Pexels API and FFmpegsrc/cli: CLI commands for the various pipeline steps
- Uses Vercel AI SDK for type-safe, structured AI responses
- Pexels API for high-quality stock videos
- ElevenLabs for realistic voice generation with precise timestamp data
- FFmpeg for video composition with TikTok-optimized settings
- Advanced captions using ASS subtitle format with word-level highlighting
- Zod schemas ensure consistent output format
- Script-driven video selection for better content relevance
UNLICENSED - Private use only