STARCaster: Spatio-Temporal AutoRegressive Video Diffusion for Identity- and View-Aware Talking Portraits
-
Updated
Dec 16, 2025
STARCaster: Spatio-Temporal AutoRegressive Video Diffusion for Identity- and View-Aware Talking Portraits
Serverless Docker deployment for ByteDance LatentSync 1.6 lip-sync model, supporting local, staging, and production environments.
Convert audio to video with AI-generated images and word-by-word captions. Telegram bot powered by AssemblyAI, DeepSeek, and FFmpeg. Free and open source.
Add a description, image, and links to the speech-to-video topic page so that developers can more easily learn about it.
To associate your repository with the speech-to-video topic, visit your repo's landing page and select "manage topics."