Generate stunning AI-powered videos from text prompts or images using the Wan 2.0 model by Alibaba.
- 🎬 Text-to-Video Generation - Create videos from descriptive text prompts
- 🖼️ Image-to-Video Animation - Bring static images to life with AI
- 🌐 Modern Web Interface - Beautiful, responsive UI with glassmorphism design
- ⚡ Real-time Processing - Fast video generation with progress tracking
- 📱 Mobile Responsive - Works seamlessly on all devices
- 🎨 Customizable Settings - Control resolution, FPS, and duration
- Python 3.8 or higher
- CUDA-compatible GPU (recommended) or CPU
- 8GB+ RAM (16GB+ recommended)
- Clone or navigate to the project directory:
cd d:\Freelancing\ImagetoVideo- Create a virtual environment:
python -m venv venv- Activate the virtual environment:
Windows:
venv\Scripts\activateLinux/Mac:
source venv/bin/activate- Install dependencies:
pip install -r requirements.txt- Set up environment variables:
copy .env.example .envEdit .env file to configure your settings (optional).
- Start the Flask server:
python app.py- Open your browser and navigate to:
http://localhost:5000
- Start creating videos!
- Select the Text to Video tab
- Enter a descriptive prompt (e.g., "A majestic dragon flying over a medieval castle at sunset")
- Optionally add a negative prompt to avoid unwanted elements
- Configure settings (duration, FPS, resolution)
- Click Generate Video
- Wait for processing and download your video!
- Select the Image to Video tab
- Upload an image (PNG, JPG, JPEG, or WEBP)
- Optionally add a motion prompt (e.g., "gentle camera zoom")
- Configure settings
- Click Generate Video
- Download your animated video!
- "A serene lake at sunrise with mist rising from the water, cinematic 4k"
- "Futuristic city with flying cars and neon lights, cyberpunk style"
- "Ocean waves crashing on a rocky shore, slow motion"
- "Northern lights dancing in the night sky over snowy mountains"
- "Slow zoom in with gentle camera movement"
- "Pan from left to right smoothly"
- "Character walking forward naturally"
- "Leaves rustling in the wind"
Edit the .env file to customize:
# Model Configuration
MODEL_NAME=alibaba-pai/wan-2.0-5b
DEVICE=cuda # Options: cuda, cpu, mps
USE_FP16=True
# Generation Settings
DEFAULT_RESOLUTION=720
DEFAULT_FPS=24
DEFAULT_DURATION=5
MAX_VIDEO_LENGTH=10
# Server Settings
HOST=0.0.0.0
PORT=5000ImagetoVideo/
├── app.py # Flask backend server
├── model_handler.py # Wan 2.0 model integration
├── requirements.txt # Python dependencies
├── .env.example # Environment configuration template
├── .gitignore # Git ignore rules
├── static/ # Frontend files
│ ├── index.html # Main web interface
│ ├── css/
│ │ └── style.css # Styling with glassmorphism
│ └── js/
│ └── app.js # Client-side JavaScript
├── uploads/ # Temporary image uploads
└── outputs/ # Generated videos
GET /api/health
POST /api/text-to-video
Content-Type: application/json
{
"prompt": "Your prompt here",
"negative_prompt": "Optional",
"duration": 5,
"fps": 24,
"resolution": 720
}
POST /api/image-to-video
Content-Type: multipart/form-data
image: <file>
prompt: "Optional motion description"
duration: 5
fps: 24
resolution: 720
GET /api/download/<filename>
- CPU: 4+ cores
- RAM: 8GB
- Storage: 10GB free space
- GPU: Not required (CPU mode available)
- CPU: 8+ cores
- RAM: 16GB+
- Storage: 20GB+ free space
- GPU: NVIDIA GPU with 8GB+ VRAM (RTX 3060 or better)
If the model fails to load, the application will run in mock mode for demonstration purposes. To use the actual Wan 2.0 model:
- Ensure you have sufficient GPU memory
- Try setting
USE_FP16=Truein.envto reduce memory usage - Consider using a smaller model variant
- Check your internet connection for model downloads
- Reduce resolution in settings
- Decrease video duration
- Set
USE_FP16=True - Close other GPU-intensive applications
- Use GPU instead of CPU (set
DEVICE=cuda) - Reduce resolution and FPS
- Enable half-precision (
USE_FP16=True)
- First run will download the Wan 2.0 model (~10GB), which may take time
- Video generation time depends on duration, resolution, and hardware
- Generated videos are saved in the
outputs/directory - Uploaded images are temporarily stored and automatically cleaned up
- Backend: Flask, Python
- AI/ML: PyTorch, Hugging Face Diffusers, Wan 2.0
- Frontend: HTML5, CSS3, Vanilla JavaScript
- Video Processing: OpenCV, imageio
This project is for educational and demonstration purposes. Please refer to the Wan 2.0 model license for commercial usage restrictions.
Contributions are welcome! Feel free to submit issues or pull requests.
Made with ❤️ using Wan 2.0 AI