Skip to content

hhhh1138/VDOT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 

Repository files navigation

VDOT: Efficient Unified Video Creation via Optimal Transport Distillation

Paper PDF Project Page

Yutong Wang1, Haiyu Zhang3,2, Tianfan Xue4,2, Yu Qiao2, Yaohui Wang2, Chang Xu1*, Xinyuan Chen2*

1USYD, 2Shanghai AI Laboratory, 3BUAA, 4CUHK

Introduction

VDOT is an efficient, unified video creation model that achieves high-quality results in just 4 denoising steps. By employing Computational Optimal Transport (OT) within the distillation process, VDOT ensures training stability and enhances both training and inference efficiency. VDOT unifies a wide range of capabilities, such as Reference-to-Video (R2V), Video-to-Video (V2V), Masked Video Editing (MV2V), and arbitrary composite tasks, matching the versatility of VACE with significantly reduced inference costs.

sour_cover_compressed_.mp4

⚙️ Installation

The codebase was tested with Python 3.10.13, CUDA version 12.4, and PyTorch >= 2.5.1.

🚀 Usage

Acknowledgement

We are grateful for the following awesome projects, including VACE, Wan, and Self-Forcing.

BibTeX

About

VDOT: Efficient Unified Video Creation via Optimal Transport Distillation

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published