Wenhao Sun,
Rong-Cheng Tu,
Jingyi Liao,
Zhao Jin,
Dacheng Tao
Nanyang Technological University
Tip
ICML25' A training-free method to accelerate video DiTs without compromising output quality.
Install Pytorch, we have tested the code with PyTorch 2.5.0 and CUDA 12.4. But it should work with other versions as well. You can install PyTorch using the following command:
python -m pip install torch==2.5.0 torchvision==0.20.0 --index-url https://download.pytorch.org/whl/cu124
Install the dependencies:
python -m pip install -r requirements.txt
Install our optimized Euclidean distance operator for better performance:
python -m pip install .
We use the genaral scripts to demonstrate the usage of our method. You can find the detailed scripts for each model in the scripts folder:
- CogVideoX: scripts/cogvideox/inference.sh
- Mochi-1: scripts/mochi/inference.sh
- HunyuanVideo: scripts/hyvideo/inference.sh
- FastVideo-Hunyuan: scripts/fast_hyvideo/inference.sh
Run the baseline model sampling without acceleration:
CUDA_VISIBLE_DEVICES=0 python scripts/<model_name>/inference.py \
--model_name <model_name> \
--pretrained_model_path <model_name_on_hf> \
--text_prompt_file configs/prompts.txt \
--output_dir results/baseline \
--log_level info \
--seed 42 \
--log_latency
- You can edit the
configs/prompts.txtor the--text_prompt_fileoption to change the text prompt.- The
log_latencyoption is enabled to log the latency of DiTs.- See the
scripts/<model_name>/inference.pyfor more detailed explanations of the arguments.
Run the video DiTs with AsymRnR for acceleration:
CUDA_VISIBLE_DEVICES=0 python scripts/<model_name>/inference.py \
--model_name <model_name> \
--pretrained_model_path <model_name_on_hf> \
--text_prompt_file configs/prompts.txt \
--output_dir results/arnr-base \
--log_level info \
--seed 42 \
+ --enable_rnr \
+ --schedule_file schedulers/<scheduler>.safetensors \
+ --rnr_config_file configs/<reduction_configuration_file>.yaml \
--log_latency
- The
enable_rnroption is used to enable AsymRnR.- The
schedule_fileoption is used to specify the schedule file which saves the similarity distribution of the baseline model. See the Section 3.4 in the paper for more details.- The
rnr_config_fileoption is used to scale the acceleration. See theconfigsfolder and the Appendix in the paper for more details.
-
... square_dist is not available ...
- Please make sure you have installed the optimized Euclidean distance operator by running
python -m pip install .in the root directory of the repository.
- Please make sure you have installed the optimized Euclidean distance operator by running
-
ImportError: libc10.so: cannot open shared object file: No such file or directory
libc10.sois made available by PyTorch. Pleaseimport torchbeforeimport square_dist.
-
... libstdc++.so.6: version GLIBCXX_x.x.xx not found ..
- This error is due to the incompatibility of the GCC version. The simplest solution is to
libstdcxx-ngbyconda install -c conda-forge libstdcxx-ng.
- This error is due to the incompatibility of the GCC version. The simplest solution is to
- Project page to introduce the method and results
- 2025-01-26 Add more visualization results in the Supplementary Material
- 2025-01-23 Code released
Thanks to the authors of the following repositories for their great works and open-sourcing the code and models: Diffusers, CogVideoX, Mochi-1, HunyuanVideo, FastVideo
If you find our work useful, please consider citing our paper:
@article{sun2024asymrnr,
title={AsymRnR: Video Diffusion Transformers Acceleration with Asymmetric Reduction and Restoration},
author={Sun, Wenhao and Tu, Rong-Cheng and Liao, Jingyi and Jin, Zhao and Tao, Dacheng},
journal={arXiv preprint arXiv:2412.11706},
year={2024}
}