Implementation of the Diffusion Transformer model in the paper:
See here for the official Pytorch implementation.
- Python 3.8
- TensorFlow 2.12
Use --train_file_pattern=<file_pattern> and --test_file_pattern=<file_pattern> to specify the train and test dataset path.
python ae_train.py --train_file_pattern='./train_dataset_path/*.png' --test_file_pattern='./test_dataset_path/*.png'
Use --file_pattern=<file_pattern> to specify the dataset path.
python ldt_train.py --file_pattern='./dataset_path/*.png'
*Training DiT requires the pretrained AutoencoderKL. Use ae_dir and ae_name to specify the AutoencoderKL path in the ldt_config.py file.
Use --model_dir=<model_dir> and --ldt_name=<ldt_name> to specify the pre-trained model. For example:
python sample.py --model_dir=ldt --ldt_name=model_1 --diffusion_steps=40
Adjust hyperparameters in the ae_config.py and ldt_config.py files.
Implementation notes:
- LDT is designed to offer reasonable performance using a single GPU (RTX 3080 TI).
- LDT largely follows the original DiT model.
- DiT Block with adaLN-Zero.
- Diffusion Transformer with Linformer attention.
- Cosine schedule.
- DDIM sampler.
- FID evaluation.
- AutoencoderKL with PatchGAN discriminator and hinge loss.
- This implementation uses code from the beresandras repo. Under MIT Licence.
Curated samples from FFHQ
MIT

