RDTF

A project for generating text-based dynamic images (Under updating).

Overview

RDTF (Text-Driven Dynamic Image Generation) is a cutting-edge project focused on generating dynamic images from text descriptions. By leveraging advanced AI models and fine-tuning techniques, RDTF enables the creation of visually engaging dynamic visuals that respond to textual inputs.

The project utilizes the i2vgen model as its foundation and incorporates LoRA (Low-Rank Adaptation) fine-tuning technology to optimize performance specifically for dynamic image generation tasks. This combination allows for efficient adaptation of the base model while maintaining high-quality output.

Features

Text-to-Dynamic-Image: Convert text descriptions into dynamic images with smooth animations
LoRA Fine-tuning: Efficient model adaptation using LoRA technology
Customizable Training: Flexible training pipeline for different dynamic image generation tasks
Easy Integration: Simple invocation process for generating dynamic images

Installation

Prerequisites

Python 3.8+
PyTorch
diffusers library
Other dependencies (see requirements.txt)

Setup

# Clone the repository
git clone https://github.com/yourusername/RDTF.git
cd RDTF

# Create and activate virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

Training

RDTF uses LoRA fine-tuning for optimal performance.

Training Command

# Run the training script
bash shells/train_multitaskpretrain.sh

Training Configuration

You can modify the training parameters in the train_multitaskpretrain.sh script, including:

Learning rate
Number of epochs
Batch size
Dataset paths
LoRA rank and alpha values

Inference

Generate Dynamic Images

To generate dynamic images using the trained model:

/usr/local/envs/diffusers/bin/python examples_lora.py

Custom Inputs

Modify the examples_lora.py file to provide your own text prompts and adjust generation parameters such as:

Output resolution
Animation length
Style parameters
Sampling steps

Examples

Check out the examples/ directory for sample outputs and corresponding input prompts.

Contributing

We welcome contributions to RDTF! Please read our CONTRIBUTING.md for details on our code of conduct and submission process.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgements

Thanks to the developers of the i2vgen model
LoRA implementation based on peft library
Diffusers library by Hugging Face

Contact

For questions and feedback, please open an issue and contact.

Name		Name	Last commit message	Last commit date
Latest commit History 109 Commits
.idea		.idea
assets		assets
config		config
logs		logs
shells		shells
static		static
tools		tools
vldm		vldm
.DS_Store		.DS_Store
.gitignore		.gitignore
README.md		README.md
index.html		index.html
infer_multitaskpretrain.py		infer_multitaskpretrain.py
lmdb_dataset_v2.py		lmdb_dataset_v2.py
train_multitaskpretrain.py		train_multitaskpretrain.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RDTF

Overview

Features

Installation

Prerequisites

Setup

Training

Training Command

Training Configuration

Inference

Generate Dynamic Images

Custom Inputs

Examples

Contributing

License

Acknowledgements

Contact

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

RDTF

Overview

Features

Installation

Prerequisites

Setup

Training

Training Command

Training Configuration

Inference

Generate Dynamic Images

Custom Inputs

Examples

Contributing

License

Acknowledgements

Contact

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages