Skip to content

ByteDance-Seed/VeOmni

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

VeOmni: Scaling Any Modality Model Training with Model-Centric Distributed Recipe Zoo

GitHub Repo stars Paper Documentation WeChat

๐Ÿช Overview

VeOmni is a versatile framework for both single- and multi-modal pre-training and post-training. It empowers users to seamlessly scale models of any modality across various accelerators, offering both flexibility and user-friendliness.

Our guiding principles when building VeOmni are:

  • Flexibility and Modularity: VeOmni is built with a modular design, allowing users to decouple most components and replace them with their own implementations as needed.

  • Trainer-free: VeOmni avoids rigid, structured trainer classes (e.g., PyTorch-Lightning or HuggingFace Trainer). Instead, VeOmni keeps training scripts linear, exposing the entire training logic to users for maximum transparency and control.

  • Omni model native: VeOmni enables users to effortlessly scale any omni-model across devices and accelerators.

  • Torch native: VeOmni is designed to leverage PyTorchโ€™s native functions to the fullest extent, ensuring maximum compatibility and performance.

๐Ÿ”ฅ Latest News

๐Ÿ“š Key Features

  • FSDP, FSDP2 backend for training.
  • Sequence Parallelism with Deepspeed Ulysess, support with non-async and async mode.
  • Experts Parallelism support large MOE model training, like Qwen3-Moe.
  • Efficient GroupGemm kernel for Moe model, Liger-Kernel.
  • Compatible with HuggingFace Transformers models. Qwen3, Qwen3-VL, Qwen3-Moe, etc
  • Dynamic batching strategy, Omnidata processing
  • Torch Distributed Checkpoint for checkpoint.
  • Support for both Nvidia-GPU and Ascend-NPU training.
  • Experiment tracking with wandb

๐Ÿ“ Upcoming Features and Changes

  • VeOmni v0.2 Roadmap #268, #271
  • Vit balance tool #280
  • Validation dataset during training #247
  • RL post training for omni-modality models with VeRL #262

๐Ÿš€ Getting Started

Documentation

Quick Start

โœ๏ธ Supported Models

Model Model size Example config File
DeepSeek 2.5/3/R1 236B/671B deepseek.yaml
Llama 3-3.3 1B/3B/8B/70B llama3.yaml
Qwen 2-3 0.5B/1.5B/3B/7B/14B/32B/72B/ qwen2_5.yaml
Qwen2-3 VL/QVQ 2B/3B/7B/32B/72B qwen3_vl_dense.yaml
Qwen3-VL MoE 30BA3B/235BA22B qwen3_vl_moe.yaml
Qwen3-MoE 30BA3B/235BA22B qwen3-moe.yaml
Wan Wan2.1-I2V-14B-480P wan_sft.yaml
Omni Model Any Modality Training seed_omni.yaml

Support new models to VeOmni see Support New Models

โ›ฐ๏ธ Performance

For more details, please refer to our paper.

๐Ÿ’ก Awesome work using VeOmni

๐ŸŽจ Contributing

Contributions from the community are welcome! Please check out CONTRIBUTING.md our project roadmap(To be updated),

๐Ÿ“ Citation and Acknowledgement

If you find VeOmni useful for your research and applications, feel free to give us a star โญ or cite us using:

@article{ma2025veomni,
  title={VeOmni: Scaling Any Modality Model Training with Model-Centric Distributed Recipe Zoo},
  author={Ma, Qianli and Zheng, Yaowei and Shi, Zhelun and Zhao, Zhongkai and Jia, Bin and Huang, Ziyue and Lin, Zhiqi and Li, Youjie and Yang, Jiacheng and Peng, Yanghua and others},
  journal={arXiv preprint arXiv:2508.02317},
  year={2025}
}

Thanks to the following projects for their excellent work:

๐ŸŒฑ About ByteDance Seed Team

Founded in 2023, ByteDance Seed Team is dedicated to crafting the industry's most advanced AI foundation models. The team aspires to become a world-class research team and make significant contributions to the advancement of science and society. You can get to know Bytedance Seed better through the following channels๐Ÿ‘‡

About

VeOmni: Scaling Any Modality Model Training with Model-Centric Distributed Recipe Zoo

Resources

License

Contributing

Stars

Watchers

Forks

Packages

No packages published

Contributors 42

Languages