ParaUni: Enhance Generation in Unified Multimodal Model with Reinforcement-driven Hierarchical Parallel Information Interaction

Overview

We propose ParaUni. It extracts features from variants VLM's layers in a Parallel way for comprehensive information interaction and retains a flexible separation architecture to enhance generation in Unified multimodal model. Concretely, visual features from all VLM's layers are fed in parallel into a Layer Integration Module (LIM), which efficiently integrates fine-grained details and semantic abstractions and provides the fused representation as a condition to the diffusion model.

Result

Citation

Thanks to the developers of OpenUni for their excellent work. Our code is adapted from OpenUni and Flow-GRPO. If our work assists your research, feel free to give us a star ⭐ or cite us using:

@article{tan2025parauni,
  title={ParaUni: Enhance Generation in Unified Multimodal Model with Reinforcement-driven Hierarchical Parallel Information Interaction},
  author={Tan, Jiangtong and Liu, Lin and Huanng, Jie and Zhang, Xiaopeng and Tian, Qi and Zhao, Feng},
  journal={arXiv preprint arXiv:2512.05422},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
Stage1&2		Stage1&2
assets		assets
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ParaUni: Enhance Generation in Unified Multimodal Model with Reinforcement-driven Hierarchical Parallel Information Interaction

Overview

Result

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ParaUni: Enhance Generation in Unified Multimodal Model with Reinforcement-driven Hierarchical Parallel Information Interaction

Overview

Result

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages