Skip to content

Darklightened/Awsome-Diffusion-LLM

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

82 Commits
 
 

Repository files navigation

Awesome-Large-Language-Diffusion-Models

Awesome

A comprehensive list of papers about Large-Language-Diffusion-Models.


Important

Contributions welcome:

  • If you have a relevant paper not included in the library, please contact us! Or, you may also consider submitting 'Pull requests' directly, thank you!

  • If you think your paper is more suitable for another category, please contact us or submit 'Pull requests'.

  • If your paper is accepted, you may consider updating the relevant information.

  • Thank you!


💥 News 💥

  • 🔥🔥🔥 Awesome-LLDM is now open!

⭐️ Useful Resources (Blogs & Technical Reports)


⚙️ Framework


Survey Papers

Paper Title Year Conference/Journal Remark
Discrete Diffusion in Large Language and Multimodal Models: A Survey 2025 Arxiv
Diffusion-based Large Language Models Survey 2025 Arxiv
A Survey on Parallel Text Generation: From Parallel Decoding to Diffusion Language Models 2025 Arxiv

Large Diffusion Language Models (>7B)


Scaling

Paper Title Year Conference/Journal Remark
David helps Goliath: Inference-Time Collaboration Between Small Specialized and Large General Diffusion LMs 2023 NAACL
Diffusion Language Models Can Perform Many Tasks with Scaling and Instruction-Finetuning 2023 Arxiv
TESS 2: A Large-Scale Generalist Diffusion Language Model 2025 ACL Adapted from Mistral-7B-v0.1
Scaling Diffusion Language Models via Adaptation from Autoregressive Models 2025 ICLR 127M~7B (GPT2, LLaMA2)
Large Language Diffusion Models 2025 Arxiv LLaDA-8B
LLaDA 1.5: Variance-Reduced Preference Optimization for Large Language Diffusion Models 2025 Arxiv
Large Language Models to Diffusion Finetuning 2025 Arxiv
LongLLaDA: Unlocking Long Context Capabilities in Diffusion LLMs 2025 Arxiv Long context scaling
Dream 7B: Diffusion Large Language Models 2025 Arxiv
UltraLLaDA: Scaling the Context Length to 128K for Diffusion Large Language Models 2025 Arxiv

AR-to-Diffusion Adaptation

Paper Title Year Conference/Journal Remark
Scaling Diffusion Language Models via Adaptation from Autoregressive Models 2025 ICLR 127M~7B (GPT2, LLaMA2)
SDAR: A Synergistic Diffusion-AutoRegression Paradigm for Scalable Sequence Generation 2025 Arxiv
From Next-Token to Next-Block: A Principled Adaptation Path for Diffusion LLMs 2025 Arxiv

Accelerating

Caching

Paper Title Year Conference/Journal Remark
Accelerating Diffusion Language Model Inference via Efficient KV Caching and Guided Diffusion 2025 Arxiv
dKV-Cache: The Cache for Diffusion Language Models 2025 Arxiv
Fast-dLLM: Training-free Acceleration of Diffusion LLM by Enabling KV Cache and Parallel Decoding 2025 Arxiv
Fast-dLLM v2: Efficient Block-Diffusion LLM 2025 Arxiv
d^2Cache: Accelerating Diffusion-Based LLMs via Dual Adaptive Caching 2025 Arxiv
Attention Is All You Need for KV Cache in Diffusion LLMs 2025 Arxiv

Decoding

Paper Title Year Conference/Journal Remark
Accelerating Diffusion LLMs via Adaptive Parallel Decoding 2025 Arxiv
Fast-dLLM: Training-free Acceleration of Diffusion LLM by Enabling KV Cache and Parallel Decoding 2025 Arxiv
Wide-In, Narrow-Out: Revokable Decoding for Efficient and Effective DLLMs 2025 Arxiv
Accelerating Diffusion Large Language Models with SlowFast Sampling: The Three Golden Principles 2025 Arxiv
AdaBlock-dLLM: Semantic-Aware Diffusion LLM Inference via Adaptive Block Size 2025 Arxiv
Fast-dLLM v2: Efficient Block-Diffusion LLM 2025 Arxiv
Spiffy: Multiplying Diffusion LLM Acceleration via Lossless Speculative Decoding 2025 Arxiv
dParallel: Learnable Parallel Decoding for dLLMs 2025 Arxiv
Learning to Parallel: Accelerating Diffusion Large Language Models via Learnable Parallel Decoding 2025 Arxiv
DiffuSpec: Unlocking Diffusion Language Models for Speculative Decoding 2025 Arxiv
Self Speculative Decoding for Diffusion Large Language Models 2025 Arxiv
CreditDecoding: Accelerating Parallel Decoding in Diffusion Large Language Models with Trace Credits 2025 Arxiv
Accelerating Diffusion LLM Inference via Local Determinism Propagation 2025 Arxiv
Saber: An Efficient Sampling with Adaptive Acceleration and Backtracking Enhanced Remasking for Diffusion Language Model 2025 Arxiv
SpecDiff-2: Scaling Diffusion Drafter Alignment For Faster Speculative Decoding 2025 Arxiv
Fast-Decoding Diffusion Language Models via Progress-Aware Confidence Schedules 2025 Arxiv

Distillation

Paper Title Year Conference/Journal Remark
Beyond Autoregression: Fast LLMs via Self-Distillation Through Time 2025 ICLR <7B
FS-DFM: Fast and Accurate Long Text Generation with Few-Step Diffusion Language Model 2025 Arxiv
CDLM: Consistency Diffusion Language Models For Faster Sampling 2025 Arxiv

Sparsity

Paper Title Year Conference/Journal Remark
Attention Sinks in Diffusion Language Models 2025 Arxiv
SparseD: Sparse Attention for Diffusion Language Models 2025 Arxiv

Quantization

Paper Title Year Conference/Journal Remark
Quantization Meets dLLMs: A Systematic Study of Post-training Quantization for Diffusion LLMs 2025 Arxiv Quantization

Reasoning & Alignment

Paper Title Year Conference/Journal Remark
Reinforcing the Diffusion Chain of Lateral Thought with Diffusion Language Models 2025 Arxiv
d1: Scaling Reasoning in Diffusion Large Language Models via Reinforcement Learning 2025 Arxiv
Diffusion of Thought: Chain-of-Thought Reasoning in Diffusion Language Models 2024 NeurIPS
wd1: Weighted Policy Optimization for Reasoning in Diffusion Language Models 2025 Arxiv
Thinking Inside the Mask: In-Place Prompting in Diffusion LLMs 2025 Arxiv
Review, Remask, Refine (R3): Process-Guided Block Diffusion for Text Generation 2025 ICML
Revolutionizing Reinforcement Learning Framework for Diffusion Large Language Models 2025 Arxiv
DiFFPO: Training Diffusion LLMs to Reason Fast and Furious via Reinforcement Learning 2025 Arxiv
Principled and Tractable RL for Reasoning with Diffusion Language Models 2025 Arxiv
Improving Reasoning for Diffusion Language Models via Group Diffusion Policy Optimization 2025 Arxiv
Improving Discrete Diffusion Unmasking Policies Beyond Explicit Reference Policies 2025 Arxiv
d2: Improved Techniques for Training Reasoning Diffusion Language Models 2025 Arxiv
Taming Masked Diffusion Language Models via Consistency Trajectory Reinforcement Learning with Fewer Decoding Step 2025 Arxiv
Inpainting-Guided Policy Optimization for Diffusion Large Language Models 2025 Arxiv
Beyond Surface Reasoning: Unveiling the True Long Chain-of-Thought Capacity of Diffusion Large Language Models 2025 Arxiv
Inpainting-Guided Policy Optimization for Diffusion Large Language Models 2025 Arxiv
Step-Aware Policy Optimization for Reasoning in Diffusion Large Language Models 2025 Arxiv
MRO: Enhancing Reasoning in Diffusion Language Models via Multi-Reward Optimization 2025 Arxiv
Enhancing Reasoning for Diffusion LLMs via Distribution Matching Policy Optimization 2025 Arxiv
Boundary-Guided Policy Optimization for Memory-efficient RL of Diffusion Large Language Models 2025 Arxiv
Coevolutionary Continuous Discrete Diffusion: Make Your Diffusion Language Model a Latent Reasoner 2025 Arxiv
SPG: Sandwiched Policy Gradient for Masked Diffusion Language Models 2025 Arxiv
LaDiR: Latent Diffusion Enhances LLMs for Text Reasoning 2025 Arxiv
TR2-D2: Tree Search Guided Trajectory-Aware Fine-Tuning for Discrete Diffusion 2025 Arxiv
MDPO: Overcoming the Training-Inference Divide of Masked Diffusion Language Models 2025 Arxiv
Loopholing Discrete Diffusion: Deterministic Bypass of the Sampling Wall 2025 Arxiv
RFG: Test-Time Scaling for Diffusion Large Language Model Reasoning with Reward-Free Guidance 2025 Arxiv
Preference-Based Alignment of Discrete Diffusion Models 2025 Arxiv
The Devil behind the mask: An emergent safety vulnerability of Diffusion LLMs 2025 Arxiv

Ordering (e.g. Block, AR, Decoding Strategies)

Paper Title Year Conference/Journal Remark
SSD-LM: Semi-autoregressive Simplex-based Diffusion Language Model for Text Generation and Modular Control 2023 ACL <7B, Simplex, Blockwise
AR-Diffusion: Auto-Regressive Diffusion Model for Text Generation 2023 NeurIPS <7B, AR-like noise
Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models 2025 ICLR <7B
Train for the Worst, Plan for the Best: Understanding Token Ordering in Masked Diffusions 2025 ICML <7B
Don't Let It Fade: Preserving Edits in Diffusion Language Models via Token Timestep Allocation 2025 NeurIPS <7B
Any-Order Flexible Length Masked Diffusion 2025 Arxiv

Theory

Paper Title Year Conference/Journal Remark
Theoretical Benefit and Limitation of Diffusion Language Model 2025 NeurIPS
Time Is a Feature: Exploiting Temporal Dynamics in Diffusion Language Models 2025 Arxiv
What Makes Diffusion Language Models Super Data Learners? 2025 Arxiv
Why mask diffusion does not work 2025 Arxiv
Diffusion Language Models Know the Answer Before Decoding 2025 Arxiv

Modeling

Paper Title Year Conference/Journal Remark
Structured Denoising Diffusion Models in Discrete State-Spaces 2021 NeurIPS <7B, Discrete
Diffusion-LM Improves Controllable Text Generation 2022 NeurIPS <7B, Embedding
DiffusionBERT: Improving Generative Masked Language Models with Diffusion Models 2023 ACL <7B, Masked
Latent Diffusion for Language Generation 2023 NeurIPS <7B, Latent
Likelihood-Based Diffusion Language Models 2023 NeurIPS <7B
Discrete Diffusion Modeling by Estimating the Ratios of the Data Distribution 2024 ICML <7B, Discrete
Simple and Effective Masked Diffusion Language Models 2024 NeurIPS <7B, Masked
Think While You Generate: Discrete Diffusion with Planned Denoising 2025 ICLR <7B, Discrete
The Diffusion Duality 2025 ICML <7B, Uniform
Generalized Interpolating Discrete Diffusion 2025 ICML <7B, Discrete
Esoteric Language Models 2025 Arxiv
Sequential Diffusion Language Models 2025 Arxiv
LLaDA-MoE: A Sparse MoE Diffusion Language Model 2025 Arxiv MoE
Next Semantic Scale Prediction via Hierarchical Diffusion Language Models 2025 Arxiv

Guidance & Constraints

Paper Title Year Conference/Journal Remark
Diffusion Models Beat GANs on Image Synthesis 2021 NeurIPS Image, Classifier Guidance
Classifier-Free Diffusion Guidance 2021 NeurIPS Image, Classifier-free Guidance
Diffusion-LM Improves Controllable Text Generation 2022 NeurIPS <7B
SSD-LM: Semi-autoregressive Simplex-based Diffusion Language Model for Text Generation and Modular Control 2023 ACL <7B
Constrained Discrete Diffusion 2025 NeurIPS <7B
Don't Let It Fade: Preserving Edits in Diffusion Language Models via Token Timestep Allocation 2025 NeurIPS <7B
DINGO: Constrained Inference for Diffusion LLMs 2025 Arxiv Constrained decoding
CtrlDiff: Boosting Large Diffusion Language Models with Dynamic Block Prediction and Controllable Generation 2025 Arxiv

Downstream Applications

Paper Title Year Conference/Journal Remark
Planning with Diffusion Models for Target-Oriented Dialogue Systems 2025 ACL Dialogue
DiffuCoder: Understanding and Improving Masked Diffusion Models for Code Generation 2025 Arxiv Code Generation
Seed Diffusion: A Large-Scale Diffusion Language Model with High-Speed Inference 2025 Arxiv Code Generation
Beyond Autoregression: An Empirical Study of Diffusion Large Language Models for Code Generation 2025 Arxiv Code Generation
Discrete Diffusion VLA: Bringing Discrete Diffusion to Action Decoding in Vision-Language-Action Policies 2025 Arxiv VLA
LLaDA-VLA: Vision Language Diffusion Action Models 2025 Arxiv VLA
dVLA: Diffusion Vision-Language-Action Model with Multimodal Chain-of-Thought 2025 Arxiv VLA

Diffusion Language Models (<7B)

Paper Title Year Conference/Journal Remark
Diffusion-LM Improves Controllable Text Generation 2022 NeurIPS Embedding
DiffuSeq: Sequence to Sequence Text Generation with Diffusion Models 2023 ICLR Embedding
DiffusionBERT: Improving Generative Masked Language Models with Diffusion Models 2023 ACL Masked
Latent Diffusion for Language Generation 2023 NeurIPS Latent
Discrete Diffusion Modeling by Estimating the Ratios of the Data Distribution 2024 ICML Masked
SSD-LM: Semi-autoregressive Simplex-based Diffusion Language Model for Text Generation and Modular Control 2023 ACL Simplex, Blockwise
AR-Diffusion: Auto-Regressive Diffusion Model for Text Generation 2023 NeurIPS AR-like noise
Likelihood-Based Diffusion Language Models 2023 NeurIPS Plaid1B
Scaling up Masked Diffusion Models on Text 2024 ICLR 1.1B
Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models 2025 ICLR
The Diffusion Duality 2025 ICML
Generalized Interpolating Discrete Diffusion 2025 ICML
Train for the Worst, Plan for the Best: Understanding Token Ordering in Masked Diffusions 2025 ICML
Esoteric Language Models 2025 Arxiv
Reinforced Context Order Recovery for Adaptive Reasoning and Planning 2025 Arxiv
Beyond Autoregression: Discrete Diffusion for Complex Reasoning and Planning 2025 ICLR
Your Absorbing Discrete Diffusion Secretly Models the Bayesian Posterior 2025 ArXiv
Any-Order Flexible Length Masked Diffusion 2025 Arxiv
Edit Flows: Flow Matching with Edit Operations 2025 Arxiv
DLM-One: Diffusion Language Models for One-Step Sequence Generation 2025 Arxiv
Simplified and Generalized Masked Diffusion for Discrete Data 2024 NeurIPS

Multi-Modal Diffusion Models

Paper Title Year Conference/Journal Remark
Diffuse Everything: Multimodal Diffusion Models on Arbitrary State Spaces 2025 ICML
MMaDA: Multimodal Large Diffusion Language Models 2025 Arxiv
LLaDA-V: Large Language Diffusion Models with Visual Instruction Tuning 2025 Arxiv
Unified Multimodal Discrete Diffusion 2025 Arxiv
Dimple: Discrete Diffusion Multimodal Large Language Model with Parallel Decoding 2025 Arxiv
LaViDa: A Large Diffusion Language Model for Multimodal Understanding 2025 Arxiv
Dual Diffusion for Unified Image Generation and Understanding 2025 Arxiv
Muddit: Liberating Generation Beyond Text-to-Image with a Unified Discrete Diffusion Model 2025 Arxiv
Show-o2: Improved Native Unified Multimodal Models 2025 Arxiv
Lumina-DiMOO: An Omni Diffusion Large Language Model for Multi-Modal Generation and Understanding 2025 Arxiv
MMaDA-Parallel: Multimodal Large Diffusion Language Models for Thinking-Aware Editing and Generation 2025 Arxiv
DiffusionVL: Translating Any Autoregressive Models into Diffusion Vision Language Models 2025 Arxiv

Seminal Diffusion Papers

Paper Title Year Conference/Journal Remark
Deep Unsupervised Learning using Nonequilibrium Thermodynamics 2015 ICML Diffusion Formulation
Denoising Diffusion Probabilistic Models 2020 NeurIPS
Denoising Diffusion Implicit Models 2021 ICLR
Score-Based Generative Modeling through Stochastic Differential Equations 2021 ICLR
DPM-Solver: A Fast ODE Solver for Diffusion Probabilistic Model Sampling in Around 10 Steps 2022 NeurIPS
High-Resolution Image Synthesis with Latent Diffusion Models 2022 CVPR
Scalable Diffusion Models with Transformers 2023 ICCV
Score-based Generative Modeling in Latent Space 2021 NeurIPS Latent
Structured Denoising Diffusion Models in Discrete State-Spaces 2021 NeurIPS Discrete
Vector Quantized Diffusion Model for Text-to-Image Synthesis 2022 CVPR VQ
Diffusion Models Beat GANs on Image Synthesis 2021 NeurIPS CG
Classifier-Free Diffusion Guidance 2021 NeurIPS CFG
Analog Bits: Generating Discrete Data using Diffusion Models with Self-Conditioning 2023 ICLR Self-conditioning
Progressive Distillation for Fast Sampling of Diffusion Models 2022 ICLR Distillation
Consistency Models 2023 ICML

Contact

We welcome all researchers to contribute to this repository.

If you have a related paper that was not added to the library, please contact us.

Email: jake630@snu.ac.kr / wjk9904@snu.ac.kr / qicher@snu.ac.kr

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published