Question about FLOPs comparison for dParallel vs. baseline dLLMs

Hi, thank you for releasing dParallel! The idea of learnable parallel decoding for diffusion LLMs is very exciting!

I have one technical question:

Since the paper does not report FLOPs, I’m trying to estimate the computational overhead of dParallel compared to the baseline diffusion LLM decoding.

May I ask:
	1.	How should FLOPs be counted for dParallel?
	2.	If possible, is there any script / config you used to profile FLOPs?
I’d like to reproduce similar measurements.

Thanks again for the great work. Looking forward to your guidance!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Question about FLOPs comparison for dParallel vs. baseline dLLMs #5

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Question about FLOPs comparison for dParallel vs. baseline dLLMs #5

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions