Bug and question about dLLM-cache applied to MMaDA

https://github.com/maomaocun/dLLM-cache/blob/c943fc24cbc8e8f9c8eb07f89bf19142c2ff6b1e/demo_MMada_t2i_cache.py#L168
https://github.com/maomaocun/dLLM-cache/blob/c943fc24cbc8e8f9c8eb07f89bf19142c2ff6b1e/demo_MMada_t2i_cache.py#L194

Thanks for your awesome work! However, there seems to be a minor typo in `demo_MMada_t2i_cache.py`. In the code above, the `prompt_length` is set to `input_ids.shape[1]`, which is indeed the length of the entire sequence (prompt + response), instead of the prompt length. This will result in fatal performance degradation.

Besides, after fixing it, in my preliminary attempts, the acceleration effect of MMaDA T2I task falls far behind that of language tasks reported in the paper. For example, when I set `prompt_interval_steps = 10`, `gen_interval_steps = 5`, `transfer_ratio = 0.5`, the output image is acceptable and the time reduces from 5.2s to 3.4s. However, when I reduce `transfer_ratio` to 0.3, the output image becomes messy and the time is 3.0s. Is this expected? What configuration do you suggest for MMaDA T2I task, and what's the corresponding acceleration rate?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug and question about dLLM-cache applied to MMaDA #4

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Bug and question about dLLM-cache applied to MMaDA #4

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions