Skip to content

Conversation

@TmacAaron
Copy link

@TmacAaron TmacAaron commented Nov 2, 2025

What does this PR do?

The original implementation of RoPE in Flux and Wan will calculate the rotary_embeddings in every steps. But actually the rotary_embeddings for each step is the same, so we only need to do the RoPE operation once for each inference.

This PR will save the cache once the rotary_embeddings is calculated, and use the cached rotary_embeddings in following steps. And Finally when inference finished, the cache will be released.

Before submitting

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@TmacAaron TmacAaron changed the title rope cache rope cache for flux and wan Dec 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant