[WIP] Support parallel tiled vae for AutoencoderKL and AutoencoderKLWan #5

TmacAaron · 2025-11-04T09:40:09Z

What does this PR do?

When memory is constrained, VAE tiling serves as a critical optimization to reduce memory footprint—here’s how it works at its core:

Split the VAE’s original input (e.g., high-resolution images or latent tensors) into multiple smaller, independently processable tiles;
Feed each tile into the VAE separately for encoding or decoding to generate tile-specific results;
Stitch all processed tiles back to their original positional layout, with overlapping regions optimized via fusion algorithms to ensure the final output is seamless and natural.

However, the default VAE tiling workflow executes tile-wise operations sequentially - even when multiple devices are available. This leads to underutilization of hardware resources, as each tile’s VAE computation is inherently independent and can be parallelized.

To address this inefficiency, this PR introduces parallelized tiled VAE processing for AutoencoderKL and AutoencoderKLWan. By leveraging multi-device capabilities, we distribute the independent tile computations across available hardware, enabling simultaneous processing of multiple tiles. This not only maximizes resource utilization but also accelerates end-to-end VAE encoding/decoding throughput - all while retaining the memory-saving benefits of tiling for memory-constrained environments.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline?
Did you read our philosophy doc (important for complex PRs)?
Was this discussed/approved via a GitHub issue or the forum? Please add a link to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

implement vae dp for AutoencoderKL and AutoencoderKLWan

7d5790f

TmacAaron changed the title ~~implement vae dp for AutoencoderKL and AutoencoderKLWan~~ [WIP] Support parallel tiled vae for AutoencoderKL and AutoencoderKLWan Nov 4, 2025

TmacAaron added 5 commits November 5, 2025 03:25

extract same code in vae dp func

6c61cd0

optimize blend method in tiled vae

4aeeeb9

fix world_size 1 bug when init parallel tiling

8cfad75

fix bug in vae_kl_wan

2511515

Merge branch 'ascend-patch' into vae_dp

43b1722

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[WIP] Support parallel tiled vae for AutoencoderKL and AutoencoderKLWan #5

[WIP] Support parallel tiled vae for AutoencoderKL and AutoencoderKLWan #5

Uh oh!

TmacAaron commented Nov 4, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

[WIP] Support parallel tiled vae for AutoencoderKL and AutoencoderKLWan #5

Are you sure you want to change the base?

[WIP] Support parallel tiled vae for AutoencoderKL and AutoencoderKLWan #5

Uh oh!

Conversation

TmacAaron commented Nov 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Before submitting

Who can review?

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

TmacAaron commented Nov 4, 2025 •

edited

Loading