Skip to content

Tensor Parallel demo #4851

@KimmiShi

Description

@KimmiShi

Hi, I have run SDXL with Tensor Parallel as well as sequence parallel. Below is my PR, and may it help those who need it.

The Motivation:
Just trying to avoid using grad checkpointing to get higher throughput when inputs have higher resolution like 720p.

However, tensor parallel comes at a cost, and I have not gained throughput by TP. (Tested with 720*1080 on A100, batchsize=16 and amp).

Just in case someone have the same idea or try to run tensor prarallel with more blocks, below is my code changes:

PR: support tensor parallel for sdxl

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions