Skip to content

GPU parallel and other accelaration methods #55

@Schuyler-Zixiao-Shi

Description

@Schuyler-Zixiao-Shi

Hi Ming and the ptychi team,

I’ve been experimenting with ptychi over the past month and wanted to share a few thoughts on potential acceleration strategies. Yi mentioned that some of these ideas have already been explored (or are actively under development), so I’m mainly posting this to consolidate the discussion and provide some context from our side.

On our end, most of our production runs are on A100 GPUs. In practice, many of our A100s are partitioned into 20 GB MIG instances, so one scenario we are particularly interested in is whether multiple 20 GB instances on the same physical A100 could be used cooperatively for a single ~16 GB dataset.

For reference, a typical dataset for us is 512 × 512 scan positions. At that scale, a single batch size of 512 (or larger) is often feasible. Conceptually, this seems like a workload that could be split across multiple 20 GB GPUs if inter-GPU communication within the same node is efficient. Extending this across multiple A100s would be even more attractive, although I suspect inter-node or inter-GPU communication overhead may quickly become the limiting factor.

In addition, I was curious about the cumulative impact of smaller FFT-related optimizations (e.g. batched FFTs, plan reuse, fixed grid sizes, avoiding unnecessary out-of-place copies). Yi mentioned that some of these have already been tested, and that the speedups were not very significant in isolation. It would still be interesting to understand whether a combination of these techniques might yield a more noticeable improvement.

Thanks a lot for all the work that’s gone into ptychi, and for sharing your experiences so far.

Take care,
Schuyler

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions