Skip to content

VariableCellArrayReal::synchronize: parallelize on item and second dimension when pack/unpack messages #1981

@DavidDureau

Description

@DavidDureau

Consider a variable of type VariableCellArrayReal where the second dimension is high (for example 38400).

When we want to synchronize this variable between GPUs, the messages are packed and unpacked on GPU using the Accelerator API:

void _copyFrom(const RunQueue* queue, SmallSpan<const Int32> indexes,
.

In the case where the number of items (nb_index) is low and the second dimension (sub_size) is really high, the _copyFrom and _copyTo methods are expansive (because not enough parallelism).

Is it possible to parallelize both on nb_index and sub_size thanks to a RUNCOMMAND_LOOP2?

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions