Skip to content

Conversation

@RezaYazdaniAminabadi
Copy link
Contributor

This PR fuses the query, key, and value sequence-parallel AlltoAlls with the help of the new api changes drafted on a PR for Megatron-DeepSpeed repo. With these changes, we can reduce the communication overhead especially when increasing the the SP-degree to >2 (4 or 8).
cc: @samadejacobs @tjruwase @tohtana

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants