Skip to content

Conversation

@LittleNyima
Copy link

During multi-GPU training, the current implementation causes all text encoders across all ranks to be loaded onto the same GPU (thus causing CUDA OOM issues). The correct approach is to initialize the accelerator first and then load the model to CUDA.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant