-
Notifications
You must be signed in to change notification settings - Fork 16
High GPU memory consumption #6
Copy link
Copy link
Open
Description
Hi,
I tried to integrate the TTLayer into transformerXL,
however I found that it consumes much more memory than usual.
Did you experience such problems? do you know anyway around this?
(BTW I also applied few fixes for multi-GPU training, e.g tensor train objects are not passed to GPU when you activate the model.to(device), therefore breaking the model in distributed training).
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels