Multi-GPU Training

Could you give some guidance on how to train the model on multiple GPUs? I have a very large dataset and I run out of memory (32 GB total). 
Should I use PyTorch's Distributed Data Parallel? or just assign different GPUs to different components of the model?

Thanks.