-
Notifications
You must be signed in to change notification settings - Fork 11
enable subtomogram alignment on multiple gpus #69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Hey @alncat, thanks for the contribution! |
|
@sroet , you may also check another commit about the shift determined during alignment. I multiplied it with the binning factor. |
I will check it tomorrow. |
|
@sroet ,
I think it might be beneficial to let multiple process running on the same gpu. The alignment or average process takes less than 1GB gpu memory. Modern nvidia gpu often has at least 4GB memory. So we can at least running 4 processes on a single gpu, thus maximizing performance. My logic is: This should be a better way than |
That is very dependent on the size of your subtomogram but I do agree that at least having the option to spawn more processes per gpu seems like a good idea
That is fine, the problem is that it now might be random which GPU each process uses (and even how many processes try to use a single GPU) especially if 1 GPU is (significantly) slower than others (because it is also processing graphics output for a workstation or something like that). For example if you have 4 processes (p0, p1, p2, and p3) and 2 GPUs (g0, g1), where g0 is way slower than g1 ( Which is why I proposed to link the gpu index it to the MPI process rank (something like |
|
@sroet
PyTom/pytom/alignment/GLocalSampling.py Line 903 in 3a89e44
So the evenSplitLis is of size 20. When passing into mpi.parfor, the argument generated by list(zip(...)) is padded to the true size of mpi world, which is 21!Line 147 in 3a89e44
But the Line 155 in 3a89e44
Otherwise, a process will get the argument,
The first one in above list is the subset of particle list this process will work on, while the last one in above list the gpuID that the process will bind to. So the modification My alignment command typically is like |
|
Thank you for the check and the extensive write-up! Do you mind adding/pushing the modification? That is the only thing that is left for me to approve this! |
|
@sroet , thank you for accepting this! In GLocalAlignmentPlan, the volumes are binned PyTom/pytom/gpu/gpuStructures.py Line 236 in 3a89e44
If the input volume is of shape PyTom/pytom/gpu/gpuStructures.py Line 437 in 3a89e44
the |
correct shift when alignment is performed with binning
|
@sroet I revised the commit and resubmitted :-). Thank you for reviewing! |
sroet
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM (looks good to me!), will merge after the unit tests complete and pass.
Thanks again for the contribution
Dear PyTom developers,
I have made a small modifications which enables the glocalsampling to utilize multiple gpus for multiple processes. This is achieved by create a copy of gpuIDs for each split on each process. My personal test found that this modification greatly enhances the parallelism of glocaljob.
Best Regards,
Zhenwei