-
Notifications
You must be signed in to change notification settings - Fork 25
MPI #141
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MPI #141
Conversation
Determine the maximum value of the key without implicit assumptions.
We were reusing first few random numbers following the source generation (state of RNG was the same at the beginning of sampling the source particle and its first flight). This commit moves the RNG back before the source generation is performed, thus preventing the reuse.
No communication takes place at this stage.
Will be reproducable if fixes to source_init are merged.
Allows to limit the console I/O to the master process only. Applied to fixed source calculation only at the moment.
It will be used for sampling without replacement.
Results from not master processes are not combined, hence they are lost at the moment.
Fixes a bug for synchronised scoreMemory. The buffer value after transfer in parallelBin was not properly set to 0 again.
It is not reproducable at the moment
| use mpi_func, only : isMPIMaster, getWorkshare, getOffset, getMPIRank | ||
| #ifdef MPI | ||
| use mpi_func, only : MASTER_RANK, MPI_Bcast, MPI_INT, MPI_COMM_WORLD, & | ||
| MPI_DOUBLE, mpi_reduce, MPI_SUM |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| MPI_DOUBLE, mpi_reduce, MPI_SUM | |
| MPI_DOUBLE, MPI_REDUCE, MPI_SUM |
|
|
||
| #ifdef MPI | ||
| ! Print the population numbers referred to all processes to screen | ||
| call mpi_reduce(nStart, nTemp, 1, MPI_INT, MPI_SUM, MASTER_RANK, MPI_COMM_WORLD, error) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| call mpi_reduce(nStart, nTemp, 1, MPI_INT, MPI_SUM, MASTER_RANK, MPI_COMM_WORLD, error) | |
| call MPI_REDUCE(nStart, nTemp, 1, MPI_INT, MPI_SUM, MASTER_RANK, MPI_COMM_WORLD, error) |
|
To get the tests passing, I think it's a case of including MPI installation in the docker image? mikolajkowalski/scone-test |
a978577 to
3fb146e
Compare
|
After a long time spent fighting with the compiler..... he is still winning. The problem is related to the Docker container though, rather than to SCONE. OMPI doesn't like it when we try to run the tests as root-user in the container (this refers to the unitTests). I managed to force it with gfortran9 and gfortran10 (which, as you can see, run ok). However I couldn't find a way to get gfortran8 to agree to run! |
|
Another thing to notice is that in the particleDungeon, the previous sampling w/o replacement algorithm is in a new procedure called samplingWithoutReplacement, and it isn't used. |
The reason here is most probably that the In general we should probably push up the compiler versions for CI and maybe drop gfortran-8 :-/ If we want to keep gfotran-8 we could just try MPICH? |
No reason to keep dead code. It will be preserved in the Git history if anyone ever wants to inspect it. |
Yes that makes sense, about the older OMPI version (annoying!). I agree about dropping gfortran 8 and adding newer versions (all this in a separate PR). But in this case, the problem remains that I can't get the tests to pass in this PR.... I will surely manage with more work but I wonder if it's worth spending time on this. |
We can just make the new PR quick (yes... I know) and then rebase this one on the |
ChasingNeutrons
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only a few very small things then I'm happy.
| !! | ||
| !! Perform nearest neighbor load balancing | ||
| !! | ||
| #ifdef MPI |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't it cut out lots of ifdefs if you only put the ifdef MPI inside loadbalancing rather than wrapping it around each time it occurs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean at runtime? Not sure what you mean here. It could be that there are better ways to put those statements, but I am not sure it makes a big difference at all.. I am tempted to leave it as is for now!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean that the contents of the function (somewhere after the 'popSizes' definition up to right before 'end subroutine') could be surrounded by an ifdef, but the function itself needn't be. This would allow you to remove ifdefs around when it is called and its definition as a procedure. I think this is desirable because (understandably) there are ifdefs everywhere, so it would be nice for readability to be able to remove a few.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed it! I see arguments to do it either way but I don't have a strong preference.
Finally! I think this is ready to go.
This PR includes MPI support. The main differences are in the tallies and in the dungeon for population normalisation between cycles and load balancing (from Paul Romano's PhD).
In the tallies, one key change is that a report was added: closeCycle other than reportCycleEnd. With MPI, there are two options: mpiSync 1 means that the tallies are synchronised each cycle; mpiSync 0 means that the tallies from each rank are collected at the end of the calculation. All calculations give identical results when mpiSync 1; they are within statistics when mpiSync 0. NOTE that this option applies to all tallies included in a tally admin rather to individual clerks. Splitting reportCycleEnd into two procedures (i.e., adding closeCycle), makes reproducibility easier for most tallyClerks.
In the dungeon, population normalisation was implemented using a new data structure, heapQueue. Note that to ensure reproducibility, particles have to be sorted before and after sampling. Then, load balancing is performed transferring particles between processes.
Results seem to be reproducible for all tallies, and all the tests pass successfully.
@Mikolaj-A-Kowalski However the github tests seem to fail during compilation because the MPI library is not available. Do you know if there's an easy solution to this? Otherwise I'll have a look.