-
Notifications
You must be signed in to change notification settings - Fork 22
Description
Issue by Dmitriy Morozov
Sunday Feb 08, 2015 at 18:51 GMT
Many processors (for example, the ones on Edison) place memory next to the core that first touched it. Currently, we do not take this into consideration at all. Initial blocks are created in serial (to avoid problems with threaded MPI IO), which means the input data is probably placed next to the core on which the main thread is running. In some cases, this may be Ok (if extra data structures are created later during parallel computation), but in some cases (all dense data), this is surely suboptimal.
Worse yet, when we run the actual computation we do not keep track of the information about which is the preferred thread for a block (in terms of memory placement). We should consider taking this into account.