-
Notifications
You must be signed in to change notification settings - Fork 6
558 develop shared memory parallel backend for dense alp #163
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: 303-develop-reference_dense-backend
Are you sure you want to change the base?
558 develop shared memory parallel backend for dense alp #163
Conversation
…backend (#129) Introduce essential functionalities of the parallel backend to support allocating a matrix in parallel and setting its elements to a single value.
* Replication factor in Distribution becomes static constexpr * Take into account replication factor in thread grid layout * Consider full thread grid coordinates (including rt) in containers and operations * Add missing comments and fix a mistake in another comment * Avoid calling distribution functions multiple times * Pass thread coordinates using ThreadCoords object * Compute number of threads within the distribution * Encapsulate thread coordinates within the local coordinates * Remove thread-grid related structures and getters since they are no longer needed * Explain the reason behind hard-coded rt value * Calculate block id inside distribution * Rename distribution to a more descriptive name
…hared-memory-parallel-backend-for-dense-alp-backup
…lop-shared-memory-parallel-backend-for-dense-alp
* Started drafting mxm using 2.5D algo * WIP implementation of shifting + compute * First complete draft * compiling mxm test on general matrices * shared mem (ge) mxm passing functional tests * Tmp debugging info + fix in shift computation * Refactoring includes * - Fixing mismatching new/delete bug - Enabling allocation/computation on sub-grid of threads - Added unit tests for non-cubic mxm * Fixed style and typos
| if [ "${BACKEND:0:4}" != "alp_" ]; then | ||
| # Temporarily execute tests only for alp_reference backend | ||
| # until all backends start supporting all smoke tests. | ||
| if [ "${BACKEND}" != "alp_reference" ]; then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we enable omp tests?
| /** | ||
| * \internal general mxm implementation that all mxm variants using | ||
| * structured matrices refer to. | ||
| */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A more detailed explanation would be useful.
| > | ||
| RC mxm_generic( | ||
| alp::Matrix< OutputType, OutputStructure, | ||
| Density::Dense, OutputView, OutputImfR, OutputImfC, omp > &C, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indent (here and in the following lines) is missing.
I think we follow this style:
alp::Matrix<
InputType1, InputStructure1,
Density::Dense, InputView1, InputImfR1, InputImfC1, omp
> &A,
| // /** Type encapsulating the local block coordinate. */ | ||
| // struct LocalBlockCoord { | ||
|
|
||
| // const size_t tr; | ||
| // const size_t tc; | ||
| // const size_t rt; | ||
| // const size_t br; | ||
| // const size_t bc; | ||
|
|
||
| // LocalBlockCoord( | ||
| // const size_t tr, const size_t tc, | ||
| // const size_t rt, | ||
| // const size_t br, const size_t bc | ||
| // ) : | ||
| // tr( tr ), tc( tc ), | ||
| // rt( rt ), | ||
| // br( br ), bc( bc ) {} | ||
|
|
||
| // }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not needed any more. There are several commented-out blocks in this file. Perhaps some of them should be removed?
| RC local_rc = SUCCESS; | ||
|
|
||
| // Broadcast A and B to all c-dimensional layers | ||
| if( local_rc == SUCCESS && da.isActiveThread( th_ijk_a ) && th_ijk_a.rt > 0 ) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the check local_rc == SUCCESS is always true
| // LocalBlockCoord mapBlockGlobalToLocal( const GlobalBlockCoord &g ) const { | ||
| // (void) g; | ||
| // return LocalBlockCoord( 0, 0, 0, 0, 0 ); | ||
| // } | ||
|
|
||
| // GlobalBlockCoord mapBlockLocalToGlobal( const LocalBlockCoord &l ) const { | ||
| // const size_t block_id_r = l.br * Tr + l.tr; | ||
| // const size_t block_id_c = l.bc * Tc + l.tc; | ||
| // return GlobalBlockCoord( block_id_r, block_id_c );// Temporary | ||
| // } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Old comments, remove them?
|
|
||
| /** | ||
| * AMF for parallel shared memory backend. | ||
| * |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we leave a space for generalization of this choice?
| imf_r( imf_r ), imf_c( imf_c ), | ||
| num_threads( num_threads ), | ||
| distribution( imf_r.n, imf_c.n, num_threads ) { | ||
| std::cout << "Entering AMF normal constructor\n"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this message got to DEBUG?
| imf_c( std::move( amf.imf_c ) ), | ||
| num_threads( amf.num_threads ), | ||
| distribution( std::move( amf.distribution ) ) { | ||
| std::cout << "Entering OMP AMF move constructor\n"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment about message.
| * for R and/or C. | ||
| * | ||
| */ | ||
| storage_index_type getStorageIndex( const size_t i, const size_t j, const size_t s, const size_t P ) const { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is const size_t P needed here?
|
Slot for 0.9 - but @djelovina , is this one still current or should we focus on the one you more recently rebased? |
This PR merges current omp reference backend into 303. Future extensions on either sequential or parallel backends would use feature branches from 303 only.