Skip to content

Conversation

@hidanielesgit
Copy link

This PR merges current omp reference backend into 303. Future extensions on either sequential or parallel backends would use feature branches from 303 only.

Vladimir Dimic and others added 6 commits December 5, 2022 11:26
…backend (#129)

Introduce essential functionalities of the parallel backend to support allocating a matrix in parallel and setting its elements to a single value.
* Replication factor in Distribution becomes static constexpr
* Take into account replication factor in thread grid layout
* Consider full thread grid coordinates (including rt) in containers and operations
* Add missing comments and fix a mistake in another comment
* Avoid calling distribution functions multiple times
* Pass thread coordinates using ThreadCoords object
* Compute number of threads within the distribution
* Encapsulate thread coordinates within the local coordinates
* Remove thread-grid related structures and getters since they are no longer needed
* Explain the reason behind hard-coded rt value
* Calculate block id inside distribution
* Rename distribution to a more descriptive name
…hared-memory-parallel-backend-for-dense-alp-backup
…lop-shared-memory-parallel-backend-for-dense-alp
* Started drafting mxm using 2.5D algo

* WIP implementation of shifting + compute

* First complete draft

* compiling mxm test on general matrices

* shared mem (ge) mxm passing functional tests

* Tmp debugging info + fix in shift computation

* Refactoring includes

* - Fixing mismatching new/delete bug
- Enabling allocation/computation on sub-grid of threads
-  Added unit tests for non-cubic mxm

* Fixed style and typos
Comment on lines -514 to +516
if [ "${BACKEND:0:4}" != "alp_" ]; then
# Temporarily execute tests only for alp_reference backend
# until all backends start supporting all smoke tests.
if [ "${BACKEND}" != "alp_reference" ]; then
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we enable omp tests?

Comment on lines +53 to +56
/**
* \internal general mxm implementation that all mxm variants using
* structured matrices refer to.
*/
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A more detailed explanation would be useful.

>
RC mxm_generic(
alp::Matrix< OutputType, OutputStructure,
Density::Dense, OutputView, OutputImfR, OutputImfC, omp > &C,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indent (here and in the following lines) is missing.
I think we follow this style:

alp::Matrix< 
	InputType1, InputStructure1, 
	Density::Dense, InputView1, InputImfR1, InputImfC1, omp 
> &A,

Comment on lines +134 to +152
// /** Type encapsulating the local block coordinate. */
// struct LocalBlockCoord {

// const size_t tr;
// const size_t tc;
// const size_t rt;
// const size_t br;
// const size_t bc;

// LocalBlockCoord(
// const size_t tr, const size_t tc,
// const size_t rt,
// const size_t br, const size_t bc
// ) :
// tr( tr ), tc( tc ),
// rt( rt ),
// br( br ), bc( bc ) {}

// };
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not needed any more. There are several commented-out blocks in this file. Perhaps some of them should be removed?

RC local_rc = SUCCESS;

// Broadcast A and B to all c-dimensional layers
if( local_rc == SUCCESS && da.isActiveThread( th_ijk_a ) && th_ijk_a.rt > 0 ) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the check local_rc == SUCCESS is always true

Comment on lines +186 to +195
// LocalBlockCoord mapBlockGlobalToLocal( const GlobalBlockCoord &g ) const {
// (void) g;
// return LocalBlockCoord( 0, 0, 0, 0, 0 );
// }

// GlobalBlockCoord mapBlockLocalToGlobal( const LocalBlockCoord &l ) const {
// const size_t block_id_r = l.br * Tr + l.tr;
// const size_t block_id_c = l.bc * Tc + l.tc;
// return GlobalBlockCoord( block_id_r, block_id_c );// Temporary
// }
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Old comments, remove them?


/**
* AMF for parallel shared memory backend.
*
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we leave a space for generalization of this choice?

imf_r( imf_r ), imf_c( imf_c ),
num_threads( num_threads ),
distribution( imf_r.n, imf_c.n, num_threads ) {
std::cout << "Entering AMF normal constructor\n";
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this message got to DEBUG?

imf_c( std::move( amf.imf_c ) ),
num_threads( amf.num_threads ),
distribution( std::move( amf.distribution ) ) {
std::cout << "Entering OMP AMF move constructor\n";
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment about message.

* for R and/or C.
*
*/
storage_index_type getStorageIndex( const size_t i, const size_t j, const size_t s, const size_t P ) const {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is const size_t P needed here?

@anyzelman
Copy link
Member

Slot for 0.9 - but @djelovina , is this one still current or should we focus on the one you more recently rebased?

@anyzelman anyzelman added the question Further information is requested label Oct 17, 2025
@anyzelman anyzelman added this to the v0.9 milestone Oct 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

question Further information is requested

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants