forked from llvm/llvm-project
-
Notifications
You must be signed in to change notification settings - Fork 9
Open
Labels
bugSomething isn't workingSomething isn't working
Description
For parallel reductions where the accumulators are all initialized to the same value, the PHI nodes are merged into the instructions and the reduction is made on the register of initial value, instead on "new" registers
for(unsigned i=0; i < MATSIZE; ++i) {
#pragma unroll 1
for(unsigned j=0; j < MATSIZE; j+=UNROLL) {
register double acc[UNROLL];
#pragma unroll
for (int u = 0; u < UNROLL; ++u) acc[u] = 0.0; // <- All is fine if this is e.g. c[i*MATSIZE+j+u];
#pragma frep infer
for(unsigned k=0; k < MATSIZE; ++k) {
#pragma unroll
for (int u = 0; u < UNROLL; ++u)
{
acc[u] += __builtin_ssr_pop(0)*__builtin_ssr_pop(1);
}
}
#pragma unroll
for (int u = 0; u < UNROLL; ++u) c[i*MATSIZE+j+u] = acc[u];
}
}With 0.0 this results in the wrong assembly
fmadd.d ft5, ft1, ft0, ft3
fmadd.d ft6, ft1, ft0, ft3
fmadd.d ft7, ft1, ft0, ft3
# ...
For inididual initial values (c[i*MATSIZE+j+u], the problem disappears
fmadd.d ft3, ft1, ft0, ft3
fmadd.d ft4, ft1, ft0, ft4
fmadd.d ft5, ft1, ft0, ft5
fmadd.d ft6, ft1, ft0, ft6
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working