-
Notifications
You must be signed in to change notification settings - Fork 3
Description
Hello,
I am using OpenCilk to run a DAG of a perfect binary tree, shown in the following.

Each node performs some computations and edges are the dependencies between nodes.
To run the DAG, I write the following code,
std::vector results(7);
for (size_t l = 0; l <= 2; ++l) {
if (l == 0) {
results[l] = computation(l);
}
else {
int count = std::pow(2, l);
for (int c = 0; c < count; ++c) {
cilk_spawn compuation(results[count-1+c]); // e.g. node t3 needs the data stored in results[1]
}
cilk_sync;
}
}
The code does a level synchronization.
For example in the figure above, node t1 and node t2 must complete before nodes t3-t6 start.
However, this implementation is not the fastest way to run the above DAG because node t2 can run in parallel with t3 and t4.
How do I modify the code to achieve a higher parallelism? Thank you.