SpdExitWorkgroup barries could be not strict enough

Could you explain why the _SPD GLSL_ version only uses one barrier when cutting off all workgroups except the last?

```cpp
// Only last active workgroup should proceed
bool SpdExitWorkgroup(AU1 numWorkGroups, AU1 localInvocationIndex, AU1 slice) 
{
    // global atomic counter
    if (localInvocationIndex == 0)
    {
        SpdIncreaseAtomicCounter(slice);
    }
    SpdWorkgroupShuffleBarrier();
    return (SpdGetAtomicCounter() != (numWorkGroups - 1));
}

void SpdWorkgroupShuffleBarrier() {
#ifdef A_GLSL
    barrier();
#endif 
#ifdef A_HLSL
    GroupMemoryBarrierWithGroupSync();
#endif
}
```

According to the _GLSL_ specification, there should be pair of calls: `barrier() + memoryBarrierImage()`
It can't be that _AMD_ and all users of the algorithm allow _UB_.

[The direct link to official SPD snippet](https://github.com/GPUOpen-Effects/FidelityFX-SPD/blob/master/ffx-spd/ffx_spd.h#L381)

Important notes:
1. There is VRAM memory for mip 5 which should be synchronized before running last single workgroup. The `barrier()` call is not enough i think. We need additional `memoryBarrierImage()`.
2. The mip 5 image has `coherent` specifier but the spec description is very hazy for me.
3. I can't imagine any example with `coherent` but without `memoryBarrierImage()`

Side note:  _HLSL_ code would need additional sync (`DeviceMemoryBarrier`?) as well.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

SpdExitWorkgroup barries could be not strict enough #13

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

SpdExitWorkgroup barries could be not strict enough #13

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions