Skip to content

SpdExitWorkgroup barries could be not strict enough #13

@Goshido

Description

@Goshido

Could you explain why the SPD GLSL version only uses one barrier when cutting off all workgroups except the last?

// Only last active workgroup should proceed
bool SpdExitWorkgroup(AU1 numWorkGroups, AU1 localInvocationIndex, AU1 slice) 
{
    // global atomic counter
    if (localInvocationIndex == 0)
    {
        SpdIncreaseAtomicCounter(slice);
    }
    SpdWorkgroupShuffleBarrier();
    return (SpdGetAtomicCounter() != (numWorkGroups - 1));
}

void SpdWorkgroupShuffleBarrier() {
#ifdef A_GLSL
    barrier();
#endif 
#ifdef A_HLSL
    GroupMemoryBarrierWithGroupSync();
#endif
}

According to the GLSL specification, there should be pair of calls: barrier() + memoryBarrierImage()
It can't be that AMD and all users of the algorithm allow UB.

The direct link to official SPD snippet

Important notes:

  1. There is VRAM memory for mip 5 which should be synchronized before running last single workgroup. The barrier() call is not enough i think. We need additional memoryBarrierImage().
  2. The mip 5 image has coherent specifier but the spec description is very hazy for me.
  3. I can't imagine any example with coherent but without memoryBarrierImage()

Side note: HLSL code would need additional sync (DeviceMemoryBarrier?) as well.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions