Skip to content
This repository was archived by the owner on Jan 1, 2025. It is now read-only.
This repository was archived by the owner on Jan 1, 2025. It is now read-only.

Division by zero caused by mask operation #243

@Chenyang-1024

Description

@Chenyang-1024

If each pixel in the input image does not belong to the q-th class, then when generating the mask for masked attention, attn_mask[b, q, :] = True will be converted to attn_mask[b, q, :] = float('-inf') in nn.MultiheadAttention. Finally, when attn_mask is used for the Softmax(attn_mask, dim=-1) operation to calculate the attention map, the NaN caused by the divide by 0 error will appear. : (
This problem came up when I applied masked attention to my semantic segmentation task. : (
image

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions