Skip to content

Why is the mask in BertSelfOutput not passed through the MaskedLayerNorm? #160

@andrewboldi

Description

@andrewboldi

In the following code in deberta/bert.py, why are we not passing in the mask to the MaskedLayerNorm in line 38? If the mask is not needed, can't we directly call hidden_states = self.LayerNorm(hidden_states)?

class BertSelfOutput(nn.Module):
def __init__(self, config):
super().__init__()
self.dense = nn.Linear(config.hidden_size, config.hidden_size)
self.LayerNorm = LayerNorm(config.hidden_size, config.layer_norm_eps)
self.dropout = StableDropout(config.hidden_dropout_prob)
self.config = config
def forward(self, hidden_states, input_states, mask=None):
hidden_states = self.dense(hidden_states)
hidden_states = self.dropout(hidden_states)
hidden_states += input_states
hidden_states = MaskedLayerNorm(self.LayerNorm, hidden_states)
return hidden_states

Thank you in advance!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions