The bug in getting attention weight

I try to get the attention weight of the model like this:
`outputs = self.model(
                    vision_x=vision_x,
                    lang_x=lang_x,
                    attention_mask=attention_mask,
                    clear_conditioned_layers=clear_conditioned_layers,
                    past_key_values=past_key_values,
                    use_cache=(past_key_values is not None),
                    output_attentions=True,
                )`
However, the attention weight tuple it returns is tuple of None. 
I step into the code and find out it might be a bug in MPT codes in "huggingface/modules/transformers_modules/". The parameter `output_attentions` has been omitted during the calling of function `MPTBlock. forward()` in blocks.py. 
I try to fix this bug but when running it, the code returns back to its original version. 
Is there any solution to it?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The bug in getting attention weight #302

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

The bug in getting attention weight #302

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions