Skip to content

Making attention multi work with positive and negative activations #3

@neelnanda-io

Description

@neelnanda-io

It's nice to be able to plot a function on pairs of tokens, like attention, but where values can be positive or negative (eg, attention head logit contribution or attention head neuron contribution), currently AttnMulti is bad at this. I think the easiest way would be adjusting the color map function to give a pair of colours to each head, one for + and one for -

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions