Different key and query tokens?

Example use case: we have 3 tokens at the end of a prompt, and we want to see the attention probs from those back to all other tokens in the sequence. This could be done via something like

```
cv.attention.attention_patterns(
    attention = attention,
    src_tokens = tokens,
    dest_tokens = tokens[-3:],
)
```

Not sure how difficult this would be to implement.