Tests for self attention using CumConcatLayer#590
Conversation
66cb35e to
2337f50
Compare
2384c1b to
cf604e1
Compare
626c35b to
75b72db
Compare
6deb705 to
a406d3f
Compare
405218b to
dc58c5a
Compare
|
This should be ready now. I left in |
|
There is also I did not check in detail, is this really exactly the same, or just similar? What is the difference? |
dc58c5a to
90e780c
Compare
Oh okay, I missed that. The test case in this PR also checks that |
|
Hm, I don't know. It might be nice as an example. I would say leave it. |
Some tests for #391.
I didn't run any of this yet (so most likely buggy) because there is no implementation.
It still needs
CumConcatLayerfrom #589, but also thecommonarguments forDotLayerfrom #569.Also, some way to set dim tags (e.g.
set_dim_tags: Dict[str, DimensionTag]forReinterpretDataLayer. I'm not sure, didn't we add this already somewhere?)We might also want more tests than this: e.g. some with positional encoding.