-
Notifications
You must be signed in to change notification settings - Fork 412
Description
I hope this is the appropriate venue to post this. I don't have an implementation yet, but maybe this ticket could encourage some work.
I am currently interested in this stochastic depth paper:
http://arxiv.org/pdf/1603.09382v2.pdf
I was going to have a go in implementing this, but I was a bit stumped as to how one would go about the identity transform that is mentioned in equation (2). As you can see, if the next layer and the current layer have different output shapes, you need to linearly project the output of the current layer so that it matches the dimensions of the output of the following layer. I'm not clear on how this is done and am afraid it's blatently obvious... is your "projection matrix" (or whatever it's called) a matrix (of some appropriate shape) consisting solely of ones? Furthermore, how would we do this for convolution networks?
It seems like that's the only roadblock for me -- the binomial mask is easy to do.
Let me know what you think.
PS: Interesting, I found a post asking on how to go about implementing this, but it seems to omit the identity transform: