Can I know if possible the difference between sg_target and gradient in your code. Mainly which one back propagate the synthetic gradient all the way back? Is it the future_grad?
I know this is not an issue but I would like to re-write the code on my own in tensorflow.
Any help is much appreciated!!