Right now, batch normalization is applied using the minibatch samples directly (both in training, and test).
We need to fine tune the batch normalization parameters (eg: lower the decay) in order to have good moving average estimates.
Once these parameters are tuned, moving averages will be used during evaluation.