Fine tune Batch normalization parameters

Right now, batch normalization is applied using the minibatch samples directly (both in training, and test).

We need to fine tune the batch normalization parameters (eg: lower the decay) in order to have good moving average estimates.

Once these parameters are tuned, moving averages will be used during evaluation.