Need better explanation for initializing weights in the neural network from the gaussian distribution.