Make sure that the gradients are correct and the variance becomes smaller when the batch size gets larger. For each model, write an example use case.