for example the train loss for total_loss as to 4.0 ? why this condition happen? do you divide batch_size ? please help me