Because of our RIB rotations, the variance node can experience arbitrary changes, including becoming negative.
"Fixes" include var = torch.abs(var). (This is actually fine -- if var becomes negative then even abs(var) is gonna be a nonsense value and the loss is gonna be terrible still.)
Think about what is least terrible and implement.