index = torch.randint( 0, self.num_ddim_timesteps, (x.shape[0],), device=self.device ).long()
maybe the timestep with win and lose should be same.
The curve of my training process is shown below. Why is it that while my lose_diff is increasing, my win_diff is also increasing? In principle, shouldn't lose_diff continuously increase and win_diff continuously decrease

index = torch.randint( 0, self.num_ddim_timesteps, (x.shape[0],), device=self.device ).long()maybe the timestep with win and lose should be same.
The curve of my training process is shown below. Why is it that while my lose_diff is increasing, my win_diff is also increasing? In principle, shouldn't lose_diff continuously increase and win_diff continuously decrease