I just test your codes and it turns out that if I use the code
.chunk(num_patches, dim=0)
You are actually splitting different samples in one group and encourage them to be similar. What you should do is set dim=1 to group augmented views from one sample