It seems related to using accelerator.accumulate() together with DeepSpeed’s ZeRO Stage 2 (also with 1). might be due to version incompatibilities between accelerate and deepspeed? Could you plz share the specific versions of accelerate, deepspeed?