-
Notifications
You must be signed in to change notification settings - Fork 7
Description
Hi, thank you for an interesting and insightful work.
I tried to train GPT2 with your code for SIM-CoT reasoning but during training I got strange model behavior.
Firstly, as I understood from README and training configs in order to train a model for SIM-CoT you need 3 stages:
- Train in CoT regime using SFT
- Train in Coconut regime
- Train in SIM-CoT regime
I tried to change as little as possible in your code and configs when I ran my experiments. I just changed precision to bf16 and added tensorboard support. So when I trained model in CoT regime the validation loss wouldn't go below 0.6 and final validation accuracy was just 8 %. Previously, I had already conducted training of GPT2 for CoT using my own code and the model reached validation loss of 0.1 and accuracy of 40 %. I didn't manage to find the problem with your code and the reason why I couldn't reach similar metrics with it. Maybe you had similar issues?
Secondly, when I continued training in Coconut regime the loss behaviour was also strange. With each latent schedule iteration the validation loss would only increase. It reached minimal value early in the training process and after that increased to the absolute value of 1.2 by the end of training. The accuracy bahave correspondingly. It reached the value of 23 % and after that started to decrease up to 16 %. Maybe you could also clarify this beahavior?
I also launched SIM-CoT training (from standard GPT2 checkpoint) and got the same validation loss behaviour as in the case of Coconut training. Only the accuracy was just 4 %.
Thank you in advance for response!