Thank you for your great work!
I'm conducting various experiments to condition the decoder on observations.
In your ablation study for the observation-conditioned decoder, were all hyperparameters same with the released code? Also, how were the observation tokens constructed?
In some conditioning experiments, I've observed cases where the autoencoder's grad_norm increases. Could this indicate potential issues with training?
Congratulations on having your paper accepted at a top conference!