Why did I lose=nan after training for a while? My input and output channels are 30, and the sequence length is 10