-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Firstly, thank you for your excellent work on extending your PLNet framework to a HNN structure. The theoretical results look really clean and and I have been working on replicating your results on alternative datasets.
I have been adapting your Port-Hamiltonian structure (Section III-B of the paper) to a non-autonomous Input-Output (I/O) modeling setting, where I use an encoder to derive latent state embeddings from windowed measurements of inputs and outputs. However, I have encountered two problems and I was wondering if you have noticed these during your works as well:
- I have noticed significant training instability with the Stable Port-Hamiltonian Neural Dynamics (pH-SHND) model from I/O data. In particular, there seems to be frequent gradient explosions after initially what are low validation/training losses (an example is given below):
Training model: IO_pH_SHND
Epoch 1/100 | Train Loss: 0.075884 | Val Loss: 0.047648
Epoch 10/100 | Train Loss: 0.001505 | Val Loss: 0.001205
Epoch 20/100 | Train Loss: 0.000694 | Val Loss: 0.002039
Epoch 30/100 | Train Loss: 0.000336 | Val Loss: 0.000529
Epoch 40/100 | Train Loss: 0.001453 | Val Loss: 0.000416
Epoch 50/100 | Train Loss: 0.000158 | Val Loss: 0.000122While I can reduce the chances of this happening with careful (and precise) tuning of the cosine learning rate scheduler and implementing early stopping - compared to architectures like embedding the Input Convex Neural Network (ICNN) within an encoder-decoder structure, my first impressions seems to be that the Port-Hamiltonian SHND is particularly fragile.
I would appreciate any insights into whether this fragility has been observed on your end, or whether the model is known to be more sensitive in a non-autonomous context.
- A more theoretical question -Does introducing an encoder (mapping windowed observations to a latent state) compromise the guarantees around passivity, dissipativity, or contraction? My intuition says that if the encoder is fixed or learned jointly, it should still preserve the structural constraints of the overall input-output dynamics as the encoder is essentially acting as a state estimator here instead of actually adding something to the overall structure. But I would love to hear your thoughts regarding this.
Thanks for your time!