Get errors below when running the main function:
constructing SpatialTransformer_ of depth 1 w/ 512 channels and 16 heads
Attention mode 'linear' is not available. Falling back to native attention. This is not a problem in Pytorch >= 2.0. FYI, you are running with PyTorch version 2.4.0