Releases: PolymathicAI/the_well
v1.2.0
The Well v1.2.0 (Nov 11, 2025)
Enhancements
Metrics
Added the following metrics:
- Histogram W1 (2e035d7)
- Dynamic time warping (51e517d)
- Normalized L1 (0b7bb44)
The first two add some metrics that evaluate results between full trajectories rather than frames at varying levels of granularity. W1 can be useful for tracking whether fields have the correct distribution of intensities, though ignores local structure. DTW on the other hand can be useful for evaluating whether trajectories are simply offset by a difference in propogation speed.
Additionally:
- Added explicit epsilon parameter to metrics in the forward pass (c73b1cb)
- Updated the default video code to be slightly more adaptive and parameterizable (6911c98)
Data utilities
Tools for normalizing predictions between methods:
- Min time prediction option (ae93512) - Allows user to specify what the first step to predict is for full trajectory rollouts. Useful for comparing models with different context lengths, particularly for datasets which transition between different regimes.
- Add option to restrict number of trajectories or samples for limited training experiments (e41ded5) - Restricts data to a random subset of the original data in order to simulate low-data settings.
- Option to pass unnormalized time (eee8405) - allows passing unnormalized time - mostly useful here for writing tests. Note, be wary of using this to overfit models to specific parts of dataset with regime transitions. No data in the Well are truly time-dependent, the fact that time may be useful here is an artifact of dataset construction.
Misc
- Requirement updates to avoid wandb issue - 5b86f01
- Synced history between dev and prod.
Fixes
- Fixed HF path lookup which was broken in the normalization concatenation (e6a36c1)
Public contributions
v1.1.0
The Well v1.1.0 (April 4, 2025)
Enhancements
Metrics (944975e)
Added the following metrics:
- Mean absolute error
- Pearson Correlation
Updated the expected path structure for Well-formatted non-Well data to match Well data (944975e)
Users can now pass paths pointing to non-Well data produced in the Well format to use in the existing pipeline.
Augmentations (944975e)
Added the following augmentation options in a tensor-law consistent manner:
- Rotations
- Resize
Refactoring to allow for easier modification (944975e)
Users can now more easily extend the Well dataset objects for their own purposes as the function of __get_item__ has now been split into a small number of sub-components. Extension now only requires users to replace the components needed in their workflow rather than the full object.
Plotting updates (944975e)
- Power spectrum now generated for last timestep rather than average
- Bug fix in 3D (sliced) video
- Update "padding" BC mode to handle nD padding masks
Download dataset statistics along data (#22)
Previously, the statistics were included in the Github repo but not the data itself. This update now automatically downloads the statistics yaml files if they do not already exist in the given path.
Hugging Face integration
Delta vs full training within (18fcd05)
Now users can decide if they want to train models to predict the time step such that
Normalization changes also within (18fcd05)
Dataset normalization has been extended to include new options for both full and delta prediction. These now use parameterizable normalization modules for easy extension.
RMS and delta statistics addition (944975e)
Additionally, we've added new normalization options beyond what was in the paper including normalization by the root-mean-square of the field and applying normalization based on delta statistics.
Fixes
Bug fixes (18fcd05)
- Fix AFNO and AViT for non-square data
- Fixed normalization bug in test dataset. Now normalization is applied correctly in test rollouts.
Data fixes
- Added Rayleigh-Benard Uniform and fixed
xdimension in existing data - Previously, thexdimension in Rayleigh-Benard was listed as being uniformly spaced. This data was actually sampled on Chebychev nodes so this spacing was not accurate. This has been addressed in two ways: 1.xdimension in the existing dataset has been updated to accurately reflect the spacing. 2. We have added a newrayleigh_benard_uniformthat has been resampled onto a uniform grid from the underlying spectral representation of the solver. This is to ensure both backward compatibility while also enabled analysis of the uniformly spaced data. - Replaced outlier trajectory in
acoustic_scattering_inclusions- acoustic_scattering_inclusions chunk 11, position 47 had a corrupted trajectory. This has been replaced with a newly simulated valid example. - Fix
shear_flowdocumentation (#13)