Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 4 additions & 2 deletions notebooks/Gallery.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -35,12 +35,14 @@
"\n",
"These tutorials can be run on a 4GB GPU using relatively low volumes of data (3-10GB). They will also work in HPC environments.\n",
"\n",
"| Title | Description | Image | Notebooks | Last Tested |\n",
"| Topic | Description | Image | Notebooks | Last Tested |\n",
"|-------|--------------|-------|-------------|-------------|\n",
"| **Simplified weather model** | Train a reduced-size weather model on a standard GPU with fetchable dataset | ![Image showing FourCastMini prediction outputs](https://pyearthtools.readthedocs.io/en/latest/_images/notebooks_tutorial_FourCastMini_Demo_18_1.png) | [Train and run a simplified global weather model (low hardware and data requirements)](./tutorial/FourCastMini_Demo.ipynb) | 18 Aug 2025 |\n",
"| **MLX Demo** | Shows how to integrate PyEarthTools with a non-PyTorch framework (Apple MLX) optimised for M-series chips | ![Image showing weather model outputs from MLX demo](https://pyearthtools.readthedocs.io/en/latest/_images/notebooks_tutorial_MLX-Demo-Custom-Arch_13_1.png) | [MLX Framework Example](./tutorial/MLX-Demo-Custom-Arch.ipynb) | 8 Jun 2025 | \n",
"| **Convolutional Neural Net on ERA5** | Shows all steps to train a CNN on ERA5, running on CPU or a standard GPU | ![Image showing weather model outputs](https://pyearthtools.readthedocs.io/en/latest/_images/notebooks_tutorial_CNN-Model-Training_44_1.png) | [End-to-end CNN Training Example](./tutorial/CNN-Model-Training.ipynb) | 25 Aug 2025 |\n",
"| **Radar Visualisation** | Shows how to visualise radar data as a time-series, in 2D and in 3D | ![Image showing a top down view of radar data](https://pyearthtools.readthedocs.io/en/latest/_images/notebooks_RadarVisualisation_10_1.png) | [Radar Visualisation](./RadarVisualisation.ipynb) | 23 Aug 2025 |\n"
"| **Radar Visualisation** | Shows how to visualise radar data as a time-series, in 2D and in 3D | ![Image showing a top down view of radar data](https://pyearthtools.readthedocs.io/en/latest/_images/notebooks_RadarVisualisation_10_1.png) | [Radar Visualisation](./RadarVisualisation.ipynb) | 23 Aug 2025 |\n",
"| **LUCIE Climate Model** | Train a climate model | (no image) | [LUCIE-Training](./tutorial/LUCIE/LUCIE-Training.ipynb) | 13 Nov 2025 |\n",
"| **LUCIE Climate Model** | Make predictions from a climate model | (no image) | [LUCIE-Inference](./tutorial/LUCIE/LUCIE-Inference.ipynb) | 13 Nov 2025 |\n"
]
},
{
Expand Down
147 changes: 147 additions & 0 deletions notebooks/tutorial/LUCIE/LUCIE-Inference.ipynb

Large diffs are not rendered by default.

194 changes: 194 additions & 0 deletions notebooks/tutorial/LUCIE/LUCIE-Training.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,194 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "a3575c36-ed8d-4bae-ab90-aefe441949f9",
"metadata": {},
"source": [
"# Training the LUCIE model\n",
"\n",
"LUCIE is a climate model developed by Haiwen Guan, Troy Arcomano, Ashesh Chattopadhyay and Romit Maulik (2024). See their preprint at https://doi.org/10.48550/arXiv.2405.16297 and the archive of their training data, code and results here https://doi.org/10.5281/zenodo.15164648.\n",
"\n",
"The code in PyEarthTools was based on their code repository at https://github.com/ISCLPennState/LUCIE, which is made available under the MIT license (see the PyEarthTools NOTICE file for full information on this point)\n",
"\n",
"LUCIE is a model which of interest to climate researchers due to its long-term stability for rollouts for many decades. This model is licensed in a compatible fashion, so we are able to provide a bundled, customised version of LUCIE which can be used within the PyEarthTools framework, integrated with its data pipelines and configurable to work flexibly.\n",
"\n",
"We have only just begun the process of this integration, and so for now the model does not make extensive use of the PyEarthTools classes. This is expected to change fairly quickly, and as this happens, this notebook will be updated. However, in the interests of providing the bundled version to the community as soon as possible for those already seeking to work with the model, we present it in a \"work in progress\" fashion.\n",
"\n",
"You need to manually download the original published dataset from Zenodo, and update the paths in this notebook to point to them. The initial focus will be on reproducing the paper fairly closely using the same data and only slightly modified code (changes to support more devices and updates for compatibility), true enough to the original. Subsequently, we will develop the code further to be adaptable to new data sources.\n",
"\n",
"The intention is to:\n",
" - [done] Supply the source code to train and run the model in PyEarthTools\n",
" - [done] Validate that the model can train without obvious code-level errors\n",
" - Validate inference and reproduce the training results to ensure the trained model is valid\n",
" - Support library updates and other changes\n",
" - Support multiple ML backends beyond CUDA\n",
" - Support connection to multiple data sources through PET data accessors\n",
" - Move the normalisation into a PET pipeline so it can be easily modified and experimented with\n",
"\n",
"If you would like to know more, or get involved with this work, please [let us know on the issue tracker](https://github.com/ACCESS-Community-Hub/PyEarthTools/issues/211)\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "e5068eca-cfcc-4dec-bf88-8b1fb870dc3b",
"metadata": {},
"outputs": [],
"source": [
"import lucie\n",
"import torch"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "f69a338a-ff4e-465f-a664-cd76630baa52",
"metadata": {},
"outputs": [],
"source": [
"from pathlib import Path\n",
"import numpy as np"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "b81e2c08-bf62-49fc-9090-0595cbfd24ab",
"metadata": {},
"outputs": [],
"source": [
"device = torch.device(\"mps\" if torch.backends.mps.is_available() else \"cpu\")\n",
"device = torch.device(\"cuda:0\" if torch.cuda.is_available() else device)"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "4180dd8c-ff64-466b-b3bc-9771b2053a57",
"metadata": {},
"outputs": [],
"source": [
"regridded_path = Path.home() / 'dev/data/lucie' / 'era5_T30_regridded.npz'"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "7f5ca64a-87c8-4cae-a2e3-3a4788066a73",
"metadata": {},
"outputs": [],
"source": [
"regridded_data = lucie.train.load_data(regridded_path)"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "b53d4754-4303-4325-801a-afa626aac582",
"metadata": {},
"outputs": [],
"source": [
"preprocessed_path = Path.home() / 'dev/data/lucie' / 'era5_T30_preprocessed.npz'\n",
"preprocessed_data = np.load(preprocessed_path)"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "22148013-b8d6-40c7-8c11-9f8545295b85",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Starting Training\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
" 0%| | 0/2 [00:00<?, ?it/s]\n",
" 0%| | 0/50 [00:00<?, ?it/s]\u001b[A\n",
" 6%|███████▏ | 3/50 [00:00<00:01, 29.47it/s]\u001b[A\n",
" 32%|█████████████████████████████████████▊ | 16/50 [00:00<00:00, 84.68it/s]\u001b[A\n",
" 58%|███████████████████████████████████████████████████████████████████▊ | 29/50 [00:00<00:00, 103.67it/s]\u001b[A\n",
"100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 50/50 [00:00<00:00, 104.68it/s]\u001b[A\n",
" 50%|████████████████████████████████████████████████████████████ | 1/2 [00:19<00:19, 19.24s/it]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"2 year rollout bias tensor(nan, device='mps:0')\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"\n",
" 0%| | 0/50 [00:00<?, ?it/s]\u001b[A\n",
" 24%|████████████████████████████ | 12/50 [00:00<00:00, 118.86it/s]\u001b[A\n",
" 50%|██████████████████████████████████████████████████████████▌ | 25/50 [00:00<00:00, 121.39it/s]\u001b[A\n",
"100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 50/50 [00:00<00:00, 121.63it/s]\u001b[A\n",
"100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:38<00:00, 19.23s/it]\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"2 year rollout bias tensor(nan, device='mps:0')\n"
]
}
],
"source": [
"model = lucie.train.load_data_and_train(device, regridded_data, preprocessed_data, debug_sample_limit=50, n_epochs=2)"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "9f8f25eb-90d4-4737-8ef0-0be54617280f",
"metadata": {},
"outputs": [],
"source": [
"torch.save(model.state_dict(), \"model.pth\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "d1b20be2-074e-4896-aa6f-0c259b5bd222",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.13.9"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Loading
Loading