Code example for "Latent diffusion models for parameterization and data assimilation of facies-based geomodels"[https://arxiv.org/abs/2406.14815] and "Latent diffusion models for parameterization of facies-based geomodels and their use in data assimilation" [https://doi.org/10.1016/j.cageo.2024.105755]
We present a deep-learning geological parameterization for complex facies-based geomodels, using recently developed generative latent diffusion models (LDMs), first published by Rombach et al. (2022). Diffusion models are trained to ''denoise'', which enables them to generate new geological realizations from input fields characterized by random noise. Based on Denoising Probabilstic Diffusion Models (DDPMs), introduced by Ho et al. (2020), the LDM representation reduces the number of variables to be determined during history matching, while preserving realistic geological features in posterior models. The model developed in this work includes a variational autoencoder (VAE) for dimension reduction, a U-net for the denoising process, and a Denoising Diffusion Implicit Model (DDIM, Song et al. (2021)) noise scheduling for inference.
Our application involves conditional 2D three-facies (channel-levee-mud) systems. The LDM can provide realizations that are visually consistent with samples from geomodeling software. General agreement between the diffusion-generated models and reference realizations can be observed through quantitative metrics involving spatial and flow-response statistics. The smoothness of the parameterization method can be assessed through latent-space interpolation tests. The LDM can then be used for ensemble-based data assimilation. Significant uncertainty reduction, posterior P10-P90 forecasts that generally bracket observed data, and consistent posterior geomodels, can be achieved.
scripts/- Directory to store dataset for data preparation, variational autoencoder (VAE) training and U-net training.pyscripts, as well as geomodel generation code.data/- Directory to store training dataset used in this study (2D, three-facies channelized geomodels). Dataset is stored as datasets.Dataset folder (diffusers_dataset/). Alternatively, data is stored at the link [https://drive.google.com/drive/folders/1JCaaaJOvfReaqPbIBVtVnPAH7TVc4AA5?usp=sharing]testing/- Directory to store reference (geomodeling software-generated)m_petrel_200.npyand LDM-generatedm_ldm_200.npyensembles used for flow simulations and history matching. Synthetic "true" models used in history matching are saved asm_true_1.npyandm_true_2.npy. Both are stored as.npyfiles.
Code implementations are based on the following repositories:
Running the scripts requires the libraries datasets, diffusers, monai or monai-generative.
This workflow is tested with Python 3.9 and PyTorch 1.8 (CPU/GPU).
Guido Di Federico, Louis J. Durlofsky
Department of Energy Science & Engineering, Stanford University
Contact: gdifede@stanford.edu



