MASF: A physics-informed unsupervised refinement Multitemporal Anti-Speckle Framework for SAR denoising
A Deep Learning Framework for Speckle Noise Suppression in Sentinel-1 GRD SAR Imagery using Physics-Informed Learning and Attention Mechanisms
Read in Portuguese (Leia em Português)
MASF (Multitemporal Anti-Speckle Framework) is a comprehensive and modular deep learning pipeline designed for the suppression of speckle noise in Sentinel-1 Synthetic Aperture Radar (SAR) imagery. The framework leverages a two-stage training process, combining supervised learning on multi-temporal targets with an optional unsupervised, physics-informed refinement stage. It integrates modern attention-based backbones (Swin Transformer or Mamba) within a U-Net architecture and supports multi-task learning for simultaneous image reconstruction and noise mask prediction. The entire pipeline, from data pre-processing with ESA SNAP to model evaluation and inference, is designed to be reproducible and extensible for academic research.
- Key Features
- Methodology and Pipeline
- Project Structure
- Installation
- Usage: Running the Pipeline
- Model Architectures
- How to Cite
- License
- Contact
This framework integrates a wide range of modern techniques for robust SAR image denoising:
Easily switch between a Swin Transformer (MASF-Swin) for spatial attention and a Mamba (MASF-Mamba) State Space Model (SSM) for linear-time sequence modeling, both integrated as a U-Net bottleneck.
- Supervised Training: The model is initially trained using a composite perceptual loss to match a clean target image.
- Unsupervised Refinement: An optional second stage refines the model using only noisy images and a physics-informed loss function, enforcing statistical properties of the speckle noise without a ground truth.
Simultaneously train the model to reconstruct the clean image and predict the residual noise mask, which can be weighted to balance objectives.
In the refinement stage, the model is constrained by losses that enforce known physical properties of speckle noise, such as zero-mean and constant variance in the log-domain.
Implements SkipGate modules to intelligently fuse features from the encoder to the decoder, improving gradient flow and feature recombination.
- Automated pre-processing of raw Sentinel-1 .zip files using ESA SNAP graphs
- Robust multi-temporal target generation using methods like simple averaging, Stroobants, or Quegan filters
- Efficient Dataset and DataLoader with on-the-fly patch extraction and data augmentation
Includes scripts to quantitatively and qualitatively compare the model against classical speckle filters (e.g., Lee, Frost, Gamma MAP) using a wide range of referential and non-referential metrics.
A complete script to apply a final trained model to new, full-scene Sentinel-1 products, handling patching and seamless reconstruction.
The project includes a full suite of pytest unit and integration tests to ensure the reliability of each component.
The project follows a structured, multi-stage pipeline from raw data to a refined, denoised image product.
Raw Sentinel-1 GRD products (.zip files) are processed using the SNAP Graph Processing Tool (GPT). The provided graphs (preprocess_graph_with_dem.xml, preprocess_graph_without_dem.xml) perform the following steps:
- Apply-Orbit-File
- ThermalNoiseRemoval
- Remove-GRD-Border-Noise
- Radiometric Calibration to Sigma0
- Ellipsoid Correction
- (Optional) Add Elevation band from SRTM 1-Sec DEM
The pre-processed scenes for a single geographic location are co-registered to a master scene. A clean "ground truth" target image is then generated by applying a multi-temporal filter (e.g., temporal median) across the time-series stack. This averages out the speckle noise, creating a stable reference. The individual co-registered scenes serve as the noisy inputs.
The model is trained in a supervised manner using the noisy scenes as input and the multi-temporal average as the target. The training process uses a PerceptualCompositeLoss which is a weighted sum of multiple loss functions to ensure high-quality visual results:
- Charbonnier Loss (L1-variant): For sharp edge reconstruction
- SSIM Loss: To preserve structural similarity
- Gradient Loss (Sobel): To maintain edge fidelity
- Frequency Loss (FFT): To ensure high-frequency details are correctly reconstructed
After supervised training, the model can be further refined in an unsupervised stage. Using only noisy images (no targets), the model is updated with a PhysicsCompositeLoss that penalizes deviations from known statistical properties of speckle noise:
- SpeckleVarianceLoss: Enforces that the variance of the residual noise is spatially constant
- SpeckleMeanLoss: Enforces that the mean of the residual noise is zero
- HistogramLoss: Forces the distribution of the residual noise to match a theoretical Gumbel distribution
.
├── config/
│ └── config.yaml # Main configuration file for the entire pipeline
├── data/
│ ├── 01_raw/ # Input for raw Sentinel-1 .zip files
│ ├── 02_processed_noisy/ # Output of SNAP pre-processing & co-registration
│ ├── 03_cleaned_targets/ # Output of multi-temporal averaging
│ ├── 04_metadata_cache/ # Cached dataset metadata for faster loading
│ ├── 05_inference_input/ # Folder for new images for inference
│ └── 06_inference_output/ # Final denoised products from inference
├── results/
│ ├── checkpoints/ # Saved model checkpoints (.pth)
│ ├── models/ # Final best models for training and refinement
│ └── plots/ # Output for visualizations and comparisons
├── scripts/
│ ├── train.py # Main training script
│ ├── refine_unsupervised.py # Unsupervised refinement script
│ ├── evaluate.py # Evaluation script for computing metrics
│ ├── inference.py # Inference script for new, full-scene images
│ ├── compare_baselines.py # Script to compare model against classical filters
│ └── ... # Other utility scripts
├── src/
│ ├── models/ # Model architecture definitions (Swin, Mamba, blocks)
│ ├── losses/ # Perceptual and physical loss function definitions
│ ├── augmentations/ # Physical and perceptual data augmentations
│ ├── utils/ # Utility functions for logging, metrics, etc.
│ └── ... # Other source files
│
├── tests/ # Pytest tests for all major components
├── environment.yaml # Conda environment file
├── pyproject.toml # Project metadata
└── README.md # This file
This project requires Conda for environment management and the ESA SNAP Toolbox for data pre-processing.
- Conda: Install Miniconda (recommended) or Anaconda
- Nvidia Drivers: Ensure you have Nvidia drivers installed that support CUDA 11.8 or higher.
- ESA SNAP: Download and install the latest version of the SNAP Toolbox. Ensure the
gptcommand-line tool is in your system's PATH.
-
Clone the repository:
git clone https://github.com/RonaldoGorgulho/MASF.git cd masf-project -
Create the Conda environment from the
environment.yamlfile. This will install all required dependencies:conda env create -f environment.yaml
-
Activate the new environment:
conda activate masf
-
Install dependencies:
pip install --upgrade pip setuptools wheel pip install -e . pip install --no-cache-dir triton==3.1.0 pip install --no-cache-dir datasets==3.6.0 pip install --no-cache-dir "causal-conv1d==1.5.0.post8" --no-build-isolation pip install ninja pybind11 pip install --no-cache-dir mamba-ssm==2.2.4 --no-build-isolation
All scripts are executed from the command line and configured via config/config.yaml.
Place your raw Sentinel-1 .zip files into data/01_raw/<location_name>/. Then, run the pre-processing script:
python scripts/preprocess_data.pyThis will execute the SNAP graphs and create co-registered, analysis-ready data in data/02_processed_noisy/ and data/03_cleaned_targets/.
Before training, calculate the mean and standard deviation of the training set for normalization:
python scripts/compute_stats.py --config config/config.yamlOptimize the model's hyperparameters using Optuna to find the best configuration for your dataset:
python scripts/hyperparam_search.py --stage supervisedVisualize the results (History, Importance, Slice Plot) and identify the best trial:
python scripts/visualize_hyperparam_search.py --storage sqlite:///results/supervised_search.db --output-dir results/plots --format htmlStart the supervised training process. You can select the model type (swin or mamba) and training mode (single-task or multi-task) in config.yaml:
python scripts/train.py --config config/config.yamlPerform unsupervised, physics-informed refinement on the best model from the previous stage:
python scripts/refine_unsupervised.py --config config/config.yamlUtilizes a Swin Transformer block in the bottleneck. This allows the model to capture non-local spatial dependencies and contextual information through its windowed self-attention mechanism, which is highly effective for image feature extraction.
Employs a Mamba (State Space Model) block in the bottleneck. Mamba is adapted for 2D data by processing image patches in multiple directions, offering a linear-time alternative to quadratic attention mechanisms while effectively modeling long-range dependencies.
Both architectures share a common convolutional encoder/decoder structure and use attention-based SkipGates for improved feature fusion.
If you use this project or its methodology in your research, please cite this work. The recommended BibTeX format is:
@misc{Gorgulho2025MASF,
author = {Gorgulho, Ronaldo},
title = {MASF: Multitemporal Anti-Speckle Framework for Sentinel-1 Imagery},
year = {2025},
publisher = {GitHub},
journal = {GitHub Repository},
howpublished = {\url{https://github.com/RonaldoGorgulho/MASF.git}}
}This project is distributed under the terms of the MIT License. For more details, see the LICENSE file in the root of the repository.
Ronaldo Gorgulho
- Email: ronaldo.gfo.santos@unesp.br
- GitHub: https://github.com/RonaldoGorgulho
The scientific foundation for the methodologies implemented in this framework is detailed in the REFERENCES file