OrthoSAM: Multi-Scale Extension of SegmentAnything for River Pebble Delineation from Large Orthophotos
This work is currently under review. Please cite as: Chan, V., Rheinwalt, A., Bookhagen, B.: OrthoSAM: Multi-Scale Extension of SegmentAnything for River Pebble Delineation from Large Orthophotos, in review PDF | Preprint.
The three example orthomosaics from the Ravi river in the western Himalaya described in the manuscript and used as examples in this repository are available on Zenodo https://doi.org/10.5281/zenodo.16567549
Sediment characteristics and grain-size distribution are crucial for understanding natural hazards, hydrologic conditions, and ecosystems. Traditional methods for collecting this information are costly, labor-intensive, and time-consuming. To address this, we present OrthoSAM, a workflow leveraging the Segment Anything Model (SAM) for automated delineation of densely packed pebbles in high-resolution orthomosaics.
Our framework consists of a tiling scheme, improved input (seed) point generation, and a multi-scale resampling scheme. Validation using synthetic images shows high precision close to 1, a recall above 0.9, with a mean IoU above 0.9. Using a large synthetic dataset, we show that the two-sample Kolmogorov-Smirnov test confirms the accuracy of the grain size distribution. We identified a size detection limit of 30 pixels in diameter that corresponds to 700 pixels for a circular object. Applying OrthoSAM to orthomosaics from the Ravi River in India, we delineated 6087 pebbles with high precision and recall. The resulting grain dataset includes measurements such as area, axis lengths, perimeter, RGB statistics, and smoothness, providing valuable insights for further analysis in geomorphology and ecosystem studies.
- Setup Guide
- Code and Examples
- Several example Jupyter Notebooks to process individual images or large orthomosaics either locally on your desktop system or on Google Colab. We also provide several videos that will guide you through the processing steps.
- Python 3.12+
For required packages, please see requirements.txt. This project was developed and tested using Python's built-in virtual environment module, venv.
Additionally, the code requires python>=3.8, pytorch>=1.7 and torchvision>=0.8. The installation instructions can be found here.
Use git clone https://github.com/UP-RS-ESP/OrthoSAM.git to download the repository (skip this step if you already have done it):
git clone https://github.com/UP-RS-ESP/OrthoSAM.git
cd OrthoSAM- Install environment:
conda create -y -n OrthoSAM -c conda-forge python=3.12 pip ipython jupyterlab numpy pandas numba scipy scikit-learn scikit-image matplotlib cupy pytorch torchvision
conda activate OrthoSAM- Install requirements from github repository:
pip install -r requirements.txt && orthosam-setup- Install conda kernel for jupyter lab. Make sure that you are not in the OrthSAM subfolder, because there is a conflict with the `code`` directory:
cd ~
python -m ipykernel install --user --name=OrthoSAM- Create a virtual environment
python -m venv venv
- Activate the virtual environment
On macOS/Linux:
source venv/bin/activate
On Windows:
venv\Scripts\activate
- To install all required packages and setup OrthoSAM:
pip install -r requirements.txt && orthosam-setup
If all dependencies have already been installed (e.g. via conda or a virtual environment), you can install OrthoSAM directly from the repository root:
pip install -e . && orthosam-setup
orthosam-setup performs the following setup tasks after installing OrthoSAM:
-
Create config.json in
OrthoSAM/config.json- This sets absolute paths for OrthoSAM. For more details on the configuration file, see the Configuration file section below.
-
Create required directories
OrthoSAM/MetaSAM/for storing model checkpoints.data/andoutput/for inputs and outputs.
-
Download default SAM model checkpoints
sam_vit_h_4b8939.pthsam_vit_l_0b3195.pthsam_vit_b_01ec64.pth
Note: This command must be run from the repository root directory.
To verified your installation:
python -c "import OrthoSAM; print('OK')"
config.json can be used to specify directory paths. This is also the file to specify which checkpoint to use. If you wish set any default parameter, it can be added to config.json. Please note that parameters defined in the script has the priority.
As processing time can be long when dealing with large images, we have added a notification function using Discord Webhook. In order to enable this function, set 'Discord_notification' to True in the parameters or the configuration file.
Please follow these steps to setup Discord notification.
- Go to the Discord channel where you would like the notification be sent to. Click Edit Channel.
- Go to Integrations, Webhooks.
- If you do not already have a Webhook, click New Webhook and then Copy Webhook URL.
- Create DWH.txt in the OrthoSAM/OrthoSAM directory to store your Webhook URL.
echo "your_webhook_url_here" > OrthoSAM/DWH.txtTo use the Ravi River orthomosaics from the western Himalaya presented in our manuscript, download data from https://doi.org/10.5281/zenodo.16567549 or use your own data.
OrthoSAM_Notebook_Ravi_crop_local.ipynb: Example walk through for processing a cropped version from the Ravi River orthomosaic (processing time: a few minutes) YouTube VideoOrthoSAM_Notebook_Sedinet_local.ipynb: Example walk through for processing a single image from the SediNet library.OrthoSAM_Notebook_Sedinet_Colab.ipynb: Example walk through using Google Colab for processing. Note that the Google Colab version without additional computational resources will not be able to process large images. YouTube VideoOrthoSAM_Notebook_imagegrains_local.ipynb: Example to process an image from the imagegrains library (processing time: a few minutes)Example_generate_synthetic_pebble_images.ipynb: Example of the pebble image generator with shadows to create large training datasets. YouTube VideoOrthoSAM_notebook.ipynb: General instruction of how to create parameters and run OrthoSAM (select a single image).OrthoSAM_presets.ipynb: Examples of using OrthoSAM with predefined parameters for large orthomosaics and for compact fine objects. When using predefined parameters, only output directory, input path and resolution are required.OrthoSAM_with_create_para.py: Script to create parameters and run OrthoSAM.generate_synthetic_images.ipynb: General instruction of how to generate synthetic pebble images.
Core.py: Contains the main function to run OrthoSAM, and functions with predefined parameters for large orthomosaics and compact fine objects (see OrthoSAM_presets.ipynb).synthetic_generator.ipynb: Contains the function to generate synthetic pebble images.
para_helper.ipynb: Parameter assistance.update_config.py: Update data directory and checkpoint directory path.utility.py: Utility functions.
config.json: Configuration file to define model type, checkpoint directory, data directory, output directory, and any default parameters.DWH.txt: File to store your Discord webhook URL.
ImageGrains_outputs.zip: Compressed archive containing segmentation outputs produced by OrthoSAM on the ImageGrains dataset. ImageGrains is publicly available and can be accessed at https://zenodo.org/records/8005771.SediNet_outputs.zip: Compressed archive containing segmentation outputs produced by OrthoSAM on the SediNet dataset. SediNet is publicly available dataset and can be accessed at https://github.com/DigitalGrainSize/SediNet.
We have developed our framework based on the Meta AI Segment-Anything model. For more details regarding the model please visit their Github: https://github.com/facebookresearch/segment-anything



