Skip to content

cvl-umass/wildsat

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

WildSAT: Learning Satellite Image Representations from Wildlife Observations

arXiv preprint ICCV paper

This is the official repository for the paper "WildSAT: Learning Satellite Image Representations from Wildlife Observations". overview

Overview

Species distributions encode valuable ecological and environmental information, yet their potential for guiding representation learning in remote sensing remains underexplored. We introduce WildSAT, which pairs satellite images with millions of geo-tagged wildlife observations readily-available on citizen science platforms. WildSAT employs a contrastive learning approach that jointly leverages satellite images, species occurrence maps, and textual habitat descriptions to train or fine-tune models. This approach significantly improves performance on diverse satellite image recognition tasks, outperforming both ImageNet-pretrained models and satellite-specific baselines. Additionally, by aligning visual and textual information, WildSAT enables zero-shot retrieval, allowing users to search geographic locations based on textual descriptions. WildSAT surpasses recent cross-modal learning methods, including approaches that align satellite images with ground imagery or wildlife photos, demonstrating the advantages of our approach.

Setup the environment

  1. Create a conda environment: conda create -n wildsat python=3.9
  2. Activate the environment: conda activate wildsat
  3. Install required packages pip install -r requirements.txt

Quickstart

This shows how to extract features from satellite images and use them for retrieving relevant images.

  1. Activate your environment and download the required package for GritLM: pip install gritlm
  2. Download our sample model here. This is an ImageNet pre-trained ViT-B/16 model that is further fine-tuned with WildSAT.
  3. Download a small set of data here
  4. Run the notebook quickstart.ipynb
  • Make sure to specify the location of the sample data downloaded in the previous step, and the location of the checkpoint in step 2

Dataset

  1. Download the Sentinel satellite images from SatlasPretrain here
  2. Download the Wikipedia data from LE-SINR here. Place it in data/wiki_data_v4.pt
  3. Download the bioclimatic variables here. Place it in data/bioclim*.npy. This is used by SINR to extract location features.
  4. Download the mapping between satellite images, location, and text here. Place it in data/dataloader_data.npy

A sample code is provided for visualizing the dataset in data_explore.ipynb

Training the model

  1. Make sure all components of the dataset has been downloaded (see Dataset)
  2. Download all pre-trained model checkpoints. Extract it and place it in wildsat/checkpoints/*. This is needed for the different pre-trained models as the starting point. This is not needed if you want to start from a randomly initialized model or an ImagetNet pre-trained model.
  3. Run training for a randomly initialized RN50 model: python train.py --satellite_encoder "resnet50" --satellite_notpretrained. For other model options see the table below.
Architecture Pre-training Training command Checkpoint when fully trained with WildSAT
ViT-B/16 ImageNet1k python train.py --satellite_encoder "vitb16" --use_bnft --is_tunefc link
ViT-B/16 CLIP python train.py --satellite_encoder "vitb16" --satellite_encoder_ckpt "clip" --lora_layer_types 'attn.k_proj' 'attn.v_proj' 'attn.q_proj' 'attn.out_proj' 'visual_projection' --use_lora --use_dora link
ViT-B/16 Prithvi python train.py --satellite_encoder "vitb16" --satellite_encoder_ckpt "prithvi" link
ViT-B/16 SatCLIP python train.py --satellite_encoder "vitb16" --satellite_encoder_ckpt "checkpoints/satclip/satclip-vit16-l10.ckpt" link
ViT-B/16 None (Random) python train.py --satellite_encoder "vitb16" --satellite_notpretrained link
Swin-T ImageNet1k python train.py --satellite_encoder "swint" --use_bnft --is_tunefc link
Swin-T Satlas python train.py --satellite_encoder "swint" --satellite_encoder_ckpt "satlas-backbone" link
Swin-T None (Random) python train.py --satellite_encoder "swint" --satellite_notpretrained link
RN50 ImageNet1k python train.py --satellite_encoder "resnet50" --use_bnft --is_tunefc link
RN50 MoCov3 python train.py --satellite_encoder "resnet50" --satellite_encoder_ckpt "checkpoints/moco_v3/r-50-100ep.pth.tar" --use_bnft --is_tunefc link
RN50 SatCLIP python train.py --satellite_encoder "resnet50" --satellite_encoder_ckpt "checkpoints/satclip/satclip-resnet50-l10.ckpt" link
RN50 Satlas python train.py --satellite_encoder "resnet50" --satellite_encoder_ckpt "satlas-backbone" link
RN50 SeCo python train.py --satellite_encoder "resnet50" --satellite_encoder_ckpt "checkpoints/seco/seco_resnet50_100k.ckpt" link
RN50 None (Random) python train.py --satellite_encoder "resnet50" --satellite_notpretrained link

Citation

If you found this helpful, please cite our paper:

@inproceedings{daroya2025wildsat,
  title={WildSAT: Learning Satellite Image Representations from Wildlife Observations},
  author={Daroya, Rangel and Cole, Elijah and Mac Aodha, Oisin and Van Horn, Grant and Maji, Subhransu},
  booktitle={IEEE/CVF International Conference on Computer Vision},
  year={2025}
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors