From 1b1f838241444a2ada1bf55114ff9ff01b480dad Mon Sep 17 00:00:00 2001 From: taeil Date: Wed, 24 Mar 2021 21:31:24 -0700 Subject: [PATCH 01/14] Add SEN12MS as sub module --- .gitmodules | 3 ++ README.md | 85 ++++++++++++++++++++++++++++----------------- SEN12MS | 1 + references/setup.md | 68 ++++++++---------------------------- 4 files changed, 73 insertions(+), 84 deletions(-) create mode 100644 .gitmodules create mode 160000 SEN12MS diff --git a/.gitmodules b/.gitmodules new file mode 100644 index 0000000..065358c --- /dev/null +++ b/.gitmodules @@ -0,0 +1,3 @@ +[submodule "SEN12MS"] + path = SEN12MS + url = git@github.com:Berkeley-Data/SEN12MS.git diff --git a/README.md b/README.md index 492d5dd..3afb1d5 100644 --- a/README.md +++ b/README.md @@ -58,54 +58,77 @@ bands_std = {'s1_std': [4.525339, 4.3586307], 1082.4341, 1057.7628, 1136.1942, 1132.7898, 991.48016]} ``` -**NEXT**, copy the pretraining template -```bash -cd src/utils -cp templates/pretraining-config-template.sh pretrain-configs/sen12ms-small.sh -# edit pretrain-configs/sen12ms-small.sh +## Pre-training with SEN12MS Dataset +[OpenSelfSup](https://github.com/Berkeley-Data/OpenSelfSup) +- see `src/utils/pretrain-runner.sh` for end-to-end run (require prep creating config files). -# once edited, generate the project -./gen-pretrain-project.sh pretrain-configs/my-dataset-config.sh -``` +Check installation by pretraining using mocov2, extracting the model weights, evaluating the representations, and then viewing the results on tensorboard or [wandb](https://wandb.ai/cal-capstone/hpt): -What just happened? We generated a bunch of pretraining configs in the following location (take a look at all of these files to get a feel for how this works): -``` -OpenSelfSup/configs/hpt-pretrain/${shortname} +Set up experimental tracking and model versioning: +```bash +export WANDB_API_KEY= +export WANDB_ENTITY=cal-capstone +export WANDB_PROJECT=hpt2 ``` -**NEXT**, you're ready to kick off a trial run to make sure the pretraining is working as expected =) +Run pre-training +```bash +cd OpenSelfSup +# (sanity check) Single GPU training on samll dataset +CUDA_VISIBLE_DEVICES=1 ./tools/single_train.sh configs/selfsup/moco/r50_v2_sen12ms_in_basetrain_20ep.py --debug + +# (sanity check) 4 GPUs training on samll dataset +CUDA_VISIBLE_DEVICES=0,1,2,3 ./tools/dist_train.sh configs/selfsup/moco/r50_v2_sen12ms_in_basetrain_20ep.py 4 + +# distributed full training +CUDA_VISIBLE_DEVICES=0,1,2,3 ./tools/dist_train.sh configs/selfsup/moco/r50_v2_sen12ms_in_fulltrain_20ep.py 4 +``` + +Extract pre-trained model ```bash -# the `-t` flag means `trial`: it'll only run a 50 iter pretraining - ./utils/pretrain-runner.sh -t -d OpenSelfSup/configs/hpt-pretrain/${shortname} +BACKBONE=work_dirs/selfsup/moco/r50_v2_sen12ms_in_basetrain_20ep/epoch_20_moco_in_baseline.pth + +# method 1: From working dir(same system for pre-training) +# CHECKPOINT=work_dirs/selfsup/moco/r50_v2_resisc_in_basetrain_20ep/epoch_20.pth + +# method 2: from W&B, {projectid}/{W&B run id} (any system) +CHECKPOINT=hpt2/3l4yg63k + +# Extract the backbone +python tools/extract_backbone_weights.py ${BACKBONE} ${CHECKPOINT} + ``` -**NEXT**, if this works, kick off the full training. NOTE: you can kick this off multiple times as long as the config directories share the same filesystem + +## Evaluating Pretrained Representations + +Using OpenSelfSup ```bash -# simply removing the `-t` flag from above - ./utils/pretrain-runner.sh -d OpenSelfSup/configs/hpt-pretrain/${shortname} +python tools/train.py $CFG --pretrained $PRETRAIN + +# RESISC finetune example +tools/train.py --local_rank=0 configs/benchmarks/linear_classification/resisc45/r50_last.py --pretrained work_dirs/selfsup/moco/r50_v2_resisc_in_basetrain_20ep/epoch_20_moco_in_basetrain.pth --work_dir work_dirs/benchmarks/linear_classification/resisc45/moco-selfsup/r50_v2_resisc_in_basetrain_20ep-r50_last --seed 0 --launcher=pytorch + + + ``` -**NEXT**, if you want to perform BYOL pretraining, add `-b` flag. + +Using Sen12ms ```bash -# simply add the `-b` flag to above. - ./utils/pretrain-runner.sh -d OpenSelfSup/configs/hpt-pretrain/${shortname} -b ``` -Congratulations: you've launch a full hierarchical pretraining experiment. -**FAQs/PROBLEMS?** -* How does `pretrain-runner.sh` keep track of what's been pretrained? - * In each config directory, it creates a `.pretrain-status` folder to keep track of what's processing/finished. See them with e.g. `find OpenSelfSup/configs/hpt-pretrain -name '.pretrain-status'` -* How to redo a pretraining, e.g. because it crashed or something changed? Remove the - * Remove the associate `.proc` or `.done` file. Find these e.g. - ```bash - find OpenSelfSup/configs/hpt-pretrain -name '.proc' - find OpenSelfSup/configs/hpt-pretrain -name '.done' - ``` -## Evaluating Pretrained Representations + +#### Previous +``` +# Evaluate the representations (NOT SURE) +./benchmarks/dist_train_linear.sh configs/benchmarks/linear_classification/resisc45/r50_last.py ${BACKBONE} +``` + This has been simplified to simply: ```bash ./utils/pretrain-evaluator.sh -b OpenSelfSup/work_dirs/hpt-pretrain/${shortname}/ -d OpenSelfSup/configs/hpt-pretrain/${shortname} diff --git a/SEN12MS b/SEN12MS new file mode 160000 index 0000000..feb3e96 --- /dev/null +++ b/SEN12MS @@ -0,0 +1 @@ +Subproject commit feb3e9602c430d53d388e496fbb22b8ff76f68db diff --git a/references/setup.md b/references/setup.md index 728009e..e030be0 100644 --- a/references/setup.md +++ b/references/setup.md @@ -31,11 +31,13 @@ conda activate hpt # NOTE: if you are not using CUDA 10.2, you need to change the 10.2 in this command appropriately. Make sure to use torch 1.6.0 # (check CUDA version with e.g. `cat /usr/local/cuda/version.txt`) -# latest + +# latest torch conda install pytorch torchvision torchaudio cudatoolkit=10.2 -c pytorch # 1.6 torch (no support for torchvision transform on tensor) conda install pytorch==1.6.0 torchvision==0.7.0 cudatoolkit=10.2 -c pytorch + #colorado machine conda install pytorch==1.2.0 torchvision==0.7.0 cudatoolkit=10.2 -c pytorch @@ -61,7 +63,7 @@ mkdir ~/data mv resisc45 ~/data # replace/set $DATA and $CODE as appropriate -# e.g., ln -s /home/ubuntu/data/resisc45 /home/ubuntu/hpt/OpenSelfSup/data/resisc45/all +# e.g., ln -s /home/ubuntu/data/resisc45 /home/ubuntu/OpenSelfSup/data/resisc45/all ln -s $DATA/resisc45 $CODE/OpenSelfSup/data/resisc45/all e.g., ln -s /home/ubuntu/data/resisc45 /home/ubuntu/hpt/OpenSelfSup/data/resisc45/all @@ -69,8 +71,8 @@ e.g., ln -s /home/ubuntu/data/resisc45 /home/ubuntu/hpt/OpenSelfSup/data/resisc4 ### Download Pretrained Models ``` shell -cd OpenSelfSup/data/basetrain_chkpts/ -./download-pretrained-models.sh +mkdir OpenSelfSup/data/basetrain_chkpts +tools/download-pretrained-models.sh ``` ## Verify Install With RESISC DataSet @@ -82,18 +84,22 @@ Check installation by pretraining using mocov2, extracting the model weights, ev ```bash export WANDB_API_KEY= export WANDB_ENTITY=cal-capstone -export WANDB_PROJECT=hpt2 +export WANDB_PROJECT=hpt3 #export WANDB_MODE=dryrun + + + + cd OpenSelfSup # Sanity check with single train and single epoch -CUDA_VISIBLE_DEVICES=1 ./tools/single_train.sh configs/selfsup/moco/r50_v2_resisc_in_basetrain_20ep.py --debug +CUDA_VISIBLE_DEVICES=x ./tools/single_train.sh configs/selfsup/moco/r50_v2_resisc_in_basetrain_20ep.py --debug -CUDA_VISIBLE_DEVICES=1 ./tools/single_train.sh /scratch/crguest/OpenSelfSup/configs/selfsup/moco/r50_v2_sen12ms_in_basetrain_20ep.py --work_dir work_dirs/selfsup/moco/r50_v2_sen12ms_in_basetrain_20ep/ --debug +CUDA_VISIBLE_DEVICES=x ./tools/single_train.sh configs/selfsup/moco/r50_v2_sen12ms_in_basetrain_20ep.py --work_dir work_dirs/selfsup/moco/r50_v2_sen12ms_in_basetrain_1ep/ --debug # Sanity check: MoCo for 20 epoch on 4 gpus -./tools/dist_train.sh configs/selfsup/moco/r50_v2_resisc_in_basetrain_20ep.py 4 +CUDA_VISIBLE_DEVICES=0,1,2,3 ./tools/dist_train.sh configs/selfsup/moco/r50_v2_resisc_in_basetrain_20ep.py 4 # if debugging, use tools/train.py configs/selfsup/moco/r50_v2_resisc_in_basetrain_1ep.py --work_dir work_dirs/selfsup/moco/r50_v2_resisc_in_basetrain_1ep/ --debug @@ -102,7 +108,7 @@ tools/train.py configs/selfsup/moco/r50_v2_resisc_in_basetrain_1ep.py --work_dir CHECKPOINT=work_dirs/selfsup/moco/r50_v2_resisc_in_basetrain_20ep/epoch_20.pth BACKBONE=work_dirs/selfsup/moco/r50_v2_resisc_in_basetrain_20ep/epoch_20_moco_in_basetrain.pth # Extract the backbone -python tools/extract_backbone_weights.py ${CHECKPOINT} ${BACKBONE} +python tools/extract_backbone_weights.py ${BACKBONE} ${CHECKPOINT} # Evaluate the representations ./benchmarks/dist_train_linear.sh configs/benchmarks/linear_classification/resisc45/r50_last.py ${BACKBONE} @@ -114,47 +120,3 @@ tensorboard --logdir . ``` -## Verify Install With SEN12MS Dataset -[OpenSelfSup](https://github.com/Berkeley-Data/OpenSelfSup) - -Check installation by pretraining using mocov2, extracting the model weights, evaluating the representations, and then viewing the results on tensorboard or [wandb](https://wandb.ai/cal-capstone/hpt): - -```bash -export WANDB_API_KEY= -export WANDB_ENTITY=cal-capstone -export WANDB_PROJECT=hpt2 - -cd OpenSelfSup - -# single GPU training -CUDA_VISIBLE_DEVICES=1 ./tools/single_train.sh configs/selfsup/moco/r50_v2_sen12ms_in_basetrain_20ep.py --debug - -CUDA_VISIBLE_DEVICES=1 ./tools/single_train.sh configs/selfsup/moco/r50_v2_sen12ms_in_fulltrain_20ep.py --debug - - -# command for remote debugging, use full path -python /scratch/crguest/OpenSelfSup/tools/train.py /scratch/crguest/OpenSelfSup/configs/selfsup/moco/r50_v2_sen12ms_in_fulltrain_20ep.py --debug - -CUDA_VISIBLE_DEVICES=1 python ./tools/single_train.sh configs/selfsup/moco/r50_v2_sen12ms_in_fulltrain_20ep.py --debug - -# Sanity check: MoCo for 20 epoch on 4 gpus -#CUDA_VISIBLE_DEVICES=0,1,2,3 -CUDA_VISIBLE_DEVICES=1 ./tools/dist_train.sh configs/selfsup/moco/r50_v2_sen12ms_in_basetrain_20ep.py 4 - -# distributed training -#CUDA_VISIBLE_DEVICES=0,1,2,3 -./tools/dist_train.sh configs/selfsup/moco/r50_v2_sen12ms_in_fulltrain_20ep.py 4 - -BACKBONE=work_dirs/selfsup/moco/r50_v2_sen12ms_in_basetrain_20ep/epoch_20_moco_in_baseline.pth -# method 1: from working dir -CHECKPOINT=work_dirs/selfsup/moco/r50_v2_resisc_in_basetrain_20ep/epoch_20.pth -# method 2: from W&B, {projectid}/{W&B run id} -CHECKPOINT=hpt2/3l4yg63k - -# Extract the backbone -python tools/extract_backbone_weights.py ${BACKBONE} ${CHECKPOINT} - -# Evaluate the representations -./benchmarks/dist_train_linear.sh configs/benchmarks/linear_classification/resisc45/r50_last.py ${BACKBONE} - -``` \ No newline at end of file From d388be7ccda7000d3ba87d3352959e86b6a65492 Mon Sep 17 00:00:00 2001 From: taeil Date: Wed, 24 Mar 2021 21:35:48 -0700 Subject: [PATCH 02/14] Add OpenSelfSup as sub module --- .gitmodules | 3 +++ OpenSelfSup | 1 + 2 files changed, 4 insertions(+) create mode 160000 OpenSelfSup diff --git a/.gitmodules b/.gitmodules index 065358c..7bf3595 100644 --- a/.gitmodules +++ b/.gitmodules @@ -1,3 +1,6 @@ [submodule "SEN12MS"] path = SEN12MS url = git@github.com:Berkeley-Data/SEN12MS.git +[submodule "OpenSelfSup"] + path = OpenSelfSup + url = https://github.com/Berkeley-Data/OpenSelfSup.git diff --git a/OpenSelfSup b/OpenSelfSup new file mode 160000 index 0000000..2fcf673 --- /dev/null +++ b/OpenSelfSup @@ -0,0 +1 @@ +Subproject commit 2fcf673f0f133f6ab966d018e17cc5ddc4edf80f From aea59e56ae79a95913c051caaf4d2914be3ee93d Mon Sep 17 00:00:00 2001 From: taeil Date: Wed, 24 Mar 2021 22:54:44 -0700 Subject: [PATCH 03/14] updated submodule url --- .gitmodules | 3 ++- SEN12MS | 2 +- 2 files changed, 3 insertions(+), 2 deletions(-) diff --git a/.gitmodules b/.gitmodules index 7bf3595..742d048 100644 --- a/.gitmodules +++ b/.gitmodules @@ -1,6 +1,7 @@ [submodule "SEN12MS"] path = SEN12MS - url = git@github.com:Berkeley-Data/SEN12MS.git + url = https://github.com/Berkeley-Data/SEN12MS.git + branch = taeil [submodule "OpenSelfSup"] path = OpenSelfSup url = https://github.com/Berkeley-Data/OpenSelfSup.git diff --git a/SEN12MS b/SEN12MS index feb3e96..42a01d2 160000 --- a/SEN12MS +++ b/SEN12MS @@ -1 +1 @@ -Subproject commit feb3e9602c430d53d388e496fbb22b8ff76f68db +Subproject commit 42a01d232d46614d4789215b349b81237f4078cd From 8dd800e7de7342c3d94ba4b102150cea744bbb5a Mon Sep 17 00:00:00 2001 From: taeil Date: Fri, 26 Mar 2021 22:03:39 -0700 Subject: [PATCH 04/14] added instructions --- references/evaluation.md | 32 +++++++++++++++++++++++++++++++ references/model_architectures.md | 5 +++++ 2 files changed, 37 insertions(+) create mode 100644 references/evaluation.md create mode 100644 references/model_architectures.md diff --git a/references/evaluation.md b/references/evaluation.md new file mode 100644 index 0000000..7dce8dd --- /dev/null +++ b/references/evaluation.md @@ -0,0 +1,32 @@ +## download pre-trained models + +Some of key pre-trained models are on s3 (s3://sen12ms/pretrained): +- [200 epochs w/o augmentation: vivid-resonance-73](https://wandb.ai/cjrd/BDOpenSelfSup-tools/runs/3qjvxo2p/overview?workspace=user-cjrd) +- [20 epochs w/o augmentation: silvery-oath7-2rr3864e](https://wandb.ai/cal-capstone/hpt2/runs/2rr3864e?workspace=user-taeil) +- [sen12ms-baseline: soft-snowflake-3.pth](https://wandb.ai/cal-capstone/SEN12MS/runs/3gjhe4ff/overview?workspace=user-taeil) + +``` +aws configure +aws s3 sync s3://sen12ms/pretrained . --dryrun +aws s3 sync s3://sen12ms/pretrained_sup . --dryrun +``` + +Any other models can be restored by run ID if stored with W&B. Go to files section under the run to find `*.pth` files + +#### Extract Backbone +``` +BACKBONE=work_dirs/selfsup/moco/r50_v2_sen12ms_in_basetrain_20ep/epoch_20_moco_in_baseline.pth +# method 1: from working dir +CHECKPOINT=work_dirs/selfsup/moco/r50_v2_resisc_in_basetrain_20ep/epoch_20.pth +# method 2: from W&B, {projectid}/{W&B run id} +CHECKPOINT=hpt2/3l4yg63k + +# Extract the backbone +python tools/extract_backbone_weights.py ${BACKBONE} ${CHECKPOINT} +``` + +#### Evaluate the representations :confused: :question: + +``` +./benchmarks/dist_train_linear.sh configs/benchmarks/linear_classification/resisc45/r50_last.py ${BACKBONE} +``` \ No newline at end of file diff --git a/references/model_architectures.md b/references/model_architectures.md new file mode 100644 index 0000000..da5b7d4 --- /dev/null +++ b/references/model_architectures.md @@ -0,0 +1,5 @@ +#### Key model architectures and terms: +- ResNet50_1x1: adding conv1x1 to original ResNet50 used by sen12ms +- ResNet50: original ResNet50 used by sen12ms +- Moco: original ResNet50 initialized the weight by Moco backbone +- Moco_1x1: ResNet50_1x1 initialized the weight by Moco backbone and input module \ No newline at end of file From 553a2fd38fd3d99c2511d79de8fcc2dfd586e1d2 Mon Sep 17 00:00:00 2001 From: taeil Date: Sat, 27 Mar 2021 09:48:38 -0700 Subject: [PATCH 05/14] updated instructions --- SEN12MS | 2 +- references/setup.md | 28 ++++++++++++++++++++++++++++ 2 files changed, 29 insertions(+), 1 deletion(-) diff --git a/SEN12MS b/SEN12MS index 42a01d2..80ce4ca 160000 --- a/SEN12MS +++ b/SEN12MS @@ -1 +1 @@ -Subproject commit 42a01d232d46614d4789215b349b81237f4078cd +Subproject commit 80ce4ca236e442b31d83d1e977c9f96509fcd782 diff --git a/references/setup.md b/references/setup.md index e030be0..567bd0a 100644 --- a/references/setup.md +++ b/references/setup.md @@ -120,3 +120,31 @@ tensorboard --logdir . ``` +## setup sub-modules for sen12ms and openselfsup repo + +Cloning +```console +git clone --recurse-submodules https://github.com/Berkeley-Data/hpt.git + +``` + +or alternatiely +``` +git submodule init +git submodule update +``` + +additional config +``` +git config push.recurseSubmodules on-demand +# show status including submodule +git config status.submodulesummary 1 +``` + +update +``` +git submodule update --remote +``` + +For mroe info: [7.11 Git Tools - Submodules](https://git-scm.com/book/en/v2/Git-Tools-Submodules) + \ No newline at end of file From 9372a6073da726f3f3eb5190fb8841ddb6f42071 Mon Sep 17 00:00:00 2001 From: Taeil Goh Date: Sat, 27 Mar 2021 10:01:28 -0700 Subject: [PATCH 06/14] Update model_architectures.md --- references/model_architectures.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/references/model_architectures.md b/references/model_architectures.md index da5b7d4..464bede 100644 --- a/references/model_architectures.md +++ b/references/model_architectures.md @@ -2,4 +2,5 @@ - ResNet50_1x1: adding conv1x1 to original ResNet50 used by sen12ms - ResNet50: original ResNet50 used by sen12ms - Moco: original ResNet50 initialized the weight by Moco backbone -- Moco_1x1: ResNet50_1x1 initialized the weight by Moco backbone and input module \ No newline at end of file +- Moco_1x1: ResNet50_1x1 initialized the weight by Moco backbone and input module +- Moco_1x1random: ResNet50_1x1 randomly the weight by Moco backbone and input module From 21e6e88936f6da6e1ceece3dfe3d49e383bdbab4 Mon Sep 17 00:00:00 2001 From: Taeil Goh Date: Sat, 27 Mar 2021 16:09:06 -0700 Subject: [PATCH 07/14] Update model_architectures.md --- references/model_architectures.md | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/references/model_architectures.md b/references/model_architectures.md index 464bede..b217473 100644 --- a/references/model_architectures.md +++ b/references/model_architectures.md @@ -1,6 +1,6 @@ #### Key model architectures and terms: -- ResNet50_1x1: adding conv1x1 to original ResNet50 used by sen12ms -- ResNet50: original ResNet50 used by sen12ms -- Moco: original ResNet50 initialized the weight by Moco backbone -- Moco_1x1: ResNet50_1x1 initialized the weight by Moco backbone and input module -- Moco_1x1random: ResNet50_1x1 randomly the weight by Moco backbone and input module +- ResNet50_1x1: adding conv1x1 to original ResNet50 used by sen12ms (supervised training) +- ResNet50: original ResNet50 used by sen12ms (supervised training) +- Moco: original ResNet50 initialized the weight by Moco backbone (transfer learning) +- Moco_1x1: ResNet50_1x1 initialized the weight by Moco backbone and input module (transfer learning) +- Moco_1x1random: ResNet50_1x1 randomly the weight by Moco backbone and input module (transfer learning) From ad6b17e65920e8f94c0fbd020544f9d1b2f3e75c Mon Sep 17 00:00:00 2001 From: taeil Date: Sat, 3 Apr 2021 19:10:27 -0700 Subject: [PATCH 08/14] updated instructions --- OpenSelfSup | 2 +- README.md | 21 ++++- SEN12MS | 2 +- references/evaluation.md | 7 +- references/model_architectures.md | 2 +- references/setup.md | 150 ------------------------------ 6 files changed, 22 insertions(+), 162 deletions(-) delete mode 100644 references/setup.md diff --git a/OpenSelfSup b/OpenSelfSup index 2fcf673..c28dd45 160000 --- a/OpenSelfSup +++ b/OpenSelfSup @@ -1 +1 @@ -Subproject commit 2fcf673f0f133f6ab966d018e17cc5ddc4edf80f +Subproject commit c28dd4505e143cf4c89fe737adfb43826e2fc266 diff --git a/README.md b/README.md index 3afb1d5..68c22bb 100644 --- a/README.md +++ b/README.md @@ -2,7 +2,7 @@ This is a research repository for the submission "Self-Supervised Pretraining Improves Self-Supervised Pretraining" -For initial setup, refer to [setup instructions](references/setup.md). +For initial setup, refer to [setup instructions](setup_pretraining.md). ## Setup Weight & Biases Tracking @@ -68,21 +68,32 @@ Set up experimental tracking and model versioning: ```bash export WANDB_API_KEY= export WANDB_ENTITY=cal-capstone -export WANDB_PROJECT=hpt2 +export WANDB_PROJECT=hpt4 ``` Run pre-training ```bash cd OpenSelfSup +# set which GPUs to use +# CUDA_VISIBLE_DEVICES=1 +# CUDA_VISIBLE_DEVICES=0,1,2,3 + # (sanity check) Single GPU training on samll dataset -CUDA_VISIBLE_DEVICES=1 ./tools/single_train.sh configs/selfsup/moco/r50_v2_sen12ms_in_basetrain_20ep.py --debug +/tools/single_train.sh configs/selfsup/moco/r50_v2_sen12ms_in_basetrain_aug_20ep.py --debug + +# (sanity check) Single GPU training on samll dataset on sen12ms fusion +./tools/single_train.sh configs/selfsup/moco/r50_v2_sen12ms_12ch_in_smoketrain_aug_2ep.py --debug + # (sanity check) 4 GPUs training on samll dataset -CUDA_VISIBLE_DEVICES=0,1,2,3 ./tools/dist_train.sh configs/selfsup/moco/r50_v2_sen12ms_in_basetrain_20ep.py 4 +./tools/dist_train.sh configs/selfsup/moco/r50_v2_sen12ms_in_basetrain_aug_20ep.py 4 + +# (sanity check) 4 GPUs training on samll fusion dataset +./tools/dist_train.sh configs/selfsup/moco/r50_v2_sen12ms_12ch_in_smoketrain_aug_2ep.py 4 # distributed full training -CUDA_VISIBLE_DEVICES=0,1,2,3 ./tools/dist_train.sh configs/selfsup/moco/r50_v2_sen12ms_in_fulltrain_20ep.py 4 +/tools/dist_train.sh configs/selfsup/moco/r50_v2_sen12ms_in_fulltrain_20ep.py 4 ``` Extract pre-trained model diff --git a/SEN12MS b/SEN12MS index 80ce4ca..a8a7760 160000 --- a/SEN12MS +++ b/SEN12MS @@ -1 +1 @@ -Subproject commit 80ce4ca236e442b31d83d1e977c9f96509fcd782 +Subproject commit a8a7760d81f83b015d341bd3f3ebc87741b0658e diff --git a/references/evaluation.md b/references/evaluation.md index 7dce8dd..76b1190 100644 --- a/references/evaluation.md +++ b/references/evaluation.md @@ -13,16 +13,15 @@ aws s3 sync s3://sen12ms/pretrained_sup . --dryrun Any other models can be restored by run ID if stored with W&B. Go to files section under the run to find `*.pth` files -#### Extract Backbone +#### Extract and Convert Backbone ``` -BACKBONE=work_dirs/selfsup/moco/r50_v2_sen12ms_in_basetrain_20ep/epoch_20_moco_in_baseline.pth # method 1: from working dir CHECKPOINT=work_dirs/selfsup/moco/r50_v2_resisc_in_basetrain_20ep/epoch_20.pth # method 2: from W&B, {projectid}/{W&B run id} -CHECKPOINT=hpt2/3l4yg63k +CHECKPOINT=hpt3/2brjqb28 # Extract the backbone -python tools/extract_backbone_weights.py ${BACKBONE} ${CHECKPOINT} +python classification/models/convert_moco_to_resnet50.py -i hpt3/2brjqb28 -o pretrained/moco ``` #### Evaluate the representations :confused: :question: diff --git a/references/model_architectures.md b/references/model_architectures.md index 464bede..ab54a31 100644 --- a/references/model_architectures.md +++ b/references/model_architectures.md @@ -3,4 +3,4 @@ - ResNet50: original ResNet50 used by sen12ms - Moco: original ResNet50 initialized the weight by Moco backbone - Moco_1x1: ResNet50_1x1 initialized the weight by Moco backbone and input module -- Moco_1x1random: ResNet50_1x1 randomly the weight by Moco backbone and input module +- Moco_1x1RND: ResNet50_1x1 randomly the weight by Moco backbone and input module diff --git a/references/setup.md b/references/setup.md deleted file mode 100644 index 567bd0a..0000000 --- a/references/setup.md +++ /dev/null @@ -1,150 +0,0 @@ - - -## (optional) GPU instance - -Use `Deep Learning AMI (Ubuntu 18.04) Version 40.0` AMI -- on us-west-2, ami-084f81625fbc98fa4 -- additional disk may be required for data - -Once logged in -``` -# update conda to the latest -conda update -n base conda - -conda install pytorch torchvision torchaudio cudatoolkit=10.2 -c pytorch - -``` - -## Installation - -**Dependency repo** -- [modified OpenSelfSup](https://github.com/Berkeley-Data/OpenSelfSup) -- [modified SEN12MS](https://github.com/Berkeley-Data/SEN12MS) -- [modified irrigation_detection](https://github.com/Berkeley-Data/irrigation_detection) - -```bash -# clone dependency repo on the same levels as this repo and cd into this repo - -# setup environment -conda create -n hpt python=3.7 ipython -conda activate hpt - -# NOTE: if you are not using CUDA 10.2, you need to change the 10.2 in this command appropriately. Make sure to use torch 1.6.0 -# (check CUDA version with e.g. `cat /usr/local/cuda/version.txt`) - -# latest torch -conda install pytorch torchvision torchaudio cudatoolkit=10.2 -c pytorch - -# 1.6 torch (no support for torchvision transform on tensor) -conda install pytorch==1.6.0 torchvision==0.7.0 cudatoolkit=10.2 -c pytorch - -#colorado machine -conda install pytorch==1.2.0 torchvision==0.7.0 cudatoolkit=10.2 -c pytorch - -# install local submodules -cd OpenSelfSup -pip install -v -e . -``` - -## Data installation - -Installing and setting up all 16 datsets is a bit of work, so this tutorial shows how to install and setup RESISC-45, and provides links to repeat those steps with other datasets. - -### RESISC-45 -RESISC-45 contains 31,500 aerial images, covering 45 scene classes with 700 images in each class. - -``` shell -# cd to the directory where you want the data, $DATA -wget -q https://bit.ly/3pfkHYp -O resisc45.tar.gz -md5sum resisc45.tar.gz # this should be 964dafcfa2dff0402d0772514fb4540b -tar xf resisc45.tar.gz - -mkdir ~/data -mv resisc45 ~/data - -# replace/set $DATA and $CODE as appropriate -# e.g., ln -s /home/ubuntu/data/resisc45 /home/ubuntu/OpenSelfSup/data/resisc45/all -ln -s $DATA/resisc45 $CODE/OpenSelfSup/data/resisc45/all - -e.g., ln -s /home/ubuntu/data/resisc45 /home/ubuntu/hpt/OpenSelfSup/data/resisc45/all -``` - -### Download Pretrained Models -``` shell -mkdir OpenSelfSup/data/basetrain_chkpts -tools/download-pretrained-models.sh -``` - -## Verify Install With RESISC DataSet -[OpenSelfSup](https://github.com/Berkeley-Data/OpenSelfSup) - -Check installation by pretraining using mocov2, extracting the model weights, evaluating the representations, and then viewing the results on tensorboard or [wandb](https://wandb.ai/cal-capstone/hpt): - - -```bash -export WANDB_API_KEY= -export WANDB_ENTITY=cal-capstone -export WANDB_PROJECT=hpt3 -#export WANDB_MODE=dryrun - - - - - -cd OpenSelfSup - -# Sanity check with single train and single epoch -CUDA_VISIBLE_DEVICES=x ./tools/single_train.sh configs/selfsup/moco/r50_v2_resisc_in_basetrain_20ep.py --debug - -CUDA_VISIBLE_DEVICES=x ./tools/single_train.sh configs/selfsup/moco/r50_v2_sen12ms_in_basetrain_20ep.py --work_dir work_dirs/selfsup/moco/r50_v2_sen12ms_in_basetrain_1ep/ --debug - -# Sanity check: MoCo for 20 epoch on 4 gpus -CUDA_VISIBLE_DEVICES=0,1,2,3 ./tools/dist_train.sh configs/selfsup/moco/r50_v2_resisc_in_basetrain_20ep.py 4 - -# if debugging, use -tools/train.py configs/selfsup/moco/r50_v2_resisc_in_basetrain_1ep.py --work_dir work_dirs/selfsup/moco/r50_v2_resisc_in_basetrain_1ep/ --debug - -# make some variables so its clear what's happening -CHECKPOINT=work_dirs/selfsup/moco/r50_v2_resisc_in_basetrain_20ep/epoch_20.pth -BACKBONE=work_dirs/selfsup/moco/r50_v2_resisc_in_basetrain_20ep/epoch_20_moco_in_basetrain.pth -# Extract the backbone -python tools/extract_backbone_weights.py ${BACKBONE} ${CHECKPOINT} - -# Evaluate the representations -./benchmarks/dist_train_linear.sh configs/benchmarks/linear_classification/resisc45/r50_last.py ${BACKBONE} - -# View the results (optional if wandb is not configured) -cd work_dirs -# you may need to install tensorboard -tensorboard --logdir . -``` - - -## setup sub-modules for sen12ms and openselfsup repo - -Cloning -```console -git clone --recurse-submodules https://github.com/Berkeley-Data/hpt.git - -``` - -or alternatiely -``` -git submodule init -git submodule update -``` - -additional config -``` -git config push.recurseSubmodules on-demand -# show status including submodule -git config status.submodulesummary 1 -``` - -update -``` -git submodule update --remote -``` - -For mroe info: [7.11 Git Tools - Submodules](https://git-scm.com/book/en/v2/Git-Tools-Submodules) - \ No newline at end of file From 5980c8d9f614ca123e88f686613303507a5fd64f Mon Sep 17 00:00:00 2001 From: taeil Date: Sat, 3 Apr 2021 19:19:07 -0700 Subject: [PATCH 09/14] updated references --- references/setup_pretraining.md | 125 ++++++++++++++++++++++++++++++++ 1 file changed, 125 insertions(+) create mode 100644 references/setup_pretraining.md diff --git a/references/setup_pretraining.md b/references/setup_pretraining.md new file mode 100644 index 0000000..e9e2ae3 --- /dev/null +++ b/references/setup_pretraining.md @@ -0,0 +1,125 @@ + + +## (optional) GPU instance + +Use `Deep Learning AMI (Ubuntu 18.04) Version 40.0` AMI +- on us-west-2, ami-084f81625fbc98fa4 +- additional disk may be required for data + +Once logged in +``` +# update conda to the latest +conda update -n base conda + +conda install pytorch torchvision torchaudio cudatoolkit=10.2 -c pytorch + +``` + +## Installation + +**Dependency repo** +- [modified OpenSelfSup](https://github.com/Berkeley-Data/OpenSelfSup) +- [modified SEN12MS](https://github.com/Berkeley-Data/SEN12MS) +- [modified irrigation_detection](https://github.com/Berkeley-Data/irrigation_detection) + +```bash +# clone dependency repo on the same levels as this repo and cd into this repo + +# setup environment +conda create -n hpt python=3.7 ipython +conda activate hpt + +# NOTE: if you are not using CUDA 10.2, you need to change the 10.2 in this command appropriately. Make sure to use torch 1.6.0 +# (check CUDA version with e.g. `cat /usr/local/cuda/version.txt`) + +# latest torch +conda install pytorch torchvision torchaudio cudatoolkit=10.2 -c pytorch + +# 1.6 torch (no support for torchvision transform on tensor) +conda install pytorch==1.6.0 torchvision==0.7.0 cudatoolkit=10.2 -c pytorch + +#llano machine +conda install pytorch==1.2.0 torchvision==0.7.0 cudatoolkit=10.2 -c pytorch + +# install local submodules +cd OpenSelfSup +pip install -v -e . +``` + +## Data installation + +Installing and setting up all 16 datsets is a bit of work, so this tutorial shows how to install and setup RESISC-45, and provides links to repeat those steps with other datasets. + +### RESISC-45 +RESISC-45 contains 31,500 aerial images, covering 45 scene classes with 700 images in each class. + +``` shell +# cd to the directory where you want the data, $DATA +wget -q https://bit.ly/3pfkHYp -O resisc45.tar.gz +md5sum resisc45.tar.gz # this should be 964dafcfa2dff0402d0772514fb4540b +tar xf resisc45.tar.gz + +mkdir ~/data +mv resisc45 ~/data + +# replace/set $DATA and $CODE as appropriate +# e.g., ln -s /home/ubuntu/data/resisc45 /home/ubuntu/OpenSelfSup/data/resisc45/all +ln -s $DATA/resisc45 $CODE/OpenSelfSup/data/resisc45/all + +e.g., ln -s /home/ubuntu/data/resisc45 /home/ubuntu/hpt/OpenSelfSup/data/resisc45/all +``` + +### Download Pretrained Models +``` shell +tools/download-pretrained-models.sh +mkdir OpenSelfSup/data/basetrain_chkpts +mv +``` + +## Verify Install With RESISC DataSet +[OpenSelfSup](https://github.com/Berkeley-Data/OpenSelfSup) + +Check installation by pretraining using mocov2, extracting the model weights, evaluating the representations, and then viewing the results on tensorboard or [wandb](https://wandb.ai/cal-capstone/hpt): + + +```bash +cd OpenSelfSup + +CUDA_VISIBLE_DEVICES=0,1,2,3 + +# Sanity check with single train and single epoch +./tools/single_train.sh configs/selfsup/moco/r50_v2_resisc_in_basetrain_1ep.py --debug + +# Sanity check: MoCo for 20 epoch on 4 gpus + ./tools/dist_train.sh configs/selfsup/moco/r50_v2_resisc_in_basetrain_20ep.py 4 +``` + + +## setup sub-modules for sen12ms and openselfsup repo + +Cloning +```console +git clone --recurse-submodules https://github.com/Berkeley-Data/hpt.git + +``` + +or alternatiely +``` +git submodule init +git submodule update +``` + +additional config +``` +git config push.recurseSubmodules on-demand +# show status including submodule +git config status.submodulesummary 1 +``` + +update +``` +git submodule update --remote +``` + +For mroe info: [7.11 Git Tools - Submodules](https://git-scm.com/book/en/v2/Git-Tools-Submodules) + \ No newline at end of file From 55d312ecc6371e5d3b5f5f3f9ada00de286326b7 Mon Sep 17 00:00:00 2001 From: taeil Date: Sat, 3 Apr 2021 19:26:30 -0700 Subject: [PATCH 10/14] updated references --- README.md | 19 +++++++++++++++++-- references/evaluation.md | 31 ------------------------------- 2 files changed, 17 insertions(+), 33 deletions(-) delete mode 100644 references/evaluation.md diff --git a/README.md b/README.md index 68c22bb..c6d9322 100644 --- a/README.md +++ b/README.md @@ -71,7 +71,7 @@ export WANDB_ENTITY=cal-capstone export WANDB_PROJECT=hpt4 ``` -Run pre-training +#### Run pre-training ```bash cd OpenSelfSup @@ -96,7 +96,22 @@ cd OpenSelfSup /tools/dist_train.sh configs/selfsup/moco/r50_v2_sen12ms_in_fulltrain_20ep.py 4 ``` -Extract pre-trained model +#### (OPTIONAL) download pre-trained models + +Some of key pre-trained models are on s3 (s3://sen12ms/pretrained): +- [200 epochs w/o augmentation: vivid-resonance-73](https://wandb.ai/cjrd/BDOpenSelfSup-tools/runs/3qjvxo2p/overview?workspace=user-cjrd) +- [20 epochs w/o augmentation: silvery-oath7-2rr3864e](https://wandb.ai/cal-capstone/hpt2/runs/2rr3864e?workspace=user-taeil) +- [sen12ms-baseline: soft-snowflake-3.pth](https://wandb.ai/cal-capstone/SEN12MS/runs/3gjhe4ff/overview?workspace=user-taeil) + +``` +aws configure +aws s3 sync s3://sen12ms/pretrained . --dryrun +aws s3 sync s3://sen12ms/pretrained_sup . --dryrun +``` + +#### Extract pre-trained model +Any other models can be restored by run ID if stored with W&B. Go to files section under the run to find `*.pth` files + ```bash BACKBONE=work_dirs/selfsup/moco/r50_v2_sen12ms_in_basetrain_20ep/epoch_20_moco_in_baseline.pth diff --git a/references/evaluation.md b/references/evaluation.md deleted file mode 100644 index 76b1190..0000000 --- a/references/evaluation.md +++ /dev/null @@ -1,31 +0,0 @@ -## download pre-trained models - -Some of key pre-trained models are on s3 (s3://sen12ms/pretrained): -- [200 epochs w/o augmentation: vivid-resonance-73](https://wandb.ai/cjrd/BDOpenSelfSup-tools/runs/3qjvxo2p/overview?workspace=user-cjrd) -- [20 epochs w/o augmentation: silvery-oath7-2rr3864e](https://wandb.ai/cal-capstone/hpt2/runs/2rr3864e?workspace=user-taeil) -- [sen12ms-baseline: soft-snowflake-3.pth](https://wandb.ai/cal-capstone/SEN12MS/runs/3gjhe4ff/overview?workspace=user-taeil) - -``` -aws configure -aws s3 sync s3://sen12ms/pretrained . --dryrun -aws s3 sync s3://sen12ms/pretrained_sup . --dryrun -``` - -Any other models can be restored by run ID if stored with W&B. Go to files section under the run to find `*.pth` files - -#### Extract and Convert Backbone -``` -# method 1: from working dir -CHECKPOINT=work_dirs/selfsup/moco/r50_v2_resisc_in_basetrain_20ep/epoch_20.pth -# method 2: from W&B, {projectid}/{W&B run id} -CHECKPOINT=hpt3/2brjqb28 - -# Extract the backbone -python classification/models/convert_moco_to_resnet50.py -i hpt3/2brjqb28 -o pretrained/moco -``` - -#### Evaluate the representations :confused: :question: - -``` -./benchmarks/dist_train_linear.sh configs/benchmarks/linear_classification/resisc45/r50_last.py ${BACKBONE} -``` \ No newline at end of file From 92c6232059d6da8865830c1585f87a3813129128 Mon Sep 17 00:00:00 2001 From: taeil Date: Sat, 10 Apr 2021 23:08:51 -0700 Subject: [PATCH 11/14] added project file for hpt --- .idea/.gitignore | 8 +++++ .idea/deployment.xml | 22 ++++++++++++++ .idea/hpt.iml | 19 ++++++++++++ .../inspectionProfiles/profiles_settings.xml | 6 ++++ .idea/misc.xml | 7 +++++ .idea/modules.xml | 8 +++++ .idea/other.xml | 6 ++++ .idea/runConfigurations/single_train.xml | 30 +++++++++++++++++++ .idea/vcs.xml | 8 +++++ OpenSelfSup | 2 +- README.md | 7 ++--- SEN12MS | 2 +- references/model_architectures.md | 18 +++++++++++ 13 files changed, 137 insertions(+), 6 deletions(-) create mode 100644 .idea/.gitignore create mode 100644 .idea/deployment.xml create mode 100644 .idea/hpt.iml create mode 100644 .idea/inspectionProfiles/profiles_settings.xml create mode 100644 .idea/misc.xml create mode 100644 .idea/modules.xml create mode 100644 .idea/other.xml create mode 100644 .idea/runConfigurations/single_train.xml create mode 100644 .idea/vcs.xml diff --git a/.idea/.gitignore b/.idea/.gitignore new file mode 100644 index 0000000..73f69e0 --- /dev/null +++ b/.idea/.gitignore @@ -0,0 +1,8 @@ +# Default ignored files +/shelf/ +/workspace.xml +# Datasource local storage ignored files +/dataSources/ +/dataSources.local.xml +# Editor-based HTTP Client requests +/httpRequests/ diff --git a/.idea/deployment.xml b/.idea/deployment.xml new file mode 100644 index 0000000..6a6d49b --- /dev/null +++ b/.idea/deployment.xml @@ -0,0 +1,22 @@ + + + + + + + + + + + + + + + + + + + + + \ No newline at end of file diff --git a/.idea/hpt.iml b/.idea/hpt.iml new file mode 100644 index 0000000..3e2e3fe --- /dev/null +++ b/.idea/hpt.iml @@ -0,0 +1,19 @@ + + + + + + + + + + + + + \ No newline at end of file diff --git a/.idea/inspectionProfiles/profiles_settings.xml b/.idea/inspectionProfiles/profiles_settings.xml new file mode 100644 index 0000000..105ce2d --- /dev/null +++ b/.idea/inspectionProfiles/profiles_settings.xml @@ -0,0 +1,6 @@ + + + + \ No newline at end of file diff --git a/.idea/misc.xml b/.idea/misc.xml new file mode 100644 index 0000000..8598883 --- /dev/null +++ b/.idea/misc.xml @@ -0,0 +1,7 @@ + + + + + + + \ No newline at end of file diff --git a/.idea/modules.xml b/.idea/modules.xml new file mode 100644 index 0000000..b0929b6 --- /dev/null +++ b/.idea/modules.xml @@ -0,0 +1,8 @@ + + + + + + + + \ No newline at end of file diff --git a/.idea/other.xml b/.idea/other.xml new file mode 100644 index 0000000..a708ec7 --- /dev/null +++ b/.idea/other.xml @@ -0,0 +1,6 @@ + + + + + \ No newline at end of file diff --git a/.idea/runConfigurations/single_train.xml b/.idea/runConfigurations/single_train.xml new file mode 100644 index 0000000..c310ab9 --- /dev/null +++ b/.idea/runConfigurations/single_train.xml @@ -0,0 +1,30 @@ + + + + + \ No newline at end of file diff --git a/.idea/vcs.xml b/.idea/vcs.xml new file mode 100644 index 0000000..b120a90 --- /dev/null +++ b/.idea/vcs.xml @@ -0,0 +1,8 @@ + + + + + + + + \ No newline at end of file diff --git a/OpenSelfSup b/OpenSelfSup index c28dd45..add4f80 160000 --- a/OpenSelfSup +++ b/OpenSelfSup @@ -1 +1 @@ -Subproject commit c28dd4505e143cf4c89fe737adfb43826e2fc266 +Subproject commit add4f806fa4ca9128975e9688d81e621def47334 diff --git a/README.md b/README.md index c6d9322..c43f900 100644 --- a/README.md +++ b/README.md @@ -9,7 +9,7 @@ For initial setup, refer to [setup instructions](setup_pretraining.md). ```bash export WANDB_API_KEY= export WANDB_ENTITY=cal-capstone -export WANDB_PROJECT=hpt +export WANDB_PROJECT=scene_classification #export WANDB_MODE=dryrun ``` @@ -83,14 +83,13 @@ cd OpenSelfSup /tools/single_train.sh configs/selfsup/moco/r50_v2_sen12ms_in_basetrain_aug_20ep.py --debug # (sanity check) Single GPU training on samll dataset on sen12ms fusion -./tools/single_train.sh configs/selfsup/moco/r50_v2_sen12ms_12ch_in_smoketrain_aug_2ep.py --debug - +./tools/single_train.sh configs/selfsup/moco/r50_v2_sen12ms_fusion_in_smoke_aug.py --debug # (sanity check) 4 GPUs training on samll dataset ./tools/dist_train.sh configs/selfsup/moco/r50_v2_sen12ms_in_basetrain_aug_20ep.py 4 # (sanity check) 4 GPUs training on samll fusion dataset -./tools/dist_train.sh configs/selfsup/moco/r50_v2_sen12ms_12ch_in_smoketrain_aug_2ep.py 4 +./tools/dist_train.sh configs/selfsup/moco/r50_v2_sen12ms_fusion_in_smoke_aug.py 4 # distributed full training /tools/dist_train.sh configs/selfsup/moco/r50_v2_sen12ms_in_fulltrain_20ep.py 4 diff --git a/SEN12MS b/SEN12MS index a8a7760..f824eeb 160000 --- a/SEN12MS +++ b/SEN12MS @@ -1 +1 @@ -Subproject commit a8a7760d81f83b015d341bd3f3ebc87741b0658e +Subproject commit f824eebb94e6e08b211cf963d6dc768c82c3bd08 diff --git a/references/model_architectures.md b/references/model_architectures.md index c687de9..1300689 100644 --- a/references/model_architectures.md +++ b/references/model_architectures.md @@ -10,3 +10,21 @@ - Moco_1x1Rnd: adding conv1x1 block to the ResNet50 used by Sen12ms. ResNet50 layers are initialized with the weight from Moco but input module is initialized with random weights - Finetune v2 (1k dataset) - freezing ResNet50 fully or partially does not seem to help with accuracy. We will continue explore and share the results once we are sure there is no issue with implementation. + + +#### Key pretrained models + +Some pretrained models: + +**Sensor Augmentation** +- [vivid-resonance-73](https://wandb.ai/cjrd/BDOpenSelfSup-tools/runs/3qjvxo2p) +- [silvery-oath-7](https://wandb.ai/cal-capstone/hpt2/runs/2rr3864e) +- sen12_crossaugment_epoch_1000.pth: 1000 epocs + +**Data Fusion - Augmentation Set 2** +- [(optional fusion) crimson-pyramid-70](https://wandb.ai/cal-capstone/hpt4/runs/2iu8yfs6): 200 epochs +- [(partial fusion) laced-water-61](https://wandb.ai/cal-capstone/hpt4/runs/367tz8vs) 200 epochs, 32K +- [(partial fusion) visionary-lake-62](https://wandb.ai/cal-capstone/hpt4/runs/1srlc7jr/overview?workspace=user-taeil) should deprecate. different number of epochs from other pretrained models +- [(full fusion) electric-mountain-33](https://wandb.ai/cal-capstone/hpt4/runs/ak0xdbfu) + +**Data Fusion - Augmentation Set 1** \ No newline at end of file From 081acb50d6c69151898ed4479b75dfeed431a9bf Mon Sep 17 00:00:00 2001 From: taeil Date: Tue, 27 Apr 2021 22:50:02 -0700 Subject: [PATCH 12/14] update repo --- .idea/deployment.xml | 2 +- OpenSelfSup | 2 +- README.md | 418 +++++++++++------------------- SEN12MS | 2 +- metrics.md | 125 +++++++++ paper_draft.md | 193 -------------- references/hpt_repo.md | 351 +++++++++++++++++++++++++ references/model_architectures.md | 23 +- 8 files changed, 644 insertions(+), 472 deletions(-) create mode 100644 metrics.md delete mode 100644 paper_draft.md create mode 100644 references/hpt_repo.md diff --git a/.idea/deployment.xml b/.idea/deployment.xml index 6a6d49b..fde1520 100644 --- a/.idea/deployment.xml +++ b/.idea/deployment.xml @@ -12,7 +12,7 @@ - + diff --git a/OpenSelfSup b/OpenSelfSup index add4f80..6573f31 160000 --- a/OpenSelfSup +++ b/OpenSelfSup @@ -1 +1 @@ -Subproject commit add4f806fa4ca9128975e9688d81e621def47334 +Subproject commit 6573f31a4f12a3a5a4c87a40624c9ed339049028 diff --git a/README.md b/README.md index c43f900..dce761f 100644 --- a/README.md +++ b/README.md @@ -1,351 +1,225 @@ -# Hierarchical Pretraining: Research Repository +## Abstract -This is a research repository for the submission "Self-Supervised Pretraining Improves Self-Supervised Pretraining" +We present a sensor-based location invariance momentum contrast for unsupervised visual representation learning in remote sensing application, where unlabeled data is well-known challenges to deep learning domain and accurate training data remains comparably scarce. In this study, we first introduce the use of SEN12MS datasets, a curated large-scale training data that include versatile remote sensing information from different sensors with global scene distributions. To continually bridge the gap between supervised and unsupervised learning on computer vision tasks in remote sensing applications, we exploit the geo-alignment data structure from SEN12MS to propose two training methods. One set is to construct sensor-based geo-alignment positive pairs in contrastive learning to design the natural augmentation. Another set is fusing data from different sensors with the objective of learning better representations. [last sentence subject to changes] Our experiments show that the proposed method outperforms the supervised learning counterpart when transferring to downstream tasks in scene classification for remote sensing data. -For initial setup, refer to [setup instructions](setup_pretraining.md). -## Setup Weight & Biases Tracking +## Introduction +The performance of deep convolutional neural networks depends on their capability and the amount of training data. The datasets are becoming larger in every domain and different kinds of network architectures like [VGG](https://arxiv.org/pdf/1409.1556.pdf), [GoogLeNet](https://arxiv.org/pdf/1409.4842.pdf), [ResNet](https://arxiv.org/pdf/1512.03385.pdf), [DenseNet](https://arxiv.org/pdf/1608.06993.pdf), etc., increased network models' capacity. -```bash -export WANDB_API_KEY= -export WANDB_ENTITY=cal-capstone -export WANDB_PROJECT=scene_classification -#export WANDB_MODE=dryrun -``` +However, the collection and annotation of large-scale datasets are time-consuming and expensive. Many self-supervised methods were proposed to learn visual features from large-scale unlabeled data without using any human annotations to avoid time-consuming and costly data annotations. Contrastive learning of visual representations has emerged as the front-runner for self-supervision and has demonstrated superior performance on downstream tasks. All contrastive learning frameworks involve maximizing agreement between positive image pairs relative to negative/different images via a contrastive loss function; this pretraining paradigm forces the model to learn good representations. These approaches typically differ in how they generate positive and negative image pairs from unlabeled data and how the data are sampled during pretraining. -## Base Training +Self-supervised approaches such as Momentum Contrast (MoCo) ([He et al., 2019](https://arxiv.org/pdf/1911.05722.pdf); [Chen et al.,2020](https://arxiv.org/pdf/2003.04297.pdf)) can leverage unlabeled data to produce pre-trained models for subsequent fine-tuning on labeled data. In addition to MoCo, these include frameworks such as SimCLR ([Chen et al., 2020](https://arxiv.org/pdf/2002.05709.pdf)) and PIRL ([Misra and Maaten, 2020](https://openaccess.thecvf.com/content_CVPR_2020/papers/Misra_Self-Supervised_Learning_of_Pretext-Invariant_Representations_CVPR_2020_paper.pdf)). -[OpenSelfSup](https://github.com/Berkeley-Data/OpenSelfSup) +Remote sensing data has become broadly available at the petabyte scale, offering unprecedented visibility into natural and human activity across the Earth. In remote sensing, labeled data is usually scarce and hard to obtain. Due to the success of self-supervised learning methods, we explore their application to large-scale remote sensing datasets. -Right now we assume ImageNet base trained models. -```bash -cd OpenSelfSup/data/basetrain_chkpts/ -./download-pretrained-models.sh -``` +While most self-supervised image analysis techniques focus on natural imagery, remote sensing differs in several critical ways. Natural imagery often has one subject; remote sensing images contain numerous objects such as buildings, trees, roads, rivers, etc. Additionally, the important content changes unpredictably within just a few pixels or between images at the same location from different times. Multiple satellites capture images of the same locations on earth with a wide variety of resolutions, spectral bands (channels), and revisit rates, such that any specific problem can require a different +combination of sensor inputs([Reiche et al., 2018](https://doi.org/10.1016/j.rse.2017.10.034),[Rustowicz et al., 2019](https://openaccess.thecvf.com/content_CVPRW_2019/papers/cv4gc/Rustowicz_Semantic_Segmentation_of_Crop_Type_in_Africa_A_Novel_Dataset_CVPRW_2019_paper.pdf)). -## Pretraining With a New Dataset +While MoCo and other contrastive learning methods have demonstrated promising results on natural image classification tasks, their application to remote sensing applications has been limited. -[hpt](https://github.com/Berkeley-Data/hpt) +Traditional contrative learning utilizes augmentation to generate positive pair. Inspired by recent success (Geo-aware Paper) using natural augmentation to create positive pairs, we propose to use positive pairs from different sensors for the same location. -We have a handy set of config generators to make pretraining with a new dataset easy and consistent! +In this work, we demonstrate that pre-training [MoCo-v2](https://openaccess.thecvf.com/content_CVPR_2020/papers/He_Momentum_Contrast_for_Unsupervised_Visual_Representation_Learning_CVPR_2020_paper.pdf) on data from multiple sensors lead to improved representations for remote sensing applications. +![](web/images/architectures_1_and_2.png) -**FIRST**, you will need the image pixel mean/std of your dataset, if you don't have it, you can do: -```bash -cd src/data/ +## Related Work +#### Self-supervised contrastive learning +Many self-supervised learning methods for visual feature learning have been developed without using any + human-annotated labels. Compared to supervised learning methods which require a data pair Xi + and Yi while Yi is annotated by human labors, self-supervised learning also trained with data Xi along + with its pseudo label Pi while Pi is automatically generated for a pre-defined pretext task without involving any + human annotation. The pseudo label Pi can be generated by using attributes of images or videos such as the context of + images or by traditional hand-designed methods. As long as the pseudo labels P are automatically generated + without involving human annotations, then the methods belong to self-supervised learning. Recently, self-supervised + learning methods have achieved great progress. -# for sen12ms, run multiples times replacing --use_s1 by --use_s2 or --use_RGB -./compute-dataset-pixel-mean-std-sen12ms.py --data_dir /storage/sen12ms_x --data_index_dir /scratch/crguest/hpt/data --use_s1 --numworkers 1 + Self-supervised contrastive learning approaches such as [MoCo](https://arxiv.org/pdf/1911.05722.pdf) , + [MoCo-v2](https://arxiv.org/pdf/2003.04297.pdf), [SimCLR](https://arxiv.org/pdf/2002.05709.pdf), and [PIRL](https://openaccess.thecvf.com/content_CVPR_2020/papers/Misra_Self-Supervised_Learning_of_Pretext-Invariant_Representations_CVPR_2020_paper.pdf) have demonstrated + superior performance and have emerged as the fore-runner on various downstream tasks. The intuition behind these + methods are to learn representations by pulling positive image pairs from the same instance closer in latent space + while pushing negative pairs from difference instances further away. These methods, on the other hand, differ in the + type of contrastive loss, generation of positive and negative pairs, and sampling method. -# for others -./compute-dataset-pixel-mean-std.py --data /scratch/crguest/data/sen12ms_small --numworkers 20 --batchsize 256 + Contrastive learning of visual representations using MoCo ([**MoCo-v2**](https://arxiv.org/pdf/2003.04297.pdf) - Chen, et + al., Facebook AI Research, 2020) has emerged as the front-runner for self-supervision and has demonstrated superior performance on downstream tasks. -where image-folder has the structure from ImageFolder in pytorch -class/image-name.jp[e]g -or whatever image extension you're using -``` -if your dataset is not arranged in this way, you can either: -(i) use symlinks to put it in this structure -(ii) update the above script to read in your data +#### Performance gap in Satellite imagery +There is a performance gap between supervised learning using labels and self-supervised contrastive learning method, [MoCo-v2](https://arxiv.org/pdf/2003.04297.pdf), on remote + sensing datasets. For instance, on the Functional Map of the World ([fMoW](https://arxiv.org/abs/1711.07846)) image classification + benchmark, there is an 8% gap in top 1 accuracy between supervised and self-supervised methods. By leveraging spatially aligned + images over time to construct temporal positive pairs in contrastive learning and geo-location in the design of pre-text tasks, **[Geography-Aware + Self-supervised Learning](https://arxiv.org/pdf/2011.09980.pdf)** (Ayush, et al., Stanford University, 2020) were able to + close the gap between self-supervised and supervised learning on image classification, object detection and semantic + segmentation on remote sensing and other geo-tagged image datasets. -NOTE: For sen12ms, the code is not working as expected (refer to [this issue](https://github.com/Berkeley-Data/hpt/issues/24), until then use the following. -``` -bands_mean = {'s1_mean': [-11.76858, -18.294598], - 's2_mean': [1226.4215, 1137.3799, 1139.6792, 1350.9973, 1932.9058, - 2211.1584, 2154.9846, 2409.1128, 2001.8622, 1356.0801]} +In this work, we provide an effective approach for improving representation learning using data from different satellite imagery using [MoCo-v2](https://arxiv.org/pdf/2003.04297.pdf). -bands_std = {'s1_std': [4.525339, 4.3586307], - 's2_std': [741.6254, 740.883, 960.1045, 946.76056, 985.52747, - 1082.4341, 1057.7628, 1136.1942, 1132.7898, 991.48016]} -``` +## Problem Definition +Does contrastive pre-training with data from multiple sensors lead to improved representations for remote sensing applications? -## Pre-training with SEN12MS Dataset -[OpenSelfSup](https://github.com/Berkeley-Data/OpenSelfSup) -- see `src/utils/pretrain-runner.sh` for end-to-end run (require prep creating config files). +Pre-train the contrastive model using unlabeled data from multiple satellites and use that model for downstream remote sensing tasks. -Check installation by pretraining using mocov2, extracting the model weights, evaluating the representations, and then viewing the results on tensorboard or [wandb](https://wandb.ai/cal-capstone/hpt): +We want to show that our approach to using images from different satellites for the same location as naturally augmented images as input to the MoCo-v2 method provides high-quality representations and transferable initializations for satellite imagery interpretation. Despite many differences in the data and task properties between natural image classification and satellite imagery interpretation, we want to show the benefit of MoCo-v2 pretraining across multiple patches from different satellites for satellite imagery and investigate representation transfer to a target dataset. -Set up experimental tracking and model versioning: -```bash -export WANDB_API_KEY= -export WANDB_ENTITY=cal-capstone -export WANDB_PROJECT=hpt4 -``` +### Datasets +- [todo] keep only sen12ms. +- +To validate our ideas, we did experiments on datasets with different satellite imageries with variations in dataset size, channels, and image ground resolutions. The statistics of these datasets are given below. Readers are requested to see the the supplementary materials for examples and additional details of these datasets. +| Dataset | Satellites | Number of Images | Image Size | Labels | Notes | +|---|---|---|---|---|---| +| [BigEarthNet](https://arxiv.org/pdf/1902.06148.pdf) | Sentinel-2A/B |590,326 patches; 12 Bands | 20x20 to 120x120 | Multiple, 43 Full and 19 Simplified | No overlapping; 10 European Countries | +| [SEN12MS](https://arxiv.org/pdf/1906.07789.pdf) | Sentinel-1A/B; Sentinel-2A/B; MODIS (Terra and Aqua) | 541,986 patches; 180662 triplets (3\*180662); 4, 2 and 13 Bands | 256X256 | Single, 17 Full and 10 Simplified | Partial overlapping | +| [FMoW](https://arxiv.org/abs/1711.07846) | QuickBird-2; GeoEye-1; WorldView-2; WorldView-3 | 1,047,691 patches; 4, 8 and RGB Bands | Variable Over 2500x2500 | Multiple, up to 63; Bounding Box Annotations | Includes False Detection; Variable timestamp overlapping | -#### Run pre-training -```bash -cd OpenSelfSup +##### SEN12MS +The SEN12MS dataset contains 180,662 patch triplets of corresponding Sentinel-1 dual-pol SAR data, Sentinel-2 multi-spectral images, and MODIS-derived land cover maps. The patches are distributed across the land masses of the Earth and spread over all four meteorological seasons. This is reflected by the dataset structure. The captured scenes were tiled into patches of 256 X 256 pixels in size and implemented a stride of 128 pixels, resulting in an overlap between adjacent patches. +Only 3847 patches do not have any overlap with adjacent patches. +Most of the overlap occurs around 25% and 50% of the area with few patches overlapping less than 15% and more than 75%. -# set which GPUs to use -# CUDA_VISIBLE_DEVICES=1 -# CUDA_VISIBLE_DEVICES=0,1,2,3 + All patches are provided in the form of 16-bit GeoTiffs containing the following specific information: +* Sentinel-1 SAR: 2 channels corresponding to sigma nought backscatter values in dB scale for VV and VH polarization. +* Sentinel-2 Multi-Spectral: 13 channels corresponding to the 13 spectral bands (B1, B2, B3, B4, B5, B6, B7, B8, B8a, B9, B10, B11, B12). +* MODIS Land Cover: 4 channels corresponding to IGBP, LCCS Land Cover, LCCS Land Use, and LCCS Surface Hydrology layers. -# (sanity check) Single GPU training on samll dataset -/tools/single_train.sh configs/selfsup/moco/r50_v2_sen12ms_in_basetrain_aug_20ep.py --debug +(TODO for Ernesto) should discuss dataset split for training and test (holdout set) +- 32K set +- potential issue with full set +- test set +- 1K set -# (sanity check) Single GPU training on samll dataset on sen12ms fusion -./tools/single_train.sh configs/selfsup/moco/r50_v2_sen12ms_fusion_in_smoke_aug.py --debug +## Method +In this section, we briefly review Contrastive Learning Framework for unsupervised learning and detail our proposed approach to improve Moco-v2, a recent contrastive learning framework, on satellite imagery from multiple sensors data. -# (sanity check) 4 GPUs training on samll dataset -./tools/dist_train.sh configs/selfsup/moco/r50_v2_sen12ms_in_basetrain_aug_20ep.py 4 +**Multiple-Sensor** +Update on different bands, different satellites etc. with images. -# (sanity check) 4 GPUs training on samll fusion dataset -./tools/dist_train.sh configs/selfsup/moco/r50_v2_sen12ms_fusion_in_smoke_aug.py 4 +#### 1. Contrastive Learning Framework +Contrastive methods attempt to learn a mapping fq from raw pixels to semantically meaningful representations z in an unsupervised way. The training objective encourages representations corresponding to pairs of images that are known a priori to be semantically similar (positive pairs) to be closer to each other than typical unrelated pairs (negative pairs). With similarity measured by dot product, recent approaches in contrastive learning differ in the type of contrastive loss and generation of positive and negative pairs. In this work, we focus on the state-of-the-art contrastive learning framework [MoCo-v2](https://arxiv.org/pdf/2003.04297.pdf), an improved version of [MoCo](https://arxiv.org/pdf/1911.05722.pdf), and study improved methods for the construction of positive and negative pairs tailored to remote sensing applications. -# distributed full training -/tools/dist_train.sh configs/selfsup/moco/r50_v2_sen12ms_in_fulltrain_20ep.py 4 -``` +#### 2. Sensors-based Geo-alignment Positive Pairs +Given the SEN12MS that provides the largest remote sensing dataset available with its global scene distribution and the wealth of versatile remote sensing information, It is natural to leverage the geo-alignment imagery from different remote sensing sensors while constructing positive or negative pairs . For example, Sentinel 1 consists of two images (vertical and horizontal polarization) and Sentinel 2 consist of thirteen images (different wavelength bands) of the same patch. Any combination from the same patch would correspond to a positive pair without the need of additional augmentation, while negative pairs would correspond to any image from different patches without restriction of the same or different satellites. -#### (OPTIONAL) download pre-trained models +In short, given an image x_i(s1) collected from sentinel 1, we can randomly select another image x_i(s2) collected from sentinel 2 that is geographically aligned with the x_i(s1), and then have them passthrough MoCo-v2 to the geographically aligned image pair x_i(s1) and x_i(s2), which provides us with a sensor-based geo-alignment positive pair ( v and v’ in Figure xxx) that can be used for training the contrastive learning framework by the query and key encoders in MoCo-v2. -Some of key pre-trained models are on s3 (s3://sen12ms/pretrained): -- [200 epochs w/o augmentation: vivid-resonance-73](https://wandb.ai/cjrd/BDOpenSelfSup-tools/runs/3qjvxo2p/overview?workspace=user-cjrd) -- [20 epochs w/o augmentation: silvery-oath7-2rr3864e](https://wandb.ai/cal-capstone/hpt2/runs/2rr3864e?workspace=user-taeil) -- [sen12ms-baseline: soft-snowflake-3.pth](https://wandb.ai/cal-capstone/SEN12MS/runs/3gjhe4ff/overview?workspace=user-taeil) +For a sample of x_i(s1), our GeoSensorInforNCE objective loss can be demonstrated as follows: -``` -aws configure -aws s3 sync s3://sen12ms/pretrained . --dryrun -aws s3 sync s3://sen12ms/pretrained_sup . --dryrun -``` +***INSERT FIGURE HERE (GeoSensorInfoNCE)*** -#### Extract pre-trained model -Any other models can be restored by run ID if stored with W&B. Go to files section under the run to find `*.pth` files +where z_i(s1) and z_i(s2) are the encoded representation of the randomly geo-aligned positive pair x_i(s1) and x_i(s2). N denotes the number of negative samples, {k_j}j=1_n are the encoded negative pairs, and \lambda is the temperature hyperparameter. -```bash -BACKBONE=work_dirs/selfsup/moco/r50_v2_sen12ms_in_basetrain_20ep/epoch_20_moco_in_baseline.pth +What we used are the actual images from the same location but different sensors. With the inspiration of the success of geography-aware self-supvervised learing (insert ref -- **geo xxxxx paper** )that constructs temporal pairs from real images, we also rely on the assumptions that the actual images for positive pairs encourages the entire network to learn better representations for real sensors data than the one focusing on augmentation strategies and synthetic images. +xxxxx -# method 1: From working dir(same system for pre-training) -# CHECKPOINT=work_dirs/selfsup/moco/r50_v2_resisc_in_basetrain_20ep/epoch_20.pth +#### 3. Geo-alignment Data Fusion -# method 2: from W&B, {projectid}/{W&B run id} (any system) -CHECKPOINT=hpt2/3l4yg63k +Instead of the first approach, we data fusioned sentinel 1 (2 bands) and sentinel 2 images (10 bands) together with the same locations and apply a set of combinations of images including sentinel 1 and sentinel 2 together, sentinel 2 only, and sentinel 1 only to construct one fusioned image. In a sense that we build a straightforward constraving learning directly under MoCo v2. -# Extract the backbone -python tools/extract_backbone_weights.py ${BACKBONE} ${CHECKPOINT} +#### 4. 1x1 Convolution filters -``` +From the above perspective of the training methods in contrastive learning, including naturally augmented positive and negative pairs and the data fusion approach, we noticed that the volume (bands) of the inputs from different sensors are different. In order to match the typical dimensions of the image channels, our study also applies the Network in Network concept (Min Lin et al)(insert ref --**NIN**) to the sourced images in the sensor-based geo-aligment pairs scheme, as well as the data fusioned image in our second set of experiments. As such, we introduced an extra layer of one by one convolution filter block to perform cross channel sampling, thereby matching and aligning the depth of the channels from different sensor images while introducing non-linearity before the MoCo v2 encoding. With the implementation, we leverage this trick to carry out a pretty non-trivial computation on the input volume whereas we hope to increase the generalization capability in the network. -## Evaluating Pretrained Representations +## Experiments +#### Pre-training on SEN12MS -Using OpenSelfSup -```bash -python tools/train.py $CFG --pretrained $PRETRAIN +Pre-training is performed twice for comparison proposes. First, examples from all patches are included (180,662). Second, pre-train includes a sample of the dataset which patches do not overlap with their adjacent patches. This sample of the dataset is selected on firs come first serve basis and any adjacent overlapping patch is ignored. The selection consist of 35,792 patches. -# RESISC finetune example -tools/train.py --local_rank=0 configs/benchmarks/linear_classification/resisc45/r50_last.py --pretrained work_dirs/selfsup/moco/r50_v2_resisc_in_basetrain_20ep/epoch_20_moco_in_basetrain.pth --work_dir work_dirs/benchmarks/linear_classification/resisc45/moco-selfsup/r50_v2_resisc_in_basetrain_20ep-r50_last --seed 0 --launcher=pytorch +The model is pre-trained on different scenarios to compare the performance of the model. ***First, the model is trained by using the original approach of MoCo V2. The input image is augmented by gaussian blur, elastic transformation, vertical and horizontal flip***. Second, the model with the approach proposed in this work that is using images from different satellites as positive pairs. ***Third, in order to generalize the model, augmentation is applied to both satellites during training***. The pre-train is also done with both the complete dataset and the non-overlapping sample described in the previous section. +The encoders have ***ResNet50*** architecture (50 layers deep, 2048 nodes) with 128 output nodes. +These encoders are designed for a RGB input (3 bands) and Sen12MS data set is 2, 4 and 13 bands for S1, LC and S2 respectively. +To overcome this structure constrain, a convolutional layer is included before the encoders to map the input with different bands to 3. +***The weights of this layer are not updated during training***. +The momentum constant (***m***) is ***0.9*** and the learning rate is ***0.03***. The temperature factor for the loss function is ***0.2***. The batch size is ***64***. -``` +#### Transfer Learning Experiments +We compared supervised learning with HPT model -Using Sen12ms -```bash -``` +1. SEN12MS - Scene Classification +- Supervised Learning Benchmark vs HPT model +**Implementation Details ** +- downloaded pretrained models from t +- the original IGBP land cover scheme has 17 classes. +- the simplified version of IGBP classes has 10 classes, which derived and consolidated from the original 17 classes. +**Qualitative Analyis** -#### Previous -``` -# Evaluate the representations (NOT SURE) -./benchmarks/dist_train_linear.sh configs/benchmarks/linear_classification/resisc45/r50_last.py ${BACKBONE} -``` +- Supervised training (full dataset) + - baseline: downloaded the pre-trained the models and evaluate without finetuning. +- Supervised training (1k dataset) + - Supervised: original ResNet50 used by Sen12ms + - Supervised_1x1: adding conv1x1 block to the ResNet50 used by Sen12ms +- Finetune/transfer learning (1k dataset) + - Moco: the ResNet50 used by Sen12ms is initialized with the weight from Moco backbone + - Moco_1x1: adding conv1x1 block to the ResNet50 used by Sen12ms and both input module and ResNet50 layers are initialized with the weight from Moco + - Moco_1x1Rnd: adding conv1x1 block to the ResNet50 used by Sen12ms. ResNet50 layers are initialized with the weight from Moco but input module is initialized with random weights +- Finetune v2 (1k dataset) + - freezing ResNet50 fully or partially does not seem to help with accuracy. We will continue explore and share the results once we are sure there is no issue with implementation. + +other findings: +- ResNet50_1x1 (s2) 100 epoch and 500 epoch shows similar accuracy. (especially for multi-label). +- ResNet50_1x1 (s2) shows significantly better result with 0.001 than 0.00001 (both single label and multi-label) -This has been simplified to simply: -```bash -./utils/pretrain-evaluator.sh -b OpenSelfSup/work_dirs/hpt-pretrain/${shortname}/ -d OpenSelfSup/configs/hpt-pretrain/${shortname} -``` -where `-b` is the backbone directory and `-d` is the config directory. This command also works for cross-dataset evaluation (e.g. evaluate models trained on Resic45 and evaluate on UC Merced dataset). +(findings pending verifications) +- By looking at the results between models with 1x1 conv and without 1x1 conv counterparts, almost all models with 1x1 conv block underperform the ones without 1x1 conv block. It appears that adding 1x1 conv layer as a volumn filters may loss some bands information overall with the finetune evalutions. -**FAQ** +### Results -Where are the checkpoints and logs? E.g., if you pass in `configs/hpt-pretrain/resisc` as the config directory, then the working directories for this evalution is e.g. `work_dirs/hpt-pretrain/resisc/linear-eval/...`. If w&b is enabled, it will be logged on weight & biases +#### Sensor-based Augmentation -## Finetuning -Assuming you generated the pretraining project as specified above, finetuning is as simple as: +Our sensor-based geo-alignment postive pair approach took sentinel 2 images with 10 bands and sentinel 1 images with 2 bands with the same locations, and had them pass through a convolution 1x1 block before the MoCo v2 framework. Thereafter each of the images outputted a generalizable 3 channels wide images from sentinel 2 and sentinel 1 separately for us to construct the query and key encoder under MoCo v2. -```bash -./utils/finetune-runner.sh -d ./OpenSelfSup/configs/hpt-pretrain/${shortname}/finetune/ -b ./OpenSelfSup/work_dirs/hpt-pretrain/${shortname}/ -``` -where `-b` is the backbone directory and `-d` is the config directory -Note: to finetune using other backbones, simply pass in a different backbone directory (the script searches for `final_backbone.pth` files in the provided directory tree) +The evaluations utilizing SEN12MS sense classification pipeline. Overall, multi-label accuracy resulted better than single-label accuracy across supervised and MoCo models. In general, due to the label noise for the SEN12MS dataset, the highest accuracy we could get may introduce irreducible errors. Knowing the provided supervised pre-train models on the full dataset does not contain s1-only data, In order to bring comparisons with the provided supervised model (full dataset) to our approach, our finetune started with 1k dataset and applied both supervised models and MoCo models with s2 dataset as well as s1/s2 dataset. In addition, we applied different finetune strategies with or without introducing the 1x1 conv block outputted weights from MoCo. +As a result of the finetune/ transfer learning, introducing 1x1 convolution weight from MoCo underperforms the ones without 1x1 convolution block, it appears models adding 1x1 convolution block from MoCo may distort the finetune evaluations, suggesting the representation of the learning may not be optimal. We continue to explore transfer learning using either the simplified dataset and evaluations, or the dataset that has less label noises. -## Finetuning only on pretrained checkpoints with BEST linear analysis +#### Geo-alignment Data Fusion -First, specify the pretraining epochs which gives the best linear evaluation result in `./utils/top-linear-analysis-ckpts.txt`. Here is an example: +aug set 1: resizecrop +aug set 2: resizecrop, blur +aug set 3: aug set 2 + color jittering/ grayscale (optional for now) -``` -# dataset best-moco-bt best-sup-bt best-no-bt -chest_xray_kids 5000 10000 100000 -resisc 5000 50000 100000 -chexpert 50000 50000 400000 -``` -, in which for `chest_xray_kids` dataset, `5000`-iters, `10000`-iters, `100000`-iters are the best pretrained models under `moco base-training`, `imagenet-supervised base-training`, and `no base-training`, respectively. +* all fusion: s1/s2 stacked image are augmented and used as q, k. +* partial fusion: s1, s2, s1/s2 image are equally mixed in the train dataset -Second, run the following command to perform finetuning only on the best checkpoints (same as above, except that the change of script name): -```bash -./utils/finetune-runner-top-only.sh -d ./OpenSelfSup/configs/hpt-pretrain/${shortname}/finetune/ -b ./OpenSelfSup/work_dirs/hpt-pretrain/${shortname} -``` +### Ablation +#### SEN12MS evaluation +scence classification (multi-label) +**?Quesiont/Discussion -- (1) should we bring dp place down, or perhaps transform to % number? as other paper like moco, moco v2, simclr all show more succinet number format in terms of the acc results. and it looks more clear. (2) should we temporarily choose the best resuls for each of the dataset. moco has a randomly sample mechencism unless we further details and fintune with different paramters. but now, would it be goood if showcase our results that we are in the promising direction (and for presenation). we definilty need to visit back for this part tho.** -## Pretraining on top of pretraining -Using the output of previously pretrained models, it is very easy to correctly setup pretraining on top of the pretraining. -Simply create a new config -``` -utils/pretrain-configs/dataname1-dataname2.sh -``` -(see `resisc-ucmerced.sh` for an example) +https://wandb.ai/cal-capstone/scene_classification/reports/Evaluation--Vmlldzo1OTgzNjA -and then set the basetrained models to be the `final_backbone.pth` from the output of the last pretrained. e.g. for using resisc-45 outputs: -``` -export basetrain_weights=( - "work_dirs/hpt-pretrain/resisc/moco_v2_800ep_basetrain/50000-iters/final_backbone.pth" +### para +We perform transfer learning experiments with our two proposed methods on land cover scene classification across xx label class using SEN12MS dataset to understand the quality of the learned representations. Given that the image samples are randomly sampled to positive and negative pairs in MoCo v2 to compute the loss during training processes, we have finetune with different hyper-parameters and selected the best results on each of the evaluation dataset in our reporting. - "work_dirs/hpt-pretrain/resisc/imagenet_r50_supervised_basetrain/50000-iters/final_backbone.pth" +Using the pre-trained with our sensor-based geo-aligned pairs approach, our finetune on the 1,024 samples for the multi class classification shows that the performance on s1 and s2 data together has better performance than the supervised model with the same size of samples. Due to our learned representation being constructed by encoders from two sensor images data with multiple bands information, we argue that the downstream results would perform better on the dataset including both s1 and s2 data, as it appears in the results. - "work_dirs/hpt-pretrain/resisc/no_basetrain/200000-iters/final_backbone.pth" -) -``` -(see `resisc-ucmerced.sh` for an example) -To select which backbones to use, evaluate the linear performance from the various source outputs (e.g. all the resisc pretrained outputs) on the target data (e.g. on uc-merced data). +Whereas by looking at the evaluation results with our geo-alignment data fusion approach on 1,024 samples, as shown in the table, we can see that -Then simply generate the project and execute the pretraining as normal: + - (1) In general, the downstream multi-class accuracy of our data fusion on moco framework results in better performance of the supervised counterparts applying on s1 and s2 data together. Similar to our first approach result, as our learned representation for the data fusion is on sets of combinations of s1 and s2 data, the downstream results perform better on the dataset including s1 and s2 data. -``` -./gen-pretrain-project.sh pretrain-configs/dataname1-dataname2.sh + - (2) The results of the optional fusion approach also perform slightly less accuracy than the full fusion and partial fusion strategy across the combination of the evaluation dataset. -./pretrain-runner.sh -d OpenSelfSup/configs/hpt-pretrain/$dataname1-dataname2 -``` + - (3) By only looking at the pre-trained model. The evaluation on both s1 and s2’s performance shows better accuracy compared to s1 or s2 alone. This is inline with our expectations. -## Object Detection / Semantic Segmentation -Object detection/segmentation uses detectron2 and takes place in the directory -``` -OpenSelfSup/benchmarks/detection -``` +#### BigEarthNet Evaluation (TBD) -**First:** Check if the dataset configs you need are already present in `configs`. E.g. if you're working with CoCo, you'll see the following 2 configs: -``` -configs/coco_R_50_C4_2x.yaml -configs/coco_R_50_C4_2x_moco.yaml -``` -We'll use the config with the `_moco` suffix for all obj det and segmentation. If your configs already exist, skip the next step. +https://wandb.ai/cal-capstone/scene_classification/reports/Evaluation--Vmlldzo1OTgzNjA -**Next:** assuming your configs do not exist, set up the configs you need for your dataset by copying an existing set of configs -``` -cp configs/coco_R_50_C4_2x.yaml ${MYDATA}_R50_C4_2x.yaml -cp configs/coco_R_50_C4_2x_moco.yaml ${MYDATA}_R50_C4_2x_moco.yaml -``` -Edit `${MYDATA}_R50_C4_2x.yaml` and set `MIN_SIZE_TRAIN` and `MIN_SIZE_TEST` to be appropriate for your dataset. Also, rename `TRAIN` and `TEST` to have your dataset name, set `MASK_ON` to `True` if doing semantic segmentation, and update `STEPS` and `MAX_ITER` if running the training for a different amount of time is appropriate (check relevant publications / codebases to set the training schedule). +### Relationship between different batch size and training epochs -Edit `${MYDATA}_R50_C4_2x_moco.yaml` and set `PIXEL_MEAN` and `PIXEL_STD` (use `compute-dataset-pixel-mean-std.py` script above, if you don't know them). +hypothesis: the impact of batch size when models are trained and evaluated on. When training epochs is small(100, 200), larger batch sizes have a significant advantage over the smaller ones. [Having more training epochs, we also shows that the gaps would decrease between different batch size]. As such, our pre-trained approach on the MoCo v2 contrastive learning framework in remote sensing imagery helps stabilize the model performance as a function of labeled data set size, compared to supervised model across the same size of data. -Then, edit `train_net.py` and add the appropriate data registry lines for your train/val data -``` -register_coco_instances("dataname_train", {}, "obj-labels-in-coco-format_train.json", "datasets/dataname/dataname_train") -register_coco_instances("dataname_val", {}, "obj-labels-in-coco-format_val.json", "datasets/dataname/dataname_val") -``` -Then, setup symlinks to your data under `datasets/dataname/dataname_train` and `datasets/dataname/dataname_val`, where you replace dataname with your dataname used in the config/registry. -**Next**, convert your backbone(s) to detectron format, e.g. (NOTE: I recommend keeping backbones in the same directory that they are originally present in, and appending a `-detectron2` suffix) -``` -python convert-pretrain-to-detectron2.py ../../data/basetrain_chkpts/imagenet_r50_supervised.pth ../../data/basetrain_chkpts/imagenet_r50_supervised-detectron2.pth -``` - -**Next** kick off training -``` -python train_net.py --config-file configs/DATANAME_R_50_C4_24k_moco.yaml --num-gpus 4 OUTPUT_DIR results/${UNIQUE_DATANAME_EXACTLY_DESCRIBING_THIS_RUN}/ TEST.EVAL_PERIOD 2000 MODEL.WEIGHTS ../../data/basetrain_chkpts/imagenet_r50_supervised-detectron2.pth SOLVER.CHECKPOINT_PERIOD ${INT_HOW_OFTEN_TO_CHECKPOINT} -``` -results will be in `results/${UNIQUE_DATANAME_EXACTLY_DESCRIBING_THIS_RUN}`, and you can use tensorboard to view them. - -## Commit and Share Results -Run the following command to grab all results (linear analysis, finetunes, etc) and put them into the appropriate json results file in `results/`: -``` -./utils/update-all-results.sh -``` - -You can verify the results in `results` and then add the new/updated results file to git and commit. - -**Did you get an error message such as:** -``` -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! - -Please investigate as your results may not be complete. -(see errors in file: base-training/utils/tmp/errors.txt) - -will not include partial result for /home/XXX/base-training/utils/../OpenSelfSup/work_dirs/hpt-pretrain/resisc/finetune/1000-labels/imagenet_r50_supervised_basetrain/50000-iters-2500-iter-0_01-lr-finetune/20200911_170916.log.json -!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -``` -This means that this particular evaluation run did not appear to run for enough iterations. Investigate the provided log file, rerun any necessary evaluations, and remove the offending log file. - -**Debugging this script** this script finds the top val accuracy, and save the corresponding test accuracy using the following script: -``` -./utils/agg-results.sh -``` -which outputs results to `utils/tmp/results.txt` and errors to `utils/tmp/errors.txt`. Look at this file if your results aren't being generated correctly. - -## Generate plots - -```bash -cd utils -python plot-results.py -``` - -See plots in directory `plot-results` -(you can also pass in a `--data` flag to only generate plots for a specific dataset, e.g. `python plot-results.py --data resisc`) - - -**To plot the eval & test acc curves**, use `./utils/plot.py` -```bash -cd utils -python plot.py --fname PLOT_NAME --folder FOLDER_CONTAINING_DIFFERENT_.PTH_FOLDERs -``` - -**To Generate plot for Exp-2-finetuning**, do -```bash -bash utils/plot-results-exp-2.sh -``` - -See plot in directory `plot-results/exp-2`. - -**To Generate plot for Exp-3-Hierarchical Pretraining**, do -```bash -bash utils/plot-results-exp-3.sh -``` - -See plot in directory `plot-results/exp-3`. - - -## Getting activations for similarity measures - -Run `get_acts.py` with a model used for a classifaction task -(one that has a test/val set).\ -Alternatively, run dist_get_acts as follows: -```shell -bash dist_get_acts.sh ${CFG} ${CHECKPOINT} [--grab_conv...] -``` -Default behavior is to grab the entire batch of linear layers. -Setting `--grab_conv` will capture a single batch of all convolutional layers.\ -Layers will be saved in `${WORK_DIR}/model_acts.npz`. -The npz contains a dictionary which maps layer names to the activations. - - -## Debugging and Developing Within OpenSelfSup - -Here's a command that will allow breakpoints (WARNING: the results with the debug=true flag SHOULD NOT BE USED -- they disable sync batch norms and are not comparable to other results): - -```bash -# from OpenSelfSup/ -# replace with your desired config -python tools/train.py configs/hpt-pretrain/resisc/moco_v2_800ep_basetrain/500-iters.py --work_dir work_dirs/debug --debug - -``` diff --git a/SEN12MS b/SEN12MS index f824eeb..edf41d4 160000 --- a/SEN12MS +++ b/SEN12MS @@ -1 +1 @@ -Subproject commit f824eebb94e6e08b211cf963d6dc768c82c3bd08 +Subproject commit edf41d4b2bf0b756f06327737330e7c892c34dd7 diff --git a/metrics.md b/metrics.md new file mode 100644 index 0000000..ba11f9d --- /dev/null +++ b/metrics.md @@ -0,0 +1,125 @@ + +#### Fusion approach + +**SEN12MS (1024)** +| aug set 2| s1 | s2 | s1/s2 | Note | +| --- | --- | --- | --- | --- | +| Supervised () | ? | ?| ? | | +| [all fusion]() | ? | ? | ? | running | +| [partial fusion]() | ? | ? | ? | done | +| [optional fusion]() | ? | ? | ? | done | + +**SEN12MS (512)** +| aug set 2| s1 | s2 | s1/s2 | Note | +| --- | --- | --- | --- | --- | +| Supervised () | ? | ?| ? | | +| [all fusion]() | ? | ? | ? | running | +| [partial fusion]() | ? | ? | ? | done | +| [optional fusion]() | ? | ? | ? | done | + +**BigEarthNet** +| aug set 2| s1 | s2 | s1/s2 | Note | +| --- | --- | --- | --- | --- | +| Supervised (1024) | ? | ?| ? | running | +| [all fusion]() | ? | ? | ? | running | +| [partial fusion]() | ? | ? | ? | done | +| [optional fusion]() | ? | ? | ? | done | + +**BigEarthNet (512)** +| aug set 2| s1 | s2 | s1/s2 | Note | +| --- | --- | --- | --- | --- | +| Supervised (1024) | ? | ?| ? | running | +| [all fusion]() | ? | ? | ? | running | +| [partial fusion]() | ? | ? | ? | done | +| [optional fusion]() | ? | ? | ? | done | + + +#### sensor augmentation + +| | Metrics|single-label |multi-label | Note | +| --- | --- | --- | --- | --- | +| | | | | | +| full dataset | Supervised s2 | .57 | .60| | +| | Supervised s1/s2 | .45 | .64|| +| | Supervised RGB | .45 | .58| | +| | | | | | +|s2 | Supervised 1x1 | .3863 | .4893 | | +| | Supervised | .4355 | .5931 | too good?| +| | Moco 1x1 RND | .4345 | .6004 | | +| | Moco 1x1 | .4469 | **.601**| not necessarily better | +| | Moco 1x1 RND (1000ep) | .4264 | .5757 | overfitting? | +| | Moco 1x1 (1000ep) | .4073 | .5622 | overfitting? | + +| | Metrics|single-label |multi-label | Note | +| --- | --- | --- | --- | --- | +|s1/s2 | :white_check_mark: Supervised 1x1 | .4094 | .5843 | | +| | :white_check_mark: Supervised | .4426 | .4678 | | +| | :no_entry_sign: Moco 1x1 RND | .4477 | .5317 | | +| | :no_entry_sign: Moco 1x1 | .4474 | .5302 | no conv1 weight transfer | +| | :no_entry_sign: **Moco** | .4718 | **.6697** | no conv1 weight transfer | + +- single-label: Average Accuracy +- multi-label: Overall Accuracy + + +crimson-pyramid + +**aug set 1(TBD)** + +| aug set 1| s1 | s2 | s1/s2 | Note | +| --- | --- | --- | --- | --- | +| Supervised (full) | xx | xx | xx | xx | +| Supervised (1024) | xx | xx | xx | xx | +| --- | --- | --- | --- | --- | +| [sensor-based augmentation] | xx | xx | xx | xx | +| [all fusion] | xx | xx| xx | xx | +| [partial fusion] | xx | xx | xx | xx | +| [optional fusion] | xx | xx | xx | xx| + + +**aug set 2** + +| aug set 2| s1 | s2 | s1/s2 | Note | +| --- | --- | --- | --- | --- | +| Supervised (full) | [Pretrained model is not provided](https://syncandshare.lrz.de/getlink/fiCDbqiiSFSNwot5exvUcW1y/trained_models) | [.60](https://wandb.ai/cal-capstone/sup_scene_cls/runs/3mg9zr5t) | [.64](https://wandb.ai/cal-capstone/sup_scene_cls/runs/2lda2016) | need to retest s1, s2 with zero padding | +| Supervised (1024) | [0.4003](https://wandb.ai/cal-capstone/sup_scene_cls/runs/555fv4cb) | [0.6108](https://wandb.ai/cal-capstone/sup_scene_cls/runs/3m1h27zt) | [.5856](https://wandb.ai/cal-capstone/sup_scene_cls/runs/dpwjby4o) | | +| --- | --- | --- | --- | --- | +| [sensor-based augmentation] | - | [0.6277](https://wandb.ai/cal-capstone/SEN12MS/runs/2826nuca) | [0.6697](https://wandb.ai/cal-capstone/SEN12MS/runs/22tv0kud) | xx | +| [all fusion](https://wandb.ai/cal-capstone/hpt4/runs/ak0xdbfu/overview) | xx | [.6251]? | [.5957](https://wandb.ai/cal-capstone/scene_classification/runs/2y2q8boi) | | +| [partial fusion](https://wandb.ai/cal-capstone/hpt4/runs/367tz8vs) | [.4729](https://wandb.ai/cal-capstone/scene_classification/runs/1qx384cs) | [.5812](https://wandb.ai/cal-capstone/scene_classification/runs/1bdmms2d) |[.6072](https://wandb.ai/cal-capstone/scene_classification/runs/1meu9iym) | | +| [optional fusion](https://wandb.ai/cal-capstone/hpt4/runs/2iu8yfs6) | [.4824](https://wandb.ai/cal-capstone/scene_classification/runs/tu3vuefx) | [.5601](https://wandb.ai/cal-capstone/scene_classification/runs/2hdbuxtv) | [.5884](https://wandb.ai/cal-capstone/scene_classification/runs/y5x2xce6) | | + + +- Supervised (full) s1, s2 need to be retested with zero padding 12 channel. + + +#### BigEarthNet Evaluation (TBD) +scence classification (multi or single label?) + +**aug set 1(TBD)** + + +| aug set 1| s1 | s2 | s1/s2 | Note | +| --- | --- | --- | --- | --- | +| Supervised (full) | xx | xx | xx | xx | +| Supervised (1024) | xx | xx | xx | xx | +| --- | --- | --- | --- | --- | +| [sensor-based augmentation] | xx | xx | xx | xx | +| [all fusion] | xx | xx | xx | xx | +| [partial fusion] | xx | xx | xx | xx | +| [optional fusion] | xx | xx | xx | xx| + + +**aug set 2** + +| aug set 2| s1 | s2 | s1/s2 | Note | +| --- | --- | --- | --- | --- | +| Supervised (full) | xx | xx | xx | xx | +| Supervised (1024) | [.4008](https://wandb.ai/cal-capstone/sup_scene_cls/runs/1lnfsmdi) | [.5496](https://wandb.ai/cal-capstone/sup_scene_cls/runs/3fpzht5f) | [.5423](https://wandb.ai/cal-capstone/sup_scene_cls/runs/1qma48o1) | xx | +| --- | --- | --- | --- | --- | +| [sensor-based augmentation] | xx | xx | xx | xx | +| [all fusion] | xx | xx | xx | xx | +| [partial fusion] | [.4279](https://wandb.ai/cal-capstone/scene_classification/runs/2a1tlnbv) | [.5351](https://wandb.ai/cal-capstone/scene_classification/runs/2f0pjxwx) | [.5352](https://wandb.ai/cal-capstone/scene_classification/table?workspace=user-kenhan) | xx | +| [optional fusion] | [.4478](https://wandb.ai/cal-capstone/scene_classification/runs/36c8z6ae) | [.5120](https://wandb.ai/cal-capstone/scene_classification/runs/3oazvjke) | [.5294](https://wandb.ai/cal-capstone/scene_classification/runs/nar53xcn) | xx| + +- Supervised (full) s1, s2 need to be retested with zero padding 12 channel. diff --git a/paper_draft.md b/paper_draft.md deleted file mode 100644 index 0f36c1b..0000000 --- a/paper_draft.md +++ /dev/null @@ -1,193 +0,0 @@ -## Abstract - -We present a sensor-based location invariance momentum contrast for unsupervised visual representation learning in remote sensing application, where unlabeled data is well-known challenges to deep learning domain and accurate training data remains comparably scarce. In this study, we first introduce the use of SEN12MS datasets, a curated large-scale training data that include versatile remote sensing information from different sensors with global scene distributions. To continually bridge the gap between supervised and unsupervised learning on computer vision tasks in remote sensing application, we exploit the geo-alignment data structure from SEN12MS and construct sensor-based geo-alignment positive pairs in contrastive learning to design the natural augmentation. [last sentence subject to changes] Our experiments show that the proposed method outperforms the supervised learning counterpart when transferring to downstream tasks in scene classification for remote sensing data. - - -## Introduction -The performance of deep convolutional neural networks depends on their capability and the amount of training data. The datasets are becoming larger in every domain and different kinds of network architectures like [VGG](https://arxiv.org/pdf/1409.1556.pdf), [GoogLeNet](https://arxiv.org/pdf/1409.4842.pdf), [ResNet](https://arxiv.org/pdf/1512.03385.pdf), [DenseNet](https://arxiv.org/pdf/1608.06993.pdf), etc., increased network models' capacity. - -However, the collection and annotation of large-scale datasets are time-consuming and expensive. Many self-supervised methods were proposed to learn visual features from large-scale unlabeled data without using any human annotations to avoid time-consuming and costly data annotations. Contrastive learning of visual representations has emerged as the front-runner for self-supervision and has demonstrated superior performance on downstream tasks. All contrastive learning frameworks involve maximizing agreement between positive image pairs relative to negative/different images via a contrastive loss function; this pretraining paradigm forces the model to learn good representations. These approaches typically differ in how they generate positive and negative image pairs from unlabeled data and how the data are sampled during pretraining. - -Self-supervised approaches such as Momentum Contrast (MoCo) ([He et al., 2019](https://arxiv.org/pdf/1911.05722.pdf); [Chen et al.,2020](https://arxiv.org/pdf/2003.04297.pdf)) can leverage unlabeled data to produce pre-trained models for subsequent fine-tuning on labeled data. In addition to MoCo, these include frameworks such as SimCLR ([Chen et al., 2020](https://arxiv.org/pdf/2002.05709.pdf)) and PIRL ([Misra and Maaten, 2020](https://openaccess.thecvf.com/content_CVPR_2020/papers/Misra_Self-Supervised_Learning_of_Pretext-Invariant_Representations_CVPR_2020_paper.pdf)). - -Remote sensing data has become broadly available at the petabyte scale, offering unprecedented visibility into natural and human activity across the Earth. In remote sensing, labeled data is usually scarce and hard to obtain. Due to the success of self-supervised learning methods, we explore their application to large-scale remote sensing datasets. - -While most self-supervised image analysis techniques focus on natural imagery, remote sensing differs in several critical ways. Natural imagery often has one subject; remote sensing images contain numerous objects such as buildings, trees, roads, rivers, etc. Additionally, the important content changes unpredictably within just a few pixels or between images at the same location from different times. Multiple satellites capture images of the same locations on earth with a wide variety of resolutions, spectral bands (channels), and revisit rates, such that any specific problem can require a different -combination of sensor inputs([Reiche et al., 2018](https://doi.org/10.1016/j.rse.2017.10.034),[Rustowicz et al., 2019](https://openaccess.thecvf.com/content_CVPRW_2019/papers/cv4gc/Rustowicz_Semantic_Segmentation_of_Crop_Type_in_Africa_A_Novel_Dataset_CVPRW_2019_paper.pdf)). - -While MoCo and other contrastive learning methods have demonstrated promising results on natural image classification tasks, their application to remote sensing applications has been limited. - -Traditional contrative learning utilizes augmentation to generate positive pair. Inspired by recent success (Geo-aware Paper) using natural augmentation to create positive pairs, we propose to use positive pairs from different sensors for the same location. - -In this work, we demonstrate that pre-training [MoCo-v2](https://openaccess.thecvf.com/content_CVPR_2020/papers/He_Momentum_Contrast_for_Unsupervised_Visual_Representation_Learning_CVPR_2020_paper.pdf) on data from multiple sensors lead to improved representations for remote sensing applications. - -## Related Work -#### Self-supervised contrastive learning -Many self-supervised learning methods for visual feature learning have been developed without using any - human-annotated labels. Compared to supervised learning methods which require a data pair Xi - and Yi while Yi is annotated by human labors, self-supervised learning also trained with data Xi along - with its pseudo label Pi while Pi is automatically generated for a pre-defined pretext task without involving any - human annotation. The pseudo label Pi can be generated by using attributes of images or videos such as the context of - images or by traditional hand-designed methods. As long as the pseudo labels P are automatically generated - without involving human annotations, then the methods belong to self-supervised learning. Recently, self-supervised - learning methods have achieved great progress. - - Self-supervised contrastive learning approaches such as [MoCo](https://arxiv.org/pdf/1911.05722.pdf) , - [MoCo-v2](https://arxiv.org/pdf/2003.04297.pdf), [SimCLR](https://arxiv.org/pdf/2002.05709.pdf), and [PIRL](https://openaccess.thecvf.com/content_CVPR_2020/papers/Misra_Self-Supervised_Learning_of_Pretext-Invariant_Representations_CVPR_2020_paper.pdf) have demonstrated - superior performance and have emerged as the fore-runner on various downstream tasks. The intuition behind these - methods are to learn representations by pulling positive image pairs from the same instance closer in latent space - while pushing negative pairs from difference instances further away. These methods, on the other hand, differ in the - type of contrastive loss, generation of positive and negative pairs, and sampling method. - - Contrastive learning of visual representations using MoCo ([**MoCo-v2**](https://arxiv.org/pdf/2003.04297.pdf) - Chen, et - al., Facebook AI Research, 2020) has emerged as the front-runner for self-supervision and has demonstrated superior performance on downstream tasks. - -#### Performance gap in Satellite imagery -There is a performance gap between supervised learning using labels and self-supervised contrastive learning method, [MoCo-v2](https://arxiv.org/pdf/2003.04297.pdf), on remote - sensing datasets. For instance, on the Functional Map of the World ([fMoW](https://arxiv.org/abs/1711.07846)) image classification - benchmark, there is an 8% gap in top 1 accuracy between supervised and self-supervised methods. By leveraging spatially aligned - images over time to construct temporal positive pairs in contrastive learning and geo-location in the design of pre-text tasks, **[Geography-Aware - Self-supervised Learning](https://arxiv.org/pdf/2011.09980.pdf)** (Ayush, et al., Stanford University, 2020) were able to - close the gap between self-supervised and supervised learning on image classification, object detection and semantic - segmentation on remote sensing and other geo-tagged image datasets. - -In this work, we provide an effective approach for improving representation learning using data from different satellite imagery using [MoCo-v2](https://arxiv.org/pdf/2003.04297.pdf). - -## Problem Definition -Does contrastive pre-training with data from multiple sensors lead to improved representations for remote sensing applications? - -Pre-train the contrastive model using unlabeled data from multiple satellites and use that model for downstream remote sensing tasks. - -We want to show that our approach to using images from different satellites for the same location as naturally augmented images as input to the MoCo-v2 method provides high-quality representations and transferable initializations for satellite imagery interpretation. Despite many differences in the data and task properties between natural image classification and satellite imagery interpretation, we want to show the benefit of MoCo-v2 pretraining across multiple patches from different satellites for satellite imagery and investigate representation transfer to a target dataset. - -### Datasets -- [todo] keep only sen12ms. -- -To validate our ideas, we did experiments on datasets with different satellite imageries with variations in dataset size, channels, and image ground resolutions. The statistics of these datasets are given below. Readers are requested to see the the supplementary materials for examples and additional details of these datasets. -| Dataset | Satellites | Number of Images | Image Size | Labels | Notes | -|---|---|---|---|---|---| -| [BigEarthNet](https://arxiv.org/pdf/1902.06148.pdf) | Sentinel-2A/B |590,326 patches; 12 Bands | 20x20 to 120x120 | Multiple, up to 43 | No overlapping; 10 European Countries | -| [SEN12MS](https://arxiv.org/pdf/1906.07789.pdf) | Sentinel-1A/B; Sentinel-2A/B; MODIS (Terra and Aqua) | 541,986 patches; 180662 triplets (3\*180662); 4, 2 and 13 Bands | 256X256 | Single, 17 Full and 10 Simplified | Partial overlapping | -| [FMoW](https://arxiv.org/abs/1711.07846) | QuickBird-2; GeoEye-1; WorldView-2; WorldView-3 | 1,047,691 patches; 4, 8 and RGB Bands | Variable Over 2500x2500 | Multiple, up to 63; Bounding Box Annotations | Includes False Detection; Variable timestamp overlapping | - -##### SEN12MS -The SEN12MS dataset contains 180,662 patch triplets of corresponding Sentinel-1 dual-pol SAR data, Sentinel-2 multi-spectral images, and MODIS-derived land cover maps. The patches are distributed across the land masses of the Earth and spread over all four meteorological seasons. This is reflected by the dataset structure. The captured scenes were tiled into patches of 256 X 256 pixels in size and implemented a stride of 128 pixels, resulting in an overlap between adjacent patches. -Only 3847 patches do not have any overlap with adjacent patches. -Most of the overlap occurs around 25% and 50% of the area with few patches overlapping less than 15% and more than 75%. - - All patches are provided in the form of 16-bit GeoTiffs containing the following specific information: -* Sentinel-1 SAR: 2 channels corresponding to sigma nought backscatter values in dB scale for VV and VH polarization. -* Sentinel-2 Multi-Spectral: 13 channels corresponding to the 13 spectral bands (B1, B2, B3, B4, B5, B6, B7, B8, B8a, B9, B10, B11, B12). -* MODIS Land Cover: 4 channels corresponding to IGBP, LCCS Land Cover, LCCS Land Use, and LCCS Surface Hydrology layers. - -## Method -In this section, we briefly review Contrastive Learning Framework for unsupervised learning and detail our proposed approach to improve Moco-v2, a recent contrastive learning framework, on satellite imagery from multiple sensors data. - -**Multiple-Sensor** -Update on different bands, different satellites etc. with images. - -![](web/images/moco_framework.png)![](web/images/current_approach.png) - -#### 1. Contrastive Learning Framework -Contrastive methods attempt to learn a mapping fq from raw pixels to semantically meaningful representations z in an unsupervised way. The training objective encourages representations corresponding to pairs of images that are known a priori to be semantically similar (positive pairs) to be closer to each other than typical unrelated pairs (negative pairs). With similarity measured by dot product, recent approaches in contrastive learning differ in the type of contrastive loss and generation of positive and negative pairs. In this work, we focus on the state-of-the-art contrastive learning framework [MoCo-v2](https://arxiv.org/pdf/2003.04297.pdf), an improved version of [MoCo](https://arxiv.org/pdf/1911.05722.pdf), and study improved methods for the construction of positive and negative pairs tailored to remote sensing applications. - -#### 2. Sensors-based Geo-alignment Positive Pairs -Given the SEN12MS that provides the largest remote sensing dataset available with its global scene distribution and the wealth of versatile remote sensing information, It is natural to leverage the geo-alignment imagery from different remote sensing sensors while constructing positive or negative pairs . For example, Sentinel 1 consists of two images (vertical and horizontal polarization) and Sentinel 2 consist of thirteen images (different wavelength bands) of the same patch. Any combination from the same patch would correspond to a positive pair without the need of additional augmentation, while negative pairs would correspond to any image from different patches without restriction of the same or different satellites. - -In short, given an image x_i(s1) collected from sentinel 1, we can randomly select another image x_i(s2) collected from sentinel 2 that is geographically aligned with the x_i(s1), and then have them passthrough MoCo-v2 to the geographically aligned image pair x_i(s1) and x_i(s2), which provides us with a sensor-based geo-alignment positive pair ( v and v’ in Figure xxx) that can be used for training the contrastive learning framework by the query and key encoders in MoCo-v2. - -For a sample of x_i(s1), our GeoSensorInforNCE objective loss can be demonstrated as follows: - -***INSERT FIGURE HERE (GeoSensorInfoNCE)*** - -where z_i(s1) and z_i(s2) are the encoded representation of the randomly geo-aligned positive pair x_i(s1) and x_i(s2). N denotes the number of negative samples, {k_j}j=1_n are the encoded negative pairs, and \lambda is the temperature hyperparameter. - -What we used are the actual images from the same location but different sensors. With the inspiration of the success of geography-aware self-supvervised learing (insert ref -- **geo xxxxx paper** )that constructs temporal pairs from real images, we also rely on the assumptions that the actual images for positive pairs encourages the entire network to learn better representations for real sensors data than the one focusing on augmentation strategies and synthetic images. -xxxxx - -#### 3. 1x1 Convolution filters - -From the above perspective of constructing naturally augmented positive and negative pairs in contrastive learning, we noticed that the volume (bands) of the inputs from different sensors are different. In order to match the typical dimensions of the image channels, our study also applies the Network in Network concept (Min Lin et al)(insert ref --**NIN**) to the sourced images. As such, we introduced an extra layer of one by one convolution filter block to perform cross channel sampling, thereby matching and aligning the depth of the channels from different sensor images while introducing non-linearity before the MoCo v2 encoding. With the implementation, we leverage this trick to carry out a pretty non-trivial computation on the input volume whereas we hope to increase the generalization capability in the network. - - - - -## Experiments -#### Pre-training on SEN12MS - -Pre-training is performed twice for comparison proposes. First, examples from all patches are included (180,662). Second, pre-train includes a sample of the dataset which patches do not overlap with their adjacent patches. This sample of the dataset is selected on firs come first serve basis and any adjacent overlapping patch is ignored. The selection consist of 35,792 patches. - -The model is pre-trained on different scenarios to compare the performance of the model. ***First, the model is trained by using the original approach of MoCo V2. The input image is augmented by gaussian blur, elastic transformation, vertical and horizontal flip***. Second, the model with the approach proposed in this work that is using images from different satellites as positive pairs. ***Third, in order to generalize the model, augmentation is applied to both satellites during training***. The pre-train is also done with both the complete dataset and the non-overlapping sample described in the previous section. - -The encoders have ***ResNet50*** architecture (50 layers deep, 2048 nodes) with 128 output nodes. -These encoders are designed for a RGB input (3 bands) and Sen12MS data set is 2, 4 and 13 bands for S1, LC and S2 respectively. -To overcome this structure constrain, a convolutional layer is included before the encoders to map the input with different bands to 3. -***The weights of this layer are not updated during training***. -The momentum constant (***m***) is ***0.9*** and the learning rate is ***0.03***. The temperature factor for the loss function is ***0.2***. The batch size is ***64***. - - -#### Transfer Learning Experiments - -We compared supervised learning with HPT model - -1. SEN12MS - Scene Classification - -- Supervised Learning Benchmark vs HPT model - -**Implementation Details ** - -- downloaded pretrained models from t -- the original IGBP land cover scheme has 17 classes. -- the simplified version of IGBP classes has 10 classes, which derived and consolidated from the original 17 classes. - -**Qualitative Analyis** - -- Supervised training (full dataset) - - baseline: downloaded the pre-trained the models and evaluate without finetuning. -- Supervised training (1k dataset) - - Supervised: original ResNet50 used by Sen12ms - - Supervised_1x1: adding conv1x1 block to the ResNet50 used by Sen12ms -- Finetune/transfer learning (1k dataset) - - Moco: the ResNet50 used by Sen12ms is initialized with the weight from Moco backbone - - Moco_1x1: adding conv1x1 block to the ResNet50 used by Sen12ms and both input module and ResNet50 layers are initialized with the weight from Moco - - Moco_1x1Rnd: adding conv1x1 block to the ResNet50 used by Sen12ms. ResNet50 layers are initialized with the weight from Moco but input module is initialized with random weights -- Finetune v2 (1k dataset) - - freezing ResNet50 fully or partially does not seem to help with accuracy. We will continue explore and share the results once we are sure there is no issue with implementation. - -| Metrics|single-label Average Accuracy (%)|multi-label Overall Accuracy (%) | -| --- | --- | --- | -| Supervised s2 (full) | .57 | .60| -| Supervised s1/s2 (full) | .45 | .64| -| Supervised RGB (full) | .45 | .58| -| --- | --- | --- | -| Supervised s2 (1024) | **.4355** | .5931 | -| Supervised s1/s2 (1024) | .4652 | .4652 | -| Supervised 1x1 s2 (1024) | **.3863** | .4893 | -| Supervised 1x1 s1/s2 (1024) | .4094 | .5843 | -| Moco s2 (1024) | .4545 | **.6277** | -| Moco s1/s2 (1024) | .4514 | **.6697** | -| Moco 1x1 s2 (1024)| .4454 | **.601**| -| Moco 1x1 s1/s2 (1024)| _.425_ (?) | .5302 | -| Moco 1x1 RND s2 (1024)| 0.371 | .5374 | -| Moco 1x1 RND s1/s2 (1024)| .4477 | .5152 | - -(before): before learning rate adjustment - -other findings: -- ResNet50_1x1 (s2) 100 epoch and 500 epoch shows similar accuracy. (especially for multi-label). -- ResNet50_1x1 (s2) shows significantly better result with 0.001 than 0.00001 (both single label and multi-label) - -(findings pending verifications) -- By looking at the results between models with 1x1 conv and without 1x1 conv counterparts, almost all models with 1x1 conv block underperform the ones without 1x1 conv block. It appears that adding 1x1 conv layer as a volumn filters may loss some bands information overall with the finetune evalutions. - -## Conclusion - - - -## References -TODO: Use APA style later. Do this once the draft is ready by taking the links in the document, giving them a number and use APA style generator. -[1] -[2] -[3] -[4] -[5] diff --git a/references/hpt_repo.md b/references/hpt_repo.md new file mode 100644 index 0000000..c43f900 --- /dev/null +++ b/references/hpt_repo.md @@ -0,0 +1,351 @@ +# Hierarchical Pretraining: Research Repository + +This is a research repository for the submission "Self-Supervised Pretraining Improves Self-Supervised Pretraining" + +For initial setup, refer to [setup instructions](setup_pretraining.md). + +## Setup Weight & Biases Tracking + +```bash +export WANDB_API_KEY= +export WANDB_ENTITY=cal-capstone +export WANDB_PROJECT=scene_classification +#export WANDB_MODE=dryrun +``` + +## Base Training + +[OpenSelfSup](https://github.com/Berkeley-Data/OpenSelfSup) + +Right now we assume ImageNet base trained models. +```bash +cd OpenSelfSup/data/basetrain_chkpts/ +./download-pretrained-models.sh +``` + +## Pretraining With a New Dataset + +[hpt](https://github.com/Berkeley-Data/hpt) + +We have a handy set of config generators to make pretraining with a new dataset easy and consistent! + +**FIRST**, you will need the image pixel mean/std of your dataset, if you don't have it, you can do: +```bash +cd src/data/ + +# for sen12ms, run multiples times replacing --use_s1 by --use_s2 or --use_RGB +./compute-dataset-pixel-mean-std-sen12ms.py --data_dir /storage/sen12ms_x --data_index_dir /scratch/crguest/hpt/data --use_s1 --numworkers 1 + +# for others +./compute-dataset-pixel-mean-std.py --data /scratch/crguest/data/sen12ms_small --numworkers 20 --batchsize 256 + +where image-folder has the structure from ImageFolder in pytorch +class/image-name.jp[e]g +or whatever image extension you're using +``` +if your dataset is not arranged in this way, you can either: +(i) use symlinks to put it in this structure +(ii) update the above script to read in your data + +NOTE: For sen12ms, the code is not working as expected (refer to [this issue](https://github.com/Berkeley-Data/hpt/issues/24), until then use the following. +``` +bands_mean = {'s1_mean': [-11.76858, -18.294598], + 's2_mean': [1226.4215, 1137.3799, 1139.6792, 1350.9973, 1932.9058, + 2211.1584, 2154.9846, 2409.1128, 2001.8622, 1356.0801]} + +bands_std = {'s1_std': [4.525339, 4.3586307], + 's2_std': [741.6254, 740.883, 960.1045, 946.76056, 985.52747, + 1082.4341, 1057.7628, 1136.1942, 1132.7898, 991.48016]} +``` + +## Pre-training with SEN12MS Dataset +[OpenSelfSup](https://github.com/Berkeley-Data/OpenSelfSup) +- see `src/utils/pretrain-runner.sh` for end-to-end run (require prep creating config files). + +Check installation by pretraining using mocov2, extracting the model weights, evaluating the representations, and then viewing the results on tensorboard or [wandb](https://wandb.ai/cal-capstone/hpt): + +Set up experimental tracking and model versioning: +```bash +export WANDB_API_KEY= +export WANDB_ENTITY=cal-capstone +export WANDB_PROJECT=hpt4 +``` + +#### Run pre-training +```bash +cd OpenSelfSup + +# set which GPUs to use +# CUDA_VISIBLE_DEVICES=1 +# CUDA_VISIBLE_DEVICES=0,1,2,3 + +# (sanity check) Single GPU training on samll dataset +/tools/single_train.sh configs/selfsup/moco/r50_v2_sen12ms_in_basetrain_aug_20ep.py --debug + +# (sanity check) Single GPU training on samll dataset on sen12ms fusion +./tools/single_train.sh configs/selfsup/moco/r50_v2_sen12ms_fusion_in_smoke_aug.py --debug + +# (sanity check) 4 GPUs training on samll dataset +./tools/dist_train.sh configs/selfsup/moco/r50_v2_sen12ms_in_basetrain_aug_20ep.py 4 + +# (sanity check) 4 GPUs training on samll fusion dataset +./tools/dist_train.sh configs/selfsup/moco/r50_v2_sen12ms_fusion_in_smoke_aug.py 4 + +# distributed full training +/tools/dist_train.sh configs/selfsup/moco/r50_v2_sen12ms_in_fulltrain_20ep.py 4 +``` + +#### (OPTIONAL) download pre-trained models + +Some of key pre-trained models are on s3 (s3://sen12ms/pretrained): +- [200 epochs w/o augmentation: vivid-resonance-73](https://wandb.ai/cjrd/BDOpenSelfSup-tools/runs/3qjvxo2p/overview?workspace=user-cjrd) +- [20 epochs w/o augmentation: silvery-oath7-2rr3864e](https://wandb.ai/cal-capstone/hpt2/runs/2rr3864e?workspace=user-taeil) +- [sen12ms-baseline: soft-snowflake-3.pth](https://wandb.ai/cal-capstone/SEN12MS/runs/3gjhe4ff/overview?workspace=user-taeil) + +``` +aws configure +aws s3 sync s3://sen12ms/pretrained . --dryrun +aws s3 sync s3://sen12ms/pretrained_sup . --dryrun +``` + +#### Extract pre-trained model +Any other models can be restored by run ID if stored with W&B. Go to files section under the run to find `*.pth` files + +```bash +BACKBONE=work_dirs/selfsup/moco/r50_v2_sen12ms_in_basetrain_20ep/epoch_20_moco_in_baseline.pth + +# method 1: From working dir(same system for pre-training) +# CHECKPOINT=work_dirs/selfsup/moco/r50_v2_resisc_in_basetrain_20ep/epoch_20.pth + +# method 2: from W&B, {projectid}/{W&B run id} (any system) +CHECKPOINT=hpt2/3l4yg63k + +# Extract the backbone +python tools/extract_backbone_weights.py ${BACKBONE} ${CHECKPOINT} + +``` + + +## Evaluating Pretrained Representations + +Using OpenSelfSup +```bash +python tools/train.py $CFG --pretrained $PRETRAIN + +# RESISC finetune example +tools/train.py --local_rank=0 configs/benchmarks/linear_classification/resisc45/r50_last.py --pretrained work_dirs/selfsup/moco/r50_v2_resisc_in_basetrain_20ep/epoch_20_moco_in_basetrain.pth --work_dir work_dirs/benchmarks/linear_classification/resisc45/moco-selfsup/r50_v2_resisc_in_basetrain_20ep-r50_last --seed 0 --launcher=pytorch + + + +``` + + +Using Sen12ms +```bash +``` + + + + + +#### Previous +``` +# Evaluate the representations (NOT SURE) +./benchmarks/dist_train_linear.sh configs/benchmarks/linear_classification/resisc45/r50_last.py ${BACKBONE} +``` + +This has been simplified to simply: +```bash +./utils/pretrain-evaluator.sh -b OpenSelfSup/work_dirs/hpt-pretrain/${shortname}/ -d OpenSelfSup/configs/hpt-pretrain/${shortname} +``` +where `-b` is the backbone directory and `-d` is the config directory. This command also works for cross-dataset evaluation (e.g. evaluate models trained on Resic45 and evaluate on UC Merced dataset). + +**FAQ** + +Where are the checkpoints and logs? E.g., if you pass in `configs/hpt-pretrain/resisc` as the config directory, then the working directories for this evalution is e.g. `work_dirs/hpt-pretrain/resisc/linear-eval/...`. If w&b is enabled, it will be logged on weight & biases + +## Finetuning +Assuming you generated the pretraining project as specified above, finetuning is as simple as: + +```bash +./utils/finetune-runner.sh -d ./OpenSelfSup/configs/hpt-pretrain/${shortname}/finetune/ -b ./OpenSelfSup/work_dirs/hpt-pretrain/${shortname}/ +``` +where `-b` is the backbone directory and `-d` is the config directory +Note: to finetune using other backbones, simply pass in a different backbone directory (the script searches for `final_backbone.pth` files in the provided directory tree) + + +## Finetuning only on pretrained checkpoints with BEST linear analysis + +First, specify the pretraining epochs which gives the best linear evaluation result in `./utils/top-linear-analysis-ckpts.txt`. Here is an example: + +``` +# dataset best-moco-bt best-sup-bt best-no-bt +chest_xray_kids 5000 10000 100000 +resisc 5000 50000 100000 +chexpert 50000 50000 400000 +``` +, in which for `chest_xray_kids` dataset, `5000`-iters, `10000`-iters, `100000`-iters are the best pretrained models under `moco base-training`, `imagenet-supervised base-training`, and `no base-training`, respectively. + +Second, run the following command to perform finetuning only on the best checkpoints (same as above, except that the change of script name): +```bash +./utils/finetune-runner-top-only.sh -d ./OpenSelfSup/configs/hpt-pretrain/${shortname}/finetune/ -b ./OpenSelfSup/work_dirs/hpt-pretrain/${shortname} +``` + + + +## Pretraining on top of pretraining +Using the output of previously pretrained models, it is very easy to correctly setup pretraining on top of the pretraining. +Simply create a new config +``` +utils/pretrain-configs/dataname1-dataname2.sh +``` +(see `resisc-ucmerced.sh` for an example) + +and then set the basetrained models to be the `final_backbone.pth` from the output of the last pretrained. e.g. for using resisc-45 outputs: + +``` +export basetrain_weights=( + "work_dirs/hpt-pretrain/resisc/moco_v2_800ep_basetrain/50000-iters/final_backbone.pth" + + "work_dirs/hpt-pretrain/resisc/imagenet_r50_supervised_basetrain/50000-iters/final_backbone.pth" + + "work_dirs/hpt-pretrain/resisc/no_basetrain/200000-iters/final_backbone.pth" +) +``` +(see `resisc-ucmerced.sh` for an example) + +To select which backbones to use, evaluate the linear performance from the various source outputs (e.g. all the resisc pretrained outputs) on the target data (e.g. on uc-merced data). + +Then simply generate the project and execute the pretraining as normal: + +``` +./gen-pretrain-project.sh pretrain-configs/dataname1-dataname2.sh + +./pretrain-runner.sh -d OpenSelfSup/configs/hpt-pretrain/$dataname1-dataname2 +``` + + +## Object Detection / Semantic Segmentation +Object detection/segmentation uses detectron2 and takes place in the directory +``` +OpenSelfSup/benchmarks/detection +``` + +**First:** Check if the dataset configs you need are already present in `configs`. E.g. if you're working with CoCo, you'll see the following 2 configs: +``` +configs/coco_R_50_C4_2x.yaml +configs/coco_R_50_C4_2x_moco.yaml +``` +We'll use the config with the `_moco` suffix for all obj det and segmentation. If your configs already exist, skip the next step. + +**Next:** assuming your configs do not exist, set up the configs you need for your dataset by copying an existing set of configs +``` +cp configs/coco_R_50_C4_2x.yaml ${MYDATA}_R50_C4_2x.yaml +cp configs/coco_R_50_C4_2x_moco.yaml ${MYDATA}_R50_C4_2x_moco.yaml +``` +Edit `${MYDATA}_R50_C4_2x.yaml` and set `MIN_SIZE_TRAIN` and `MIN_SIZE_TEST` to be appropriate for your dataset. Also, rename `TRAIN` and `TEST` to have your dataset name, set `MASK_ON` to `True` if doing semantic segmentation, and update `STEPS` and `MAX_ITER` if running the training for a different amount of time is appropriate (check relevant publications / codebases to set the training schedule). + +Edit `${MYDATA}_R50_C4_2x_moco.yaml` and set `PIXEL_MEAN` and `PIXEL_STD` (use `compute-dataset-pixel-mean-std.py` script above, if you don't know them). + +Then, edit `train_net.py` and add the appropriate data registry lines for your train/val data +``` +register_coco_instances("dataname_train", {}, "obj-labels-in-coco-format_train.json", "datasets/dataname/dataname_train") +register_coco_instances("dataname_val", {}, "obj-labels-in-coco-format_val.json", "datasets/dataname/dataname_val") +``` + +Then, setup symlinks to your data under `datasets/dataname/dataname_train` and `datasets/dataname/dataname_val`, where you replace dataname with your dataname used in the config/registry. + +**Next**, convert your backbone(s) to detectron format, e.g. (NOTE: I recommend keeping backbones in the same directory that they are originally present in, and appending a `-detectron2` suffix) +``` +python convert-pretrain-to-detectron2.py ../../data/basetrain_chkpts/imagenet_r50_supervised.pth ../../data/basetrain_chkpts/imagenet_r50_supervised-detectron2.pth +``` + +**Next** kick off training +``` +python train_net.py --config-file configs/DATANAME_R_50_C4_24k_moco.yaml --num-gpus 4 OUTPUT_DIR results/${UNIQUE_DATANAME_EXACTLY_DESCRIBING_THIS_RUN}/ TEST.EVAL_PERIOD 2000 MODEL.WEIGHTS ../../data/basetrain_chkpts/imagenet_r50_supervised-detectron2.pth SOLVER.CHECKPOINT_PERIOD ${INT_HOW_OFTEN_TO_CHECKPOINT} +``` +results will be in `results/${UNIQUE_DATANAME_EXACTLY_DESCRIBING_THIS_RUN}`, and you can use tensorboard to view them. + +## Commit and Share Results +Run the following command to grab all results (linear analysis, finetunes, etc) and put them into the appropriate json results file in `results/`: +``` +./utils/update-all-results.sh +``` + +You can verify the results in `results` and then add the new/updated results file to git and commit. + +**Did you get an error message such as:** +``` +!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! + +Please investigate as your results may not be complete. +(see errors in file: base-training/utils/tmp/errors.txt) + +will not include partial result for /home/XXX/base-training/utils/../OpenSelfSup/work_dirs/hpt-pretrain/resisc/finetune/1000-labels/imagenet_r50_supervised_basetrain/50000-iters-2500-iter-0_01-lr-finetune/20200911_170916.log.json +!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! +``` +This means that this particular evaluation run did not appear to run for enough iterations. Investigate the provided log file, rerun any necessary evaluations, and remove the offending log file. + +**Debugging this script** this script finds the top val accuracy, and save the corresponding test accuracy using the following script: +``` +./utils/agg-results.sh +``` +which outputs results to `utils/tmp/results.txt` and errors to `utils/tmp/errors.txt`. Look at this file if your results aren't being generated correctly. + +## Generate plots + +```bash +cd utils +python plot-results.py +``` + +See plots in directory `plot-results` +(you can also pass in a `--data` flag to only generate plots for a specific dataset, e.g. `python plot-results.py --data resisc`) + + +**To plot the eval & test acc curves**, use `./utils/plot.py` +```bash +cd utils +python plot.py --fname PLOT_NAME --folder FOLDER_CONTAINING_DIFFERENT_.PTH_FOLDERs +``` + +**To Generate plot for Exp-2-finetuning**, do +```bash +bash utils/plot-results-exp-2.sh +``` + +See plot in directory `plot-results/exp-2`. + +**To Generate plot for Exp-3-Hierarchical Pretraining**, do +```bash +bash utils/plot-results-exp-3.sh +``` + +See plot in directory `plot-results/exp-3`. + + +## Getting activations for similarity measures + +Run `get_acts.py` with a model used for a classifaction task +(one that has a test/val set).\ +Alternatively, run dist_get_acts as follows: +```shell +bash dist_get_acts.sh ${CFG} ${CHECKPOINT} [--grab_conv...] +``` +Default behavior is to grab the entire batch of linear layers. +Setting `--grab_conv` will capture a single batch of all convolutional layers.\ +Layers will be saved in `${WORK_DIR}/model_acts.npz`. +The npz contains a dictionary which maps layer names to the activations. + + +## Debugging and Developing Within OpenSelfSup + +Here's a command that will allow breakpoints (WARNING: the results with the debug=true flag SHOULD NOT BE USED -- they disable sync batch norms and are not comparable to other results): + +```bash +# from OpenSelfSup/ +# replace with your desired config +python tools/train.py configs/hpt-pretrain/resisc/moco_v2_800ep_basetrain/500-iters.py --work_dir work_dirs/debug --debug + +``` + diff --git a/references/model_architectures.md b/references/model_architectures.md index 1300689..11fe6e6 100644 --- a/references/model_architectures.md +++ b/references/model_architectures.md @@ -11,20 +11,35 @@ - Finetune v2 (1k dataset) - freezing ResNet50 fully or partially does not seem to help with accuracy. We will continue explore and share the results once we are sure there is no issue with implementation. - #### Key pretrained models +![[pretraining_loss_comparisions.png]] + Some pretrained models: **Sensor Augmentation** +- [dainty-dragon-14](https://wandb.ai/cal-capstone/hpt3/runs/b2de56v2) hpt3 + +(old) - [vivid-resonance-73](https://wandb.ai/cjrd/BDOpenSelfSup-tools/runs/3qjvxo2p) - [silvery-oath-7](https://wandb.ai/cal-capstone/hpt2/runs/2rr3864e) - sen12_crossaugment_epoch_1000.pth: 1000 epocs **Data Fusion - Augmentation Set 2** - [(optional fusion) crimson-pyramid-70](https://wandb.ai/cal-capstone/hpt4/runs/2iu8yfs6): 200 epochs -- [(partial fusion) laced-water-61](https://wandb.ai/cal-capstone/hpt4/runs/367tz8vs) 200 epochs, 32K -- [(partial fusion) visionary-lake-62](https://wandb.ai/cal-capstone/hpt4/runs/1srlc7jr/overview?workspace=user-taeil) should deprecate. different number of epochs from other pretrained models +- [(partial fusion) decent-bird-80](https://wandb.ai/cal-capstone/hpt4/runs/yuy7sdav) to replace due to consistent kernel size. [(partial fusion) laced-water-61](https://wandb.ai/cal-capstone/hpt4/runs/367tz8vs) and [visionary-lake-62](https://wandb.ai/cal-capstone/hpt4/runs/1srlc7jr) +- [(full fusion) volcanic-disco-84](https://wandb.ai/cal-capstone/hpt4/runs/21toacw1). + +**Data Fusion - Augmentation Set 1** +- [(optional fusion) proud-snowball-86](https://wandb.ai/cal-capstone/hpt4/runs/3lsgncpe) +- [silvery-meadow-88](https://wandb.ai/cal-capstone/hpt4/runs/1jkg2ym0) + +**Archived ** - [(full fusion) electric-mountain-33](https://wandb.ai/cal-capstone/hpt4/runs/ak0xdbfu) +- [(partial fusion) visionary-lake-62](https://wandb.ai/cal-capstone/hpt4/runs/1srlc7jr/overview?workspace=user-taeil) should deprecate. different number of epochs from other pretrained models + + +#### running -**Data Fusion - Augmentation Set 1** \ No newline at end of file +volcacine 128_64 all : gpu 9 +silvery-meadow-88: gpu 7 From 71f32a5f101e7c790b6b0fb8cc03d1703989a9ce Mon Sep 17 00:00:00 2001 From: taeil Date: Wed, 12 May 2021 21:17:36 -0700 Subject: [PATCH 13/14] updated model description --- references/model_architectures.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/references/model_architectures.md b/references/model_architectures.md index 11fe6e6..052ffd4 100644 --- a/references/model_architectures.md +++ b/references/model_architectures.md @@ -28,7 +28,9 @@ Some pretrained models: **Data Fusion - Augmentation Set 2** - [(optional fusion) crimson-pyramid-70](https://wandb.ai/cal-capstone/hpt4/runs/2iu8yfs6): 200 epochs - [(partial fusion) decent-bird-80](https://wandb.ai/cal-capstone/hpt4/runs/yuy7sdav) to replace due to consistent kernel size. [(partial fusion) laced-water-61](https://wandb.ai/cal-capstone/hpt4/runs/367tz8vs) and [visionary-lake-62](https://wandb.ai/cal-capstone/hpt4/runs/1srlc7jr) -- [(full fusion) volcanic-disco-84](https://wandb.ai/cal-capstone/hpt4/runs/21toacw1). +- [(full fusion 200 epocs) volcanic-disco-84](https://wandb.ai/cal-capstone/hpt4/runs/21toacw1). +- [(full fusion 500 epocs) pleasant-moon-88](https://wandb.ai/cal-capstone/hpt4/runs/11yc8up0) +- [(full fusion 900 epocs) major-sky-90](https://wandb.ai/cal-capstone/hpt4/runs/3l1wwwvo) **Data Fusion - Augmentation Set 1** - [(optional fusion) proud-snowball-86](https://wandb.ai/cal-capstone/hpt4/runs/3lsgncpe) From ca383f0e6c10145f44d1da832e23619b17cee481 Mon Sep 17 00:00:00 2001 From: taeil Date: Sat, 5 Jun 2021 07:34:54 -0700 Subject: [PATCH 14/14] new metrics --- SEN12MS | 2 +- references/model_architectures.md | 3 +++ 2 files changed, 4 insertions(+), 1 deletion(-) diff --git a/SEN12MS b/SEN12MS index edf41d4..a6012cc 160000 --- a/SEN12MS +++ b/SEN12MS @@ -1 +1 @@ -Subproject commit edf41d4b2bf0b756f06327737330e7c892c34dd7 +Subproject commit a6012cc292f79147d85ae4e2d658476a5a5d7fd3 diff --git a/references/model_architectures.md b/references/model_architectures.md index 052ffd4..7215889 100644 --- a/references/model_architectures.md +++ b/references/model_architectures.md @@ -32,6 +32,9 @@ Some pretrained models: - [(full fusion 500 epocs) pleasant-moon-88](https://wandb.ai/cal-capstone/hpt4/runs/11yc8up0) - [(full fusion 900 epocs) major-sky-90](https://wandb.ai/cal-capstone/hpt4/runs/3l1wwwvo) +- [(full fusion 200 - 180K) stilted-mountain-91](https://wandb.ai/cal-capstone/hpt4/runs/xcthtqmn) - not evaluated yet + + **Data Fusion - Augmentation Set 1** - [(optional fusion) proud-snowball-86](https://wandb.ai/cal-capstone/hpt4/runs/3lsgncpe) - [silvery-meadow-88](https://wandb.ai/cal-capstone/hpt4/runs/1jkg2ym0)