Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
63 commits
Select commit Hold shift + click to select a range
fc9ccb9
add evalution assessment for sen12ms
Mar 22, 2021
c4ea156
Update evaluation_sen12ms_assessment.md
tchken Mar 22, 2021
43fea0b
Update evaluation_sen12ms_assessment.md
tchken Mar 22, 2021
f93f140
Update evaluation_sen12ms_assessment.md
tchken Mar 22, 2021
520fe0f
Update evaluation_sen12ms_assessment.md
tchken Mar 22, 2021
c145c4c
Update evaluation_sen12ms_assessment.md
tchken Mar 22, 2021
00d3f12
Update evaluation_sen12ms_assessment.md
tchken Mar 22, 2021
ff18a68
Update evaluation_sen12ms_assessment.md
tchken Mar 22, 2021
b782681
Update evaluation_sen12ms_assessment.md
tchken Mar 22, 2021
08b0bab
Update evaluation_sen12ms_assessment.md
tchken Mar 22, 2021
3120c8b
Update evaluation_sen12ms_assessment.md
tchken Mar 22, 2021
7a31561
Update evaluation_sen12ms_assessment.md
tchken Mar 22, 2021
eebed49
Update evaluation_sen12ms_assessment.md
tchken Mar 22, 2021
871f186
Update evaluation_sen12ms_assessment.md
tchken Mar 22, 2021
49bbbc0
Update evaluation_sen12ms_assessment.md
tchken Mar 22, 2021
8701067
Update evaluation_sen12ms_assessment.md
tchken Mar 22, 2021
4c760c6
Update evaluation_sen12ms_assessment.md
tchken Mar 22, 2021
4920663
Update evaluation_sen12ms_assessment.md
tchken Mar 22, 2021
9d36b46
Update evaluation_sen12ms_assessment.md
tchken Mar 22, 2021
24dd6e0
Update evaluation_sen12ms_assessment.md
tchken Mar 22, 2021
58cbded
Update evaluation_sen12ms_assessment.md
tchken Mar 22, 2021
40d711c
Update evaluation_sen12ms_assessment.md
tchken Mar 22, 2021
2b49f21
Update evaluation_sen12ms_assessment.md
tchken Mar 22, 2021
8a77484
Update evaluation_sen12ms_assessment.md
tchken Mar 22, 2021
65d529b
Update evaluation_sen12ms_assessment.md
tchken Mar 22, 2021
ccd9330
Update evaluation_sen12ms_assessment.md
tchken Mar 22, 2021
90288a4
Update evaluation_sen12ms_assessment.md
tchken Mar 22, 2021
9418f93
Update evaluation_sen12ms_assessment.md
tchken Mar 22, 2021
14de6d8
Update evaluation_sen12ms_assessment.md
tchken Mar 22, 2021
e8455d6
Update evaluation_sen12ms_assessment.md
tchken Mar 22, 2021
ed78916
Made some minor changes
suryagutta Mar 22, 2021
45a8edb
Made some more minor changes
suryagutta Mar 22, 2021
1b1f838
Add SEN12MS as sub module
taeil Mar 25, 2021
d388be7
Add OpenSelfSup as sub module
taeil Mar 25, 2021
aea59e5
updated submodule url
taeil Mar 25, 2021
6d7f559
Merge remote-tracking branch 'origin/master' into taeil
taeil Mar 26, 2021
8dd800e
added instructions
taeil Mar 27, 2021
553a2fd
updated instructions
taeil Mar 27, 2021
9372a60
Update model_architectures.md
taeil Mar 27, 2021
21e6e88
Update model_architectures.md
taeil Mar 27, 2021
ad6b17e
updated instructions
taeil Apr 4, 2021
8928b87
updated references
taeil Apr 4, 2021
5980c8d
updated references
taeil Apr 4, 2021
55d312e
updated references
taeil Apr 4, 2021
e6b64f8
add tmp file for verifying mean and std
Apr 5, 2021
6a9185a
add bigearthnet custom dataset class and calc dataset info script
Apr 6, 2021
4049792
rm personal checkup file
Apr 6, 2021
ce4e4c1
add desc
Apr 6, 2021
7f42f00
Update README.md
tchken Apr 6, 2021
31bc8d9
rm rgb s2, re-work on s2 for bigearthnet
Apr 6, 2021
61ab19a
rm rgb s2 to re work on bigearth s2 load
Apr 6, 2021
de5d415
rm lines not used
Apr 7, 2021
0e6d9fd
update BigEarthNet channel data stats calc. inlude also the s-2 data …
Apr 8, 2021
4801489
correct s1 band name 'VH'
Apr 8, 2021
0350894
Update README.md
tchken Apr 8, 2021
9c25985
Update README.md
tchken Apr 8, 2021
3aae7bb
Update README.md
tchken Apr 8, 2021
92c6232
added project file for hpt
taeil Apr 11, 2021
70d230c
Merge remote-tracking branch 'origin/taeil' into ken
cjrd Apr 20, 2021
d4eb4a8
merge
cjrd Apr 20, 2021
de661c0
submodule branch
cjrd Apr 20, 2021
368aa0a
removed submodules
cjrd Apr 20, 2021
741f6b2
rm submod
cjrd Apr 20, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Empty file added .gitmodules
Empty file.
8 changes: 8 additions & 0 deletions .idea/.gitignore

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

22 changes: 22 additions & 0 deletions .idea/deployment.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

19 changes: 19 additions & 0 deletions .idea/hpt.iml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 6 additions & 0 deletions .idea/inspectionProfiles/profiles_settings.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

7 changes: 7 additions & 0 deletions .idea/misc.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

8 changes: 8 additions & 0 deletions .idea/modules.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 6 additions & 0 deletions .idea/other.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

30 changes: 30 additions & 0 deletions .idea/runConfigurations/single_train.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

8 changes: 8 additions & 0 deletions .idea/vcs.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

127 changes: 97 additions & 30 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,14 +2,14 @@

This is a research repository for the submission "Self-Supervised Pretraining Improves Self-Supervised Pretraining"

For initial setup, refer to [setup instructions](references/setup.md).
For initial setup, refer to [setup instructions](setup_pretraining.md).

## Setup Weight & Biases Tracking

```bash
export WANDB_API_KEY=<use your API key>
export WANDB_ENTITY=cal-capstone
export WANDB_PROJECT=hpt
export WANDB_PROJECT=scene_classification
#export WANDB_MODE=dryrun
```

Expand Down Expand Up @@ -58,54 +58,102 @@ bands_std = {'s1_std': [4.525339, 4.3586307],
1082.4341, 1057.7628, 1136.1942, 1132.7898, 991.48016]}
```

**NEXT**, copy the pretraining template
## Pre-training with SEN12MS Dataset
[OpenSelfSup](https://github.com/Berkeley-Data/OpenSelfSup)
- see `src/utils/pretrain-runner.sh` for end-to-end run (require prep creating config files).

Check installation by pretraining using mocov2, extracting the model weights, evaluating the representations, and then viewing the results on tensorboard or [wandb](https://wandb.ai/cal-capstone/hpt):

Set up experimental tracking and model versioning:
```bash
export WANDB_API_KEY=<use your API key>
export WANDB_ENTITY=cal-capstone
export WANDB_PROJECT=hpt4
```

#### Run pre-training
```bash
cd src/utils
cp templates/pretraining-config-template.sh pretrain-configs/sen12ms-small.sh
# edit pretrain-configs/sen12ms-small.sh
cd OpenSelfSup

# set which GPUs to use
# CUDA_VISIBLE_DEVICES=1
# CUDA_VISIBLE_DEVICES=0,1,2,3

# (sanity check) Single GPU training on samll dataset
/tools/single_train.sh configs/selfsup/moco/r50_v2_sen12ms_in_basetrain_aug_20ep.py --debug

# (sanity check) Single GPU training on samll dataset on sen12ms fusion
./tools/single_train.sh configs/selfsup/moco/r50_v2_sen12ms_fusion_in_smoke_aug.py --debug

# (sanity check) 4 GPUs training on samll dataset
./tools/dist_train.sh configs/selfsup/moco/r50_v2_sen12ms_in_basetrain_aug_20ep.py 4

# once edited, generate the project
./gen-pretrain-project.sh pretrain-configs/my-dataset-config.sh
# (sanity check) 4 GPUs training on samll fusion dataset
./tools/dist_train.sh configs/selfsup/moco/r50_v2_sen12ms_fusion_in_smoke_aug.py 4

# distributed full training
/tools/dist_train.sh configs/selfsup/moco/r50_v2_sen12ms_in_fulltrain_20ep.py 4
```

What just happened? We generated a bunch of pretraining configs in the following location (take a look at all of these files to get a feel for how this works):
#### (OPTIONAL) download pre-trained models

Some of key pre-trained models are on s3 (s3://sen12ms/pretrained):
- [200 epochs w/o augmentation: vivid-resonance-73](https://wandb.ai/cjrd/BDOpenSelfSup-tools/runs/3qjvxo2p/overview?workspace=user-cjrd)
- [20 epochs w/o augmentation: silvery-oath7-2rr3864e](https://wandb.ai/cal-capstone/hpt2/runs/2rr3864e?workspace=user-taeil)
- [sen12ms-baseline: soft-snowflake-3.pth](https://wandb.ai/cal-capstone/SEN12MS/runs/3gjhe4ff/overview?workspace=user-taeil)

```
OpenSelfSup/configs/hpt-pretrain/${shortname}
aws configure
aws s3 sync s3://sen12ms/pretrained . --dryrun
aws s3 sync s3://sen12ms/pretrained_sup . --dryrun
```

**NEXT**, you're ready to kick off a trial run to make sure the pretraining is working as expected =)
#### Extract pre-trained model
Any other models can be restored by run ID if stored with W&B. Go to files section under the run to find `*.pth` files

```bash
# the `-t` flag means `trial`: it'll only run a 50 iter pretraining
./utils/pretrain-runner.sh -t -d OpenSelfSup/configs/hpt-pretrain/${shortname}
BACKBONE=work_dirs/selfsup/moco/r50_v2_sen12ms_in_basetrain_20ep/epoch_20_moco_in_baseline.pth

# method 1: From working dir(same system for pre-training)
# CHECKPOINT=work_dirs/selfsup/moco/r50_v2_resisc_in_basetrain_20ep/epoch_20.pth

# method 2: from W&B, {projectid}/{W&B run id} (any system)
CHECKPOINT=hpt2/3l4yg63k

# Extract the backbone
python tools/extract_backbone_weights.py ${BACKBONE} ${CHECKPOINT}

```

**NEXT**, if this works, kick off the full training. NOTE: you can kick this off multiple times as long as the config directories share the same filesystem

## Evaluating Pretrained Representations

Using OpenSelfSup
```bash
# simply removing the `-t` flag from above
./utils/pretrain-runner.sh -d OpenSelfSup/configs/hpt-pretrain/${shortname}
python tools/train.py $CFG --pretrained $PRETRAIN

# RESISC finetune example
tools/train.py --local_rank=0 configs/benchmarks/linear_classification/resisc45/r50_last.py --pretrained work_dirs/selfsup/moco/r50_v2_resisc_in_basetrain_20ep/epoch_20_moco_in_basetrain.pth --work_dir work_dirs/benchmarks/linear_classification/resisc45/moco-selfsup/r50_v2_resisc_in_basetrain_20ep-r50_last --seed 0 --launcher=pytorch



```

**NEXT**, if you want to perform BYOL pretraining, add `-b` flag.

Using Sen12ms
```bash
# simply add the `-b` flag to above.
./utils/pretrain-runner.sh -d OpenSelfSup/configs/hpt-pretrain/${shortname} -b
```


Congratulations: you've launch a full hierarchical pretraining experiment.

**FAQs/PROBLEMS?**
* How does `pretrain-runner.sh` keep track of what's been pretrained?
* In each config directory, it creates a `.pretrain-status` folder to keep track of what's processing/finished. See them with e.g. `find OpenSelfSup/configs/hpt-pretrain -name '.pretrain-status'`
* How to redo a pretraining, e.g. because it crashed or something changed? Remove the
* Remove the associate `.proc` or `.done` file. Find these e.g.
```bash
find OpenSelfSup/configs/hpt-pretrain -name '.proc'
find OpenSelfSup/configs/hpt-pretrain -name '.done'
```

## Evaluating Pretrained Representations

#### Previous
```
# Evaluate the representations (NOT SURE)
./benchmarks/dist_train_linear.sh configs/benchmarks/linear_classification/resisc45/r50_last.py ${BACKBONE}
```

This has been simplified to simply:
```bash
./utils/pretrain-evaluator.sh -b OpenSelfSup/work_dirs/hpt-pretrain/${shortname}/ -d OpenSelfSup/configs/hpt-pretrain/${shortname}
Expand Down Expand Up @@ -301,3 +349,22 @@ python tools/train.py configs/hpt-pretrain/resisc/moco_v2_800ep_basetrain/500-it

```

## (Other) BigEarthNet bands mean and standard deviation

For S-1 data, band name {'VV', 'VH'}.

For S-2 data, band name {'B01', 'B02', 'B03', 'B04', 'B05', 'B06', 'B07', 'B08', 'B09', 'B11', 'B12', 'B8A'}.

```
cd into hpt/src/data/
```
Calc band stats by running
```bash
bash dataset_calc_BigEarthNet.sh
```






Binary file added data/small_sample.pkl
Binary file not shown.
Loading