-
Notifications
You must be signed in to change notification settings - Fork 3
Open
Description
Hey folks,
Let's take a look at a more straightforward contrastive learning baseline, using the following set of pretraining techniques, ie each of the following steps is a separate pretraining technique, ordered by complexity:
- (pretraining) Have conv1 take 12 bands as input and use all 12 bands to do moco pretraining (this merges s1 and s2 data to be treated as a single "image"). Use whatever subset of the moco-v2 augmentations we have available across the 12 bands.
- Do the same as (1.) except include cases where either s1 or s2 are not included for both the query and key view (e.g. a query might have s1+s2,s1, or s2 as input and the key would have the same input with a different set of augmentations). We can 0-pad the removed s1 or s2.
- Do the same as (2.) except include cases where either s1 or s2 are not included for either the query and key view (e.g. a query might have s1+s2,s1, or s2 as input and the key might have s1+s2,s1, or s2 as input).
The idea here is that in (1.) we're doing instance discrimination where an input image is really a composition of 2 images. In (2.) we're doing instance discrimination where an input image can also be from only s1 or s2. In (3.), an input image can be from either (1.) or (2.).
Metadata
Metadata
Assignees
Labels
No labels