Skip to content

Conversation

@iona5
Copy link
Contributor

@iona5 iona5 commented Sep 18, 2024

the image_id attribute of the labels of iteration005 missed the last digit of the PLANET image_id.

for example while the Planet Image ID was 4688749_0470419_2021-07-12_105c the attributes value was only 4688749_0470419_2021-07-12_105

this PR fixes the dataset accordingly.

# fix iteration005
# iteration005 is missing the last digit in the image_id
label_source_dir = Path('../training_data_creation/slumps/03_processed')
training_data_folder = Path("ML_training_labels/retrogressive_thaw_slumps/")

gpkg_path = training_data_folder / "TrainingLabel_RTS_PlanetScope_Nitze_iteration005" / "TrainingLabel_RTS_PlanetScope_Nitze_iteration005.gpkg"
layer_name = gpd.list_layers(gpkg_path).name[0]
gpkg_it005 = gpd.read_file(training_data_folder / "TrainingLabel_RTS_PlanetScope_Nitze_iteration005" / "TrainingLabel_RTS_PlanetScope_Nitze_iteration005.gpkg", layer=layer_name )
for broken_image_id in gpkg_it005.image_id.unique():
    # find the correct one:
    matching_dirs = [p for p in label_source_dir.rglob(f"{broken_image_id}*") if p.is_dir()]
    correct_image_id = matching_dirs[0].name
    print(f"-   {broken_image_id}\n => {correct_image_id}")
    gpkg_it005.image_id = gpkg_it005.image_id.replace(broken_image_id, correct_image_id)

gpkg_it005.to_file(gpkg_path, layer=layer_name)

@iona5 iona5 marked this pull request as ready for review September 18, 2024 11:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant