Fix(preprocessing): duplicated labels by hollandjg · Pull Request #67 · WilhelmusLab/ebseg

hollandjg · 2025-01-28T19:40:42Z

Ensure that floes detected at different scales don't have the same label.

Changes:

Avoid integer overflow by using cv2.connectedComponentsWithStats, which has a 32-bit integer type for tracking labels
Ensure that floes from iteration n+1 are labelled starting from the value of the previous highest label highest_label_so_far + 1

Resolutions:

resolves fix: add min- and max-floe size limits to preprocessing #66
resolves bug: floes at different scales can obliterate each other, leaving disconnected chunks #68

…cted components having the same label

…et systematically increasing labels

…ning

cpaniaguam

A few observations.

.vscode/settings.json

cpaniaguam · 2025-01-29T17:06:30Z

src/ebfloeseg/preprocess.py

+        logger.debug("output after this iteration\n %s" % count_blobs_per_label(output).query("count > 1"))

    # saving the props table
    output = opening(output)


See #66 (comment)

tests/test_preprocess.py

hollandjg · 2025-01-29T19:55:39Z

We still get some cases, like this one, where there's a disconnected part of the floe (yellow) which has nothing to do with the other floes (greenish) nearby:

... or this one:

(where you can see a disconnected component near the concave bit on the western side of the floe)

hollandjg · 2025-02-12T17:06:09Z

I've some new examples with the updated code. It turns out that the opening operation itself causes some of these cases to arise.

Example: a floe which has multiple "lobes" straight out of the FSD algorithm prior to cleaning

In this case, we need some kind of cleaning.

The input image with the floe marked (I can't see anything in the true-color image):

How it appears after the FSD algorithm:

With opening:

With opening then cleaning:

With cleaning alone:

Example: a floe which has an extra "lobe" once opening is applied

The input image with the floe marked (I can't see anything in the true-color image):

How it appears after the FSD algorithm:

with opening:

with opening then cleaning:

with cleaning alone:

Proposal: include "new" cleaning, check in paper whether opening is a core part of the algorithm and decide what to do with it

hollandjg · 2025-02-13T16:40:47Z

From Buckley et. al. https://doi.org/10.5194/tc-18-5031-2024, Figure 2

It looks to me as though the "opening" step at the end isn't a fundamental part of the algorithm.
The opening step was in the original: https://github.com/WilhelmusLab/Segmentation_EB/blame/main/Segmentation_EB.ipynb and we don't have the history of that file in GitHub.

I think the question is whether we want that smoothing step to be applied to all the floes at the end. Given Carlos' comment #66 (comment) it might be inadvisable to leave as-is.

If the smoothing is necessary from a scientific point of view, we could:

Apply opening to each floe individually, but then we may have single pixels where the opening makes it look like it's part of two or more floes. We'd could apply the opening at each floe scale, so that floes detected at smaller scales wouldn't be allowed to chomp bits out of larger floes.
Get the contour surrounding each floe from the raw (unsmoothed) pixels, and then smooth that contour. There would be some "collisions" between those contours for sure, but the scale of those would hopefully be smaller than the precision we have on the floe edge position.

@danielmwatkins @cpaniaguam Any other ideas? Or preferences?

hollandjg · 2025-02-13T16:41:45Z

Note that the failing test is unrelated to this PR, and is fixed in

test(load): remove broken test that files downloaded look identical to those in repo #69

hollandjg · 2025-03-07T15:05:05Z

I'd like to suggest keeping the "opening" step at the end of the algorithm, and just using the new "cleaning" to remove the extra disconnected parts. That's the smallest change we can make to the algorithm whilst still fixing the problem.
@danielmwatkins @cpaniaguam @ellenbuckley

danielmwatkins · 2025-03-24T19:28:14Z

This is an interesting case because it is one, like you note, where there really isn't anything there that it should be identifying. I like the idea of adding a minimal cleaning step at the end.

cpaniaguam

Thanks for this!

Because the number of different labels in the images might be high, it's probably better to avoid the accumulator pattern in favor of a comprehension. Left a suggestion regarding this.

src/ebfloeseg/preprocess.py

Looked into this. Decided against the requested changes.

hollandjg added 6 commits January 27, 2025 21:36

test(preprocess): add test throwing an error due to multiple disconne…

6bc9203

…cted components having the same label

test(preprocess): add new test cases with different data

2c5bd54

feat(preprocess): add logic to ensure that floes of different sizes g…

8b1bc0b

…et systematically increasing labels

chore(config): remove extra comma

989eb4c

test: abstract counting of blobs

c8c7a6c

add debugging logging calls

a6c58b4

hollandjg mentioned this pull request Jan 28, 2025

Tracker crashes sometimes when collision indices are out of bounds of matched pairs WilhelmusLab/IceFloeTracker.jl#545

Closed

hollandjg requested review from cpaniaguam, danielmwatkins, mirestrepo and tdivoll January 28, 2025 19:42

hollandjg added 2 commits January 28, 2025 22:17

reformat with black

3b08cdd

reformat test file

bead5b7

hollandjg mentioned this pull request Jan 28, 2025

fix: add min- and max-floe size limits to preprocessing #66

Closed

hollandjg added 6 commits January 29, 2025 14:26

test: skip zeroth background row

a81b902

chore(preprocess): add extra debug logging for iteration end

a50aaa7

chore(preprocess): add extra debug logging for output after final ope…

94b469a

…ning

reformat file

69f0533

chore: remove additional logging functions

5641415

remove commented code

2a0c5e4

cpaniaguam requested changes Jan 29, 2025

View reviewed changes

hollandjg added 3 commits January 29, 2025 18:43

test: simplify fixtures for preprocessing test

6dba27c

remove vscode settings

0c454df

test: update output files

ef7bbd3

hollandjg added 5 commits February 11, 2025 22:31

add prototype working cleanup script for small blobs

b636370

refactor(preprocess): make function clean_labels_with_multiple_blobs

df52f14

add tests for clean label with multiple blobs

b0ca31b

add tests of error throwing

d2719d4

update test names

bd75010

test(preprocess): update test expected outputs

5d9de21

hollandjg added 3 commits February 12, 2025 23:50

reintroduce opening

fb6f57a

revert back expected results

0c25ac3

simplify loading function to use paths

17266c8

Merge branch 'main' into fix--duplicated-labels

9047286

hollandjg marked this pull request as ready for review March 7, 2025 14:57

reformat test_load

86ab2fb

hollandjg requested a review from ellenbuckley March 7, 2025 15:04

Add test files for preprocessing

3008b28

hollandjg requested a review from cpaniaguam March 25, 2025 11:04

cpaniaguam previously requested changes Mar 25, 2025

View reviewed changes

src/ebfloeseg/preprocess.py Outdated Show resolved Hide resolved

src/ebfloeseg/preprocess.py Show resolved Hide resolved

refactor: simplify count_blobs_per_label function with a comprehension

eb37f03

hollandjg requested a review from cpaniaguam March 26, 2025 15:09

hollandjg merged commit 7d2cc7e into main May 22, 2025
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix(preprocessing): duplicated labels#67

Fix(preprocessing): duplicated labels#67
hollandjg merged 30 commits intomainfrom
fix--duplicated-labels

hollandjg commented Jan 28, 2025 •

edited

Loading

Uh oh!

cpaniaguam left a comment

Uh oh!

Uh oh!

cpaniaguam Jan 29, 2025

Uh oh!

Uh oh!

Uh oh!

hollandjg commented Jan 29, 2025 •

edited

Loading

Uh oh!

hollandjg commented Feb 12, 2025 •

edited

Loading

Uh oh!

hollandjg commented Feb 13, 2025

Uh oh!

hollandjg commented Feb 13, 2025 •

edited

Loading

Uh oh!

hollandjg commented Mar 7, 2025

Uh oh!

danielmwatkins commented Mar 24, 2025

Uh oh!

cpaniaguam left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

hollandjg commented Jan 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cpaniaguam left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

cpaniaguam Jan 29, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

hollandjg commented Jan 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hollandjg commented Feb 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Example: a floe which has multiple "lobes" straight out of the FSD algorithm prior to cleaning

Example: a floe which has an extra "lobe" once opening is applied

Proposal: include "new" cleaning, check in paper whether opening is a core part of the algorithm and decide what to do with it

Uh oh!

hollandjg commented Feb 13, 2025

Uh oh!

hollandjg commented Feb 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hollandjg commented Mar 7, 2025

Uh oh!

danielmwatkins commented Mar 24, 2025

Uh oh!

cpaniaguam left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

hollandjg commented Jan 28, 2025 •

edited

Loading

hollandjg commented Jan 29, 2025 •

edited

Loading

hollandjg commented Feb 12, 2025 •

edited

Loading

hollandjg commented Feb 13, 2025 •

edited

Loading