fix!: Fix stepwise performance logging #640

vkakerbeck · 2025-12-15T12:51:00Z

For a while now, we've been logging "confused" in the stepwise performance column if no semantic sensor is used. This happened because we would set the on-object-map (semantic) to 0's and 1's when we have no semantic sensor data (which became the default last year when @scottcanoe cleaned up that code for the DMC paper). However, when the stepwise_performance logging code looked at those values, it would interpret the 1's as object ID 1 (i.e. the first object in the dataset). So unless the LM actually recognized object 1, it would log "confused" (and incorrectly logged "correct" if the target wasn't actually the 1st object).

This PR fixes this issue (although maybe not in the most beautiful way) but setting the semantic values to a large number that wouldn't be in the semantic_id_to_label dict (setting it to np.inf doesn't work for various reasons and a negative value seemed like it would introduce more confusion). I also added a line that actually correctly logs the overall no_label performance (which previously didn't happen).

@hlee9212 it would be nice to use this updated version in your demo so people aren't confused why the .csv files show "confused" in the second column if Monty correctly recognized the object. I'll try to merge the PR before then.

I spot-checked benchmarks, but since this only affects the stepwise performance logging, which we don't report in the benchmarks, it's not expected to have any effect.

scottcanoe

note: ‏It makes me slightly nervous to modify semantic_3d to do this, but I couldn't find anywhere in tbp.monty or my experiment code that distinguishes between nonzero values when use_semantic_sensor is False. And other solutions seem pretty complicated, so I get it.

suggestion: ‏Drop a line in the DepthTo3DTransforms indicating this behavior. Maybe line 387.

thought: I bet habitat returns uint8 for semantic. It's a long shot, but 10_000 > 2*8, so I did wonder whether there's anywhere it could cause overflow. I don't think so though.

vkakerbeck · 2025-12-15T16:54:38Z

Yes, I know. I looked into a lot of other options but the use_semantic_sensor setting is so isolated from the config info that Monty or even the experiment get, I could see another way that wasn't super hacky to check for this. How I am conceptualizing it now is that 10000 basically represents the "unknown_object" ID. I.e. one that would not be in semantic_id_to_label (unless we have 10000 distinct objects).

tristanls-tbp · 2025-12-15T17:25:11Z

issue: This will become a problem in the future as it assumes that 10 000 is large enough. So, when it isn't we are back to the same problem this pull request is intended to fix. Can we find a permanent solution?

tristanls-tbp · 2025-12-15T18:16:46Z

note: I'm still having trouble finding where the logging is going wrong. If the desired state is not to log "confused" when the DepthTo3DTransform does not use a semantic sensor, then that seems like it requires a logging configuration to tell the logger the semantic sensor is not being used. Either way there is coupling. However, with a configuration param, the coupling is explicit in the configuration, whereas with this pull request, the coupling is a magic number passing through the data path (in essence, we are changing what Monty observes so that a logger logs correctly).

vkakerbeck · 2025-12-15T20:29:48Z

Currently, we basically set all on-object pixels to semantic-id 1 (id of the first object in the list) when we don't use the semantic sensor. Setting it to a value other than 1 (and one that is not defined for one if the other objects) is already an improvement. I agree that it would be the cleanest to have a check like if use_semantic_sensor is False in the logging code instead (or in addition). But I couldn't find a good way to do this since the transform parameters are so isolated from the rest of the code. If you have a good suggestion, let me know.
For now, I think this is better than what we had before (i.e. still using a magic number but one that doesn't have a double meaning). If we want to be safer I can change the number to be larger, like 9999999999. It will be pretty unlikely that we will ever have an experiment with so many objects. FWIW I initially tried setting it to np.inf but that doesn't work with some of the type conversions later in the code.

vkakerbeck added 6 commits December 15, 2025 14:14

unused semantic value if no semantic data available

eedf251

log no_label

65592bf

Merge branch 'main' into fix_stepwise_performance_logging

f3b73af

only update semantic ID if use_semantic_sensor is false

72e77fc

fix tests

69a8e09

test fix

2dce2e0

vkakerbeck changed the title ~~Fix stepwise performance logging~~ fix!: Fix stepwise performance logging Dec 15, 2025

jeremyshoemaker added the triaged This issue or pull request was triaged label Dec 15, 2025

jeremyshoemaker assigned scottcanoe Dec 15, 2025

jeremyshoemaker requested review from scottcanoe and tristanls-tbp December 15, 2025 15:03

scottcanoe approved these changes Dec 15, 2025

View reviewed changes

add note about unknown object ID

d7de61c

Merge branch 'main' into fix_stepwise_performance_logging

c5973e6

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix!: Fix stepwise performance logging #640

fix!: Fix stepwise performance logging #640

Uh oh!

vkakerbeck commented Dec 15, 2025

Uh oh!

scottcanoe left a comment •

edited

Loading

Uh oh!

vkakerbeck commented Dec 15, 2025

Uh oh!

tristanls-tbp commented Dec 15, 2025

Uh oh!

tristanls-tbp commented Dec 15, 2025

Uh oh!

vkakerbeck commented Dec 15, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

fix!: Fix stepwise performance logging #640

Are you sure you want to change the base?

fix!: Fix stepwise performance logging #640

Uh oh!

Conversation

vkakerbeck commented Dec 15, 2025

Uh oh!

scottcanoe left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

vkakerbeck commented Dec 15, 2025

Uh oh!

tristanls-tbp commented Dec 15, 2025

Uh oh!

tristanls-tbp commented Dec 15, 2025

Uh oh!

vkakerbeck commented Dec 15, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

scottcanoe left a comment •

edited

Loading