Skip to content

Fix depth_uint8_decoding calculation for output#198

Open
kleinicke wants to merge 1 commit intoNVlabs:masterfrom
kleinicke:patch-1
Open

Fix depth_uint8_decoding calculation for output#198
kleinicke wants to merge 1 commit intoNVlabs:masterfrom
kleinicke:patch-1

Conversation

@kleinicke
Copy link

@kleinicke kleinicke commented Oct 18, 2025

uint8 images were incorrectly converted by multiplying by 255 and not shifting correctly by 8 bit by multiplying by 256. This might cause some serious issues, might have even harmed the training of the network.

When the dataset for foundation stereo was computed, was this formula used to save the images as 24bit? Or does this issue only occur in the training process? As long as it's consistent between dataset creation and training, this issue is fine for this network. But it should be marked for everyone else trying to train with the dataset, that this formula was used. The previous formula basically interprets
00000000 00000001 00000000 (1*255)
00000000 00000000 10000000 (255)
both as 255.

So in the context of disparities scaled with a factor of 1000, this can mean that a disparity of 300 is saved as 298.

Before merging this change, the implications should be shortly discussed.

uint8 images were incorrectly converted by multiplying by 255 and not shifting correctly by 8 bit by multiplying by 256.
This might cause some serious issues, might have even harmed the training of the network.

When the dataset for foundation stereo was computed, was this formula used to save the images as 24bit?
Or does this issue only occur in the training process?
As long as it's consistent between dataset creation and training, this issue is fine for this network. But it should be marked for everyone else trying to train with the dataset, that this formula was used.
The previous formula basically interprets 
00000000 00000001 00000000 (1*255)
00000000 00000000 10000000 (255)
both as 255.
@kleinicke
Copy link
Author

Ok, I took a look at the dataset. It's skipping the R=255 and G=255 values.
So the data interpretation was consistent for this network, but every network training with the dataset on the future should make sure to als skip the G=255 values.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant