Is there any specific reason for using only certain voxels during the evaluation of semantic scene completion and completion?
Would the metric still work without these masks?
Or did you observe some anomalies in the metric when omitting the masks?
Maybe the abundance of empty voxels in the volume to predict causes problems?