This repository contains the implementation of CTKH (Calibrated Top-k Histogram) and EF (Encoded Fusion) semantic fusion techniques introduced in our paper "Memory-Efficient Real Time Many-Class 3D Metric-Semantic Mapping", IROS 2025.
You can find CTKH and EF implementations on lines 610 and 1266 respectively. (This repository is a fork of https://github.com/joaomcm/Semantic-3D-Mapping-Uncertainty-Calibration/tree/main.)
Refer to the instructions below to reproduce our experimental results as well as run 3D semantic reconstructions on specific scenes from ScanNet/ScanNet++/BS3D datasets.
-
Clone the repo
git clone --recurse-submodules https://github.com/uiuc-iml/memory-efficient-3d-semantic-mapping.git cd memory-efficient-3d-semantic-mapping -
Set up environment
conda create -n semantic_mapping python=3.10.16 conda activate semantic_mapping pip install -r requirements.txt
-
Install PyTorch for your CUDA version ex:
pip install torch==1.13.1+cu117 torchvision==0.14.1+cu117 torchaudio==0.13.1 --extra-index-url https://download.pytorch.org/whl/cu117
(Refer to pytorch.org/get-started/locally/.)
-
Download the dataset of your choice: ScanNet v2, ScanNet++, BS3D
-
Download the pre-trained weights for segmentation models and place them in their respective folders in the /segmentation_model_checkpoints folder.
ScanNet: Fine-tuned Segformer (https://uofi.app.box.com/s/lnuxvqh77tulivbew7c9y0m6jh5y23ti), ESANet (https://uofi.app.box.com/s/hd3mlqcnwh9k1i3f5ffur5kcup32htby).
ScanNet++: Fine-tuned Segformer (Download)
For BS3D, we use the 150 class off-the-shelf segformer-b4-finetuned-ade-512-512 model.
-
We provide weights for the encoder-decoder architecture used in EF for encoding dimensions 2,4, and 8 for the below listed segmentation models. To use Encoded Fusion (EF), download the weights and place them in the respective directory under calibration_experiments/EF_weights/:
| Model | num_classes | encoding dim = 2 | encoding dim = 4 | encoding dim = 8 |
|---|---|---|---|---|
| ESANet (ScanNet) | 21 | Download | Download | Download |
| Segformer (ScanNet) | 21 | Download | Download | Download |
| Segformer (ScanNet++) | 101 | Download | Download | Download |
| Segformer (BS3D) | 150 | Download | Download | Download |
-
Specify the paths to dataset, results directory, etc in settings/directory_definitions.json.
-
Run reconstructions for the semantic fusion techniques in settings/experiments_and_short_names.json:
cd calibration_experiments
python perform_reconstruction.py --dataset "scannet++"Use the argument "scannet" for ScanNet, "scannet++" for ScanNet++, and "bs3d" for BS3D.
This script will also save per scene memory usage and update times plots in results_dir/{experiment_name}/visuals/.
- Create the ground truth reconstructions (Only required for ScanNet):
cd ../dataset_creating_and_finetuning
python create_reconstruction_gts.pyNote that only ScanNet and ScanNet++ have ground truth annotations.
- Run the evaluation script:
cd ../calibration_experiments
python run_full_eval_scannet.py # or python run_full_eval_scannetpp.pyThis will output the results in {results_dir}/quant_eval.
- Example
python perform_reconstruction.py --dataset "bs3d" --scene "dining" --integration "CTKH" --k 4Note: --integration must be one of ["CTKH", "Naive Averaging", "Naive Bayesian", "Histogram", "EF"]. k is valid for CTKH and EF only.
This will write the point cloud, label file and the plots for VRAM usage and update times in the results_dir you specify in settings/directory_definitions.json.
