RWTH Aachen University
Mask4Former is a transformer-based model for 4D Panoptic Segmentation, achieving a new state-of-the-art performance on the SemanticKITTI test set.
[Project Webpage] [arXiv]
-
2025-05-25: MinkowskiEngine has been replaced with Spconv to simplify installation and enable 16-bit operations. Training now should be roughly twice as fast. See Minkowski in Tags for the previous version of the code.
-
2024-01-29: Mask4Former accepted to ICRA 2024
-
2023-09-28: Mask4Former on arXiv
The main dependencies of the project are the following:
python: 3.11
cuda: 12.4We use uv as Python package and project manager. You can also set up your environment as you prefer and just use pip install command.
You can set up a uv virtual environment in your local directory as follows:
uv venv --python 3.11
source .venv/bin/activate
You can install the python packages as follows:
uv pip install -r requirements.txt --no-deps --extra-index-url https://download.pytorch.org/whl/cu124
After installing the dependencies, we preprocess the SemanticKITTI dataset. This can take some time.
python -m datasets.preprocessing.semantic_kitti_preprocessing preprocess \
--data_dir $SEMANTICKITTI_DIR/SemanticKITTI/dataset \
--save_dir data/semantic_kitti \
--generate_instances True
Train Mask4Former:
python main_panoptic_4d.pyIn the simplest case the inference command looks as follows:
python main_panoptic_4d.py \
general.mode="validate" \
general.ckpt_path='PATH_TO_CHECKPOINT.ckpt'Or you can use DBSCAN to boost the scores even further:
python main_panoptic_4d.py \
general.mode="validate" \
general.ckpt_path='PATH_TO_CHECKPOINT.ckpt' \
general.dbscan_eps=1.0The provided model, trained after the submission, achieves 71.3 LSTQ without DBSCAN and 72.0 with DBSCAN post-processing on the valiidation set.
@inproceedings{yilmaz24mask4former,
title = {{Mask4Former: Mask Transformer for 4D Panoptic Segmentation}},
author = {Yilmaz, Kadir and Schult, Jonas and Nekrasov, Alexey and Leibe, Bastian},
booktitle = {{International Conference on Robotics and Automation (ICRA)}},
year = {2024}
}
