OmniMap: A Comprehensive Mapping Framework Integrating Optics, Geometry, and Semantics

🏠 Abstract

Robotic systems demand accurate and comprehensive 3D environment perception, requiring simultaneous capture of a comprehensive representation of photo-realistic appearance (optical), precise layout shape (geometric), and open-vocabulary scene understanding (semantic). Existing methods typically achieve only partial fulfillment of these requirements while exhibiting optical blurring, geometric irregularities, and semantic ambiguities. To address these challenges, we propose OmniMap. Overall, OmniMap represents the first online mapping framework that simultaneously captures optical, geometric, and semantic scene attributes while maintaining real-time performance and model compactness. At the architectural level, OmniMap employs a tightly coupled 3DGS–Voxel hybrid representation that combines fine-grained modeling with structural stability. At the implementation level, OmniMap identifies key challenges across different modalities and introduces several innovations: adaptive camera modeling for motion blur and exposure compensation, hybrid incremental representation with normal constraints, and probabilistic fusion for robust instance-level understanding. Extensive experiments show OmniMap's superior performance in rendering fidelity, geometric accuracy, and zero-shot semantic segmentation compared to state-of-the-art methods across diverse scenes. The framework's versatility is further evidenced through a variety of downstream applications including multi-domain scene Q&A, interactive edition, perception-guided manipulation, and map-assisted navigation.

🛠 Install

Tested on Ubuntu 20.04/24.04 with CUDA 11.8.

Clone this repo

git clone https://github.com/BIT-DYN/omnimap.git
cd omnimap

Install the required libraries

conda env create -f environment.yaml
conda activate omnimap

Install torch-scatter

pip install torch-scatter -f https://data.pyg.org/whl/torch-2.1.2+cu118.html

Set CUDA environment

Run this every time before using the environment, or add to conda activation script:

export CUDA_HOME=$CONDA_PREFIX
export PATH=$CUDA_HOME/bin:$PATH
export LD_LIBRARY_PATH=$CUDA_HOME/lib:$LD_LIBRARY_PATH

To make it permanent, add to conda activate script:

mkdir -p $CONDA_PREFIX/etc/conda/activate.d
echo 'export CUDA_HOME=$CONDA_PREFIX
export PATH=$CUDA_HOME/bin:$PATH
export LD_LIBRARY_PATH=$CUDA_HOME/lib:$LD_LIBRARY_PATH' > $CONDA_PREFIX/etc/conda/activate.d/cuda_env.sh

Install thirdparty components

pip install --no-build-isolation thirdparty/simple-knn
pip install --no-build-isolation thirdparty/diff-gaussian-rasterization
pip install --no-build-isolation thirdparty/lietorch

Note: The mmyolo package has been copied from YOLO-World repository into thirdparty/mmyolo/ to resolve a dependency conflict. The original YOLO-World had a version constraint that prevented using mmcv versions newer than 2.0.0, but this project requires mmcv 2.1.0. This issue has been fixed in the local copy.

Install YOLO-World Model

cd ..
git clone --recursive https://github.com/AILab-CVC/YOLO-World.git
cd YOLO-World
pip install mmcv==2.1.0 -f https://download.openmmlab.com/mmcv/dist/cu118/torch2.1/index.html
pip install -r <(grep -v "opencv-python" requirements/basic_requirements.txt)
pip install -e . --no-build-isolation
cd ../omnimap

Fix YOLO-World syntax error: In YOLO-World/yolo_world/models/detectors/yolo_world.py line 61, replace:

self.text_feats, None = self.backbone.forward_text(texts)

with:

self.text_feats, _ = self.backbone.forward_text(texts)

Download pretrained weights YOLO-Worldv2-L (CLIP-Large) to weights/yolo-world/.

Install TAP Model

pip install flash-attn==2.5.8 --no-build-isolation
pip install git+https://github.com/baaivision/tokenize-anything.git

Download pretrained weights to weights/tokenize-anything/:

Install SBERT Model

pip install -U sentence-transformers
pip install transformers==4.36.2

Note: If you see sentence-transformers 5.2.0 has requirement transformers<6.0.0,>=4.41.0, but you have transformers 4.36.2. just skip it - it's okay.

Download pretrained weights to weights/sbert/:

cd weights/sbert
git clone https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2

Install additional dependencies

pip install --no-build-isolation git+https://github.com/lvis-dataset/lvis-api.git
python -m spacy download en_core_web_sm

Download YOLO-World data files

(This part is unnecessary because data folder already exists with all required scripts)

mkdir -p data/coco/lvis && cd data/coco/lvis
wget https://huggingface.co/GLIPModel/GLIP/resolve/main/lvis_v1_minival_inserted_image_name.json
cd ../../..
cp -r ../YOLO-World/data/texts data/

Modify the model path

Change the address of the above models in the configuration file in config/.

Reinstall mmcv:

(some packages may change your mmcv version, please reinstall mmcv and check if it's version is 2.1.0)

pip install mmcv==2.1.0 -f https://download.openmmlab.com/mmcv/dist/cu118/torch2.1/index.html

Fix transformers version compatibility:

If you encounter AttributeError: module 'torch.utils._pytree' has no attribute 'register_pytree_node', install the compatible version of transformers:

pip install transformers==4.36.2

This version is compatible with PyTorch 2.1.2. Newer versions of transformers require PyTorch 2.2+.

Verify installation

python -c "import torch; import mmcv; import mmdet; from tokenize_anything import model_registry; print('Setup complete')"

Note:: You may get the ERROR: AssertionError: MMCV==2.2.0 is used but incompatible. Please install mmcv>=2.0.0rc4, <2.1.0.. If so - just go to __init__.py and change mmcv_maximum_version to 2.2.0.

📊 Prepare dataset

OmniMap has completed validation on Replica (as same with vMap) and ScanNet. Please download the following datasets.

Replica Demo - Replica Room 0 only for faster experimentation.
Replica - All Pre-generated Replica sequences.
ScanNet - Official ScanNet sequences.

Update the dataset path in config/replica_config.yaml or config/scannet_config.yaml:

path:
  data_path: /path/to/your/dataset

🏃 Run

Main Code

Run the following command to start the formal execution of the incremental mapping.

# for replica
python demo.py --dataset replica --scene {scene} --vis_gui
# for scannet
python demo.py --dataset scannet --scene {scene} --vis_gui

You can use --start {start_id} and --length {length} to specify the starting frame ID and the mapping duration, respectively. The --vis_gui flag controls online visualization; disabling it may improve processing speed.

Examples:

# Replica
python demo.py --dataset replica --scene room_0

# ScanNet
python main.py --dataset scannet --scene scene0000_00

After building the map, the results will be saved in folder outputs/{scene}, which contains the rendered outputs and evaluation metrics.

Gen 3D Mesh

We use the rendered depth and color images to generate the color mesh. You can run the following code to perform this operation.

# for replica
python tsdf_integrate.py --dataset replica --scene {scene}
# for scannet
python tsdf_integrate.py --dataset scannet --scene {scene}

Project Structure

omnimap/
├── config/
│   ├── replica_config.yaml
│   ├── scannet_config.yaml
│   └── yolo-world/
├── data/
│   ├── coco/lvis/
│   └── texts/
├── weights/
│   ├── yolo-world/
│   ├── tokenize-anything/
│   └── sbert/
├── thirdparty/
│   ├── simple-knn/
│   ├── diff-gaussian-rasterization/
│   ├── lietorch/
│   └── mmyolo/
└── demo.py

🔗 Citation

If you find our work helpful, please cite:

@article{omnimap,
  title={OmniMap: A Comprehensive Mapping Framework Integrating Optics, Geometry, and Semantics},
  author={Deng, Yinan and Yue, Yufeng and Dou, Jianyu and Zhao, Jingyu and Wang, Jiahui and Tang, Yujie and Yang, Yi and Fu, Mengyin},
  journal={IEEE Transactions on Robotics},
  year={2025}
}

👏 Acknowledgements

We would like to express our gratitude to the open-source projects and their contributors HI-SLAM2, 3D Gaussian Splatting, YOLO-World, and TAP. Their valuable work has greatly contributed to the development of our codebase.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

OmniMap: A Comprehensive Mapping Framework Integrating Optics, Geometry, and Semantics

🏠 Abstract

🛠 Install

Clone this repo

Install the required libraries

Install torch-scatter

Set CUDA environment

Install thirdparty components

Install YOLO-World Model

Install TAP Model

Install SBERT Model

Install additional dependencies

Download YOLO-World data files

Modify the model path

Reinstall mmcv:

Fix transformers version compatibility:

Verify installation

📊 Prepare dataset

🏃 Run

Main Code

Examples:

Gen 3D Mesh

Project Structure

🔗 Citation

👏 Acknowledgements

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
calib		calib
config		config
data		data
omnimap		omnimap
pretrained_models		pretrained_models
thirdparty		thirdparty
.gitignore		.gitignore
README.md		README.md
demo.py		demo.py
environment.yaml		environment.yaml
tsdf_integrate.py		tsdf_integrate.py

BIT-DYN/omnimap

Folders and files

Latest commit

History

Repository files navigation

OmniMap: A Comprehensive Mapping Framework Integrating Optics, Geometry, and Semantics

🏠 Abstract

🛠 Install

Clone this repo

Install the required libraries

Install torch-scatter

Set CUDA environment

Install thirdparty components

Install YOLO-World Model

Install TAP Model

Install SBERT Model

Install additional dependencies

Download YOLO-World data files

Modify the model path

Reinstall mmcv:

Fix transformers version compatibility:

Verify installation

📊 Prepare dataset

🏃 Run

Main Code

Examples:

Gen 3D Mesh

Project Structure

🔗 Citation

👏 Acknowledgements

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages