OrthoLoC: UAV 6-DoF Localization and Calibration Using Orthographic Geodata

NeurIPS 2025 (Oral)

🎯 We present a new paradigm for UAV camera localization and calibration using geospatial data (geodata), specifically orthophotos (DOPs) and digital surface models (DSMs). This approach is particularly useful for UAVs operating in urban environments, where traditional localization methods may struggle due to the complexity of the surroundings. By leveraging geodata, we can achieve accurate localization and calibration even in challenging conditions. We propose a large-scale benchmark dataset for UAV visual localization, which includes a diverse set of images and geodata from various environments. This dataset serves as a valuable resource for researchers and practitioners in the field, enabling them to develop and evaluate new localization algorithms.

📔 Table of Contents

⭐ About the Project
🔍 Dataset Features
📊 Dataset Samples
🧰 Getting Started
🔧 Usage
🏃 Run Scripts
⚠️ License
🙏 Acknowledgements

🌟 About the Project

OrthoLoC is a framework for UAV camera localization and calibration using orthographic geodata. The project provides a large-scale dataset and implementation of algorithms for matching, localization, and calibration of UAV imagery using orthophotos and digital surface models.

🔍 Dataset Features

📸 Large-scale Dataset: 16,427 high-resolution UAV images with high-precision camera parameters (intrinsics + extrinsics), sourced from multiple geographic environments (Europe + US).
🛫 Multi-altitude: Imagery captured at varied altitudes (23m-154m), offering diverse perspectives and coverage.
🔄 Multi-viewpoint: Diverse viewpoints with wide camera angles (tilting: 0°-86.8°), offering challenging and dynamic perspectives.
🌍 Multi-geographic Environments: Coverage of urban 🏙️, rural 🌾, highway 🚗, and suburban 🏡 landscapes, with high-res geodata alignment.
📍 High-precision Ground-truth: Paired UAV-geodata images for precise localization and calibration, minimizing retrieval errors and ensuring accuracy.
🌐 Geospatial Alignment: Well-aligned orthographic maps (orthophotos and elevation maps) for precise UAV localization and calibration.
🔬 Real-world Applicability: A foundation for evaluating decoupled UAV localization and calibration, enabling practical, real-world use cases.

📊 Dataset Samples

🧰 Getting Started

‼️ Prerequisites

Python 3.10

You can create a virtual environment using conda or venv:

conda create -n ortholoc python=3.10 -y
conda activate ortholoc

or

python3 -m venv ortholoc
source ortholoc/bin/activate

⚙️ Installation

via pip

To install the package, you can use pip to install it directly from GitHub (this might take a while):

pip install git+https://github.com/deepscenario/OrthoLoC.git

To force supporting GPU, use the following command:

pip install "git+https://github.com/deepscenario/OrthoLoC.git#egg=ortholoc[gpu]" --extra-index-url https://download.pytorch.org/whl/cu118

via git clone

Clone the project

git clone https://github.com/deepscenario/OrthoLoC.git

Install the library

cd OrthoLoC
pip install -e .

To fore supporting GPU, use the following command:

cd OrthoLoC
pip install -e ".[gpu]" --extra-index-url https://download.pytorch.org/whl/cu118

📥 Download the Dataset

Our dataset is available here: OrthoLoC Dataset

Important Notes:

You can either download the dataset manually or let our scripts do it for you.
You can directly use urls when using our scripts or relative paths to the dataset.
- Example of a relative path: "demo/samples/highway_forest.npz" will automatically download https://cvg.cit.tum.de/webshare/g/papers/Dhaouadi/OrthoLoC/demo/samples/highway_forest.npz
- Example of an URL: "https://cvg.cit.tum.de/webshare/g/papers/Dhaouadi/OrthoLoC/full/test_inPlace/" as dataset_dir of the dataset will automatically download the full folder.
The dataset will be downloaded to the default cache directory in your system. E.g. on ubuntu, it will be downloaded to ~/.cache/ortholoc/. You can change the cache directory by setting the environment variable ORTHOLOC_CACHE_DIR to your desired path.
Not setting --dataset_dir in the scripts will automatically download the dataset (on-demand) to the cache directory.

📁 Structure of the Dataset

├── demo                                   # example data
│   ├── urban_residential.jpg              # query image
│   ├── urban_residential_DOP.tif          # DOP image as tif file
│   ├── urban_residential_DSM.tif          # DSM image as tif file
│   ├── urban_residential_extrinsics.json  # extrinsics of the query image as json file (for evaluation only)
│   ├── urban_residential_intrinsics.json  # intrinsics of the query image as json file
│   ├── urban_residential_xDOP.tif         # DOP with cross domain as tif file
│   ├── urban_residential_xDSM.tif         # DSM with cross domain as tif file
│   ├── samples                            # few samples from the dataset as .npz files
│   │   ├── highway_forest.npz
│   │   ├──  ...
├── full
│   ├── train
│   │   ├── L01_R0000.npz
│   │   ├──  ...
│   ├── val
│   │   ├── L01_R0014.npz
│   │   ├──  ...
│   ├── test_inPlace
│   │   ├── L01_R0012.npz
│   │   ├──  ...
│   ├── test_outPlace
│   │   ├── L08_R0000.npz
│   │   ├──  ...
├── unpacked
│   ├── train
│   │   ├── DOPs
│   │   │   ├── L01_R0000.tif
│   │   │   ├──  ...
│   │   ├── DSMs
│   │   │   ├── L01_R0000.tif
│   │   │   ├──  ...
│   │   ├── cameras
│   │   │   ├── L01_R0000.json
│   │   │   ├──  ...
│   │   ├── meshes
│   │   │   ├── L01_R0000.ply
│   │   │   ├──  ...
│   │   ├── keypoints
│   │   │   ├── L01_R0000.ply
│   │   │   ├──  ...
│   │   ├── point_maps
│   │   │   ├── L01_R0000.ply
│   │   │   ├──  ...
│   │   ├── queries
│   │   │   ├── L01_R0000.jpg
│   │   │   ├──  ...
│   ├── val # same structure as train
│   │   ├──  ...
│   ├── test_inPlace # same structure as train
│   │   ├──  ...
│   ├── test_outPlace # same structure as train
│   │   ├──  ...

The folder demo contains data in raw format (query image as .jpg and the geodata as .tif files) as well as folder called samples containing a small subset of the dataset for quick testing and debugging purposes. The folder full contains the full dataset. The dataset is divided into four parts: train, val, test_inPlace, test_outPlace. The train and val folders contain the samples for training and validation. The test_inPlace and test_outPlace folders contain the samples for testing. The test_inPlace samples are taken from the same locations as in the train and val samples, while the test_outPlace samples are taken from different locations.

The naming of the files follows the pattern: L{location_id}_{type_of_the_sample}{sample_index}.npz, where:

location_id: The ID of the location (e.g. L01, L02, ...)
type_of_the_sample: The type of the sample (e.g. "R" for same domain, "xDOP" for DOP domain shift, "xDSM" for DSM domain shift, "xDOPDSM" for DOP and DSM domain shift)

Sample

Each .npz file contains the following keys: ['sample_id', 'image_query', 'point_map', 'image_dop', 'dsm', 'scale', 'extrinsics', 'intrinsics', 'keypoints', 'vertices', 'faces', 'extrinsics_refined']

sample_id: The ID of the sample as string
image_query: The query image as numpy array of shape (H, W, 3)
point_map: The point map as numpy array of shape (H, W, 3)
image_dop: The DOP image as numpy array of shape (H_geo, W_geo, 3), H_geo = W_geo = 1024
dsm: The DSM image as numpy array of shape (H_geo, W_geo, 3), H_geo = W_geo = 1024
scale: The scale of a single pixel in the DOP and DSM images in meters
extrinsics: The extrinsics (world to cam) of the query image as numpy array of shape (3, 4) derived from 3D reconstruction
intrinsics: The intrinsics of the query image as numpy array of shape (3, 3)
keypoints: The 3D keypoints of the query image as numpy array of shape (N, 3)
vertices: The 3D vertices of the local mesh as numpy array of shape (M, 3)
faces: The faces of the local mesh as numpy array of shape (L, 3)
extrinsics_refined: Camera pose parameters as numpy array of shape (3, 4) optimized to compensate for rasterization artifacts and missing 0.5D information (building facades) in the DSM. Computed using ground truth correspondences between query images and DOP/DSM data via PnP RANSAC. Provided for research purposes but not used in official benchmarking.

🔧 Usage

To use the dataset as a PyTorch Dataset, you can do the following:

from ortholoc.dataset import OrthoLoC

dataset = OrthoLoC(
    dataset_dir=None,  # path to the dataset, if empty, the dataset will be downloaded automatically
    sample_paths=None,  # path to the samples (cannot be specified when dataset_dir is set), e.g. ["demo/samples/highway_rural.npz", "demo/samples/urban_residential_xDOPDSM.npz"]
    set_name='all',  # name of the set (all, train, val, test_inPlace, test_outPlace)
    start=0.,  # start of the dataset
    end=1.,  # end of the dataset
    mode=0,  # mode 0 for all samples, 1 for samples with same UAV imagery and geodata domain, 2 for samples with DOP domain shift, 3 for samples with DOP and DSM domains shift
    new_size=None,  # new size of the images (useful for training)
    limit_size=None,  # limit size of the images (useful for debugging)
    shuffle=True,  # shuffle the dataset
    scale_query_image=1.0,  # scale of the query image (1.0 keep the original size)
    scale_dop_dsm=1.0,  # scale of the DOP and DSM images
    gt_matching_confidences_decay=1.0,  # decay of the matching confidences (the larger the less confident will be the GT for non-unique points like points on facades)
    covisibility_ratio=1.0,  # ratio of the covisibility (0.0 to 1.0, 0.0 exclusive, the larger the more area in the geodata will be visible for the UAV)
    return_tensor=False,  # return the samples while iterating over the dataset as dict of torch tensors
    predownload=False,  # if True, it will download the dataset while constructing the dataset object, otherwise it will download while iterating over the dataset
)

You can also use the unpacked version of the dataset, where each data type is stored in a separate folder. This is useful if you are working with C++ or other languages that do not support .npz loading. Of course, you can still load .npz files in C++ using libraries such as cnpy.

🏃 Run Scripts

For each script, consult the help message for more options. All the weights for matching algorithms will be downloaded automatically.

Important Notes:

Use --device cpu AND set the environment variable CUDA_VISIBLE_DEVICES='' if you want to run the scripts on CPU.

🔁 Matching Only

To run the image matching from a sample of the dataset or from two images, you can do the following:

run-matching --sample assets/demo/samples/urban_residential_xDOPDSM.npz --matcher Mast3R --device cuda --angles 0 --show

📍 Localization and/or Calibration

To run the localization and/or calibration from a sample of the dataset or from custom data, you can do the following:

run-localization --sample assets/demo/samples/urban_residential_xDOPDSM.npz --matcher Mast3R --device cuda --angles 0 --show

will output the following results:

Pose (world to cam):
[[ 1.0548955e-01 -9.9438059e-01  8.8982619e-03 -1.0469512e+01]
 [-9.0781039e-01 -9.9950403e-02 -4.0729618e-01 -2.2552296e+01]
 [ 4.0589681e-01  3.4887560e-02 -9.1325277e-01  1.0123373e+02]]
Intrinsics:
[[772.4036    0.      516.1471 ]
 [  0.      772.4036  385.53125]
 [  0.        0.        1.     ]]

You can use your own images and geodata by specifying the paths to the files directly:

run-localization --image assets/demo/urban_residential.jpg --dop assets/demo/urban_residential_DOP.tif --dsm assets/demo/urban_residential_DSM.tif --intrinsics assets/demo/urban_residential_intrinsics.json --matcher Mast3R --device cuda --angles 0 --show

Important notes:

You can add --adhop to enable AdHoP refinement.
If you do not provide intrinsics parameters, the system will automatically estimate them (performing calibration).
Ensure your geodata covers the area visible in the query image. The localization and calibration framework requires sufficient overlap between the query image and geodata. Using geodata with large areas not visible in the query image may lead to poor results.

📏 Benchmarking

To benchmark performance across a set of samples from the dataset:

run-benchmark --dataset_dir assets/demo/samples/ --output_dir ./output/ --matcher Mast3R --device cuda

Please run run-benchmark -h to see how to configure the parameters for running a specific benchmark.

👁️ Visualization of a Sample

To visualize a single sample from the dataset:

visualize-sample --sample assets/demo/samples/highway_rural.npz --show

or

visualize-sample --sample https://cvg.cit.tum.de/webshare/g/papers/Dhaouadi/OrthoLoC/full/test_inPlace/L01_R0056.npz --show

🖼️ Visualization of Samples in the Dataset

To create a visualization of some samples in a dataset directory:

visualize-dataset --dataset_dir full/test_outPlace/ --n_scenes 2 --show

or

visualize-dataset --sample_ids L17_R0066 L09_R0396 L32_R0008 --show

⚠️ License

Distributed under CC BY-NC-SA 4.0. See LICENSE.md for more information. For licenses of the matching methods used in the project, please refer to the respective repositories.

🙏 Acknowledgements

This work is a result of the joint research project STADT:up. The project is supported by the German Federal Ministry for Economic Affairs and Climate Action (BMWK), based on a decision of the German Bundestag. The author is solely responsible for the content of this publication.

Special thanks to Vincentqyw for developing the Image Matching WebUI. This tool provides a user-friendly interface for matching between images using various state-of-the-art algorithms.

Big thanks to the German government for making geospatial data freely available to everyone. These open data portals are a goldmine for developers, researchers, and anyone curious about spatial information. Here's where you can find them:

🇩🇪 State Open Data Portals

Bavaria: geodaten.bayern.de/opengeodata
Berlin: gdi.berlin.de
Hesse: gds.hessen.de
Hamburg: geoportal-hamburg.de
North Rhine-Westphalia: opengeodata.nrw.de
Baden-Württemberg: opengeodata.lgl-bw.de

Citation

@inproceedings{dhaouadi2025ortholoc,
  title        = {OrthoLoC: {UAV} 6-DoF Localization and Calibration Using Orthographic Geodata},
  author       = {Oussema Dhaouadi and Riccardo Marin and Johannes Michael Meier and Jacques Kaiser and Daniel Cremers},
  booktitle    = {The Thirty-ninth Annual Conference on Neural Information Processing Systems Datasets and Benchmarks Track},
  year         = {2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 49 Commits
ortholoc		ortholoc
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
croissant.json		croissant.json
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

OrthoLoC: UAV 6-DoF Localization and Calibration Using Orthographic Geodata

NeurIPS 2025 (Oral)

📔 Table of Contents

🌟 About the Project

🔍 Dataset Features

📊 Dataset Samples

🧰 Getting Started

‼️ Prerequisites

⚙️ Installation

via pip

via git clone

📥 Download the Dataset

Important Notes:

📁 Structure of the Dataset

Sample

🔧 Usage

🏃 Run Scripts

Important Notes:

🔁 Matching Only

📍 Localization and/or Calibration

📏 Benchmarking

👁️ Visualization of a Sample

🖼️ Visualization of Samples in the Dataset

⚠️ License

🙏 Acknowledgements

🇩🇪 State Open Data Portals

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Languages

License

deepscenario/OrthoLoC

Folders and files

Latest commit

History

Repository files navigation

OrthoLoC: UAV 6-DoF Localization and Calibration Using Orthographic Geodata

NeurIPS 2025 (Oral)

📔 Table of Contents

🌟 About the Project

🔍 Dataset Features

📊 Dataset Samples

🧰 Getting Started

‼️ Prerequisites

⚙️ Installation

via pip

via git clone

📥 Download the Dataset

Important Notes:

📁 Structure of the Dataset

Sample

🔧 Usage

🏃 Run Scripts

Important Notes:

🔁 Matching Only

📍 Localization and/or Calibration

📏 Benchmarking

👁️ Visualization of a Sample

🖼️ Visualization of Samples in the Dataset

⚠️ License

🙏 Acknowledgements

🇩🇪 State Open Data Portals

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Languages

Packages