curl -LsSf https://astral.sh/uv/install.sh | sh
uv sync
The dataset is available on HuggingFace 🤗 at: https://huggingface.co/datasets/dfki-av/drivergaze360
uv run \
torchrun --standalone --nproc-per-node=gpu \
main.py --model DriverGaze360 \
uv run \
torchrun --standalone --nproc-per-node=gpu \
main.py \
--model DriverGaze360 \
--inference \
--video-path VIDEO_PATH \
--video-outpath VIDEO_OUTPATH \
--cktp CKPT
usage: main.py [-h] [--no-logs] [--save-dir SAVE_DIR] [--model MODEL] [--num-epochs NUM_EPOCHS] [--batch-size BATCH_SIZE] [--lr LR] [--w-nss W_NSS] [--w-kld W_KLD] [--w-cc W_CC] [--w-mse W_MSE] [--w-sal W_SAL] [--w-ss W_SS] [--use-amp] [--resume] [--ckpt CKPT]
[--num-workers NUM_WORKERS] [-T T] [--overlap OVERLAP] [--frame-stride FRAME_STRIDE] [--train-path TRAIN_PATH] [--val-path VAL_PATH] [--img-size IMG_SIZE IMG_SIZE] [--weighted-samples]
Training script for DriverGaze360
options:
-h, --help show this help message and exit
--no-logs disable logging
--save-dir SAVE_DIR save directory for outputs
Model Config:
--model MODEL Model architecture
--num-epochs NUM_EPOCHS
Number of training epochs
--batch-size BATCH_SIZE
Batch size
--lr LR Learning rate
--w-nss W_NSS Weight for NSS loss
--w-kld W_KLD Weight for KLD loss
--w-cc W_CC Weight for cross-correlation loss
--w-mse W_MSE Weight for MSE loss
--w-sal W_SAL Weight for Saliency loss
--w-ss W_SS Weight for Sementic Segmentation loss
--use-amp Use mixed precision
--resume Resume training from ckpt
--ckpt CKPT Model Checkpoint
Dataset Config:
--num-workers NUM_WORKERS
Number of data loader workers
-T T Number of consecutive frames
--overlap OVERLAP Number of overlapping frames
--frame-stride FRAME_STRIDE
Stride between frames
--train-path TRAIN_PATH
Path to training data
--val-path VAL_PATH Path to validation data
--img-size IMG_SIZE IMG_SIZE
Input image size (H, W)
--weighted-samples Use weighted sampler with stored KLDs
Inference:
--inference Perform inference on a video
--video-path VIDEO_PATH
Path of video folder
--video-outpath VIDEO_OUTPATH
Save path
- Add data processing scripts
- Add training scripts
- Add inference scripts
If you find this work useful in your research, please consider citing:
@article{govil_2025_cvpr,
title = {DriverGaze360: OmniDirectional Driver Attention with Object-Level Guidance},
author = {Shreedhar Govil and Didier Stricker and Jason Rambach},
year = {2025},
eprint = {2512.14266},
archivePrefix= {arXiv},
primaryClass = {cs.CV},
url = {https://arxiv.org/abs/2512.14266}
}This work was partially funded by the European Union's Horizon Europe Research and Innovation Programme under Grant Agreement No. 101076360 (BERTHA) and by the German Federal Ministry of Research, Technology and Space under Grant Agreement No. 16IW24009 (COPPER). The authors would like to express their sincere appreciation to Prateek Kumar Sharma, for his support with data collection and the implementation of driving scenarios. We also gratefully acknowledge Ruben Abad, Alex Levy, and Prof. Antonio M. López from the Computer Vision Center (CVC) for their methodological guidance and for providing the code used to implement the goal-directed navigation routes applied in collecting part of the dataset presented in this study. Finally, we sincerely thank all the participants who contributed to the dataset collection, as well as our colleagues at DFKI for their valuable feedback and support throughout this project.
The views and opinions expressed in this publication are solely those of the author(s) and do not necessarily reflect those of the European Union or the European Climate, Infrastructure and Environment Executive Agency (CINEA). Neither the European Union nor the granting authority can be held responsible for them.
