SpatialDINO: A Self-Supervised 3D Vision Transformer that enables Segmentation and Tracking in Crowded Cellular Environments

SpatialDINO brings a self-supervised foundation model for analyzing 3D fluorescence microscopy images by adapting DINOv2-style joint-embedding training to learn dense volumetric features directly from unlabeled 3D datasets. By exploiting true 3D context rather than slice-wise “2.5D” aggregation, it enables automated detection and segmentation in crowded, anisotropic, low-contrast volume and enables tracking in 4D time-lapse data. SpatialDINO generalizes across targets and imaging conditions without voxel-level annotation or retraining.

Paper Bibtex

Authors: Alex Lavaee*, Arkash Jain*, Gustavo Scanavachi*, Jose Inacio Costa-Filho*, Adam Ingemansson, Tom Kirchhausen

* Equal contribution

📂 Public Data Access

All datasets and pre-trained models are publicly available through AWS S3. The datasets can also be accessed with Mirante4D.

Download Commands

Download training datasets:

aws s3 cp s3://spatialdino/dataset_part1/ ./datasets/ --recursive --no-sign-request

Download inference datasets:

aws s3 cp s3://spatialdino/inference_data/ ./inference_data/ --recursive --no-sign-request

Download models:

aws s3 cp s3://spatialdino/models/ ./models/ --recursive --no-sign-request

List available data:

aws s3 ls s3://spatialdino/ --no-sign-request

Getting started

1. Install uv

uv is a faster drop-in replacement for conda that we use for environment management. Download and install it via either

curl -LsSf https://astral.sh/uv/install.sh | sh

or

wget -qO- https://astral.sh/uv/install.sh | sh

2. Clone the repository

git clone --recursive git@github.com:kirchhausenlab/spatialdino.git

3. Create a uv environment:

In the repository directory, run

uv venv --python 3.12
uv sync --all-packages

This creates a single root .venv shared by the core spatialdino package and the GUI server in apps/server.

4. Install the GUI

In the repository directory, run

cd apps/web
npm install

5. Run the GUI

In the repository directory, run

cd apps/web
npm run dev

Then follow the GUI instructions shown in the terminal and browser.

CUDA Installation 🛠️

This project requires CUDA version 12 or higher. Verify the correct version of CUDA installed by running:

nvcc --version

Inference

⚠️ Important: Before running inference, ensure you have the pretrained model. Use the model path ../models/backbone.pth which contains the pretrained weights for the DINO vision transformer.

Example Inference Script

#!/bin/bash

folder_path="/nfs/data1expansion/datasync3/Gustavo/20210422_0p5_0p55_sCMOS_Gu_AP2/CS1_Ap2_live_3colorsDic/Ex07_488_60mW_z0p5/ch488nmCamA/DS"
file_start=0
file_end=1                           # exclusive; leave unset to process through the end
save_path="/raid1/cme_tests/results/ablations/ap2_test"
export OMP_NUM_THREADS=32
export CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7
export NUM_PROC_PER_NODE=$(echo "$CUDA_VISIBLE_DEVICES" | tr ',' '\n' | wc -l)

uv run torchrun --nnodes 1 --node_rank 0 --nproc_per_node $NUM_PROC_PER_NODE \
         --rdzv_endpoint=localhost:9999 ./scripts/inference/inference.py \
  file_path="$folder_path" \
  save_path="$save_path" \
  file_start=$file_start \
  file_end=$file_end \
  global_hist_min=null \
  global_hist_max=null \
  crop_params="[0,0,0,0,0,0]"

Script Parameters

folder_path: Path to folder containing images
file_start / file_end: File slice passed to fnames[file_start:file_end] (file_end is exclusive)
save_path: Path to save results
global_hist_min / global_hist_max: Optional global histogram bounds. If both are provided, inference uses those shared values for all volumes instead of the default per-volume normalization. These correspond to the values written by scripts/inference/norm_per_vol.py.
OMP_NUM_THREADS: Number of threads to use
CUDA_VISIBLE_DEVICES: List of GPUs to use
NUM_PROC_PER_NODE: Number of processes/GPUs per node
crop_params: Parameters for cropping images

Segmentation

Work in progress

Training 🏃

Environment Setup

If you do not have a .bashrc file, create one:

touch ~/.bashrc

Add the following to your .bashrc file:

NCCL Configuration (NVIDIA's communication library):

export NCCL_SOCKET_NTHREADS=4     # number of threads per socket
export NCCL_NSOCKS_PERTHREAD=4    # number of sockets per thread
export NCCL_IB_DISABLE=0          # enable Infiniband
export NCCL_IB_HCA="mlx5"         # use Mellanox Infiniband
export CUDA_HOME="/usr/local/cuda-12"  # choose the correct CUDA version
export PATH=$CUDA_HOME/bin:$PATH
export CPATH="$CUDA_HOME/include:$CPATH"

C++ Library:

export CXX=g++

Distributed Training:

export NCCL_SOCKET_IFNAME=ib      # use all infiniband interfaces
export RDZV_BACKEND="c10d"
export OMP_NUM_THREADS=16
export NUM_ALLOWED_FAILURES=3

export RDZV_ID="2001"             # set the rdzv id to be the same for all nodes
export MASTER_PORT="29500"        # set the master port to be the same for all nodes

Multi-Node Setup

To get the Master Address, get the IP address of your infiniband interface:

ibstat

Example output:

ib0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 2044
inet 10.1.0.11 netmask 255.255.0.0 broadcast 10.1.255.255
...

Set the master address (in this case 10.1.0.11):

export MASTER_ADDR="10.1.0.11"
export RDZV_ENDPOINT="$MASTER_ADDR:$MASTER_PORT"

Training Command

For multi-node training with 3 nodes and 8 GPUs per node:

# Arguments explanation:
# --nnodes: number of nodes (e.g. 3)
# --node_rank: rank of the node (e.g. 0, 1, 2, ... n) for n nodes
# --nproc_per_node: number of processes/GPUs per node (e.g. 8)
# --master_addr: address of the master node (e.g. 10.10.10.10)
# --master_port: port of the master node (e.g. 29500)

torchrun --nnodes 3 --nproc_per_node 8 --node_rank $NODE_RANK \
         --rdzv-id $RDZV_ID --rdzv-backend $RDZV_BACKEND \
         --rdzv-endpoint $RDZV_ENDPOINT scripts/train/pretrain.py

Support

Issues: SpatialDINO Issues
Contact: Jose Inacio Costa-Filho (joseinacio@tklab.hms.harvard.edu), Tom Kirchhausen (kirchhausen@crystal.harvard.edu)

Citation

Bibtex

@article {spatialdino2025,
	author = {Lavaee, Alex and Jain, Arkash and Scanavachi Moreira Campos, Gustavo and Costa-Filho, Jose Inacio and Ingemansson, Adam and Kirchhausen, Tom},
	title = {SpatialDINO: A Self-Supervised 3D Vision Transformer that enables Segmentation and Tracking in Crowded Cellular Environments},
	year = {2026},
	doi = {10.64898/2025.12.31.697247},
	publisher = {Cold Spring Harbor Laboratory},
	URL = {https://www.biorxiv.org/content/early/2026/01/02/2025.12.31.697247},
	journal = {bioRxiv}
}

Name		Name	Last commit message	Last commit date
Latest commit History 673 Commits
.vscode		.vscode
apps		apps
aws		aws
scripts		scripts
src/spatialdino		src/spatialdino
tests		tests
.gitignore		.gitignore
.gitmodules		.gitmodules
.python-version		.python-version
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
ruff.toml		ruff.toml
s3_upload_progress.json		s3_upload_progress.json
setup.py		setup.py
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SpatialDINO: A Self-Supervised 3D Vision Transformer that enables Segmentation and Tracking in Crowded Cellular Environments

📂 Public Data Access

Download Commands

Getting started

1. Install uv

2. Clone the repository

3. Create a uv environment:

4. Install the GUI

5. Run the GUI

CUDA Installation 🛠️

Inference

Example Inference Script

Script Parameters

Segmentation

Work in progress

Training 🏃

Environment Setup

Multi-Node Setup

Training Command

Support

Citation

Bibtex

About

Uh oh!

Releases 2

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

SpatialDINO: A Self-Supervised 3D Vision Transformer that enables Segmentation and Tracking in Crowded Cellular Environments

📂 Public Data Access

Download Commands

Getting started

1. Install uv

2. Clone the repository

3. Create a uv environment:

4. Install the GUI

5. Run the GUI

CUDA Installation 🛠️

Inference

Example Inference Script

Script Parameters

Segmentation

Work in progress

Training 🏃

Environment Setup

Multi-Node Setup

Training Command

Support

Citation

Bibtex

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages