Skip to content

wision-lab/instant-video-models

Repository files navigation

Instant Video Models

This is the official code for our NeurIPS 2025 paper Instant Video Models: Universal Adapters for Stabilizing Image-Based Networks.

The goal of this work is to modify existing single-frame models to improve stability and robustness on video. Our approach involves minimal modifications to the original model. The original parameters can be frozen, such that we only train a lightweight set of stabilization adapters.

Note

I spent some time cleaning up this codebase after the paper deadline. I re-ran a few experiments to test core functionality. However, I did not test all of the config files, due to the significant compute resources required to re-run experiments. If you encounter a problem, please submit a GitHub issue and I will do my best to help. -Matt

Usage

The two main scripts are ./scripts/train.py and ./scripts/evaluate.py. They should be invoked from the top level of the repository.

The runs folder contains a .yml configuration file for each experiment in the paper. The filepath of the config file is passed to the training or evaluation script. For example, to fine-tune the Deeplab segmentation model on the VIPER dataset:

./scripts/train.py runs/deeplab_viper/finetune_base_model.yml

Then, to train controlled spatial-fusion stabilizers on top of the fine-tuned model:

./scripts/train.py runs/deeplab_viper/train_controlled_spatial.yml

Finally, the stabilized model can be evaluated with:

./scripts/evaluate.py runs/deeplab_viper/evaluate_controlled_spatial.yml

Scripts will print progress and results to the terminal. Results, weights, and log files are saved in a unique directory created in outputs.

Config files starting with underscores are not meant to be run directly - they are intended to be imported by other configs.

Some experiments load weights generated by a previous run. In this case, WEIGHTS_FILEPATH is used as a placeholder in config files. Replace WEIGHTS_FILEPATH with the location of the previously generated weights (typically in outputs).

Config values can be overridden on the command line. For example, to increase the number of stabilizer training epochs to 100:

./scripts/train.py runs/deeplab_viper/train_controlled_spatial.yml epochs=100

An important use-case for overrides is in setting lambda, the strength of the temporal smoothness penalty (see our paper for details). If a script complains that lambda is not defined, it probably expects a command-line override, for example:

./scripts/evaluate.py runs/deeplab_viper/evaluate_controlled_spatial.yml lambda=0.4

Some of the paper experiments vary lambda throughout some range. In this case, something like the following can be used:

for lambda in 0.1 0.2 0.4 0.8
do
  ./scripts/train.py runs/deeplab_viper/train_controlled_spatial.yml lambda=$lambda
done

For some baselines, lambda is replaced with another parameter controlling the degree of smoothing. For the gaussian stabilizer we use sigma and for the simple_fixed stabilizer we use alpha. These parameters should be overridden via the command-line, analogous to lambda.

Some models and datasets require manual downloads. See the "Dataset Setup" and "Model Setup" sections below for instructions.

General Setup

Git Submodules

After cloning this repository, initialize submodules (containing code for external models) by running:

git submodule init
git submodule update

Dependencies

We manage dependencies using Conda. To create the Conda environment, run:

conda env create --file environment.yml

Then activate the environment with:

conda activate instant-video-models

The file environment_precise.yml contains more exact package versions and can be used to reproduce the original development environment. To create an environment based on environment_precise.yml, run:

conda env create --file environment_precise.yml

Python Path

Scripts assume that the working directory is on the Python path. To set this up, run the following in Bash (or add it to .bashrc):

if [[ :$PYTHONPATH: != *:.:* ]]
then
    export PYTHONPATH="$PYTHONPATH:."
fi

Dataset Setup

DAVIS

Run the script ./scripts/datasets/download_davis.sh to download and unpack the DAVIS dataset.

Run the script ./scripts/datasets/crop_davis.py to crop the DAVIS dataset to only the annotated sections. This step is only required if running the adversarial robustness experiments.

Need for Speed (NFS)

Run the script ./scripts/datasets/download_nfs.sh to download and unpack the NFS dataset.

Local Laplacian NFS

After downloading NFS, run the following to generate the local Laplacian variant:

matlab -batch 'run("scripts/datasets/generate_nfs_local_laplacian.m")'

This requires a MATLAB installation with the image processing toolbox.

Comment/uncomment the marked blocks at the top of the file to adjust the strength of the local Laplacian effect.

RobustSpring

Download the following files from the Spring website:

  • test_frame_left.zip
  • test_frame_right.zip
  • rain.zip
  • snow.zip

Place the downloaded files in data/robust_spring. Then run ./scripts/datasets/unpack_robust_spring.sh to unpack the zip files. Note this script deletes the original zip files after unpacking to save space.

VIPER

Download the following ZIP files from Google Drive:

Place the downloaded files in data/viper. Then run ./scripts/datasets/unpack_viper.sh to unpack the zip files. Note this script deletes the original zip files after unpacking to save space.

VisionSim Depth

Please contact us for a copy of this data (the dataset is ~200 GB and we have not found a good way to share it publicly).

Model Setup

AdaIn

Download the files decoder.pth and vgg_normalised.pth from this GitHub releases page. Place these files in ./weights/adain. Then run ./scripts/models/prepare_adain_weights.py to generate a single merged weight file that can be used more easily with our codebase.

Deeplab

Download the best_deeplabv3plus_mobilenet_cityscapes_os16.pth weights using this link and place them in weights/deeplab. Then run ./scripts/models/prepare_deeplab_weights.py to convert these weights to a format usable with our codebase.

HDRNet

Download weights using this link. Extract the contents of the zip file to weights/hdrnet. Then run ./scripts/models/prepare_hdrnet_weights.py to convert these weights to a format usable with our codebase.

NAFNet

Download the NAFNet-SIDD-width32.pth weights using this Google Drive link. Place this file in ./weights/nafnet. Then run ./scripts/models/prepare_nafnet_weights.py to convert these weights to a format usable with our codebase.

About

Code for our paper "Instant Video Models: Universal Adapters for Stabilizing Image-Based Networks"

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors