Skip to content

Latest commit

 

History

History
71 lines (60 loc) · 2.38 KB

File metadata and controls

71 lines (60 loc) · 2.38 KB

Video Inpainting Localization with Contrastive Learning

An official implementation code for paper "Video Inpainting Localization with Contrastive Learning". This repo provides code and trained weights.

Framework

Proposed video inpainting localization scheme ViLocal. Each 5 consecutive frames is set as an input unit to yield the inpainting localization map of the middle frame. (a) Training stage 1. ViLocal utilizes contrastive supervision to train the encoder network. (b) Training stage 2. ViLocal employs localization supervision to train the decoder network.

Dependency

  • torch 1.7.0
  • python 3.7

Datasets

  1. DAVIS2016
  2. DAVIS2017
  3. MOSE
  4. VOS2k5-800 (in this paper we use 800 videos from VOS2k5)

The MOSE100 dataset in this paper can be found in this

Video inpainting algorithms

  1. VI
  2. OP
  3. CP
  4. E2FGVI
  5. FuseFormer
  6. STTN

Usage

For example to train:

cd train_stage1
python train.py

cd train_stage2
python train.py

For example to test: download train_VI_OP.pth

cd train_stage2
python split_files.py # split files
python construct5frames.py # construct 5-frames groups
python test.py 

For example to inference:

cd train_stage2
python inference.py 

Citation

If you use this code for your research, please cite our paper

@article{lou2025video,
  title={Video Inpainting Localization with Contrastive Learning},
  author={Lou, Zijie and Cao, Gang and Lin, Man},
  journal={IEEE Signal Processing Letters},
  year={2025},
  publisher={IEEE}
}

License

Licensed under a Creative Commons Attribution-NonCommercial 4.0 International for Non-commercial use only. Any commercial use should get formal permission first.