Skip to content

xjiangmed/STIM-TM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Token Merging via Spatiotemporal Information Mining for Surgical Video Understanding

STIM-TM applied to Surgformer

In this repository, we provide code for applying Spatiotemporal Information Mining Token Merging (STIM-TM) on the Surgformer baseline.

The provided code extends the original code for Surgformer.

Installation

conda create -n STIM-TM python==3.8.13
conda activate STIM-TM
pip install torch==1.13.0+cu117 torchvision==0.14.0+cu117 torchaudio==0.13.0 --extra-index-url https://download.pytorch.org/whl/cu117
pip install -r requirements.txt

Training Surgformer

Please follow the Surgformer code to prepare the dataset and download the pre-trained parameters for TimeSformer.

run the following code for training

sh scripts/train.sh

Testing Surgformer w/o token merging

  1. run the following code for testing, and get 0.txt, 1.txt, ... (Testing with N GPUs will result in N files);
sh scripts/test.sh
  1. Merge the files and generate separate txt file for each video;
python datasets/convert_results/convert_cholec80.py
python datasets/convert_results/convert_autolaparo.py
  1. Use Matlab Evaluation Code to compute metrics;

Testing Surgformer w/ STIM-TM

run the following code for testing:

sh scripts/test_STIM_TM.sh

Acknowledgements

Thanks to the authors of following open-source projects:

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors