Token Merging via Spatiotemporal Information Mining for Surgical Video Understanding

STIM-TM applied to Surgformer

In this repository, we provide code for applying Spatiotemporal Information Mining Token Merging (STIM-TM) on the Surgformer baseline.

The provided code extends the original code for Surgformer.

Installation

conda create -n STIM-TM python==3.8.13
conda activate STIM-TM
pip install torch==1.13.0+cu117 torchvision==0.14.0+cu117 torchaudio==0.13.0 --extra-index-url https://download.pytorch.org/whl/cu117
pip install -r requirements.txt

Training Surgformer

Please follow the Surgformer code to prepare the dataset and download the pre-trained parameters for TimeSformer.

run the following code for training

sh scripts/train.sh

Testing Surgformer w/o token merging

run the following code for testing, and get 0.txt, 1.txt, ... (Testing with N GPUs will result in N files);

sh scripts/test.sh

Merge the files and generate separate txt file for each video;

python datasets/convert_results/convert_cholec80.py
python datasets/convert_results/convert_autolaparo.py

Use Matlab Evaluation Code to compute metrics;

Testing Surgformer w/ STIM-TM

run the following code for testing:

sh scripts/test_STIM_TM.sh

Acknowledgements

Thanks to the authors of following open-source projects:

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
ToMe		ToMe
configs		configs
datasets		datasets
downstream_phase		downstream_phase
evaluation_matlab		evaluation_matlab
model		model
scripts		scripts
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Token Merging via Spatiotemporal Information Mining for Surgical Video Understanding

STIM-TM applied to Surgformer

Installation

Training Surgformer

Testing Surgformer w/o token merging

Testing Surgformer w/ STIM-TM

Acknowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

License

xjiangmed/STIM-TM

Folders and files

Latest commit

History

Repository files navigation

Token Merging via Spatiotemporal Information Mining for Surgical Video Understanding

STIM-TM applied to Surgformer

Installation

Training Surgformer

Testing Surgformer w/o token merging

Testing Surgformer w/ STIM-TM

Acknowledgements

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages