Skip to content
/ BREPS Public

[AAAI 2026] BREPS: Bounding-Box Robustness Evaluation of Promptable Segmentation

License

Notifications You must be signed in to change notification settings

emb-ai/BREPS

Repository files navigation

[AAAI 2026] BREPS: Bounding-Box Robustness Evaluation of Promptable Segmentation

Paper

This repository contains offiсial dataset and code implementation for the paper:
BREPS: Bounding-Box Robustness Evaluation of Promptable Segmentation

image

Setting Environment

Install Dependencies:

conda install -y scikit-image
conda install -y -c anaconda cmake
pip install -e .

Prepare Datasets & Models Checkpoints

This project builds upon on RITM and TETRIS,and so it uses the same dataset structure and evaluation scripts. Thus, you should configure the paths to the datasets in config.yml. We measured out BREPS attack on datasets below.


Dataset Description Download Link
GrabCut 50 images with one object each (test) GrabCut.zip (11 MB)
Berkeley 96 images with 100 instances (test) Berkeley.zip (7 MB)
DAVIS 345 images with one object each (test) DAVIS.zip (43 MB)
COCO_MVal 800 images with 800 instances (test) COCO_MVal.zip (127 MB)
TETRIS 2000 images with 2531 instances (test) TETRIS.zip (6.3 GB)
ACDC 100 images with 100 instances (test) ACDC.zip (14 MB)
BUID 780 images with 780 instances (test) BUID.zip (24 MB)
MedScribble 56 images with 56 instances (test) MedScribble.zip (2.5 MB)

Real-Users Study:

We collected 25000 annotations, 50 user bboxes for 500 images from 10 datasets (All attack datasets and also ADE20K and PascalVOC). You can download full user study data from this link.

To download checkpoints, please refer to the repositories of the relevant papers or download all checkpoints used in this work at once — MODELS_CHECKPONTS.zip (18 GB)

Run Optimization

Short Example

python3 scripts/evaluate_boxes_model_sam.py NoBRS --checkpoint ../MODEL_CHECKPOINTS/SAM/sam_vit_b_01ec64.pth --deterministic --save-ious --datasets=GrabCut --n_opt_steps=50 --lr_mult=9 --iou-analysis --gpus="0" --thresh=0.5 --optim_min --modality=bbox --lambda_mult 0.1

All flags the same as in original models except following additional flags:

--n_opt_steps — number of optimization steps for the bounding box
--optim_min — minimization optimization (maximization by default)
--lr_mult — learning rate multiplyer for optimization (can be set to 0 with n_opt_steps=1 for baseline clicking strategy)
--n_workers — number of parallel workers for evaluation (the maximum number you can fit depends on your GPU)
--deterministic — force determistic algroritms in PyTorch (however, some models may use non-deterministic ops)
--lambda_mult — regularization strength

Full Benchmarking

Some models (SAM, SAM-HQ, MobileSAM, SAM2.1, SAM-HQ 2,RobustSAM, MedSAM) should be installed using separate package and don't support backpropagation from the box due to torch.no_grad calls. Thus, you can manually remove such calls or download already patched versions and install it using:

cd segment-anything-custom-build; pip install -e .
cd sam-hq-custom-build; pip install -e .
cd MobileSAM-custom-build; pip install -e .
...

etc.

To benchmark all models after setting up an environment and downloading all checkpoints to MODEL_CHECKPOINTS folder and just run: bash runbboxparallel.sh, selecting the amount of models appropriate for your server. All hyperparameters are set following author implementations.

Metrics Calculation

After benchmarking, each model creates entries in the folder EXPS_PATH from config.yml. Merge it with baseline experiments folder if you need IoU-Base@BBox scores.Use provided Evaluate Models.ipynb Jupyter Notebook to calculate all metrics — IoU-Min@BBox, IoU-Max@BBox, IoU-Base@BBox. One can download obtained results from benchmarking all models — experiments.zip (570 MB). Sample output:

--------------------------------
GrabCut
--------------------------------
mobile_sam
IOU  | Min 96.21 | Base 94.77 | Max 97.19 | Delta 0.99
--------------------------------
robustsam_checkpoint_b
IOU  | Min 45.73 | Base 84.62 | Max 90.07 | Delta 44.33
--------------------------------

To compute and visualise heatmaps, please checkout HEATMAPS.md

Citation

If you find this work useful for your research, please cite the original paper:

@article{moskalenko2026breps,
  title={BREPS: Bounding-Box Robustness Evaluation of Promptable Segmentation},
  author={Moskalenko, Andrey and Kuznetsov, Danil and Dudko, Irina and Iasakova, Anastasiia and Boldyrev, Nikita and Shepelev, Denis and Spiridonov, Andrei and Kuznetsov, Andrey and Shakhuro, Vlad},
  journal={arXiv preprint arXiv:2601.15123},
  year={2026}
}

About

[AAAI 2026] BREPS: Bounding-Box Robustness Evaluation of Promptable Segmentation

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •