Evaluation metrics in the BADJA dataset

Thanks to the authors for the wonderful work. If I have my own dataset in the same format as the BADJA dataset (original images and segmented masks), I would like to know how to perform the same computation of evaluation metrics as in BADJA. We found that in the README only guidance is provided for the calculation of the evaluation metrics for TAP-Vid.

Thank you for your help.