If anyone is still looking for a guide on data annotation/preparation and how to fine-tune the table transformer (either detection or structure recognition) using Transformers trainer, I have prepared two articles on how to do it.
One on data annotation/preparation and one on fine-tuning.
The github repo is here: https://github.com/andyphua114/table-transformer-finetune-eval/tree/main
I hope these helps and feel free to let me know if you have any questions.
Big thanks and credits to nielsr from the HuggingFace team for his notebook on fine-tuning a DETR for object detection and his other notebooks on inference using Table Transformer.