Skip to content

Table Transformer (TATR) is a deep learning model for extracting tables from unstructured documents (PDFs and images). This is also the official repository for the PubTables-1M dataset and GriTS evaluation metric.

License

Notifications You must be signed in to change notification settings

aioaneid/table-transformer

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

210 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

TATR with Box Relaxation

Clone of https://github.com/microsoft/table-transformer with several improvements and box relaxation.

Usage:

conda env create --name tatr --file=environment.yml

conda run --no-capture-output --live-stream -n tatr python -c 'from huggingface_hub import snapshot_download; snapshot_download(repo_id="bsmock/pubtables-1m", repo_type="dataset")'

Instead of environment.yml it is also possible to use environment-latest.

The available list of scripts is described below. Note that only those using python need conda, and even those can be easily modified to skip conda if one installs manually the list of dependencies.

Just like with TATR v1.1, the TSR eval should be performed on table images with very little padding as created by create_padded_dataset.py. The val and test splits with tight padding are available at https://huggingface.co/datasets/aioaneid/relaxed-table-structure.

The GriTS evaluation code can be executed in parallel on different batches of images, e.g.:

seed=$((echo 0 ${test_split_name} ${epoch} | sha512sum | awk '{printf "ibase=16; "toupper($1)}' && echo " % 7FFFFF") | bc) &&
conda run --no-capture-output --live-stream -n tatr python src/main.py --data_type structure --config_file src/structure_config.json --data_root_dirs ${d} --table_words_dir ${d}/words --data_root_image_extensions .jpg --data_root_multiplicities 1 --device ${device} --mode ${mode} --test_split_name ${test_split_name} --test_start_offset ${test_start_offset} --test_max_size ${test_max_size} --no-enable_bounds --model_load_path ${f}/model_${epoch}.pth --metrics_save_filepath ${metrics_save_path} --seed ${seed} --torch_num_threads 1

The metrics batches for an arbitrary epoch can then be merged together using plots/aggregate_json_grits.py.

The training scripts allow a new --mode option validate which can be executed in a subsequent phase to training.

The code can also be used without box relaxation by specifying --no-enable_bounds when running main.py.

Performance Metrics

Training for Table Detection with a subset of the tables

Model Cardinality Error AP AR
All images, all tables 0.0018 0.9800 0.9900
Only images with exactly one object (table or table rotated) 0.1050 0.8700 0.8870
All images, one randomly-selected object (table or table rotated) per image 0.0186 0.9770 0.9880
All images, all objects counted split by category (table or table rotated), one randomly-selected object per image has a bounding box 0.0018 0.9730 0.9880
All images, all objects with hole and outer bounding boxes each relaxed by 2 pixels. TATR v1.1 cropping around the outer border 0.0018 0.9800 0.9900
All images, all objects with hole and outer bounding boxes each relaxed symetrically by (up to) 4 pixels. TATR v1.1 cropping around original bounding box 0.0016 0.9790 0.9900
All images, all objects with hole and outer bounding boxes each relaxed symetrically by (up to) 8 pixels. TATR v1.1 cropping around original bounding box 0.0014 0.9760 0.9850

Training for Table Structure Recognition with Box Relaxation

The table below refers to the best results obtained up to and including model_28.pth of:

Model Tables AccCon GriTSCon GriTSLoc GriTSTop Epochs
TATR v1.0 All 0.8243 0.9850 0.9786 0.9849 20
TATR v1.1 All 0.8326 0.9855 0.9797 0.9851 28.5
TATR v1.1 with bug fixes All 0.8433 0.9862 0.9806 0.9858 28
Constrained box relaxation All 0.8458 0.9866 0.9811 0.9861 28
TATR v1.1 with bug fixes Simple 0.9661 0.9947 0.9934 0.9953 28
Constrained box relaxation Simple 0.9667 0.9954 0.9941 0.9960 28
TATR v1.1 with bug fixes Complex 0.7324 0.9786 0.9693 0.9774 28
Constrained box relaxation Complex 0.7363 0.9789 0.9697 0.9773 28

Citation

If you find this work useful, please cite: Aioanei, D. (2025). Relaxed Bounding Boxes for Object Detection. ICCK Journal of Image Analysis and Processing, 1(3), 107–124. https://doi.org/10.62762/JIAP.2025.507329

In BibTeX format:

@article{aioanei2025relaxed,
  author  = {Aioanei, Daniel},
  title   = {Relaxed Bounding Boxes for Object Detection},
  journal = {ICCK Journal of Image Analysis and Processing},
  year    = {2025},
  volume  = {1},
  number  = {3},
  pages   = {107--124},
  doi     = {10.62762/JIAP.2025.507329},
  url     = {https://doi.org/10.62762/JIAP.2025.507329}
}

About

Table Transformer (TATR) is a deep learning model for extracting tables from unstructured documents (PDFs and images). This is also the official repository for the PubTables-1M dataset and GriTS evaluation metric.

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 98.8%
  • Shell 1.2%