Releases · docling-project/docling-eval · GitHub

11 Mar 16:38

docling-ops

v1.0.1 Latest

Latest

Fix

Remove hard pinning of docling-parse (#203) (901814d)

Assets 2

11 Mar 15:42

docling-ops

v1.0.0

Feature

Parallelize the evaluation of tables and cache the loading of external predictions (#190) (9d04a56)
Regression tests for CVAT to Docling conversion (#193) (8a10188)
CVAT box rotation support, structural cleanup (#191) (db068e9)
Improvements in user experience: Performance, error handling, logging (#189) (a850784)
Visualizer tool and command for datasets (#186) (373f959)
Extend the evaluators to support external predictions stored in files (#185) (53dbd95)
Convert Docling JSON inputs to image streams in FileDatasetBuilder (#184) (15888fd)
Allow subset to split routing in CVAT to HF exporter (#182) (ebb8800)
Ingest CVAT assets and filter submissions (#180) (b55b2ea)
Runtime optimizations for MultiLabelConfusionMatrix (#175) (5084a4d)
Add more fine-grained control in the DoclingEvalCOCOExporter (#149) (8f33420)
Remove legacy CvatDatasetBuilder code, use modernized code (#174) (693c224)
Introduce the PixelLayoutEvaluator to produce confusion matrices for the multi-label layout analysis (#173) (a79bac5)
Review-bundle builder, fixes for GraphCell with merged elements and more (#172) (21341ce)

Fix

Correct import path for TableStructureModel (#199) (a7e74a3)
Fix the reporting of doc_id, true_md, pred_md in markdown_text_evaluator.py (#196) (3ce7591)
PixelLayoutEvaluator: Set all-pixels background in case of a missing prediction and evaluate (#183) (4314091)
Fix empty prediction handling in markdown evaluator (#177) (9b6df83)
Consistenty and perf improvements (#171) (8fb3a16)

Breaking

CvatDatasetBuilder now requires modern CVAT folder structure and uses convert_cvat_folder_to_docling() internally. (693c224)

Assets 2

05 Nov 18:26

docling-ops

v0.10.0

Feature

Extend the CLI for create-eval to receive the vlm-options and max_new_tokens parameters when the provider is GraniteDocling (#164) (8be2e83)
Harmonizing pic classes for cvat to docling conversion (#167) (740157d)
Add more specific validation for reading-order, enhance validation report (5e5f2db)
Integrate textline_cells based OCR evaluation (#156) (3a9543c)

Fix

Validation fixes for list item impurity check (#169) (74e7b3e)
Don't report content-layer group violation multiple times (cb71009)
Handle merged elements regarding inclusion, don't flag single element pages (c10fdfd)
Missing transform to storage_scale for some items and table cells (1eb6b4e)
More CVAT validation and docling conversion fixes (#163) (6f59c7a)
Better control over scaling in CVAT transform, fixes for OCR (#162) (ef17b5a)
Fixes for CVAT validation, OCR in CVAT pipeline, logging, and more (#161) (80e449d)

Performance

Consistenty and perf improvements (#170) (d4a0ef6)

Assets 2

01 Oct 03:42

docling-ops

v0.9.0

Feature

Exposed forced-ocr-option (#157) (ac21644)
Implementation of table structure conversion from CVAT to DoclingDocument (208cd14)

Assets 2

16 Sep 08:23

docling-ops

v0.8.1

Fix

Ocr visualization and add ocr recognition metrics (#144) (d63a439)

Assets 2

02 Sep 21:18

laurachiticariu

v0.8.0

What's Changed

feat: Extend the Consolidator to export Latex files alongside the excel report by @nikos-livathinos in #143
feat: Extend the DoclingEvalCOCOExporter to export a parquet dataset in COCO format by @nikos-livathinos in #145
feat: Several fixes and campaign tools extensions by @cau-git in #150
feat: Add Table structure evaluations for TEDS by @praveenmidde in #94

Full Changelog: v0.7.0...v0.8.0

Contributors

cau-git, nikos-livathinos, and praveenmidde

Assets 2

30 Jul 08:06

docling-ops

v0.7.0

Feature

Add CLI arguments to control the docling layout model (#136) (3e134ae)
Campaign tools (#139) (af2c222)
Add KeyValueEvaluator (#140) (bc60093)

Fix

Prevent crash from invalid bbox coordinates in HTML export (#142) (c31b107)

Assets 2

02 Jul 09:01

docling-ops

v0.6.0

Feature

Layout evaluation fixes, mode control and cleanup (#133) (629a451)
Introduce utility to export layout predictions from HF parquet files into pycocotools format. (#125) (54f7c81)
Add specific language support for XFUND dataset builder (#122) (4ca6a0e)
Tooling for CVAT validation, to DoclingDocument transformation, new Evaluators (#119) (2ee1104)

Fix

Move ibm-cos to hyperscaler (#135) (9aff6c1)
Update hyperscalers to support multiple image file types (#118) (a34f264)
Misc fixes (#131) (518e1ba)
CVAT to DoclingDoc: Ensure that nested list handling works across page boundaries (#129) (1b58377)
Important fixes for parquet serialization / deserialization, optimizations (#128) (53c22ef)
Fixes for the dataset visualizers (#127) (a127ea9)

Performance

Improve parquet writing with plain pyarrow (#134) (c08950b)

Assets 2

11 Jun 15:48

docling-ops

v0.5.0

Feature

Integrate OCR visualization (#121) (b39f2e7)
Add the segmentation layout evaluations in the consolidated excel report. Update mypy overrides. (#120) (c4e7de0)
Update OCREvaluator with additional metrics (#78) (17e9fde)

Fix

Add the bbox to TableData from annotations (#123) (c4fe51f)
Treat th and td as equal for TEDS calculation (#114) (dbf9db7)
Add support for Google, AWS, and Azure prediction providers in cli (#115) (e8e7421)

Assets 2

28 May 09:22

docling-ops

v0.4.0

Feature

Extend the FileProvider and the CLI to accept parameters that control the source of the prediction images (#111) (42e1615)
Improvements for the MultiEvaluator (#95) (04fe2d9)
Add extra args for docling-provider and default annotations for CVAT (#98) (7903b6a)
Introduce SegmentedPage for OCR (#91) (be0ff6a)
Update CVAT for multi-page annotation, utility to create sliced PDFs (#90) (28d166d)
Add area level f1 (#86) (54d013b)

Fix

Small fixes (#108) (0628fa6)
Layout text not correctly populated in AWS prediction provider, add tests (#100) (6441688)
Dataset feature spec fixes, cvat improvements (#97) (b79dd19)
Update boto3 AWS client to accept service credentials (#88) (4e01d0b)
Handle unsupported END2END evaluation and fix variable name in OCR (#87) (75311da)
Propagate cvat parameters (#82) (1e2040a)

Documentation

Update README.md (#84) (518f684)

Assets 2