Skip to content

fix: drop malformed bounding boxes#454

Open
b-hahn wants to merge 4 commits intodocling-project:mainfrom
b-hahn:main
Open

fix: drop malformed bounding boxes#454
b-hahn wants to merge 4 commits intodocling-project:mainfrom
b-hahn:main

Conversation

@b-hahn
Copy link
Copy Markdown

@b-hahn b-hahn commented Dec 10, 2025

This change handles the case where bounding boxes predicted by VLMs have a width or a height <=0. Previously, this would cause the pipeline to crash while saving the PIL image. Now, malformed bounding boxes are simply ignored, resulting in an empty provenance field while keeping the extracted text.

Fixes docling-project/docling#2763

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Dec 10, 2025

DCO Check Passed

Thanks @b-hahn, all your commits are properly signed off. 🎉

@mergify
Copy link
Copy Markdown

mergify bot commented Dec 10, 2025

Merge Protections

Your pull request matches the following merge protections and will not be merged until they are valid.

🔴 Require two reviewer for test updates

This rule is failing.

When test data is updated, we require two reviewers

  • #approved-reviews-by >= 2

🟢 Enforce conventional commit

Wonderful, this rule succeeded.

Make sure that we follow https://www.conventionalcommits.org/en/v1.0.0/

  • title ~= ^(fix|feat|docs|style|refactor|perf|test|build|ci|chore|revert)(?:\(.+\))?(!)?:

@b-hahn b-hahn force-pushed the main branch 4 times, most recently from 1c3bd97 to dd586ed Compare December 10, 2025 20:22
@codecov
Copy link
Copy Markdown

codecov bot commented Dec 10, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@kklein
Copy link
Copy Markdown

kklein commented Dec 11, 2025

@dolfim-ibm @cau-git:

I think that @b-hahn from yesterday would require another approval of the CI runs on your end. :)

@dolfim-ibm
Copy link
Copy Markdown
Member

@b-hahn you identified the right function, i.e. load_from_doctags(), to fix the issue. As we discussed, instead of making an empty prov: [] we would like to drop the element completely if it has zero size.

@cau-git
Copy link
Copy Markdown
Member

cau-git commented Dec 11, 2025

@dolfim-ibm we agreed on the proposed solution here. It’s legit for some elements to have missing prov

@kklein
Copy link
Copy Markdown

kklein commented Dec 12, 2025

@dolfim-ibm and @cau-git : How do you suggest to move forward now?

@simon376
Copy link
Copy Markdown

I think I found another issue that might relate to this, unfortunately I can't provide a stack trace since it is hard to reproduce and I didn't fully log everything on the server, and I couldn't find this error message in the code yet. Maybe you can have a look at this, too:
Pipeline VlmPipeline failed: Coordinate 'lower' is less than 'upper'

@dolfim-ibm
Copy link
Copy Markdown
Member

@b-hahn for the moment we should be good with the option which provides the empty prov: [].

Copy link
Copy Markdown
Member

@cau-git cau-git left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@b-hahn @kklein just reflecting the desired solution below. Let's first go only with your change in extract_bounding_box, then observe before introducing conditions that could potentially skip valid content.

text=caption_text,
parent=None,
)
if bbox is not None:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would remove this condition so captions are created also when the bbox is not present, as before.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

doc=doc,
parent=inline_group,
)
if common_bbox is not None:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would remove this condition so that _add_text is called also with a None bounding box, as before.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@b-hahn b-hahn closed this Dec 17, 2025
@b-hahn b-hahn reopened this Dec 17, 2025
@b-hahn b-hahn force-pushed the main branch 2 times, most recently from 378b1d6 to 074400f Compare December 17, 2025 21:30
@b-hahn b-hahn marked this pull request as ready for review December 17, 2025 21:32
@dosubot
Copy link
Copy Markdown

dosubot bot commented Dec 17, 2025

Related Documentation

Checked 8 published document(s) in 1 knowledge base(s). No updates required.

How did I do? Any feedback?  Join Discord

ceberam
ceberam previously approved these changes Feb 3, 2026
Copy link
Copy Markdown
Member

@ceberam ceberam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, it works as expected, thus approved.

I would only add the following suggestion. Turn this variable into a module constant, so that we remember the value choice we made. We can also add some docstrings. Something like:

_DOCTAGS_BBOX_MIN_DIMENSION: Final = 1 / 500
"""Minimum bounding box dimension threshold for DocTags import.

This constant represents the minimum width or height (as a fraction of the image
dimensions) that a bounding box must have to be considered valid during DocTags
import. The value ``1/500 = 0.002 = 0.2%`` means that any bounding box with width or
height less than 0.2% of the image width or height will be discarded.
"""

cau-git
cau-git previously approved these changes Feb 3, 2026
@ceberam
Copy link
Copy Markdown
Member

ceberam commented Feb 3, 2026

@b-hahn we just need your sign-off in the second commit (af71bb5) to merge this PR.

@ceberam
Copy link
Copy Markdown
Member

ceberam commented Feb 11, 2026

@b-hahn we just need your sign-off in the second commit (af71bb5) to merge this PR.

@b-hahn ^^^ were you able to have a look at this? ^^^
Also, consider the remark here #454 (review)

Signed-off-by: Benjamin Hahn <benjamin.hahn1@gmail.com>
Signed-off-by: Benjamin Hahn <benjamin.hahn1@gmail.com>
Signed-off-by: Benjamin Hahn <benjamin.hahn1@gmail.com>
Signed-off-by: Benjamin Hahn <benjamin.hahn1@gmail.com>
@b-hahn
Copy link
Copy Markdown
Author

b-hahn commented Feb 20, 2026

@ceberam Added a module-level constant and signed off all commits. Let me know if you need anything else to see this through.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

SystemError: tile cannot extend outside image caused by VLM zero-area predictions (Integer Rounding Edge Case)

6 participants