Table column detection regression with vLLM ≥0.15 + latest transformers (dots.ocr detects 5 columns as 3)

After upgrading to vLLM 0.15.x and newer versions of transformers, I’m seeing a regression in table extraction accuracy when using dots.ocr.

Previously, tables were detected and extracted correctly, including column structure and table header. With the newer stack, some tables are mis-detected — specifically, 5-column tables are sometimes extracted as only 3 columns.

This behavior did not occur in my earlier tests with older vLLM/transformers versions. vllm==0.10, transformers==4.51.3

Expected behavior

Tables should preserve the correct number of columns (e.g., a 5-column table should be extracted as 5 columns).

Actual behavior

Some tables are detected with fewer columns than exist:

5 columns → detected as 3

leads to merged/misaligned cells and incorrect structured output

Environment

vLLM: 0.15

transformers: <version>

dots.ocr: <version/commit>

CUDA: 13.0

GPU: H100 NVL 96GB

Python: 3.12

OS: PyTorch (Vast) docker image based on nvidia/cuda:10.0-cudnn7-devel-ubuntu16.04

Repro steps

Run dots.ocr with vLLM backend

Process document containing multi-column tables (≥5 columns)

Observe incorrect column detection

i wish i kept examples but i didnt when i find i will reply in this

vLLM call params

``` python
    payload = {
        "model": "model",
        "messages": [
            {
                "role": "user",
                "content": [
                    {
                        "type": "image_url",
                        "image_url": {
                            "url": image_base64
                        }
                    },
                    {
                        "type": "text",
                        "text": prompt
                    }
                ]
            }
        ],
        "max_tokens": 8096,
        #"presence_penalty": 0.0,
        #"frequency_penalty": 0.0,
        "repetition_penalty": 1.05,
        "temperature": 0.1,
        "top_p": 1.0,
        "top_k": 0,
        "min_p": 0.0
    }
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Table column detection regression with vLLM ≥0.15 + latest transformers (dots.ocr detects 5 columns as 3) #268

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Table column detection regression with vLLM ≥0.15 + latest transformers (dots.ocr detects 5 columns as 3) #268

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions