Skip to content

[Feature request] Expose advanced PaddleOCR parameters to fix missing dialogue dashes #92

@leontin8920

Description

@leontin8920

Describe the feature

Description:

Hi! First of all, thank you again for creating this tool!

I am writing to request the exposure of a few advanced PaddleOCR parameters in the GUI (or via a config file) to solve a persistent issue: the OCR engine consistently ignores or cuts out the dialogue dashes (-) at the beginning of subtitle lines.

The Problem & What I've Tried:
This issue happens on all types of videos: regular color, grayscale, and black & white. I have tried absolutely everything to help the OCR catch these dashes. I went as far as perfectly binarizing the video (pure black text on a pure white background). I also created mathematically perfect, tightly fitted, and perfectly centered Crop Boxes. As you can see in the screenshot I will attach, I even created a separate, dedicated Crop Box for each individual text line to isolate them completely:

Image

Despite all this perfect pre-processing, PaddleOCR's default detection model still drops them.

The Solution (The "Last Hope") proposed to me by Gemini AI:
As a last hope to fix this, it would be a massive game-changer if users could tweak the following PaddleOCR parameters directly from your app. These are known to fix the "missing edge punctuation" issue:

1. det=False (Option to disable Detection of text area)

  • Why: Even when we provide a perfect, hand-drawn Crop Zone, PaddleOCR's default behavior (det=True) is to run its internal text detection model again on that small, cropped image. This model's job is to scan the image, find blobs of what it thinks is text, and draw its own, even tighter, internal bounding boxes around them before passing them to the recognition model.

  • The Failure Point: The failure happens right here. The detection model is trained to find words and is very aggressive about filtering out "noise". It often misclassifies small, isolated characters like a leading dialogue dash (-) as a digital artifact or a speck of dust. As a result, it creates an internal bounding box that starts after the dash, effectively excluding it. Consequently, the final recognition model never even sees the dash, because it was filtered out by the detector.

  • The Solution: By allowing det=False, we could bypass this entire problematic detection step. It would essentially tell the engine: "Trust the user's hand-drawn Crop Zone completely. Assume everything inside it is a single line of text and send the entire zone directly to the recognition model." This guarantees that the dash is never filtered out.

  • (Note: I understand this might require some logic tweaks on your end, since PaddleOCR won't return spatial bounding box coordinates when det=False. However, adding a checkbox like "Assume Crop Zone is a single text line" would be an incredibly powerful bypass for these hard-to-read punctuation cases).

2. det_unclip_ratio (Might help to be higher than the default value of 1.5)

  • Why: The default value often makes the bounding box too tight around the words, physically cutting off the leading dash. Increasing this ratio expands the bounding box slightly outward, ensuring edge punctuation is caught in the recognition phase.

3. det_db_box_thresh (Might help to be lower than the default value of 0.6)

  • Why: This controls the threshold for filtering out small bounding boxes. Lowering it forces the DBNet detector to accept and keep smaller, "uncertain" characters (like isolated dashes), instead of aggressively discarding them as noise.

4. det_db_thresh (Might help to be lower than the default value of 0.3)

  • Why: This makes the internal binarization map of the OCR much more sensitive, allowing it to pick up very thin or isolated pixels (like thin dashes) that would otherwise be lost.

Allowing users to customize these parameters would make VideOCR the absolute ultimate tool for extracting hardcoded subs.

Thank you so much for your time and for considering this!

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions