Skip to content

Issue with two column format #3

@werkstattcodes

Description

@werkstattcodes

Hello,

many thanks for all your efforts put into this project.

I am currently working on session transcripts which are in two column format. I had previously tried tesseract (bundled in R package tesseract and pdf_tools) but my results were not totally satisfying.

I now tried your approach, but to my surprise, it only recognized the header line of the page. Any suggestion what I am missing?
Attached I am sending you a sample page of the records. If your time permits, any help/suggestion would be very welcome.

Many thanks

imfname_158653_page4.pdf

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions