Skip to content

Regression on PDF type detection #85

@MaximeLaffaire

Description

@MaximeLaffaire

Hi !

I have a question about this PR, you have removed the PDF header detection in the first 1024 bytes, and I was wondering why ?

For reminder I added that feature because of the Adobe spec for PDF describing it (page 1102, section 3.4.1).

I am curious about the security issues mentioned in the PR, do you have more details about that and know if we could find a way to re-enable the header detection while still responding to this security issue ?

For the context : we have an app where users can upload PDF documents and we are using this library in our validation process.
We find out that A LOT of PDFs (coming from various sources, like scanners or even Google invoices) has some data before the %PDF header.

Thank you :)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions