Regression on PDF type detection

Hi ! 

I have a question about [this PR](https://github.com/neilharvey/FileSignatures/pull/74/files), you have removed the PDF header detection in the first 1024 bytes, and I was wondering why ?

For reminder I added that feature because of the [Adobe spec for PDF](https://opensource.adobe.com/dc-acrobat-sdk-docs/pdfstandards/pdfreference1.7old.pdf) describing it (page 1102, section 3.4.1).

I am curious about the security issues mentioned in the PR, do you have more details about that and know if we could find a way to re-enable the header detection while still responding to this security issue ?

For the context : we have an app where users can upload PDF documents and we are using this library in our validation process.
We find out that A LOT of PDFs (coming from various sources, like scanners or even Google invoices) has some data before the %PDF header.

Thank you :) 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Regression on PDF type detection #85

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Regression on PDF type detection #85

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions