Umlauts in filename problem and PyPDF2 hiccups

After I decrypted my database I used menextract2pdf to get my annotations into the pdfs. I encountered a couple of errors:

`Could not find pdffile /Users/armin/Desktop/ProjekteOnHold/ceat/mendeley_archive/Mach - 1886 - BeitrÃ¤ge zur Analyse der Empfindungen.pdf`

This is an Umlaut encoding issue. Adding `.decode("utf8")` on line 28 solved this problem for me. 


`zlib.error: Error -3 while decompressing data: incorrect header check`
and
`ValueError: invalid literal for int() with base 10: 'dobj'`

These were errors related to specific (kind of corrupted) pdfs. I added `print(fn)` to `processpdf(fn, fn_out, annotations)` so I could identify and manually remove the culprits.

Thank you for writing Menextract2pdf!
 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Umlauts in filename problem and PyPDF2 hiccups #18

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Umlauts in filename problem and PyPDF2 hiccups #18

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions