All notable changes in pdfminer.six will be documented in this file.
The format is based on Keep a Changelog.
- Wrong order of text box grouping introduced by PR #315 (#335)
- Simple wrapper to easily extract text from a PDF file #330
- Support for extracting JBIG2 encoded images (#311 and #46)
- Sphinx documentation that is published on Read the Docs (#329)
- Unhandled AssertionError when dumping pdf containing reference to object id 0 (#318)
- Debug flag actually changes logging level to debug for pdf2txt.py and dumppdf.py (#325)
- Using argparse instead of getopt for command line interface of dumppdf.py (#321)
- Refactor
LTLayoutContainer.group_textboxesfor a significant speed up in layout analysis (#315)
- Files for external applications such as django, cgi and pyinstaller (#314)
- Support for Python 2 is dropped at January 1st, 2020 (#307)
- Contribution guidelines in CONTRIBUTING.md (#259)
- Support new encodings OneByteEncoding and DLIdent for CMaps (#283)
- Use
six.iteritems()instead ofdict().iteritems()to ensure Python2 and Python3 compatibility (#274) - Properly convert Adobe Glyph names to unicode characters (#263)
- Allow CMap to be a content stream (#283)
- Resolve indirect objects for width and bounding boxes for fonts (#273)
- Actually updating stroke color in graphic state (#298)
- Interpret (invalid) negative font descent as a positive descent (#203)
- Correct colorspace comparision for images (#132)
- Allow for bounding boxes with zero height or width by removing assertion (#246)