docx2python/changelog at master · manuelguzmandao/docx2python · GitHub

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
---- version 1.25 - 200820 Added support for Table of Contents text

A docx table of contents is built like a set of hyperlinks, with each hyperlink element's having an anchor (internal
link) instead of an href (external link).

Previously any document with a Table of Contents would fail with
`KeyError: '{http://schemas.openxmlformats.org/officeDocument/2006/relationships}id'` after failing to find an href.
Now, docx2python will continue without warning if an href is not found in a hyperlink element. In an href is found,
docx2python will print the href inside '<a href="{}">' as before. Anchor (internal link) elements are meaningless
outside the docx and are therefore ignored by docx2python


---- version 1.26 - 201005 Continue (with bullet) when numbering-format lookup fails

Some documents created in Pages use a different indexing scheme to specify numbered-list formats and values. I cannot
infer formats or values from such files without potentially changing existing behavior. The previous behavior in such
cases was to fail with an IndexError. v1.26 will now replace any numbering format with a "bullet" (--) when the format
or value cannot be inferred.

This will only happen where the program would previously have failed with an IndexError, so no previous behavior (which
allowed the program to complete) has been altered.


---- version 1.27 - 201102 Continue when document properties are not found

`docx2python(file).properties` returns a dictionary of document properties (e.g., {'Author': 'Shay Hill'}). Google Docs
(and perhaps others) do not store such properties. When document properties cannot be found, v1.27 will continue and
return an empty dictionary for `docx2python(file).properties`.

This will only happen where the program would previously have failed with a KeyError, so no previous behavior (which
allowed the program to complete) has been altered.


---- version 1.27.1 - 201115 Continue when image r:id is not found

A user found a docx `imagedata` element with a missing `r:id` element. The `r:id` number gives the location of an image
filename. I presume this `imagedata` element is a vector graphic, which `docx2python` does not and will not support.
This makes two out of three `r:id` lookup positions (`hyperlink`, `image`, and `imagedata`) for which users have found
absent `r:id`. None so far have contained anything meaningful for text export (internal links in a previous case and
vector graphics in this case). Now all `r:id` lookups take place within `suppress(KeyId)` context.

This will only happen where the program would previously have failed with a KeyError, so no previous behavior (which
allowed the program to complete) has been altered.