Skip to content

DOCX: embedded fonts from word/fonts/ directory not loaded #104

@developer0hye

Description

@developer0hye

Description

DOCX files can embed font files (TTF/OTF) in the word/fonts/ directory, referenced via word/fontTable.xml and word/_rels/fontTable.xml.rels. These embedded fonts are completely ignored during conversion.

Root Cause

The DOCX parser does not extract or load fonts from the ZIP archive. Searching for embedded.*font, fontTable, or fonts/ in docx.rs yields zero matches.

The renderer (crates/office2pdf/src/render/pdf.rs) supports font_paths for additional font directories but only loads system fonts by default. Embedded DOCX fonts are never passed to the Typst compiler.

DOCX structure example:

word/fonts/CustomFont-regular.ttf
word/fonts/CustomFont-bold.ttf
word/fonts/CustomFont-italic.ttf
word/fonts/CustomFont-boldItalic.ttf
word/fontTable.xml           (references the font families)
word/_rels/fontTable.xml.rels (maps rIds to font files)

Expected

The PDF output uses the embedded fonts from the DOCX file.

Actual

Fallback fonts are used when the specified fonts are not installed on the system, causing visual differences in the output.

Fix Suggestion

  1. During DOCX parsing, extract font files from word/fonts/ in the ZIP archive
  2. Pass them to the Typst compiler as additional font data (similar to how images are passed as ImageAsset)
  3. Or extract to a temp directory and pass via font_paths

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions