Skip to content

Fix pypdf bug: AttributeError when /DR missing from AcroForm #77

@jhs

Description

@jhs

Bug Report for Upstream pypdf Project

Purpose: This issue tracks the need to report and fix a bug in the pypdf library (py-pdf/pypdf).


Summary

pypdf crashes with AttributeError: 'dict' object has no attribute 'get_object' when filling form fields in PDFs that lack a /DR (Default Resources) entry in the AcroForm dictionary and reference fonts not in CORE_FONT_METRICS.

Location

  • File: pypdf/generic/_appearance_stream.py
  • Line: 437
  • Function: TextStreamAppearance.from_text_annotation()
  • Version: pypdf 5.7.0+ (tested with 6.3.0+)

Root Cause

Lines 433-437 contain a type error:

# BUGGY CODE
document_resources = cast(
    dict[Any, Any],
    acro_form.get("/DR", {}),  # ← Returns plain dict {} if /DR missing
)
document_font_resources = document_resources.get_object().get("/Font", DictionaryObject()).get_object()

The Problem:

  1. cast() is only a type hint - it doesn't convert the object
  2. acro_form.get("/DR", {}) returns:
    • A pypdf DictionaryObject if /DR exists (has .get_object() method)
    • A plain Python dict {} if /DR is missing (no .get_object() method)
  3. Line 437 unconditionally calls .get_object(), which crashes when the default {} is returned

Correct Pattern

The same file already shows the correct pattern on lines 419-428:

# CORRECT CODE (lines 419-428)
document_resources = cast(
    DictionaryObject,
    cast(
        DictionaryObject,
        acro_form.get("/DR", DictionaryObject()),  # ← Uses DictionaryObject() as default
    ).get_object(),  # ← Calls get_object() immediately
)

Proposed Fix

Replace lines 433-437 with:

document_resources = cast(
    DictionaryObject,
    cast(
        DictionaryObject,
        acro_form.get("/DR", DictionaryObject()),  # ← Change {} to DictionaryObject()
    ).get_object(),
)
document_font_resources = document_resources.get("/Font", DictionaryObject()).get_object()

Minimal Complete Verifiable Example (MCVE)

Reproduction Code

from pypdf import PdfReader, PdfWriter

# Create a PDF with form fields using PyMuPDF
import fitz

doc = fitz.new()
page = doc.new_page(width=612, height=792)

# Add a text field with /Helv font (not in CORE_FONT_METRICS)
widget = fitz.Widget()
widget.field_type = fitz.PDF_WIDGET_TYPE_TEXT
widget.field_name = "test_field"
widget.rect = fitz.Rect(50, 50, 300, 70)
widget.text_fontsize = 10
widget.text_font = "helv"  # This becomes /Helv in default appearance
widget.fill_color = (1, 1, 1)
widget.border_color = (0.7, 0.7, 0.7)
page.add_widget(widget)

doc.save("test_form.pdf")
doc.close()

# Now try to fill the field with pypdf
reader = PdfReader("test_form.pdf")
writer = PdfWriter(clone_from=reader)

# This will crash with AttributeError
writer.update_page_form_field_values(
    writer.pages[0],
    {"test_field": "Test Value"},
    auto_regenerate=False
)

writer.set_need_appearances_writer(True)
with open("test_form_filled.pdf", "wb") as f:
    writer.write(f)

Error Output

Traceback (most recent call last):
  File "test.py", line 29, in <module>
    writer.update_page_form_field_values(
  File "/path/to/pypdf/_writer.py", line 1045, in update_page_form_field_values
    appearance_stream_obj = TextStreamAppearance.from_text_annotation(
                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/path/to/pypdf/generic/_appearance_stream.py", line 437, in from_text_annotation
    document_font_resources = document_resources.get_object().get("/Font", DictionaryObject()).get_object()
                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'dict' object has no attribute 'get_object'

PDF Characteristics

The bug occurs when:

  1. AcroForm dictionary has no /DR (Default Resources) entry
  2. Form field default appearance references a font not in CORE_FONT_METRICS (e.g., /Helv)
  3. Code reaches the fallback path at lines 431-437

CORE_FONT_METRICS includes: Helvetica, Courier, Times-Roman, Arial, etc. (but NOT "Helv")

Workaround

Until fixed, use fonts that are in CORE_FONT_METRICS:

  • Change /Helv to /Helvetica in PDF default appearance strings
  • Or add a /DR dictionary to the AcroForm with proper font resources

Testing Strategy

A proper unit test should:

  1. Create a PDF with form fields lacking /DR in AcroForm
  2. Use a font reference not in CORE_FONT_METRICS (e.g., /Helv)
  3. Attempt to update field values using update_page_form_field_values()
  4. Verify it either succeeds or fails gracefully (not with AttributeError)

Labels for pypdf Issue

When reporting to pypdf, use labels:

  • is-bug
  • workflow-forms
  • needs-pdf (will attach sample PDF)

Action Items

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions