Fix: Nested struct parsing fails to preserve nested fields (Issue #627) by laughingman7743 · Pull Request #628 · pyathena-dev/PyAthena

laughingman7743 · 2025-11-17T13:17:33Z

Summary

Fixes nested STRUCT (ROW) type parsing where nested fields were being lost during data conversion.

Problem

Issue #627 reported that when querying tables with nested structs like:

positions = Column(
    "positions",
    AthenaArray(
        AthenaStruct(
            ("header", AthenaStruct(("stamp", AthenaTimestamp))),
            ("x", Float),
            ("y", Float)
        )
    ),
)

The actual data {header={stamp=xyz, seq=123}, x=4.736, y=0.583} was being incorrectly parsed as {'x': 4.736, 'y': 0.583}, with the entire header field lost.

Root Cause

The _parse_named_struct function in pyathena/converter.py had two issues:

Simple comma splitting - Used inner.split(",") which incorrectly split nested structures:

Input: "header={stamp=2024-01-01, seq=123}, x=4.736"
Wrong split: ["header={stamp=2024-01-01", " seq=123}", " x=4.736"]

Brace filtering - Skipped any key-value pairs containing {} characters, removing all nested fields

Solution

Updated _parse_named_struct to use _split_array_items helper for proper brace-depth-aware splitting
Added recursive parsing: when a value looks like a struct ({...}), call _to_struct recursively
Updated docstring to document nested struct support

Changes

pyathena/converter.py: Modified _parse_named_struct function
tests/pyathena/test_converter.py: Added 10 test cases for nested structs
tests/pyathena/sqlalchemy/test_base.py: Added 2 integration tests with real Athena queries

Testing

All tests pass:

✅ 71 converter tests (including 10 new nested struct tests)
✅ 2 new SQLAlchemy integration tests with Athena queries
✅ All existing tests pass without regression
✅ Lint/format/type checks pass

Test Coverage

Converter tests (test_converter.py):

Single-level nesting: {header={stamp=..., seq=...}, x=..., y=...}
Double nesting: {outer={middle={inner=value}}}
Triple nesting: {level1={level2={level3=...}}}
Multiple nested fields: {pos={x, y}, vel={x, y}, timestamp=...}
Arrays with nested structs: [{header={...}, x=...}]

SQLAlchemy integration tests (test_base.py):

Query execution with nested ROW types
Array query with nested structs

Verification

# Before fix
>>> from pyathena.converter import _to_struct
>>> _to_struct("{header={stamp=2024-01-01, seq=123}, x=4.736}")
{'x': 4.736}  # header is lost!

# After fix
>>> _to_struct("{header={stamp=2024-01-01, seq=123}, x=4.736}")
{'header': {'stamp': '2024-01-01', 'seq': 123}, 'x': 4.736}  # ✓ Correct!

Fixes #627

🤖 Generated with Claude Code

This commit fixes a critical bug where nested STRUCT (ROW) types were not being parsed correctly, causing nested fields to be lost during data conversion. ## Problem The `_parse_named_struct` function in `pyathena/converter.py` was using simple comma-splitting which failed for nested structures like: `{header={stamp=2024-01-01, seq=123}, x=4.736}` This caused: 1. Incorrect splitting at commas inside nested braces 2. Nested fields being skipped due to brace-containing value filtering ## Solution - Updated `_parse_named_struct` to use `_split_array_items` for proper brace-depth-aware splitting - Added recursive parsing for nested struct values - Updated docstring to document nested struct support ## Testing Added comprehensive test cases: - Converter tests: 7 nested struct patterns + 3 array patterns - SQLAlchemy integration tests: Query execution with nested ROW types All existing tests pass without regression. Fixes #627 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

laughingman7743 mentioned this pull request Nov 17, 2025

Undesired behaviour with AthenaArray/AthenaStruct #627

Closed

laughingman7743 marked this pull request as ready for review November 17, 2025 13:18

laughingman7743 merged commit 8d4e41c into master Nov 17, 2025
5 checks passed

laughingman7743 deleted the fix/nested-struct-parsing-issue-627 branch November 17, 2025 13:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix: Nested struct parsing fails to preserve nested fields (Issue #627)#628

Fix: Nested struct parsing fails to preserve nested fields (Issue #627)#628
laughingman7743 merged 1 commit intomasterfrom
fix/nested-struct-parsing-issue-627

laughingman7743 commented Nov 17, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

laughingman7743 commented Nov 17, 2025

Summary

Problem

Root Cause

Solution

Changes

Testing

Test Coverage

Verification

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant