Skip to content

upstream: umya-spreadsheet panics and parse failures on valid XLSX files #83

@developer0hye

Description

@developer0hye

Summary

22 XLSX files fail due to bugs in the upstream umya-spreadsheet (v2.3.3) crate. 15 files cause panics (caught by our catch_unwind wrapper), and 7 files return parse errors. These are not fixable in our codebase without patching or replacing the upstream library.

Panicking files (15 files) — caught by catch_unwind

All of these would crash the process without our catch_unwind wrapper. The panics originate from unwrap() calls inside umya-spreadsheet.

File Panic message Root cause
xlsx/libreoffice/chart_hyperlink.xlsx unwrap() on Err: FileNotFound Missing expected zip entry for chart hyperlink
xlsx/libreoffice/hyperlink.xlsx unwrap() on Err: FileNotFound Missing expected zip entry for hyperlink
xlsx/libreoffice/tdf130959.xlsx unwrap() on Err: FileNotFound Missing expected zip entry
xlsx/libreoffice/test_115192.xlsx unwrap() on Err: FileNotFound Missing expected zip entry
xlsx/poi/47504.xlsx unwrap() on Err: FileNotFound Missing expected zip entry
xlsx/poi/bug63189.xlsx unwrap() on Err: FileNotFound Missing expected zip entry
xlsx/poi/ConditionalFormattingSamples.xlsx unwrap() on Err: FileNotFound Missing expected zip entry
xlsx/libreoffice/check-boolean.xlsx unwrap() on Err: ParseFloatError Boolean cell value parsed as float
xlsx/libreoffice/functions-excel-2010.xlsx unwrap() on Err: ParseIntError PosOverflow Integer overflow parsing cell reference
xlsx/poi/FormulaEvalTestData_Copy.xlsx unwrap() on Err: ParseIntError PosOverflow Integer overflow parsing cell reference
xlsx/libreoffice/tdf100709.xlsx unwrap() on None Expected element missing
xlsx/poi/64450.xlsx unwrap() on None Expected element missing
xlsx/poi/sample-beta.xlsx unwrap() on None Expected element missing
xlsx/libreoffice/tdf162948.xlsx Could not find dataBar end element Malformed conditional formatting XML
xlsx/poi/NewStyleConditionalFormattings.xlsx Could not find dataBar end element Malformed conditional formatting XML

Panic root causes (grouped)

  • FileNotFound (7 files): umya-spreadsheet calls unwrap() when looking up zip entries by relationship ID. If the referenced file doesn't exist in the archive (e.g., external hyperlinks, chart data), it panics instead of returning None/Err.
  • ParseFloatError / ParseIntError (3 files): unwrap() on str::parse::<f64>() or str::parse::<u32>() for cell values/references that contain non-numeric data or exceed u32::MAX.
  • unwrap() on None (3 files): Expected XML elements/attributes missing from the document structure.
  • dataBar end element (2 files): Conditional formatting parser can't find closing tag for <dataBar> — panics via panic!() macro.

Parse error files (7 files) — non-panic failures

These return Err from umya-spreadsheet without panicking, but still fail conversion.

File Error Root cause
xlsx/libreoffice/forcepoint107.xlsx IoError: Invalid checksum Corrupted zip entry within a valid zip container
xlsx/libreoffice/tdf121887.xlsx ZipError: specified file not found in archive Missing expected entry (e.g., xl/worksheets/sheet1.xml)
xlsx/libreoffice/tdf131575.xlsx ZipError: specified file not found in archive Missing expected entry
xlsx/libreoffice/tdf76115.xlsx ZipError: specified file not found in archive Missing expected entry
xlsx/poi/49609.xlsx ZipError: specified file not found in archive Missing expected entry
xlsx/poi/56278.xlsx ZipError: specified file not found in archive Missing expected entry
xlsx/poi/59021.xlsx ZipError: specified file not found in archive Missing expected entry

Possible mitigations

  1. Short-term: Already handled gracefully — panics are caught, errors are returned. No process crashes.
  2. Medium-term: Contribute fixes upstream to umya-spreadsheet to replace unwrap() calls with proper error handling.
  3. Long-term: Consider alternative XLSX parsing libraries (e.g., calamine for read-only parsing) or a custom OOXML parser.

Impact

22 / 2,831 files (0.8%) — all XLSX.

Related: #77

Metadata

Metadata

Assignees

No one assigned

    Labels

    upstreamIssue caused by upstream dependency

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions