fix: handle Avro reader schema with no fields by mzabaluev · Pull Request #9611 · apache/arrow-rs

mzabaluev · 2026-03-24T14:47:26Z

Which issue does this PR close?

Closes Avro decoder can't handle a reader schema with no fields #9608.

Rationale for this change

In the degenerate case when the Avro reader schema has no fields, the RecordDecoder should be able to produce empty record batches with the number of rows counted from the data. As an optimization for OCF, the reader could skip decoding altogether, relying on record counts provided by data blocks.

What changes are included in this PR?

A row counter is run in the RecordDecoder state.

Are these changes tested?

Added tests to verify decoder behavior given an empty reader schema for the data files in the test suite.

Are there any user-facing changes?

No.

rluvaton · 2026-04-06T16:35:21Z

@@ -512,7 +512,7 @@ impl<R: AsyncFileReader + Unpin + 'static> AsyncAvroFileReader<R> {
                    // We have a full batch ready, emit it
                    // (This is not mutually exclusive with the block being finished, so the state change is valid)
                    if self.decoder.batch_is_full() {
-                        return match self.decoder.flush() {
+                        return match self.decoder.flush_block() {


Why changed to flush_block?

It was something I noticed while tweaking: the OCF reader does not need the schema-updating code in flush. The method it calls to decode is decode_block, the logical companion to which is provided as flush_block.

is it required for the fix? if not can you please create a separate pr for this?

Moved out to #9726.

alamb · 2026-04-06T19:48:29Z

fyi @jecsand838

alamb · 2026-04-14T20:51:23Z

Marking as draft as I think this PR is no longer waiting on feedback and I am trying to make it easier to find PRs in need of review. Please mark it as ready for review when it is ready for another look

In the degenerate case when the Avro reader schema has no fields, the RecordDecoder should be able to produce empty record batches with the number of rows counted from the data.

github-actions bot added arrow Changes to the arrow crate arrow-avro arrow-avro crate labels Mar 24, 2026

mzabaluev-flarion force-pushed the avro-empty-reader-schema branch from f8a3f4a to c518a19 Compare March 24, 2026 17:32

rluvaton reviewed Apr 6, 2026

View reviewed changes

alamb marked this pull request as draft April 14, 2026 20:51

fix: handle Avro reader schema with no fields

9175874

In the degenerate case when the Avro reader schema has no fields, the RecordDecoder should be able to produce empty record batches with the number of rows counted from the data.

mzabaluev-flarion force-pushed the avro-empty-reader-schema branch from 7be6a71 to 9175874 Compare April 15, 2026 12:23

mzabaluev marked this pull request as ready for review April 15, 2026 12:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: handle Avro reader schema with no fields#9611

fix: handle Avro reader schema with no fields#9611
mzabaluev wants to merge 1 commit intoapache:mainfrom
mzabaluev:avro-empty-reader-schema

mzabaluev commented Mar 24, 2026

Uh oh!

rluvaton Apr 6, 2026

Uh oh!

mzabaluev Apr 6, 2026 •

edited

Loading

Uh oh!

rluvaton Apr 9, 2026 •

edited

Loading

Uh oh!

mzabaluev Apr 15, 2026

Uh oh!

alamb commented Apr 6, 2026

Uh oh!

alamb commented Apr 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

mzabaluev commented Mar 24, 2026

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

Uh oh!

rluvaton Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

mzabaluev Apr 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rluvaton Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mzabaluev Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

alamb commented Apr 6, 2026

Uh oh!

alamb commented Apr 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

mzabaluev Apr 6, 2026 •

edited

Loading

rluvaton Apr 9, 2026 •

edited

Loading