512: feat!: Support enums and tuples in `SchemaAwareSerializer` and implement `SchemaAwareDeserializer` by martin-augment · Pull Request #65 · martin-augment/avro-rs

martin-augment · 2026-03-23T07:11:59Z

512: To review by AI

… documented model (no tests)

…nd `[T; N]`

…ng `#[avro(with)]`

…se `fake_variadic`

--------- Co-authored-by: Martin Grigorov <martin-g@users.noreply.github.com>

Co-authored-by: Martin Grigorov <martin-g@users.noreply.github.com>

coderabbitai · 2026-03-23T07:12:38Z

Walkthrough

This pull request adds schema-aware serialization and deserialization infrastructure to the apache_avro crate. It introduces new deser_schema and ser_schema modules that implement Serde integration for encoding and decoding Avro data with explicit schema validation. Reader and writer APIs are extended with builder patterns, configuration fields for human_readable and target_block_size settings, and new methods (read_deser, write_ser_to_vec) to directly convert between Rust types and Avro bytes without intermediate Value representations. Error types are refined with new contextual variants. Support is added for tuples, fixed-size arrays, and BigDecimal serialization through dedicated serde helper modules. Documentation files describing type mappings between Avro and Serde data models are introduced. Tests and examples throughout the crate are refactored to use the new deserialization paths.

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch pr-512-2026-03-23-07-11-45

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

cursor

Cursor Bugbot has reviewed your changes and found 2 potential issues.

^{Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

cursor · 2026-03-23T07:14:10Z

avro/src/reader/mod.rs

+        );
+
+        self.block.read_next_deser(self.reader_schema)
+    }


Reader passes reader_schema causing unexpected panic

High Severity

read_next_deser passes self.reader_schema directly to block.read_next_deser, but block.read_next_deser panics whenever read_schema.is_some(). When a user provides a reader_schema that equals the writer_schema, should_resolve_schema is false (so the assert passes), but the block still receives Some(...) and panics. The existing read_next method correctly passes None when !should_resolve_schema, but read_next_deser does not replicate this filtering logic.

Additional Locations (1)

avro/src/reader/block.rs#L231-L234

value:useful; category:bug; feedback: The Bugbot AI reviewer is correct! The new serde deserializer does not support reader schema and this is explained in a TODO. But instead of panicking and crashing the application it would be better to return an Err.

cursor · 2026-03-23T07:14:10Z

avro/src/serde/deser_schema/mod.rs

+            && union.is_nullable()
+        {
+            let index = zag_i32(self.reader)?;
+            let schema = union.get_variant(index as usize)?;


Unsafe as usize cast on potentially negative index

Medium Severity

In deserialize_option, index as usize is used to convert the result of zag_i32 without checking for negative values. Elsewhere in this file (e.g., with_union on line 137), usize::try_from(index) is correctly used to produce a proper error on negative indices. With corrupt data, the as usize cast silently wraps a negative value, leading to a confusing out-of-bounds error message instead of the expected ConvertI32ToUsize error.

value:useful; category:bug; feedback: The Bugbot AI reviewer is correct! Any casting from a wider integer type to a narrower one should be done by using the checked APIs and report Err in case the value does not fit in the target type. Prevents silently overflowing the value and using a totally wrong value for any following operations with that number.

gemini-code-assist · 2026-03-23T07:15:00Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces schema awareness to the serialization and deserialization processes, enabling more robust and reliable data handling. It includes support for enums and tuples, along with comprehensive documentation to guide developers in mapping Avro schemas to Rust types and vice versa. The changes enhance the library's ability to work with complex data structures while maintaining type safety.

Highlights

Schema Aware Serializer: Adds support for serializing enums and tuples with schema awareness, ensuring data is written according to the defined schema.
Schema Aware Deserializer: Implements a schema aware deserializer, enabling the reading of Avro data into Rust types based on a provided schema.
Documentation: Adds documentation for mapping between Avro and Serde data models, improving clarity on how different types are handled.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

augmentcode · 2026-03-23T07:18:09Z

🤖 Augment PR Summary

Summary: This PR expands schema-aware Serde support by adding a raw-Avro SchemaAwareDeserializer and extending SchemaAwareSerializer to cover enums, tuples, and fixed-size arrays.

Changes:

Introduced serde::deser_schema (schema-aware deserialization directly from Avro binary) and wired it into datum readers, container readers, and single-object readers.
Refactored writers to use the new schema-aware serializer config (including human_readable and block sizing for arrays/maps).
Added tuple and fixed-size array schema generation (record-based) and support for serializing/deserializing these shapes.
Added new serde::with helpers for Avro array encoding of Rust arrays and byte-based BigDecimal encoding.
Extended UnionSchema helpers and improved/expanded error variants/messages for schema-aware (de)serialization.
Updated and added tests across reader/writer/derive/interop to exercise the new serde paths and mappings.

Technical Notes: Several reader-schema resolution paths are still marked WIP (panicking when resolution would be required), and documentation modules were added to clarify Serde↔Avro data model mappings.

_{🤖 Was this summary useful? React with 👍 or 👎}

augmentcode

Review completed. 4 suggestions posted.

Comment augment review to trigger a new review at any time.

augmentcode · 2026-03-23T07:18:10Z

avro/src/reader/mod.rs

+            "Schema aware deserialisation does not resolve schemas yet"
+        );
+
+        self.block.read_next_deser(self.reader_schema)


read_next_deser always forwards self.reader_schema into Block::read_next_deser, but Block::read_next_deser panics whenever read_schema.is_some() (even when should_resolve_schema is false because schemas match). This makes Reader::into_deser_iter panic for a valid configured reader_schema that happens to equal the writer schema.

Severity: high

Other Locations

avro/src/reader/block.rs:232

_{🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.}

value:useful; category:bug; feedback: The Augment AI reviewer is correct! The new serde deserializer does not support reader schema and this is explained in a TODO. But instead of panicking and crashing the application it would be better to return an Err.

augmentcode · 2026-03-23T07:18:10Z

avro/src/serde/deser_schema/mod.rs

+            && union.is_nullable()
+        {
+            let index = zag_i32(self.reader)?;
+            let schema = union.get_variant(index as usize)?;


index is an i32 but is cast to usize (index as usize), which can wrap negative values and yield confusing out-of-bounds behavior. Other union paths here use usize::try_from(index) with a dedicated conversion error, which seems safer/consistent.

Severity: medium

_{🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.}

value:useful; category:bug; feedback: The Augment AI reviewer is correct! Any casting from a wider integer type to a narrower one should be done by using the checked APIs and report Err in case the value does not fit in the target type. Prevents silently overflowing the value and using a totally wrong value for any following operations with that number.

augmentcode · 2026-03-23T07:18:10Z

avro/src/serde/deser_schema/mod.rs

+    {
+        match self.schema {
+            Schema::Record(record) if record.name.name() == name && record.fields.len() == len => {
+                visitor.visit_map(RecordDeserializer::new(self.reader, record, self.config))


deserialize_tuple_struct calls visitor.visit_map(...), but tuple-struct Deserialize implementations typically expect sequence access from deserialize_tuple_struct. If the visitor doesn’t implement visit_map, tuple-struct deserialization from Avro records will fail at runtime.

Severity: medium

_{🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.}

augmentcode · 2026-03-23T07:18:10Z

avro/src/serde/ser_schema/record/field_default.rs

+                | Schema::BigDecimal
+                | Schema::Decimal(_)
+                | Schema::Duration(_)
+                | Schema::Date,


Schema::Date is included in the branch that serializes string defaults as bytes, which would accept an invalid default type for date and write an incompatible encoding. It seems Schema::Date should only accept numeric defaults (like the Schema::Int | Schema::Date | Schema::TimeMillis arm above).

Severity: low

_{🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.}

value:useful; category:bug; feedback: The Augment AI reviewer is correct! The Avro specification does not support encoding dates (and times) as strings, so this branch should be removed. Dates are encoded as numbers (seconds since epoch), so the default value must be an integer

claude · 2026-03-23T07:19:28Z

PR Review: feat!: Support enums and tuples in SchemaAwareSerializer and implement SchemaAwareDeserializer

This is a substantial PR adding enum/tuple support to the serializer and implementing the full schema-aware deserializer. The overall design is solid and the modular decomposition (separate files for enums, record, tuple, block, identifier) makes the code easier to follow. Here are my findings:

Bugs / Correctness

1. Unsafe cast in deserialize_option

In deser_schema/mod.rs, the option handling does:

let index = zag_i32(self.reader)?;
let schema = union.get_variant(index as usize)?;

Every other place in this PR uses usize::try_from(index).map_err(...) before the bounds check. A negative i32 cast to usize wraps to a huge number on 64-bit; get_variant will still catch it, but the error will show a nonsensical large index rather than a clear "negative index" message. Be consistent with the rest of the codebase.

2. usize as u32 truncation in UnionEnumDeserializer::variant_seed

seed.deserialize(IdentifierDeserializer::index(index as u32))? — union indices are usize here but IdentifierDeserializer takes u32. Indices above u32::MAX would silently truncate. Low-priority but worth a try_from.

API Design Concerns

3. Panics in public iterator path — these should return Err

reader/mod.rs and reader/block.rs have assert!/panic! for the not-yet-implemented schema-resolution path. ReaderDeser is a public iterator. A user who calls Reader::builder(...).reader_schema(&schema).build()?.into_deser_iter::<T>() and then iterates will get a panic rather than Some(Err(...)). The builder-level doc comment says "will panic during reading" which is unusual and user-hostile for an iterator. Consider returning Err(Details::NotYetImplemented(...)) instead, or making into_deser_iter a compile error when a reader schema is set.

4. Breaking changes need prominent documentation

Two silent breaking changes are buried in the diff:

std::time::Duration schema changed from Schema::Duration (a 12-byte fixed) to a Schema::Record with secs: Fixed(u64) and nanos: Long fields. Any existing serialized data using Duration is no longer readable without migration.
[T; N] no longer implements AvroSchemaComponent by default (the manual impl was removed). Users who had array fields will get compile errors until they add the appropriate #[serde(with = ...)] attribute.

Both should be prominently documented in the PR description / CHANGELOG / migration guide.

Minor Issues

5. Error message regression

Old error messages included actionable hints like:

"Did you mean to use Schema::Uuid(UuidSchema::String) or utils::serde_set_human_readable(false)?"

The new messages drop these hints. Tests were updated to match the new (less helpful) wording. Consider preserving the actionability of these errors.

6. TODOs that affect correctness

deserialize_ignored_any always errors in IdentifierDeserializer — this breaks #[serde(flatten)] and similar patterns.
// TODO: Add Seek bound and skip over ignored data — unknown/extra fields during deserialization will error rather than be skipped gracefully.

Both are worth tracking as follow-up issues if not in scope for this PR.

7. #[expect(private_interfaces)] on public enum

TupleSerializer is pub with variants referencing private structs. Since it is a Serializer associated type it must be pub — but a comment explaining this forced design choice would help future readers.

8. unwrap() calls in field_default.rs

Lines like n.as_i64().unwrap() and n.as_f64().unwrap() are safe due to surrounding match guards, but prefer expect("unreachable: guarded by match arm") for clarity.

Positive Highlights

Modular decomposition into block, enums, record, tuple, identifier sub-modules is clean.
Out-of-order field caching in RecordSerializer is well thought out.
Config being Copy via manual impl (to avoid unnecessary bounds) is a nice touch.
SchemaAwareRecordFieldDefault for handling skipped fields with defaults is thorough.
New documentation modules (avro_data_model_to_serde, serde_data_model_to_avro) are excellent additions.
#[track_caller] on serde_assert improves test diagnostics.

Summary

Main blocking issues: #3 (panics instead of errors in the public iterator) and #4 (undocumented breaking changes). Issues #1 and #5 are moderate. The rest are minor polish.

gemini-code-assist

Code Review

This is a substantial pull request that introduces a new schema-aware serializer and deserializer, refactors many parts of the library to use them, and improves the Avro schema mapping for several Rust types like [T; N] and core::time::Duration. The changes are well-structured and accompanied by extensive tests and new documentation, which is commendable.

My main feedback is regarding a regression in the quality of diagnostic error messages for mismatched types during serialization, particularly for uuid::Uuid. The new error messages are more generic and less helpful than before. I've left specific comments with suggestions on how to improve them.

gemini-code-assist · 2026-03-23T07:22:46Z

avro/tests/serde_human_readable_false.rs

        writer.write(uuid, &mut buffer).unwrap_err().to_string(),
-        "Failed to serialize value of type bytes using schema Uuid(String): 55e840e29b41d4a7164466554400. Cause: Expected a string, but got 16 bytes. Did you mean to use `Schema::Uuid(UuidSchema::Fixed)` or `utils::serde_set_human_readable(true)`?"
+        "Failed to serialize value of type `bytes` using Schema::Uuid(String): Expected Schema::Bytes | Schema::Fixed | Schema::BigDecimal | Schema::Decimal | Schema::Uuid(Bytes | Fixed) | Schema::Duration"
    );


The updated error message is less informative than the previous one. The old message was very specific, guiding the user on how to fix the issue by suggesting Schema::Uuid(UuidSchema::Fixed) or setting human_readable to true. The new message is a generic list of expected schemas for byte serialization.

While this is a consequence of the new generic serializer, it's a regression in diagnostics. Could we restore a more specific error message for this case? For example, by adding a special check for Schema::Uuid in serialize_bytes within avro/src/serde/ser_schema/mod.rs when the schema doesn't match.

gemini-code-assist · 2026-03-23T07:22:46Z

avro/tests/serde_human_readable_true.rs

        writer.write(uuid, &mut buffer).unwrap_err().to_string(),
-        r#"Failed to serialize value of type string using schema Uuid(Bytes): 550e8400-e29b-41d4-a716-446655440000. Cause: Expected bytes but got a string. Did you mean to use `Schema::Uuid(UuidSchema::String)` or `utils::serde_set_human_readable(false)`?"#
+        r#"Failed to serialize value of type `str` using Schema::Uuid(Bytes): Expected Schema::String | Schema::Uuid(String)"#
    );


Similar to my other comment, the new error message here is a regression in terms of helpfulness. The previous error message clearly stated the problem (expected bytes, got a string) and suggested solutions (Schema::Uuid(UuidSchema::String) or setting human_readable to false). The new message is a generic list of schemas that can handle a string.

It would be great to improve the diagnostics here to be as helpful as they were before. This could likely be addressed by adding more specific checks within serialize_str in avro/src/serde/ser_schema/mod.rs for Uuid-related schemas.

gemini-code-assist · 2026-03-23T07:22:46Z

avro/tests/serde_human_readable_true.rs

        writer.write(uuid, &mut buffer).unwrap_err().to_string(),
-        r#"Failed to serialize value of type string using schema Uuid(Fixed(FixedSchema { name: Name { name: "uuid", .. }, size: 16, .. })): 550e8400-e29b-41d4-a716-446655440000. Cause: Expected bytes but got a string. Did you mean to use `Schema::Uuid(UuidSchema::String)` or `utils::serde_set_human_readable(false)`?"#
+        r#"Failed to serialize value of type `str` using Schema::Uuid(Fixed(FixedSchema { name: Name { name: "uuid", .. }, size: 16, .. })): Expected Schema::String | Schema::Uuid(String)"#
    );


This is another instance where the error message has become less helpful. The old error message was very specific about the mismatch and how to resolve it. The new one is generic.

Improving the error message to be more specific about the Uuid type mismatch would enhance the developer experience.

coderabbitai

Actionable comments posted: 13

🧹 Nitpick comments (2)

avro_derive/src/attributes/mod.rs (1)

321-344: Consider clarifying the error message when default is auto-disabled.

The logic to auto-disable field_default when #[avro(with)] is used is sound—calling T::field_default() doesn't make sense when the schema comes from a different source.

However, the interaction with the subsequent validation (lines 336-344) could produce a confusing error. If a user writes:

#[avro(with)]
#[serde(with = "my_module", skip_serializing_if = "Option::is_none")]
field: Option<MyType>,

They'll receive an error stating skip_serializing_if is incompatible with #[avro(default = false)], even though they never explicitly set default = false. The auto-disabling behavior isn't obvious from the error message.

Consider either:

Providing a more specific error message when the default was auto-disabled due to #[avro(with)]
Or documenting this interaction in the TODO mentioned on line 330

💡 Sketch of a clearer error message

+        let default_was_auto_disabled = with != With::Trait && avro.default == FieldDefault::Trait;
         // TODO: Implement a better way to do this (maybe if user specifies `#[avro(with)]` also use that for the default)
         // Disable getting the field default, if the schema is not retrieved from the field type
         if with != With::Trait && avro.default == FieldDefault::Trait {
             avro.default = FieldDefault::Disabled;
         }
 
         if ((serde.skip_serializing && !serde.skip_deserializing)
             || serde.skip_serializing_if.is_some())
             && avro.default == FieldDefault::Disabled
         {
-            errors.push(syn::Error::new(
-                span,
-                "`#[serde(skip_serializing)]` and `#[serde(skip_serializing_if)]` are incompatible with `#[avro(default = false)]`"
-            ));
+            let msg = if default_was_auto_disabled {
+                "`#[serde(skip_serializing)]` and `#[serde(skip_serializing_if)]` require a default value, but `#[avro(with)]` disables auto-default. Please provide an explicit `#[avro(default = \"...\")]`"
+            } else {
+                "`#[serde(skip_serializing)]` and `#[serde(skip_serializing_if)]` are incompatible with `#[avro(default = false)]`"
+            };
+            errors.push(syn::Error::new(span, msg));
         }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@avro_derive/src/attributes/mod.rs` around lines 321 - 344, The validation
error about incompatibility with `#[avro(default = false)]` can be misleading
when the default was auto-disabled because `#[avro(with)]` was used; update the
check around serde.skip_serializing/skip_serializing_if to detect that `with !=
With::Trait` and that the field default was originally `FieldDefault::Trait`
(i.e., auto-disabled to `FieldDefault::Disabled`) and, in that case, push a
clearer error via `errors.push(syn::Error::new(span, ...))` that mentions the
default was auto-disabled due to `#[avro(with)]` (or alternatively include that
context in the existing message), using the same symbols
(`With::from_avro_and_serde`, `With::Trait`, `avro.default`,
`FieldDefault::Trait`/`Disabled`, `serde.skip_serializing`,
`serde.skip_serializing_if`, and `span`) so callers know why `default` became
disabled.

avro/src/reader/block.rs (1)

218-254: Consider extracting duplicated infinite-loop protection logic.

Lines 247-250 duplicate the same check from read_next (lines 209-212). Consider extracting this into a helper method to reduce duplication.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@avro/src/reader/block.rs` around lines 218 - 254, Duplicate infinite-loop
protection logic in read_next_deser and read_next should be extracted into a
shared helper: create a method (e.g., check_no_progress_or_error or
ensure_progress) that takes the original byte count and the current slice length
(or both as usize), performs the b_original != 0 && b_original == current_len
check, and returns AvroResult<()> with Err(Details::ReadBlock.into()) on
failure; call this helper from read_next_deser (replacing lines 247-250) and
from read_next (replacing lines 209-212), and update callers to adjust buf_idx
and message_count only after the helper confirms progress.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@avro/src/documentation/serde_data_model_to_avro.rs`:
- Around line 56-57: The doc comment in serde_data_model_to_avro.rs contains a
stray backtick in the sentence "This is different from normal tuples`."—remove
the stray backtick so the sentence ends "normal tuples." (Locate the comment
block in serde_data_model_to_avro.rs around the tuple-struct note and edit the
text accordingly.)

In `@avro/src/reader/block.rs`:
- Around line 232-234: Replace the panic! in the branch that checks read_schema
(the SchemaAwareResolvingDeserializer TODO) with returning a proper error result
instead; add a new Details enum variant like SchemaResolutionNotImplemented with
#[error("Schema-aware deserialization does not support schema resolution yet")]
and then change the code in block.rs (the branch that currently calls panic!) to
return Err(Details::SchemaResolutionNotImplemented.into()) (or construct the
crate's error type expected by the surrounding function) so callers can handle
the unimplemented feature gracefully.

In `@avro/src/reader/datum.rs`:
- Around line 145-160: read_deser currently panics any time a reader schema is
present; change it to only error when the reader schema actually differs from
the writer schema. In read_deser, inspect self.reader (the Some((reader_schema,
_)) tuple) and compare reader_schema to self.writer; if they are identical,
proceed with the same SchemaAwareDeserializer path you use in the else branch
(using Config { names: self.resolved.get_names(), human_readable:
self.human_readable }), otherwise return an AvroResult::Err (e.g. an AvroError
indicating schema resolution is not implemented) instead of unconditionally
panicking; references: read_deser, self.reader, self.writer,
SchemaAwareDeserializer, self.resolved.get_names().

In `@avro/src/reader/mod.rs`:
- Around line 133-140: Replace the panic assert in read_next_deser with
returning an AvroResult::Err so the iterator yields an error instead of
panicking: check the should_resolve_schema flag in fn read_next_deser and if
true return an appropriate Avro error (e.g., a variant indicating
schema-resolving deserialization is not implemented or unsupported) wrapped in
AvroResult::Err; otherwise proceed to call
self.block.read_next_deser(self.reader_schema). Ensure the returned error type
matches the crate's AvroError/AvroResult conventions so callers receive a
recoverable Err instead of a panic.

In `@avro/src/serde/derive.rs`:
- Around line 809-815: The synthesized helper record names (created where
T::get_schema_in_ctxt is used and via Name::new_with_enclosing_namespace) can
collide with user-defined types because they live in the caller's namespace and
are directly checked against named_schemas; change the name-generation to
guarantee uniqueness by either: (a) creating the helper names in a dedicated
internal/synthetic namespace instead of enclosing_namespace, or (b)
loop-appending a deterministic collision suffix (e.g., incrementing counter or
short hash/UUID) until Name::new_with_enclosing_namespace returns a name not
present in named_schemas; apply this change to both the tuple/array helper
generation sites (the block using format!("A{N}_{}",
t_schema.unique_normalized_name()) and the analogous T{len} generation) and
ensure subsequent code emits the newly-reserved name rather than producing a
Schema::Ref to an existing user type.

In `@avro/src/serde/deser_schema/mod.rs`:
- Around line 1474-1476: The test constructs value2 using
TupleExternalEnum::Val1 again, so the tuple-variant path for the three-field
variant is never exercised; update the TestTupleExternalEnum construction for
value2 to use TupleExternalEnum::Val2 with three fields (matching the enum's
three-field tuple variant) so the Val2 branch is executed (refer to
TestTupleExternalEnum, TupleExternalEnum::Val2, and value2).
- Around line 633-645: In deserialize_tuple_struct, the code incorrectly calls
visitor.visit_map for tuple-structs; change it to call visitor.visit_seq so
Serde's tuple-struct visitors are satisfied. Specifically, in the match arm
handling Schema::Record where record.name.name() == name && record.fields.len()
== len, replace the visitor.visit_map(RecordDeserializer::new(self.reader,
record, self.config)) call with visitor.visit_seq using the same
RecordDeserializer instance (same params), mirroring how deserialize_tuple uses
visit_seq with ManyTupleDeserializer. Ensure the rest of the method's branches
remain unchanged.

In `@avro/src/serde/deser_schema/tuple.rs`:
- Around line 105-120: In next_element_seed, avoid incrementing current_field
before deserialization: retrieve schema using
self.schema.fields[self.current_field].schema, create the
SchemaAwareDeserializer and call seed.deserialize first (propagating any error),
and only after deserialize succeeds increment self.current_field and return
Some(value); this ensures SchemaAwareDeserializer::new or seed.deserialize
failures do not advance current_field and preserves caller state for retries or
inspection.

In `@avro/src/serde/ser_schema/record/field_default.rs`:
- Around line 83-92: The match on Value::String for byte-like schemas (the arm
matching Schema::Fixed, Decimal with InnerDecimalSchema::Fixed, Uuid::Fixed, and
Duration) should validate length by Unicode code points (s.chars().count()) not
UTF-8 bytes, and the serialization should convert each char to a raw byte by
mapping char as u32 & 0xFF into a Vec<u8> (instead of using s.as_bytes()); also
remove Schema::Date from the string-handling pattern so date logical types are
not treated as byte-strings. Locate and update the match arm(s) in
field_default.rs where Value::String(s) is validated and serialized (the
fixed/decimal/uuid/duration string branches) to use chars().count() for length
checks and iterate chars to produce bytes, and drop Schema::Date from that
pattern.

In `@avro/src/serde/ser_schema/tuple.rs`:
- Around line 282-297: The OneUnionTupleSerializer SerializeTuple impl currently
doesn't enforce exactly one element; add an element counter field (e.g.,
elements_written or element_count) to OneUnionTupleSerializer, increment it
inside serialize_element (alongside bytes_written), and return an appropriate
Error if serialize_element is called when the counter is already 1 (prevent >1
elements). In end(), validate the counter is exactly 1 and return an Error if
it's 0; update serialize_element and end to use this counter to enforce the
single-element requirement while keeping bytes_written logic intact.
- Around line 235-262: OneTupleSerializer currently doesn't enforce that exactly
one element was serialized; add a counter field (e.g., elements_serialized:
usize) to OneTupleSerializer, increment it in serialize_element (alongside
updating bytes_written), and change end() to validate that elements_serialized
== 1 returning an appropriate Err(Error::... ) when it's 0 or >1 (mirroring the
validation logic used by ManyTupleSerializer) so invalid tuple lengths produce
an error instead of silently succeeding.

In `@avro/src/serde/ser_schema/union.rs`:
- Around line 258-310: The serialize_bytes function currently only treats
SchemaKind::Bytes and fixed-size Schema::Fixed as byte-backed union variants, so
Schema::BigDecimal variants are ignored; update the logic in serialize_bytes
(including the BytesType::Bytes branch, the BytesType::Unset matching where
bytes_index is computed, and any error messages) to also consider
SchemaKind::BigDecimal as a bytes-backed variant (e.g., use
union.index_of_schema_kind for both SchemaKind::Bytes and SchemaKind::BigDecimal
or add a helper that checks either kind) so BigDecimal union members are
selected and handled with with_len semantics the same as bytes.

In `@avro/src/serde/with.rs`:
- Around line 546-550: The new helpers bigdecimal, bigdecimal_opt, array, and
array_opt currently define get_record_fields_in_ctxt without the leading usize
parameter, breaking the ABI; update each helper's get_record_fields_in_ctxt
signature to accept the unused leading usize parameter (i.e. change fn
get_record_fields_in_ctxt(_: &mut HashSet<Name>, _: NamespaceRef) ->
Option<Vec<RecordField>> to fn get_record_fields_in_ctxt(_: usize, _: &mut
HashSet<Name>, _: NamespaceRef) -> Option<Vec<RecordField>>) so they match the
existing helpers' signature and remain backward compatible with callers and the
macro-generated 3-argument form.

---

Nitpick comments:
In `@avro_derive/src/attributes/mod.rs`:
- Around line 321-344: The validation error about incompatibility with
`#[avro(default = false)]` can be misleading when the default was auto-disabled
because `#[avro(with)]` was used; update the check around
serde.skip_serializing/skip_serializing_if to detect that `with != With::Trait`
and that the field default was originally `FieldDefault::Trait` (i.e.,
auto-disabled to `FieldDefault::Disabled`) and, in that case, push a clearer
error via `errors.push(syn::Error::new(span, ...))` that mentions the default
was auto-disabled due to `#[avro(with)]` (or alternatively include that context
in the existing message), using the same symbols (`With::from_avro_and_serde`,
`With::Trait`, `avro.default`, `FieldDefault::Trait`/`Disabled`,
`serde.skip_serializing`, `serde.skip_serializing_if`, and `span`) so callers
know why `default` became disabled.

In `@avro/src/reader/block.rs`:
- Around line 218-254: Duplicate infinite-loop protection logic in
read_next_deser and read_next should be extracted into a shared helper: create a
method (e.g., check_no_progress_or_error or ensure_progress) that takes the
original byte count and the current slice length (or both as usize), performs
the b_original != 0 && b_original == current_len check, and returns
AvroResult<()> with Err(Details::ReadBlock.into()) on failure; call this helper
from read_next_deser (replacing lines 247-250) and from read_next (replacing
lines 209-212), and update callers to adjust buf_idx and message_count only
after the helper confirms progress.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 66191e49-13e3-4c2e-a621-d0c50719d431

📥 Commits

Reviewing files that changed from the base of the PR and between fbec01b and ecaf964.

📒 Files selected for processing (42)

avro/examples/test_interop_single_object_encoding.rs
avro/src/bigdecimal.rs
avro/src/documentation/avro_data_model_to_serde.rs
avro/src/documentation/mod.rs
avro/src/documentation/serde_data_model_to_avro.rs
avro/src/error.rs
avro/src/lib.rs
avro/src/reader/block.rs
avro/src/reader/datum.rs
avro/src/reader/mod.rs
avro/src/reader/single_object.rs
avro/src/schema/mod.rs
avro/src/schema/union.rs
avro/src/serde/derive.rs
avro/src/serde/deser_schema/block.rs
avro/src/serde/deser_schema/enums.rs
avro/src/serde/deser_schema/identifier.rs
avro/src/serde/deser_schema/mod.rs
avro/src/serde/deser_schema/record.rs
avro/src/serde/deser_schema/tuple.rs
avro/src/serde/mod.rs
avro/src/serde/ser_schema/block.rs
avro/src/serde/ser_schema/mod.rs
avro/src/serde/ser_schema/record/field_default.rs
avro/src/serde/ser_schema/record/mod.rs
avro/src/serde/ser_schema/tuple.rs
avro/src/serde/ser_schema/union.rs
avro/src/serde/with.rs
avro/src/writer/datum.rs
avro/src/writer/mod.rs
avro/src/writer/single_object.rs
avro/tests/avro-rs-226.rs
avro/tests/avro-rs-285-bytes_deserialization.rs
avro/tests/schema.rs
avro/tests/serde_human_readable_false.rs
avro/tests/serde_human_readable_true.rs
avro/tests/union_schema.rs
avro_derive/src/attributes/mod.rs
avro_derive/tests/derive.proptest-regressions
avro_derive/tests/derive.rs
avro_derive/tests/serde.rs
licenserc.toml

coderabbitai · 2026-03-23T07:32:41Z

avro/src/documentation/serde_data_model_to_avro.rs

+//!     - **Note:** Tuple structs with 0 or 1 elements will also be (de)serialized as a [`Schema::Record`]. This
+//!       is different from normal tuples`.


⚠️ Potential issue | 🟡 Minor

Minor typo: stray backtick before period.

Line 57 has "normal tuples`." with a backtick before the period that appears unintentional.

Suggested fix

-//! - **Note:** Tuple structs with 0 or 1 elements will also be (de)serialized as a [`Schema::Record`]. This -//! is different from normal tuples`. +//! - **Note:** Tuple structs with 0 or 1 elements will also be (de)serialized as a [`Schema::Record`]. This +//! is different from normal tuples.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

//! - **Note:** Tuple structs with 0 or 1 elements will also be (de)serialized as a [`Schema::Record`]. This

//! is different from normal tuples`.

//! - **Note:** Tuple structs with 0 or 1 elements will also be (de)serialized as a [`Schema::Record`]. This

//! is different from normal tuples.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@avro/src/documentation/serde_data_model_to_avro.rs` around lines 56 - 57, The doc comment in serde_data_model_to_avro.rs contains a stray backtick in the sentence "This is different from normal tuples`."—remove the stray backtick so the sentence ends "normal tuples." (Locate the comment block in serde_data_model_to_avro.rs around the tuple-struct note and edit the text accordingly.)

value:good-to-have; category:bug; feedback: The CodeRabbit AI reviewer is correct! There is a trailing backtick in the sentence and it may break the rendering of the rustdoc. It should be removed.

coderabbitai · 2026-03-23T07:32:41Z

avro/src/reader/block.rs

+        let item = if read_schema.is_some() {
+            // TODO: Implement SchemaAwareResolvingDeserializer
+            panic!("Schema aware deserialisation does not resolve schemas yet");


⚠️ Potential issue | 🟠 Major

Replace panic! with a proper error return.

Using panic! in library code for an unimplemented feature is problematic — it crashes the application without giving callers a chance to handle the situation gracefully. Return an error instead.

🛡️ Proposed fix

- let item = if read_schema.is_some() { - // TODO: Implement SchemaAwareResolvingDeserializer - panic!("Schema aware deserialisation does not resolve schemas yet"); - } else { + let item = if let Some(_read_schema) = read_schema { + return Err(crate::error::Details::SchemaResolutionNotImplemented.into()); + } else {

You'll need to add the corresponding error variant to Details:

#[error("Schema-aware deserialization does not support schema resolution yet")] SchemaResolutionNotImplemented,

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@avro/src/reader/block.rs` around lines 232 - 234, Replace the panic! in the branch that checks read_schema (the SchemaAwareResolvingDeserializer TODO) with returning a proper error result instead; add a new Details enum variant like SchemaResolutionNotImplemented with #[error("Schema-aware deserialization does not support schema resolution yet")] and then change the code in block.rs (the branch that currently calls panic!) to return Err(Details::SchemaResolutionNotImplemented.into()) (or construct the crate's error type expected by the surrounding function) so callers can handle the unimplemented feature gracefully.

value:useful; category:bug; feedback: The CodeRabbit AI reviewer is correct! The new serde deserializer does not support reader schema and this is explained in a TODO. But instead of panicking and crashing the application it would be better to return an Err.

coderabbitai · 2026-03-23T07:32:41Z

avro/src/reader/datum.rs

+    pub fn read_deser<T: DeserializeOwned>(&self, reader: &mut impl Read) -> AvroResult<T> {
+        // `reader` is `impl Read` instead of a generic on the function like T so it's easier to
+        // specify the type wanted (`read_deser<String>` vs `read_deser<String, _>`)
+        if let Some((_, _)) = &self.reader {
+            // TODO: Implement SchemaAwareResolvingDeserializer
+            panic!("Schema aware deserialisation does not resolve schemas yet");
+        } else {
+            T::deserialize(SchemaAwareDeserializer::new(
+                reader,
+                self.writer,
+                Config {
+                    names: self.resolved.get_names(),
+                    human_readable: self.human_readable,
+                },
+            )?)
+        }


⚠️ Potential issue | 🟠 Major

Only reject actual schema adaptation here.

This panics whenever a reader schema was configured, even in the no-op case where the reader schema is equivalent to the writer schema. That makes read_deser stricter than read_value and turns an AvroResult<T> API into a panic path.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@avro/src/reader/datum.rs` around lines 145 - 160, read_deser currently panics any time a reader schema is present; change it to only error when the reader schema actually differs from the writer schema. In read_deser, inspect self.reader (the Some((reader_schema, _)) tuple) and compare reader_schema to self.writer; if they are identical, proceed with the same SchemaAwareDeserializer path you use in the else branch (using Config { names: self.resolved.get_names(), human_readable: self.human_readable }), otherwise return an AvroResult::Err (e.g. an AvroError indicating schema resolution is not implemented) instead of unconditionally panicking; references: read_deser, self.reader, self.writer, SchemaAwareDeserializer, self.resolved.get_names().

value:useful; category:bug; feedback: The CodeRabbit AI reviewer is correct! The new serde deserializer does not support reader schema and this is explained in a TODO. But instead of panicking and crashing the application it would be better to return an Err.

coderabbitai · 2026-03-23T07:32:41Z

avro/src/reader/mod.rs

+    fn read_next_deser<T: DeserializeOwned>(&mut self) -> AvroResult<Option<T>> {
+        // TODO: Implement SchemaAwareResolvingDeserializer
+        assert!(
+            !self.should_resolve_schema,
+            "Schema aware deserialisation does not resolve schemas yet"
+        );
+
+        self.block.read_next_deser(self.reader_schema)


⚠️ Potential issue | 🟠 Major

Return an Err here instead of panicking.

ReaderDeser promises Iterator<Item = AvroResult<T>>, but this assert! breaks that contract and panics on an unsupported configuration instead of surfacing the failure as the iterator's first error.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@avro/src/reader/mod.rs` around lines 133 - 140, Replace the panic assert in read_next_deser with returning an AvroResult::Err so the iterator yields an error instead of panicking: check the should_resolve_schema flag in fn read_next_deser and if true return an appropriate Avro error (e.g., a variant indicating schema-resolving deserialization is not implemented or unsupported) wrapped in AvroResult::Err; otherwise proceed to call self.block.read_next_deser(self.reader_schema). Ensure the returned error type matches the crate's AvroError/AvroResult conventions so callers receive a recoverable Err instead of a panic.

value:useful; category:bug; feedback: The CodeRabbit AI reviewer is correct! The new serde deserializer does not support reader schema and this is explained in a TODO. But instead of panicking and crashing the application it would be better to return an Err.

coderabbitai · 2026-03-23T07:32:41Z

avro/src/serde/derive.rs

+            let t_schema = T::get_schema_in_ctxt(named_schemas, enclosing_namespace);
+            let name = Name::new_with_enclosing_namespace(
+                format!("A{N}_{}", t_schema.unique_normalized_name()),
+                enclosing_namespace,
+            )
+            .expect("Name is valid");
+            if named_schemas.contains(&name) {


⚠️ Potential issue | 🟠 Major

Reserve a collision-proof name for synthesized tuple/array records.

A{N}_... and T{len}_... are created in the caller's namespace and checked against named_schemas like ordinary user types. If a real record already has that name, this code will silently emit a Schema::Ref to the existing type instead of the generated helper record.

Also applies to: 919-924

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@avro/src/serde/derive.rs` around lines 809 - 815, The synthesized helper record names (created where T::get_schema_in_ctxt is used and via Name::new_with_enclosing_namespace) can collide with user-defined types because they live in the caller's namespace and are directly checked against named_schemas; change the name-generation to guarantee uniqueness by either: (a) creating the helper names in a dedicated internal/synthetic namespace instead of enclosing_namespace, or (b) loop-appending a deterministic collision suffix (e.g., incrementing counter or short hash/UUID) until Name::new_with_enclosing_namespace returns a name not present in named_schemas; apply this change to both the tuple/array helper generation sites (the block using format!("A{N}_{}", t_schema.unique_normalized_name()) and the analogous T{len} generation) and ensure subsequent code emits the newly-reserved name rather than producing a Schema::Ref to an existing user type.

coderabbitai · 2026-03-23T07:32:41Z

avro/src/serde/ser_schema/record/field_default.rs

+            (
+                Value::String(s),
+                Schema::Fixed(fixed)
+                | Schema::Decimal(DecimalSchema {
+                    inner: InnerDecimalSchema::Fixed(fixed),
+                    ..
+                })
+                | Schema::Uuid(UuidSchema::Fixed(fixed))
+                | Schema::Duration(fixed),
+            ) => s.len() == fixed.size,


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

According to the Apache Avro specification, how are record field defaults for bytesandfixeddecoded from JSON strings, and aredate logical-type defaults represented as JSON numbers rather than strings?

💡 Result:

According to the Apache Avro specification, record field defaults for bytes and fixed types are provided as JSON strings. These strings use Unicode code points 0-255 mapped directly to unsigned 8-bit byte values 0-255 (e.g., "\u00FF" for the byte 255). When decoding the schema from JSON, these strings are interpreted as the corresponding byte sequences or fixed values. For date logical types (which annotate an underlying int type representing days since the Unix epoch), defaults are represented as JSON numbers (integers), following the general rule for int/long field defaults, rather than strings. Examples from schema usage confirm defaults like 1502250227187 for timestamp logical types (on long) and similar integer defaults for date on int. The specification's field default table lists int,long as JSON integer, with no special override for logical types, which are schema attributes ignored during default value parsing.

Citations:

1: https://avro.apache.org/docs/1.11.1/specification/

2: https://avro.apache.org/docs/1.5.1/spec.html

3: https://avro.apache.org/docs/%2B%2Bversion%2B%2B/specification/

4: https://avro.apache.org/docs/1.11.4/specification/

🏁 Script executed:

cat -n avro/src/serde/ser_schema/record/field_default.rs | sed -n '75,100p'

Repository: martin-augment/avro-rs

Length of output: 1441

🏁 Script executed:

cat -n avro/src/serde/ser_schema/record/field_default.rs | sed -n '145,170p'

Repository: martin-augment/avro-rs

Length of output: 1381

Decode Avro byte defaults by code point, not UTF-8.

For bytes/fixed defaults, Avro maps JSON string code points 0-255 to raw bytes 0-255. The code at line 92 uses s.len() to validate size, which counts UTF-8 bytes rather than code points—so "\u00FF" (1 code point, 2 UTF-8 bytes) would fail validation. Similarly, line 160 uses s.as_bytes() to serialize, which produces UTF-8 bytes instead of the raw byte values specified by Avro.

Line 159 also incorrectly includes Schema::Date in the byte-string pattern. Per the Avro specification, date logical-type defaults are JSON numbers (integers), not strings, and should not be handled by string-based serialization.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@avro/src/serde/ser_schema/record/field_default.rs` around lines 83 - 92, The match on Value::String for byte-like schemas (the arm matching Schema::Fixed, Decimal with InnerDecimalSchema::Fixed, Uuid::Fixed, and Duration) should validate length by Unicode code points (s.chars().count()) not UTF-8 bytes, and the serialization should convert each char to a raw byte by mapping char as u32 & 0xFF into a Vec<u8> (instead of using s.as_bytes()); also remove Schema::Date from the string-handling pattern so date logical types are not treated as byte-strings. Locate and update the match arm(s) in field_default.rs where Value::String(s) is validated and serialized (the fixed/decimal/uuid/duration string branches) to use chars().count() for length checks and iterate chars to produce bytes, and drop Schema::Date from that pattern.

coderabbitai · 2026-03-23T07:32:41Z

avro/src/serde/ser_schema/tuple.rs

+impl<'s, 'w, W: Write, S: Borrow<Schema>> SerializeTuple for OneTupleSerializer<'s, 'w, W, S> {
+    type Ok = usize;
+    type Error = Error;
+
+    fn serialize_element<T>(&mut self, value: &T) -> Result<(), Self::Error>
+    where
+        T: ?Sized + Serialize,
+    {
+        match self.schema {
+            Schema::Union(union) => {
+                self.bytes_written +=
+                    value.serialize(UnionSerializer::new(self.writer, union, self.config))?;
+            }
+            schema => {
+                self.bytes_written += value.serialize(SchemaAwareSerializer::new(
+                    self.writer,
+                    schema,
+                    self.config,
+                )?)?;
+            }
+        }
+        Ok(())
+    }
+
+    fn end(self) -> Result<Self::Ok, Self::Error> {
+        Ok(self.bytes_written)
+    }
+}


⚠️ Potential issue | 🟡 Minor

OneTupleSerializer doesn't validate that exactly one element was serialized.

Unlike ManyTupleSerializer which validates element count in end(), OneTupleSerializer::end() unconditionally returns success. If zero or more than one element is serialized, this could lead to incorrect Avro output without error.

💡 Proposed fix

struct OneTupleSerializer<'s, 'w, W: Write, S: Borrow<Schema>> { writer: &'w mut W, schema: &'s Schema, config: Config<'s, S>, bytes_written: usize, + element_written: bool, } impl<'s, 'w, W: Write, S: Borrow<Schema>> OneTupleSerializer<'s, 'w, W, S> { pub fn new( writer: &'w mut W, schema: &'s Schema, config: Config<'s, S>, bytes_written: Option<usize>, ) -> Self { Self { writer, schema, config, bytes_written: bytes_written.unwrap_or(0), + element_written: false, } } } impl<'s, 'w, W: Write, S: Borrow<Schema>> SerializeTuple for OneTupleSerializer<'s, 'w, W, S> { // ... fn serialize_element<T>(&mut self, value: &T) -> Result<(), Self::Error> where T: ?Sized + Serialize, { + if self.element_written { + return Err(Error::new(Details::SerializeTupleTooManyElements { + expected: 1, + actual: 2, + })); + } match self.schema { // ... existing code ... } + self.element_written = true; Ok(()) } fn end(self) -> Result<Self::Ok, Self::Error> { + if !self.element_written { + return Err(Error::new(Details::SerializeTupleMissingElements { + position: 0, + total_elements: 1, + })); + } Ok(self.bytes_written) } }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@avro/src/serde/ser_schema/tuple.rs` around lines 235 - 262, OneTupleSerializer currently doesn't enforce that exactly one element was serialized; add a counter field (e.g., elements_serialized: usize) to OneTupleSerializer, increment it in serialize_element (alongside updating bytes_written), and change end() to validate that elements_serialized == 1 returning an appropriate Err(Error::... ) when it's 0 or >1 (mirroring the validation logic used by ManyTupleSerializer) so invalid tuple lengths produce an error instead of silently succeeding.

coderabbitai · 2026-03-23T07:32:42Z

avro/src/serde/ser_schema/tuple.rs

+impl<'s, 'w, W: Write, S: Borrow<Schema>> SerializeTuple for OneUnionTupleSerializer<'s, 'w, W, S> {
+    type Ok = usize;
+    type Error = Error;
+
+    fn serialize_element<T>(&mut self, value: &T) -> Result<(), Self::Error>
+    where
+        T: ?Sized + Serialize,
+    {
+        self.bytes_written +=
+            value.serialize(UnionSerializer::new(self.writer, self.union, self.config))?;
+        Ok(())
+    }
+
+    fn end(self) -> Result<Self::Ok, Self::Error> {
+        Ok(self.bytes_written)
+    }


⚠️ Potential issue | 🟡 Minor

Same issue: OneUnionTupleSerializer doesn't validate element count.

Similar to OneTupleSerializer, this should validate that exactly one element is serialized.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@avro/src/serde/ser_schema/tuple.rs` around lines 282 - 297, The OneUnionTupleSerializer SerializeTuple impl currently doesn't enforce exactly one element; add an element counter field (e.g., elements_written or element_count) to OneUnionTupleSerializer, increment it inside serialize_element (alongside bytes_written), and return an appropriate Error if serialize_element is called when the counter is already 1 (prevent >1 elements). In end(), validate the counter is exactly 1 and return an Error if it's 0; update serialize_element and end to use this counter to enforce the single-element requirement while keeping bytes_written logic intact.

coderabbitai · 2026-03-23T07:32:42Z

avro/src/serde/ser_schema/union.rs

+    fn serialize_bytes(mut self, v: &[u8]) -> Result<Self::Ok, Self::Error> {
+        let (index, with_len) = match SER_BYTES_TYPE.get() {
+            BytesType::Bytes => {
+                if let Some(index) = self.union.index_of_schema_kind(SchemaKind::Bytes) {
+                    (index, true)
+                } else {
+                    return Err(self.error("bytes", "Expected Schema::Bytes in variants"));
+                }
+            }
+            BytesType::Fixed => {
+                if let Some((index, _)) = self
+                    .union
+                    .find_fixed_of_size_n(v.len(), self.config.names)?
+                {
+                    (index, false)
+                } else {
+                    return Err(self.error(
+                        "bytes",
+                        format!("Expected Schema::Fixed(size: {}) in variants", v.len()),
+                    ));
+                }
+            }
+            BytesType::Unset => {
+                let bytes_index = self.union.index_of_schema_kind(SchemaKind::Bytes);
+                let fixed_index = self
+                    .union
+                    .find_fixed_of_size_n(v.len(), self.config.names)?;
+                // Find the first variant that matches the bytes or fixed
+                match (bytes_index, fixed_index) {
+                    (Some(bytes_index), Some((fixed_index, _))) => {
+                        (bytes_index.min(fixed_index), bytes_index < fixed_index)
+                    }
+                    (Some(bytes_index), None) => (bytes_index, true),
+                    (None, Some((fixed_index, _))) => (fixed_index, false),
+                    (None, None) => {
+                        return Err(self.error(
+                            "bytes",
+                            format!(
+                                "Expected Schema::Bytes | Schema::Fixed(size: {}) in variants",
+                                v.len()
+                            ),
+                        ));
+                    }
+                }
+            }
+        };
+        let mut bytes_written = zig_i32(index as i32, &mut *self.writer)?;
+        if with_len {
+            bytes_written += self.write_bytes_with_len(v)?;
+        } else {
+            bytes_written += self.write_bytes(v)?;
+        }
+        Ok(bytes_written)


⚠️ Potential issue | 🟠 Major

Schema::BigDecimal union members are unreachable from serialize_bytes.

This selector only matches SchemaKind::Bytes or fixed-size branches. A union containing Schema::BigDecimal therefore never selects its valid branch on the bytes path, so the new big-decimal adapters will still fail once they serialize into a non-null union. Please treat big-decimal as a bytes-backed union member here as well.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@avro/src/serde/ser_schema/union.rs` around lines 258 - 310, The serialize_bytes function currently only treats SchemaKind::Bytes and fixed-size Schema::Fixed as byte-backed union variants, so Schema::BigDecimal variants are ignored; update the logic in serialize_bytes (including the BytesType::Bytes branch, the BytesType::Unset matching where bytes_index is computed, and any error messages) to also consider SchemaKind::BigDecimal as a bytes-backed variant (e.g., use union.index_of_schema_kind for both SchemaKind::Bytes and SchemaKind::BigDecimal or add a helper that checks either kind) so BigDecimal union members are selected and handled with with_len semantics the same as bytes.

coderabbitai · 2026-03-23T07:32:42Z

avro/src/serde/with.rs

+    pub fn get_record_fields_in_ctxt(
+        _: &mut HashSet<Name>,
+        _: NamespaceRef,
+    ) -> Option<Vec<RecordField>> {
+        None


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "Current helper signatures in avro/src/serde/with.rs" rg -n -C1 'pub fn get_record_fields_in_ctxt\s*\(' avro/src/serde/with.rs echo echo "Call sites / macro references outside avro/src/serde/with.rs" rg -n -C2 '\bget_record_fields_in_ctxt\s*\(' -g '!avro/src/serde/with.rs'

Repository: martin-augment/avro-rs

Length of output: 19287

Update new helper signatures to match existing get_record_fields_in_ctxt ABI.

The new bigdecimal, bigdecimal_opt, array, and array_opt helpers at lines 546, 619, 695, and 771 drop the leading usize parameter compared to all existing helpers in the file. This breaks the public API contract: all prior helpers expect (_: usize, _: &mut HashSet<Name>, _: NamespaceRef), but the new ones only expose (_: &mut HashSet<Name>, _: NamespaceRef). The macro generates 2-argument calls, so the new signature technically works with the current derive macro, but it makes these new modules incompatible with the established pattern and any code expecting the historical ABI. For consistency and backward compatibility, add the usize parameter to all new helpers, even if unused.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@avro/src/serde/with.rs` around lines 546 - 550, The new helpers bigdecimal, bigdecimal_opt, array, and array_opt currently define get_record_fields_in_ctxt without the leading usize parameter, breaking the ABI; update each helper's get_record_fields_in_ctxt signature to accept the unused leading usize parameter (i.e. change fn get_record_fields_in_ctxt(_: &mut HashSet<Name>, _: NamespaceRef) -> Option<Vec<RecordField>> to fn get_record_fields_in_ctxt(_: usize, _: &mut HashSet<Name>, _: NamespaceRef) -> Option<Vec<RecordField>>) so they match the existing helpers' signature and remain backward compatible with callers and the macro-generated 3-argument form.

martin-augment · 2026-03-23T12:31:08Z

1. Unsafe cast in deserialize_option

In deser_schema/mod.rs, the option handling does:
let index = zag_i32(self.reader)?;
let schema = union.get_variant(index as usize)?;
Every other place in this PR uses usize::try_from(index).map_err(...) before the bounds check. A negative i32 cast to usize wraps to a huge number on 64-bit; get_variant will still catch it, but the error will show a nonsensical large index rather than a clear "negative index" message. Be consistent with the rest of the codebase.

value:useful; category:bug; feedback: The Claude AI reviewer is correct! Any casting from a wider integer type to a narrower one should be done by using the checked APIs and report Err in case the value does not fit in the target type. Prevents silently overflowing the value and using a totally wrong value for any following operations with that number.

Kriskras99 and others added 12 commits March 20, 2026 10:01

doc: Document the mapping between the Serde and Avro data models

226d4c2

feat: SchemaAwareDeserializer (no tests)

f1462c3

feat: Use SchemaAwareDeserializer in the readers and add tests

7b4c2dc

feat: Rework SchemaAwareSerializer to be more strict and follow the…

a010ca6

… documented model (no tests)

feat: Update tests and provide support for old ways of BigDecimal a…

b1db20f

…nd `[T; N]`

fix: Add missing license header and set PR number for tests

757c25e

fix: Don't call T::field_default() if the schema is overwritten usi…

e2055f5

…ng `#[avro(with)]`

fix: Enable rustdoc_internal feature when cfg(docsrs) so we can u…

3d03097

…se `fake_variadic`

fix: Review feedback

faeb5ba

--------- Co-authored-by: Martin Grigorov <martin-g@users.noreply.github.com>

fix: Apply suggestions from code review

bc52d9f

Co-authored-by: Martin Grigorov <martin-g@users.noreply.github.com>

fix: Second round of code review

2ced99b

Co-authored-by: Martin Grigorov <martin-g@users.noreply.github.com>

fix: Use longs when serializing block item count and byte size

ecaf964

cursor bot reviewed Mar 23, 2026

View reviewed changes

This was referenced Mar 23, 2026

512: feat!: Support enums and tuples in SchemaAwareSerializer and implement SchemaAwareDeserializer #63

Closed

512: feat!: Support enums and tuples in SchemaAwareSerializer and implement SchemaAwareDeserializer #64

Closed

augmentcode bot reviewed Mar 23, 2026

View reviewed changes

gemini-code-assist bot reviewed Mar 23, 2026

View reviewed changes

coderabbitai bot reviewed Mar 23, 2026

View reviewed changes

		//! - Note: Tuple structs with 0 or 1 elements will also be (de)serialized as a [`Schema::Record`]. This
		//! is different from normal tuples`.

Conversation

martin-augment commented Mar 23, 2026

Uh oh!

coderabbitai bot commented Mar 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Uh oh!

cursor bot left a comment

Choose a reason for hiding this comment

Uh oh!

cursor bot Mar 23, 2026

Choose a reason for hiding this comment

Reader passes reader_schema causing unexpected panic

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cursor bot Mar 23, 2026

Choose a reason for hiding this comment

Unsafe as usize cast on potentially negative index

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot commented Mar 23, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

augmentcode bot commented Mar 23, 2026

Uh oh!

augmentcode bot left a comment

Choose a reason for hiding this comment

Uh oh!

augmentcode bot Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

augmentcode bot Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

augmentcode bot Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

augmentcode bot Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

claude bot commented Mar 23, 2026

PR Review: feat!: Support enums and tuples in SchemaAwareSerializer and implement SchemaAwareDeserializer

Bugs / Correctness

API Design Concerns

Minor Issues

Positive Highlights

Summary

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 23, 2026

coderabbitai bot commented Mar 23, 2026 •

edited

Loading

Unsafe `as usize` cast on potentially negative index