Skip to content

477: feat: Allow types to provide default values#53

Open
martin-augment wants to merge 1 commit intomainfrom
pr-477-2026-02-20-10-08-16
Open

477: feat: Allow types to provide default values#53
martin-augment wants to merge 1 commit intomainfrom
pr-477-2026-02-20-10-08-16

Conversation

@martin-augment
Copy link
Copy Markdown
Owner

477: To review by AI

`AvroSchemaComponent` is extended with a function `field_default`
which will be called when deriving a record to set the default value
for a field. The default implementation is to return `None`, which
means no default.

On the derive side, it is now possible to specify a default for a type
using `#[avro(default = "..")]`. It is also possible to disable setting
a default for a field with `#[avro(default = false)]`.

This enables users to use `#[serde(skip_serializing{_if})]` on most fields
without having to provide a default value.
@gemini-code-assist
Copy link
Copy Markdown

Summary of Changes

Hello @martin-augment, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances how default values are handled in Avro schemas generated by the apache-avro Rust library. By introducing a field_default method to the AvroSchemaComponent trait and extending the avro_derive macro, types can now declare their default Avro JSON representation. This change streamlines schema generation, especially for fields with skip_serializing attributes, and provides more granular control over the generated Avro schema defaults, ultimately improving schema compatibility and ease of use for developers.

Highlights

  • Introduced AvroSchemaComponent::field_default: A new trait method allowing types to define their default Avro JSON value when used as a field, providing a standardized way to specify defaults.
  • Enhanced #[avro(default = ...)] attribute: The avro_derive macro now supports specifying default values for types and fields, including #[avro(default = false)] to explicitly disable a default, and #[avro(default = r#"..."#)] for custom JSON values.
  • Automatic default value generation: Primitive types, collections, and logical types (e.g., Duration, Uuid, u64, u128, i128) now automatically provide default Avro JSON values based on their Default implementation.
  • Improved Uuid handling: Fixed a bug where Uuid values could not be converted from string representations when the schema was a fixed UUID, ensuring correct deserialization.
  • Updated documentation and error messages: Clarified usage of #[avro(default = ...)] in documentation and improved error messages for incompatible serde and avro attributes, specifically when skip_serializing is used with #[avro(default = false)].
Changelog
  • avro/src/serde/derive.rs
    • Added constants for common fixed-size default byte arrays.
    • Updated documentation for #[avro(default = ...)] on types and fields, clarifying JSON string encoding and the false option.
    • Introduced fn field_default() -> Option<serde_json::Value> to the AvroSchemaComponent trait.
    • Implemented field_default for various primitive types, String, str, char, Vec, [T; N], HashMap, Option, Duration, Uuid, u64, u128, and i128.
  • avro/src/types.rs
    • Added logic to convert Value::String to Uuid when the schema is a fixed UUID, including byte length validation.
  • avro_derive/src/attributes/avro.rs
    • Added default: Option<String> to ContainerAttributes for type-level defaults.
    • Changed default field in FieldAttributes from Option<String> to the new FieldDefault enum.
  • avro_derive/src/attributes/mod.rs
    • Defined FieldDefault enum (Trait, Disabled, Value(String)) to represent different default value strategies.
    • Implemented FromMeta for FieldDefault to parse #[avro(default = ...)] attributes.
    • Modified NamedTypeOptions::from_attributes to parse type-level #[avro(default = ...)] and store it as a TokenStream.
    • Updated validation logic to check for incompatibility between serde(skip_serializing) and #[avro(default = false)].
  • avro_derive/src/lib.rs
    • Updated create_trait_definition to include the generated field_default_impl token stream.
    • Modified get_struct_schema_def to correctly handle FieldDefault enum variants when generating field defaults.
    • Added type_to_field_default_expr helper function to generate field_default() calls for various types.
  • avro_derive/tests/derive.rs
    • Imported Duration and Uuid for new test cases.
    • Added #[avro(default = false)] to a field in test_basic_struct_with_defaults.
    • Added default attribute to the uuid fixed schema in avro_rs_397_uuid test.
    • Added new test cases avro_rs_476_field_default, avro_rs_476_field_default_false, and avro_rs_476_field_default_provided to verify the new default value functionality.
  • avro_derive/tests/ui/avro_rs_226_skip_serializing.rs
    • Added #[avro(default = false)] to the z field to test the new error message.
  • avro_derive/tests/ui/avro_rs_226_skip_serializing.stderr
    • Updated the expected error message to reflect the change from "require #[avro(default = ..)]" to "are incompatible with #[avro(default = false)]".
  • avro_derive/tests/ui/avro_rs_226_skip_serializing_if.rs
    • Added #[avro(default = false)] to the y field to test the new error message.
  • avro_derive/tests/ui/avro_rs_226_skip_serializing_if.stderr
    • Updated the expected error message similarly to the previous file.
Activity
  • The pull request was created by martin-augment.
  • The author requested review by AI.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Feb 20, 2026

Walkthrough

This pull request introduces comprehensive support for field-level defaults in Avro schema derivation. It adds a new field_default() method to the AvroSchemaComponent trait across the codebase, establishes a FieldDefault enum for attribute parsing, and implements type-specific default value generation for primitives and complex types. The changes include new constants for fixed-size byte sequences, extended derive macro logic to thread default values through trait implementations, and updates to validation to align with the new default-handling semantics. UUID resolution is enhanced to support String-to-Fixed conversion, and test coverage is expanded to validate defaults across various type scenarios.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch pr-477-2026-02-20-10-08-16

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

The pull request introduces a significant enhancement to the AvroSchema derive macro by allowing types to provide default values through the AvroSchemaComponent trait. This aligns the Rust implementation more closely with the Avro specification's support for default values in record fields. The implementation is thorough, covering many standard types and providing compile-time validation for JSON default values provided via attributes. The addition of #[avro(default = false)] to disable defaults is also a useful feature for fine-grained control over schema generation.

None
}

/// If `T` has a field default, this will return an array of with that default. Otherwise there is no default.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

There is a small typo in the doc comment: "of with" should be "with".

Suggested change
/// If `T` has a field default, this will return an array of with that default. Otherwise there is no default.
/// If `T` has a field default, this will return an array with that default. Otherwise there is no default.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

value:valid-but-wont-fix; category:bug; feedback: The Gemini AI reviewer is not correct! The grammar would be better if "elements" is added instead of removing the "of": this will return an array of elements with that default

/// If `T` has a field default, this will return an array of with that default. Otherwise there is no default.
fn field_default() -> Option<serde_json::Value> {
T::field_default().map(|default| {
serde_json::Value::Array(std::array::from_fn::<_, N, _>(|_| default.clone()).to_vec())
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using vec![default; N] is a simpler and more idiomatic way to create a vector with N cloned elements than std::array::from_fn followed by to_vec().

Suggested change
serde_json::Value::Array(std::array::from_fn::<_, N, _>(|_| default.clone()).to_vec())
serde_json::Value::Array(vec![default; N])

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

value:good-to-have; category:bug; feedback: The Gemini AI reviewer is correct! Using the vec! initializer will be shorter and more succint.

serde_json::from_str(&default_value[..]).map_err(|e| {
vec![syn::Error::new(
ident.span(),
format!("Invalid avro default json: \n{e}"),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

"Avro" and "JSON" should be capitalized in the error message for consistency and correctness.

Suggested change
format!("Invalid avro default json: \n{e}"),
format!("Invalid Avro default JSON: \n{e}"),

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

value:good-to-have; category:documentation; feedback: The Gemini AI reviewer is correct! The names and abbreviations should be capitalized/upper-cased to emphasise their meaning.

)]
})?;
quote! {
Some(serde_json::from_str(#default_value).expect(format!("Invalid JSON: {:?}", #default_value).as_str()))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Since the JSON string has already been validated at compile time (lines 124-130), using format! and as_str() inside expect is redundant. A simple static message is sufficient.

Suggested change
Some(serde_json::from_str(#default_value).expect(format!("Invalid JSON: {:?}", #default_value).as_str()))
Some(serde_json::from_str(#default_value).expect("Invalid JSON"))

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

value:good-to-have; category:documentation; feedback: The Gemini AI reviewer is correct! There is no need to parse the JSON twice. The successful result could be assigned and quoted instead. This way there won't be a need to expect() and provide a panic message.

@@ -199,7 +208,6 @@ fn get_struct_schema_def(
Some(serde_json::from_str(#default_value).expect(format!("Invalid JSON: {:?}", #default_value).as_str()))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Since the JSON string has already been validated at compile time (lines 200-206), using format! and as_str() inside expect is redundant. A simple static message is sufficient.

Suggested change
Some(serde_json::from_str(#default_value).expect(format!("Invalid JSON: {:?}", #default_value).as_str()))
Some(serde_json::from_str(#default_value).expect("Invalid JSON"))

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

value:good-to-have; category:documentation; feedback: The Gemini AI reviewer is correct! There is no need to parse the JSON twice. The successful result could be assigned and quoted instead. This way there won't be a need to expect() and provide a panic message.

@claude
Copy link
Copy Markdown

claude bot commented Feb 20, 2026

PR Review: feat: Allow types to provide default values

Overall this is a well-structured feature that extends AvroSchemaComponent to optionally supply schema field defaults at the type level. The test coverage is thorough. I have some concerns ranging from bugs to design questions worth discussing.


Critical Issues

1. Undocumented Breaking Change

The most significant concern: before this PR, all derived fields produced default: None in the schema (no default). After this PR, primitive types (i32, bool, String, etc.) automatically get defaults (0, false, "", ...) in the schema. This silently changes the schema output for all existing users of #[derive(AvroSchema)].

For example, existing code like:

#[derive(AvroSchema)]
struct MyRecord {
    count: i32,
    name: String,
}

will now produce a schema with "default":0 and "default":"" on those fields — which it did not before. This is a breaking change for any user relying on the absence of defaults, and it should be either gated behind an opt-in attribute or have a prominent semver-bump note.

2. Regression in skip_serializing Validation

The old condition:

// avro_derive/src/attributes/mod.rs
&& avro.default.is_none()

The new condition:

&& avro.default == FieldDefault::Disabled

Previously, using #[serde(skip_serializing)] without any #[avro(default = "...")] was a compile error — which was correct, since a skipped field needs a schema default for Avro compatibility. Now, with FieldDefault::Trait as the implicit default, the error is only raised if the user explicitly disables the default.

If a user's custom type doesn't implement field_default() (returns None by default), using skip_serializing on that field will now silently produce a schema without a default — the opposite of what the validation was meant to catch. This is a regression.

3. Invalid Avro Schema Accepted in Test

In avro_rs_476_field_default_provided:

#[avro(default = "42.0")]
_r: Option<f64>,

The schema type is ["null", "double"]. According to the Avro specification, the default value for a union must be valid for the first type in the union (which is null). Setting "default": 42.0 on this field produces an invalid Avro schema. While this is not a new validation gap (explicit defaults were always accepted as-is), the test appears to treat it as a valid scenario.


Code Quality Issues

4. Typo in Documentation

// avro/src/serde/derive.rs
///    This default value can be overriden by providing a JSON string.

Should be overridden.

5. format\! in .expect() — Unnecessary Allocation

In both avro_derive/src/attributes/mod.rs and avro_derive/src/lib.rs, the generated code contains:

serde_json::from_str(#default_value).expect(format\!("Invalid JSON: {:?}", #default_value).as_str())

The format\! allocates a String on every call, even though this path should never be reached at runtime (the JSON is validated at proc macro expansion time). Prefer a static string, or use unwrap() with a comment explaining why it's safe:

// SAFETY: JSON was validated at compile time by the proc macro
serde_json::from_str(#default_value).unwrap()

6. Misleading Comment

// avro/src/serde/derive.rs
///  - Implement `field_default()` if you want to use `#[serde(skip_serializing{,_if})]`.

field_default() now has a broader purpose — it's used for all auto-defaults, not just skip_serializing. This comment should be updated.


Design Observations

7. char Default Is the Null Character

char::default() is '\0', so char fields get "\u0000" as their schema default. This is arguably surprising — '\0' is a valid Avro string but semantically questionable as a default for a character field. Worth documenting or considering '\0' -> ' ' or similar.

8. Array Default Behavior for Types Without a Default Is Implicit

For fixed-size arrays where the element type's field_default() returns None, no default is emitted. This is correct and consistent, but it's non-obvious. The test shows _p_alt: [Bar; 5] produces no default, while _p: [u8; 5] produces [0,0,0,0,0]. A doc comment on the impl<T, const N: usize> AvroSchemaComponent for [T; N] block would help.

9. The #[avro(default = true)] Error Message

FromMeta::from_bool correctly rejects true, but the error "Expected false or a JSON string, got true" might confuse users who think they should write #[avro(default = true)] to enable defaults. Consider phrasing: "Use #[avro(default = false)] to disable defaults, or #[avro(default = \"...\")] to provide a JSON value."


Summary

Category Issue
Breaking Auto-defaults change schema for all existing derived types
Bug skip_serializing validation regression for types returning None from field_default()
Bug Test exercises invalid Avro schema (non-null default on Option<T>)
Quality format\! in expect() allocates unnecessarily in generated code
Quality Typo: "overriden" -> "overridden"
Quality Misleading doc comment on when to implement field_default()
Design char default is null character, possibly surprising

The feature is useful and the implementation is mostly clean, but the breaking change aspect needs to be addressed — either by making auto-defaults opt-in (e.g., #[avro(auto_defaults)] at the container level) or by clearly documenting this as a breaking change with a major version bump.

@augmentcode
Copy link
Copy Markdown

augmentcode bot commented Feb 20, 2026

🤖 Augment PR Summary

Summary: This PR extends schema derivation to support type-provided default values and makes field defaults more flexible.

Changes:

  • Adds AvroSchemaComponent::field_default() as a new hook for providing a default JSON value when a type is used as a record field
  • Extends #[avro(default = ...)] semantics on fields: default-from-trait by default, JSON override via string, and false to omit the default attribute
  • Adds a container-level #[avro(default = ...)] to define a type’s field-default (used by other schemas)
  • Implements field_default() for many built-in/standard types (primitives, strings, arrays/vectors, maps, Option) plus fixed/logical types like Duration and Uuid
  • Updates passthrough schema impls to forward field_default()
  • Adds UUID resolution support for fixed-UUID defaults represented as JSON strings
  • Adds extensive derive tests to validate default propagation, disabling defaults, and explicit default overrides

Technical Notes: Default values are represented as serde_json::Value embedded into the generated schema JSON, and fixed defaults are emitted as JSON strings with NUL bytes (rendered as \u0000 escapes).

🤖 Was this summary useful? React with 👍 or 👎

Copy link
Copy Markdown

@augmentcode augmentcode bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review completed. 3 suggestions posted.

Fix All in Augment

Comment augment review to trigger a new review at any time.

Value::Uuid(Uuid::from_slice(bytes).map_err(Details::ConvertSliceToUuid)?)
}
(Value::String(ref string), UuidSchema::Fixed(_)) => {
let bytes = string.as_bytes();
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For UuidSchema::Fixed, using string.as_bytes() treats the JSON default as UTF-8 bytes; this can change both the length and the actual byte values for non-ASCII \u00xx escapes, so fixed/uuid defaults may resolve incorrectly. It may be worth ensuring this follows Avro’s “fixed/bytes default as JSON string of raw bytes” interpretation (1 character → 1 byte).

Severity: medium

Fix This in Augment

🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.

let default_value = match field_attrs.default {
Some(default_value) => {
FieldDefault::Disabled => quote! { None },
FieldDefault::Trait => type_to_field_default_expr(&field.ty)?,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FieldDefault::Trait always emits <FieldType as AvroSchemaComponent>::field_default(), even when the field schema is provided via #[avro(with = ...)]; this can break the documented “foreign type” use-case where the field type itself doesn’t implement AvroSchemaComponent. As-is, derives that previously worked with with may now fail to compile unless the field default is explicitly disabled/provided.

Severity: high

Fix This in Augment

🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.

#get_record_fields_impl
}

fn field_default() -> ::std::option::Option<::serde_json::Value> {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The generated impl references ::serde_json::Value directly; consumers deriving AvroSchemaComponent may now need to add an explicit serde_json dependency even if they don’t otherwise use schema-default features. If that’s not intended, it may be worth ensuring the generated path resolves without requiring a direct dependency in downstream crates.

Severity: medium

Fix This in Augment

🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
avro_derive/tests/derive.rs (1)

2383-2393: Bar and Spam structs are duplicated across two test functions.

Both avro_rs_476_field_default and avro_rs_476_field_default_false define identical Bar and Spam locally. Moving them to module scope would eliminate the duplication and avoid divergence if either is changed later.

♻️ Proposed refactor: hoist shared types to module scope
+/// Recursive type with no auto-derivable default (used by avro_rs_476_* tests).
+#[derive(AvroSchema)]
+struct AvroRs476Bar {
+    _field: Box<AvroRs476Bar>,
+}
+
+/// Type with a struct-level default (used by avro_rs_476_* tests).
+#[derive(AvroSchema)]
+#[avro(default = r#"{"_field": true}"#)]
+struct AvroRs476Spam {
+    _field: bool,
+}

 #[test]
 fn avro_rs_476_field_default() {
-    #[derive(AvroSchema)]
-    struct Bar {
-        _field: Box<Bar>,
-    }
-
-    #[derive(AvroSchema)]
-    #[avro(default = r#"{"_field": true}"#)]
-    struct Spam {
-        _field: bool,
-    }
     // use AvroRs476Bar and AvroRs476Spam instead ...

Also applies to: 2434-2442

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@avro_derive/tests/derive.rs` around lines 2383 - 2393, The two tests
avro_rs_476_field_default and avro_rs_476_field_default_false duplicate the
local struct definitions Bar and Spam; hoist Bar and Spam to module scope
(declare them once at the top of the test module) and remove the local
definitions in both test functions so both tests reuse the same types (ensure
you keep the AvroSchema derives and the #[avro(default = ...)] attribute on Spam
when moving).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@avro_derive/tests/derive.rs`:
- Around line 2553-2554: The test sets an Avro default on the union field `_r:
Option<f64>` using #[avro(default = "42.0")], which violates Avro union-default
rules because Option<f64> maps to ["null","double"] and defaults must match the
first union branch (null); fix by changing the attribute to #[avro(default =
"null")] for `_r` (or remove the default entirely), or if you intentionally want
an invalid/default-for-second-branch case, keep it but add a clear comment like
`// Invalid per Avro spec` next to `#[avro(default = "42.0")]` to document the
deliberate deviation.
- Around line 1374-1375: Several test structs produce schemas with "default"
entries because the derive uses FieldDefault::Trait by default; update each test
struct used in schema equality assertions—e.g. TestAllSupportedBaseTypes,
TestBasicWithBool, TestSmartPointers, TestNested (or TestInnerStruct),
TestGeneric (and any other fixtures referenced in tests like test_basic_types,
test_inner_struct, test_smart_pointers, avro_3634_test_basic_with_bool,
test_generic_container_1/2)—by adding the attribute #[avro(default = false)] to
every primitive-typed field so the derived schema omits "default" and matches
the hardcoded expected schema fixtures.

---

Nitpick comments:
In `@avro_derive/tests/derive.rs`:
- Around line 2383-2393: The two tests avro_rs_476_field_default and
avro_rs_476_field_default_false duplicate the local struct definitions Bar and
Spam; hoist Bar and Spam to module scope (declare them once at the top of the
test module) and remove the local definitions in both test functions so both
tests reuse the same types (ensure you keep the AvroSchema derives and the
#[avro(default = ...)] attribute on Spam when moving).

Comment on lines +1374 to 1375
#[avro(default = false)]
c: f64,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's look at the context around the modified line
head -n 1400 avro_derive/tests/derive.rs | tail -n 100

Repository: martin-augment/avro-rs

Length of output: 2648


🏁 Script executed:

# Get the file size to see if we can read it all
wc -l avro_derive/tests/derive.rs

Repository: martin-augment/avro-rs

Length of output: 98


🏁 Script executed:

# Search for TestBasicStructWithDefaultValues to understand the test structure
rg -n "TestBasicStructWithDefaultValues" avro_derive/tests/derive.rs -A 20

Repository: martin-augment/avro-rs

Length of output: 3528


🏁 Script executed:

# Look for other tests that check f64 fields to see if they have defaults
rg -n "test_basic_types|test_inner_struct|avro_3634_test_basic_with_bool|test_smart_pointers" avro_derive/tests/derive.rs | head -20

Repository: martin-augment/avro-rs

Length of output: 411


🏁 Script executed:

# Search for field_default implementations to understand the auto-default mechanism
rg -n "field_default" avro_derive/ --type rust

Repository: martin-augment/avro-rs

Length of output: 15750


🏁 Script executed:

# Look at the FieldDefault enum definition to understand the mechanism
rg -n "enum FieldDefault" avro_derive/ -A 5

Repository: martin-augment/avro-rs

Length of output: 443


🏁 Script executed:

# Search for how FieldDefault is determined - is it opt-in or automatic?
rg -n "FieldDefault::" avro_derive/src/attributes/mod.rs -B 2 -A 2

Repository: martin-augment/avro-rs

Length of output: 308


🏁 Script executed:

# Check the actual test functions mentioned in the search results to understand the feature
sed -n '2383,2450p' avro_derive/tests/derive.rs

Repository: martin-augment/avro-rs

Length of output: 3349


🏁 Script executed:

# Also check the other field_default test
sed -n '2432,2510p' avro_derive/tests/derive.rs

Repository: martin-augment/avro-rs

Length of output: 3189


🏁 Script executed:

# Let's check the test_basic_types function to see if it has been updated
rg -n "fn test_basic_types" avro_derive/tests/derive.rs -A 200 | head -250

Repository: martin-augment/avro-rs

Length of output: 5809


🏁 Script executed:

# Search for schema assertions in test_basic_types
sed -n '263,340p' avro_derive/tests/derive.rs

Repository: martin-augment/avro-rs

Length of output: 1823


🏁 Script executed:

# Check test_inner_struct
sed -n '337,420p' avro_derive/tests/derive.rs

Repository: martin-augment/avro-rs

Length of output: 2446


🏁 Script executed:

# Look for avro_3634_test_basic_with_bool to see if it's been modified
rg -n "fn avro_3634_test_basic_with_bool" avro_derive/tests/derive.rs -A 100 | head -120

Repository: martin-augment/avro-rs

Length of output: 3358


🏁 Script executed:

# Let's count how many tests assert schema equality without defaults
# Look for the pattern where schema strings are compared
rg -n 'get_schema()' avro_derive/tests/derive.rs | wc -l

Repository: martin-augment/avro-rs

Length of output: 68


🏁 Script executed:

# See which tests use assert_eq with schema comparisons but likely don't have defaults in their expected strings
rg -n "assert_eq.*get_schema" avro_derive/tests/derive.rs | head -20

Repository: martin-augment/avro-rs

Length of output: 1302


🏁 Script executed:

# Let's find the definition of TestAllSupportedBaseTypes to see if it has #[avro(default = false)] attributes
rg -n "struct TestAllSupportedBaseTypes" avro_derive/tests/derive.rs -A 12

Repository: martin-augment/avro-rs

Length of output: 294


🏁 Script executed:

# Find TestBasicWithBool definition
rg -n "struct TestBasicWithBool" avro_derive/tests/derive.rs -A 5

Repository: martin-augment/avro-rs

Length of output: 171


🏁 Script executed:

# Check TestSmartPointers which is also in the list of tests that assert schema equality
rg -n "struct TestSmartPointers" avro_derive/tests/derive.rs -A 15

Repository: martin-augment/avro-rs

Length of output: 440


🏁 Script executed:

# Let's see the full test_smart_pointers function
sed -n '909,985p' avro_derive/tests/derive.rs

Repository: martin-augment/avro-rs

Length of output: 1912


🏁 Script executed:

# Let's check if there are any structs with #[avro(default = false)] on multiple fields
rg -n "#\[avro\(default = false\)\]" avro_derive/tests/derive.rs | head -30

Repository: martin-augment/avro-rs

Length of output: 1053


🏁 Script executed:

# Check the total count - are there many instances?
rg -c "#\[avro\(default = false\)\]" avro_derive/tests/derive.rs

Repository: martin-augment/avro-rs

Length of output: 68


🏁 Script executed:

# Let's specifically check if TestAllSupportedBaseTypes is used in multiple tests
rg -n "TestAllSupportedBaseTypes" avro_derive/tests/derive.rs | head -20

Repository: martin-augment/avro-rs

Length of output: 777


🏁 Script executed:

# Now let's understand the actual behavior - check if there's documentation or implementation 
# of when field_default() gets called
rg -n "fn field_default" avro/src/ --type rust -A 3 | head -50

Repository: martin-augment/avro-rs

Length of output: 2627


🏁 Script executed:

# Check what the actual implementation of field_default is for primitive types
rg -n "impl.*AvroSchemaComponent.*for.*bool\|impl.*AvroSchemaComponent.*for.*i32\|impl.*AvroSchemaComponent.*for.*f64" avro/src/ -A 10 | head -80

Repository: martin-augment/avro-rs

Length of output: 48


🏁 Script executed:

# Let's search for field_default implementations for specific primitive types
rg -n "impl AvroSchemaComponent" avro/src/serde/derive.rs | grep -E "bool|i32|f64|String" | head -20

Repository: martin-augment/avro-rs

Length of output: 48


🏁 Script executed:

# Search more broadly for the macro that implements field_default
rg -n "fn field_default" avro/src/serde/derive.rs

Repository: martin-augment/avro-rs

Length of output: 971


🏁 Script executed:

# Let's look at the actual implementations around line 530 where the macro is defined
sed -n '500,600p' avro/src/serde/derive.rs

Repository: martin-augment/avro-rs

Length of output: 3777


🏁 Script executed:

# Check what the avro_schema_impl_for macro does
rg -n "avro_schema_impl_for" avro/src/serde/derive.rs -A 20 | head -100

Repository: martin-augment/avro-rs

Length of output: 48


🏁 Script executed:

# Let's verify the exact behavior by looking at the test that DOES work with auto-defaults
sed -n '2383,2440p' avro_derive/tests/derive.rs

Repository: martin-augment/avro-rs

Length of output: 3144


🏁 Script executed:

# And compare it with a test that should be failing - test_basic_types
# Let's extract just the schema assertion to be clear
sed -n '313p' avro_derive/tests/derive.rs

Repository: martin-augment/avro-rs

Length of output: 130


🏁 Script executed:

# Check if there's any configuration or feature flag that controls auto-defaults
rg -n "cfg\|feature" avro_derive/tests/derive.rs | head -30

Repository: martin-augment/avro-rs

Length of output: 48


🏁 Script executed:

# Let's see if TestAllSupportedBaseTypes is defined with any special attributes
sed -n '246,260p' avro_derive/tests/derive.rs

Repository: martin-augment/avro-rs

Length of output: 314


🏁 Script executed:

# Let's verify by running a test to see if it passes or fails
# First, check if there are any recent changes to these test functions
git log --oneline -n 20 avro_derive/tests/derive.rs

Repository: martin-augment/avro-rs

Length of output: 117


🏁 Script executed:

# Let's look for all structs with primitive types that DON'T have #[avro(default = false)]
# and check what their expected schemas contain
rg -n "assert_eq!\(schema, .*::get_schema\(\)\)" avro_derive/tests/derive.rs | while read line; do
  linenum=$(echo "$line" | cut -d: -f1)
  testname=$(sed -n "1,${linenum}p" avro_derive/tests/derive.rs | grep "fn test_\|fn avro_" | tail -1)
  echo "Line $linenum: $testname"
done | head -30

Repository: martin-augment/avro-rs

Length of output: 1660


🏁 Script executed:

# Let's check a specific case: does the expected schema in test_basic_types contain "default"?
sed -n '264,311p' avro_derive/tests/derive.rs | grep -i "default"

Repository: martin-augment/avro-rs

Length of output: 48


🏁 Script executed:

# Check test_smart_pointers expected schema for defaults
sed -n '910,935p' avro_derive/tests/derive.rs | grep -i "default"

Repository: martin-augment/avro-rs

Length of output: 48


🏁 Script executed:

# Most importantly: check if TestAllSupportedBaseTypes struct definition has any #[avro(default=false)]
sed -n '246,260p' avro_derive/tests/derive.rs | grep "#\[avro"

Repository: martin-augment/avro-rs

Length of output: 48


🏁 Script executed:

# Let's verify the concern by checking if those tests would actually fail
# by looking at the actual vs expected - let's construct what the schemas SHOULD contain
# First, let's see what avro_rs_476_field_default() shows us about the expected schema with defaults

python3 << 'PYEOF'
import json

# From avro_rs_476_field_default test, here's what the schema WITH defaults looks like
with_defaults = r'{"type":"record","name":"Foo","fields":[{"name":"_a","type":"boolean","default":false},{"name":"_b","type":"int","default":0},{"name":"_c","type":"int","default":0},{"name":"_d","type":"int","default":0},{"name":"_e","type":"long","default":0},{"name":"_f","type":"int","default":0},{"name":"_g","type":"int","default":0},{"name":"_h","type":"long","default":0},{"name":"_i","type":"float","default":0.0},{"name":"_j","type":"double","default":0.0},{"name":"_k","type":"string","default":""},{"name":"_l","type":"string","default":""},{"name":"_m","type":"string","default":"\u0000"},{"name":"_n","type":{"type":"record","name":"Spam","fields":[{"name":"_field","type":"boolean","default":false}]},"default":{"_field":true}},{"name":"_o","type":{"type":"array","items":"boolean"},"default":[]},{"name":"_p","type":{"type":"array","items":"int"},"default":[0,0,0,0,0]},{"name":"_p_alt","type":{"type":"array","items":{"type":"record","name":"Bar","fields":[{"name":"_field","type":"Bar"}]}}},{"name":"_q","type":{"type":"map","values":"string"},"default":{}},{"name":"_r","type":["null","double"],"default":null},{"name":"_s","type":{"type":"fixed","name":"duration","size":12,"logicalType":"duration"},"default":"\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000"},{"name":"_t","type":{"type":"fixed","name":"uuid","size":16,"logicalType":"uuid"},"default":"\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000"},{"name":"_u","type":{"type":"fixed","name":"u64","size":8},"default":"\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000"},{"name":"_v","type":{"type":"fixed","name":"u128","size":16},"default":"\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000"},{"name":"_w","type":{"type":"fixed","name":"i128","size":16},"default":"\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000"},{"name":"_x","type":"Bar"}]}'

# Verify it's valid JSON and pretty-print to see the pattern
schema_obj = json.loads(with_defaults)
print("Fields with auto-defaults (from avro_rs_476_field_default test):")
for field in schema_obj['fields']:
    if 'default' in field:
        print(f"  {field['name']}: {field['type']} -> default={field['default']}")
    else:
        print(f"  {field['name']}: {field['type']} -> NO DEFAULT")
PYEOF

Repository: martin-augment/avro-rs

Length of output: 1412


🏁 Script executed:

# Now let's check: has the feature actually been enabled in the test file?
# Let's look at git diff to see what changed in this PR
git diff HEAD~1 avro_derive/tests/derive.rs | grep -A 5 -B 5 "avro(default = false)" | head -50

Repository: martin-augment/avro-rs

Length of output: 252


🏁 Script executed:

# Check if the feature is controlled by a runtime flag or if it's unconditionally on
rg -n "field_default_impl" avro_derive/src/lib.rs -B 3 -A 3

Repository: martin-augment/avro-rs

Length of output: 556


🏁 Script executed:

# Let's check git status to see if there are uncommitted changes
git status

Repository: martin-augment/avro-rs

Length of output: 131


🏁 Script executed:

# Check the git log to understand the PR better
git log --oneline -n 10

Repository: martin-augment/avro-rs

Length of output: 117


🏁 Script executed:

# Let's look at the actual commit that introduced the field_default feature
git show 13b4e53 --stat

Repository: martin-augment/avro-rs

Length of output: 11279


🏁 Script executed:

# Now let's see what files were modified in that commit
git show 13b4e53 avro_derive/tests/derive.rs | head -200

Repository: martin-augment/avro-rs

Length of output: 6021


🏁 Script executed:

# This is interesting - the entire test file was newly created in this commit
# Let's verify: are test_basic_types and other unchanged tests' expected schemas 
# supposed to have defaults or not?

# Check if there are any updates to test_basic_types or test_smart_pointers 
# in the full commit by looking at the full file as it exists now
sed -n '247,259p' avro_derive/tests/derive.rs

Repository: martin-augment/avro-rs

Length of output: 242


🏁 Script executed:

# Check if the expected schema in test_basic_types includes defaults
sed -n '264,311p' avro_derive/tests/derive.rs | grep -C 2 "default"

Repository: martin-augment/avro-rs

Length of output: 48


🏁 Script executed:

# So test_basic_types expects NO defaults. Let's check if TestAllSupportedBaseTypes 
# has been defined with #[avro(default = false)] on its fields
sed -n '245,260p' avro_derive/tests/derive.rs

Repository: martin-augment/avro-rs

Length of output: 315


🏁 Script executed:

# Now compare to a test that DOES have auto-defaults (avro_rs_476)
# which doesn't use #[avro(default = false)]
sed -n '2405,2440p' avro_derive/tests/derive.rs | grep -E "struct|#\[avro|_[a-z]:"

Repository: martin-augment/avro-rs

Length of output: 478


🏁 Script executed:

# Let's verify the core concern: does the new mechanism apply auto-defaults 
# based on the FieldDefault enum default value?
rg -n "FieldDefault::" avro_derive/src/lib.rs -B 3 -A 3

Repository: martin-augment/avro-rs

Length of output: 608


🏁 Script executed:

# Check how field_default is actually applied in the schema generation
rg -n "field_default\(\)" avro_derive/src/lib.rs -B 2 -A 2

Repository: martin-augment/avro-rs

Length of output: 524


🏁 Script executed:

# See the exact context where field_default is called
sed -n '180,220p' avro_derive/src/lib.rs

Repository: martin-augment/avro-rs

Length of output: 2316


🏁 Script executed:

# Count tests that assert schema equality and have primitive types but NO #[avro(default = false)]
# These are the ones that will fail

# First, let's list test names that check schemas
rg -n "fn test_|fn avro_" avro_derive/tests/derive.rs | grep -E "^[0-9]+:" | head -50

Repository: martin-augment/avro-rs

Length of output: 2518


🏁 Script executed:

# Now let's verify by checking a few key test structs to see if they have #[avro(default = false)]
# Check TestNested
sed -n '330,333p' avro_derive/tests/derive.rs

# Check TestOptional
sed -n '420,423p' avro_derive/tests/derive.rs

# Check TestGeneric
sed -n '453,458p' avro_derive/tests/derive.rs

Repository: martin-augment/avro-rs

Length of output: 427


Add #[avro(default = false)] to all test struct fields to match hardcoded expected schemas.

The auto-default feature enables FieldDefault::Trait by default, which calls field_default() on all field types. All primitive types (bool, i8, i16, i32, i64, u8, u16, u32, f32, f64, String, char) return Some(default_value), so derived schemas will automatically include "default" fields. However, the expected schemas in most tests—test_basic_types, test_inner_struct, test_smart_pointers, avro_3634_test_basic_with_bool, test_generic_container_1, test_generic_container_2, and others—do not include defaults. Test structs like TestAllSupportedBaseTypes, TestBasicWithBool, TestSmartPointers, TestNested, and TestGeneric lack #[avro(default = false)] on their primitive-typed fields, causing schema mismatches.

Add #[avro(default = false)] to all fields in test structs that are used in schema equality assertions where the expected schema omits default values. This applies to all structs with primitive fields that lack explicit default annotations in their hardcoded schema fixtures.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@avro_derive/tests/derive.rs` around lines 1374 - 1375, Several test structs
produce schemas with "default" entries because the derive uses
FieldDefault::Trait by default; update each test struct used in schema equality
assertions—e.g. TestAllSupportedBaseTypes, TestBasicWithBool, TestSmartPointers,
TestNested (or TestInnerStruct), TestGeneric (and any other fixtures referenced
in tests like test_basic_types, test_inner_struct, test_smart_pointers,
avro_3634_test_basic_with_bool, test_generic_container_1/2)—by adding the
attribute #[avro(default = false)] to every primitive-typed field so the derived
schema omits "default" and matches the hardcoded expected schema fixtures.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

value:good-to-have; category:bug; feedback: The CodeRabbit AI reviewer is correct! The Pull Request proposes to add "default" field for all generated schemas. When the application developer does not specify a default explicitly then the type's default is used. This is not the right way. It would be better to not add "default" field at all if the user application didn't ask for such. I.e. only when a field is annotated with #[avro(default = ...)] or AvroSchemaComponent::field_default() returns Some then add the default attribute to the schema

Comment on lines +2553 to +2554
#[avro(default = "42.0")]
_r: Option<f64>,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Spec-violating union default: 42.0 is not a valid default for ["null","double"].

The Avro specification states that "when a default value is specified for a record field whose type is a union, the type of the default value must match the first element of the union. Thus, for unions containing null, the null is usually listed first, since the default value of such unions is typically null."

_r: Option<f64> maps to ["null","double"] (first type: null). The explicit "42.0" default is a double value matching the second type, which is spec-invalid. Default values for union fields correspond to the first schema in the union. While the test happens to pass (it only calls serde_json::to_string, which doesn't validate union default semantics), a schema registry or any strict Avro consumer will reject this schema.

The test should either use #[avro(default = "null")], or demonstrate non-null defaults only on non-union types (which are already well-covered in this test). If the intent is to document that #[avro(default = "..")] deliberately bypasses spec validation, that should at minimum carry a // Invalid per Avro spec comment.

🛡️ Proposed fix
-        #[avro(default = "42.0")]
-        _r: Option<f64>,
+        // Option<f64> maps to ["null","double"]; spec-valid default must be null
+        #[avro(default = "null")]
+        _r: Option<f64>,
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
#[avro(default = "42.0")]
_r: Option<f64>,
// Option<f64> maps to ["null","double"]; spec-valid default must be null
#[avro(default = "null")]
_r: Option<f64>,
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@avro_derive/tests/derive.rs` around lines 2553 - 2554, The test sets an Avro
default on the union field `_r: Option<f64>` using #[avro(default = "42.0")],
which violates Avro union-default rules because Option<f64> maps to
["null","double"] and defaults must match the first union branch (null); fix by
changing the attribute to #[avro(default = "null")] for `_r` (or remove the
default entirely), or if you intentionally want an
invalid/default-for-second-branch case, keep it but add a clear comment like `//
Invalid per Avro spec` next to `#[avro(default = "42.0")]` to document the
deliberate deviation.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

value:useful; category:bug; feedback: The CodeRabbit AI reviewer is correct! As per Avro specification the type of the default value must match with the first variant in a union schema. Since Option generates a Union schema with first variant Schema::Null and second an appropriate schema for Xyz the set default value here can be only "null" and nothing else.

@martin-augment
Copy link
Copy Markdown
Owner Author

3. Invalid Avro Schema Accepted in Test

In avro_rs_476_field_default_provided:

#[avro(default = "42.0")]
_r: Option<f64>,

The schema type is ["null", "double"]. According to the Avro specification, the default value for a union must be valid for the first type in the union (which is null). Setting "default": 42.0 on this field produces an invalid Avro schema. While this is not a new validation gap (explicit defaults were always accepted as-is), the test appears to treat it as a valid scenario.

value:useful; category:bug; feedback: The Claude AI reviewer is correct! As per Avro specification the type of the default value must match with the first variant in a union schema. Since Option generates a Union schema with first variant Schema::Null and second an appropriate schema for Xyz the set default value here can be only "null" and nothing else.

@martin-augment
Copy link
Copy Markdown
Owner Author

1. Undocumented Breaking Change

The most significant concern: before this PR, all derived fields produced default: None in the schema (no default). After this PR, primitive types (i32, bool, String, etc.) automatically get defaults (0, false, "", ...) in the schema. This silently changes the schema output for all existing users of #[derive(AvroSchema)].

For example, existing code like:

#[derive(AvroSchema)]
struct MyRecord {
    count: i32,
    name: String,
}

will now produce a schema with "default":0 and "default":"" on those fields — which it did not before. This is a breaking change for any user relying on the absence of defaults, and it should be either gated behind an opt-in attribute or have a prominent semver-bump note.

value:good-to-have; category:bug; feedback: The Claude AI reviewer is correct! The Pull Request proposes to add "default" field for all generated schemas. When the application developer does not specify a default explicitly then the type's default is used. This is not the right way. It would be better to not add "default" field at all if the user application didn't ask for such. I.e. only when a field is annotated with #[avro(default = ...)] or AvroSchemaComponent::field_default() returns Some then add the default attribute to the schema

@martin-augment
Copy link
Copy Markdown
Owner Author

2. Regression in skip_serializing Validation

The old condition:

// avro_derive/src/attributes/mod.rs
&& avro.default.is_none()

The new condition:

&& avro.default == FieldDefault::Disabled

Previously, using #[serde(skip_serializing)] without any #[avro(default = "...")] was a compile error — which was correct, since a skipped field needs a schema default for Avro compatibility. Now, with FieldDefault::Trait as the implicit default, the error is only raised if the user explicitly disables the default.

If a user's custom type doesn't implement field_default() (returns None by default), using skip_serializing on that field will now silently produce a schema without a default — the opposite of what the validation was meant to catch. This is a regression.

value:useful; category:bug; feedback: The Claude AI reviewer is correct! If AvroSchemaComponent is used to derive the schema then the result value of its field_default() should be used to check whether there is a default or not, and thus whether skip_serializing could be used or not.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants