351: fix: Don't depend on Serde to provide fields in the right order by martin-augment · Pull Request #9 · martin-augment/avro-rs

martin-augment · 2025-12-08T06:55:34Z

351: To review by AI

coderabbitai · 2025-12-08T06:55:44Z

Walkthrough

This pull request modifies Avro serialization to handle out-of-order struct fields and enforce default values for skipped fields. Changes include updating the SerializeRecordFieldWithSchema error variant to use owned String field names instead of &'static str, and introducing a new MissingDefaultForSkippedField error variant. The SchemaAwareWriteSerializeStruct now buffers out-of-order field data using field_cache and next_field tracking. The end-of-struct logic validates that all fields are either written or have defaults, returning appropriate errors when defaults are missing. Tests are updated to annotate skipped fields with #[avro(default)] attributes and verify error handling for missing defaults.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch pr-351-2025-12-08-06-55-31

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

augmentcode

Review completed. No suggestions at this time.

Comment augment review to trigger a new review at any time.

coderabbitai

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 36c861a and a186161.

📒 Files selected for processing (3)

avro/src/error.rs (1 hunks)
avro/src/ser_schema.rs (8 hunks)
avro/tests/avro-rs-226.rs (5 hunks)

🧰 Additional context used

🧬 Code graph analysis (1)

avro/src/ser_schema.rs (4)

avro/src/error.rs (1)

new (38-42)

avro/src/writer.rs (2)

new (95-97)

schema (171-173)

avro/src/ser.rs (13)

new (56-63)

new (67-77)

new (81-88)

new (92-96)

new (100-106)

end (312-314)

end (328-330)

end (344-346)

end (364-372)

end (386-388)

end (418-427)

end (445-447)

end (465-476)

avro/src/schema.rs (5)

new (253-259)

new (375-377)

name (379-381)

name (1208-1216)

schema (7114-7114)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)

GitHub Check: Cursor Bugbot
GitHub Check: clippy_check (stable)
GitHub Check: ci (1.86.0, ubuntu-24.04, x86_64-unknown-linux-gnu)
GitHub Check: ci (stable, ubuntu-24.04-arm, aarch64-unknown-linux-gnu)
GitHub Check: ci (beta, ubuntu-24.04-arm, wasm32-unknown-unknown)
GitHub Check: claude-review
GitHub Check: ci (stable, ubuntu-24.04, wasm32-unknown-unknown)
GitHub Check: interop
GitHub Check: web-assembly

🔇 Additional comments (8)

avro/src/error.rs (1)

517-525: LGTM! Error variant changes are well-structured.

The change from &'static str to String for field_name properly supports dynamic field names from schema lookups. The new MissingDefaultForSkippedField variant provides clear error messaging for skipped fields without defaults.

avro/src/ser_schema.rs (4)

252-256: LGTM! Out-of-order field buffering design.

The field_cache and next_field tracking enables proper handling of fields that arrive out of schema order. The approach is straightforward and maintains correctness.

272-313: Well-implemented in-order vs out-of-order field handling.

The logic correctly:

Writes fields directly when in-order (fast path)

Buffers out-of-order fields for later emission

Flushes buffered fields as soon as their position is reached

The linear search in field_cache is acceptable for typical record sizes.

390-417: LGTM! skip_field correctly delegates to serialize_next_field.

The implementation properly:

Looks up the skipped field in the schema

Serializes the default value through serialize_next_field (maintaining field order)

Returns MissingDefaultForSkippedField when no default is available

2982-3026: Good test coverage for out-of-order field serialization.

The test correctly validates the core feature: a Rust struct with fields (a, b) serialized against a schema with fields (b, a). The roundtrip through Writer/Reader confirms correct behavior.

avro/tests/avro-rs-226.rs (3)

47-49: LGTM! Added required default for skip_serializing_if field.

The #[avro(default = "null")] annotation is necessary because the new implementation requires defaults for fields that may be skipped during serialization.

108-122: LGTM! Selective default annotations are correctly applied.

The defaults are appropriately added only to fields that may be skipped during serialization:

skip_serializing and skip_serializing_if: need #[avro(default = "null")]

skip_deserializing: correctly has no default (it's still serialized)

skip: uses #[avro(skip)] to exclude from schema entirely

137-201: LGTM! Comprehensive panic tests for missing defaults.

The three #[should_panic] tests properly validate error handling for:

skip_serializing_if without a default

skip without a default

skip_serializing without a default

The expected messages correctly match the MissingDefaultForSkippedField error format.

coderabbitai · 2025-12-08T06:58:45Z

avro/src/ser_schema.rs

+        // Check if all fields were written
+        if self.next_field != self.record_schema.fields.len() {
+            let name = self.record_schema.fields[self.next_field].name.clone();
+            return Err(Details::GetField(name).into());
+        }


⚠️ Potential issue | 🟡 Minor

Dead code: This check can never trigger.

The while loop at line 317 only exits when self.next_field == self.record_schema.fields.len(). Therefore, the condition at line 348 is always false and this block is unreachable.

Remove the redundant check:

} - // Check if all fields were written - if self.next_field != self.record_schema.fields.len() { - let name = self.record_schema.fields[self.next_field].name.clone(); - return Err(Details::GetField(name).into()); - } assert!( self.field_cache.is_empty(), "There should be no more unwritten fields at this point" ); Ok(self.bytes_written)

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

// Check if all fields were written

if self.next_field != self.record_schema.fields.len() {

let name = self.record_schema.fields[self.next_field].name.clone();

return Err(Details::GetField(name).into());

}

}

assert!(

self.field_cache.is_empty(),

"There should be no more unwritten fields at this point"

);

Ok(self.bytes_written)

🤖 Prompt for AI Agents

In avro/src/ser_schema.rs around lines 347 to 351, the final check that returns Err(Details::GetField(...)) is dead code because the preceding while loop always exits only when self.next_field == self.record_schema.fields.len(); remove this redundant if-block entirely (delete the lines that fetch the field name and return the error) and ensure any trailing code and formatting remain consistent after its removal.

value:useful; category:bug; feedback:The CodeRabbit AI reviewer is correct! There is no need to check that the tracked position and the number of fields are the same because this is already checked in the earlier whilez condition. Prevents adding unused code.

claude · 2025-12-08T06:59:48Z

Summary: This PR fixes issue apache#351 by handling cases where Serde serializes struct fields in a different order than the Avro schema expects.

Positive Aspects:

Well-designed core algorithm with minimal performance impact for in-order fields
Comprehensive test coverage including edge cases and error scenarios
Good error handling with new MissingDefaultForSkippedField error
Clear documentation

Issues Found:

CRITICAL - Dead Code (ser_schema.rs:347-350): The check "if self.next_field != self.record_schema.fields.len()" is unreachable because the while loop above already ensures this condition. Remove this redundant check.
Performance Issue (ser_schema.rs:287-290, 319-322): Linear search O(n) on field_cache for every field could cause O(n²) complexity with many out-of-order fields. Recommend using HashMap<usize, Vec> instead of Vec<(usize, Vec)> for O(1) lookups.
Production Assert (ser_schema.rs:352-355): Using assert! in library code can panic. Replace with debug_assert! or proper error handling.
Breaking Change (error.rs:518): Changing field_name from &static str to String is a breaking change. Document in release notes if public API.
Memory Concern: Unbounded buffering for out-of-order fields could consume significant memory with large records.

Performance Analysis:

Best case (in-order fields): Excellent - zero overhead
Worst case (reverse order): O(n²) complexity
Memory: Unbounded growth for out-of-order fields

Recommendations:
MUST FIX: Remove dead code check, replace assert with debug_assert or error
SHOULD FIX: Use HashMap for O(n) performance, add performance documentation
NICE TO HAVE: More edge case tests, performance benchmarks

Overall: Solid implementation with sound algorithm. Ready to merge after fixing the assert and dead code. HashMap optimization can be follow-up work.

martin-augment · 2025-12-08T09:32:45Z

2. Performance Issue (ser_schema.rs:287-290, 319-322): Linear search O(n) on field_cache for every field could cause O(n²) complexity with many out-of-order fields. Recommend using HashMap<usize, Vec> instead of Vec<(usize, Vec)> for O(1) lookups.

value:good-to-have; category:bug; feedback:The Claude AI reviewer is correct! Using a HashMap will be both more ergonomic and succinct. There won't be a need to iterate over the items to find but it could use a map lookup directly. Prevents suboptimal performance

martin-augment · 2025-12-08T09:35:03Z

CRITICAL - Dead Code (ser_schema.rs:347-350): The check "if self.next_field != self.record_schema.fields.len()" is unreachable because the while loop above already ensures this condition. Remove this redundant check.

value:useful; category:bug; feedback:The Claude AI reviewer is correct! There is no need to check that the tracked position and the number of fields are the same because this is already checked in the earlier whilez condition. Prevents adding unused code.

martin-augment · 2025-12-08T09:37:17Z

3. Production Assert (ser_schema.rs:352-355): Using assert! in library code can panic. Replace with debug_assert! or proper error handling.

value:good-to-have; category:bug; feedback:The Claude AI reviewer is correct! A software library should avoid panicking if possible. In the current case it should return an Err like for any other problem earlier or to use debug_assertion!() macro that will report the problem in dev mode only.

fix: Different field order between Serde and the Schema

a186161

augmentcode bot reviewed Dec 8, 2025

View reviewed changes

coderabbitai bot reviewed Dec 8, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

351: fix: Don't depend on Serde to provide fields in the right order#9

351: fix: Don't depend on Serde to provide fields in the right order#9
martin-augment wants to merge 1 commit intomainfrom
pr-351-2025-12-08-06-55-31

martin-augment commented Dec 8, 2025

Uh oh!

coderabbitai bot commented Dec 8, 2025 •

edited

Loading

Uh oh!

augmentcode bot left a comment

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Dec 8, 2025

Uh oh!

martin-augment Dec 8, 2025

Uh oh!

claude bot commented Dec 8, 2025

Uh oh!

martin-augment commented Dec 8, 2025

Uh oh!

martin-augment commented Dec 8, 2025

Uh oh!

martin-augment commented Dec 8, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

martin-augment commented Dec 8, 2025

Uh oh!

coderabbitai bot commented Dec 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Uh oh!

augmentcode bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Dec 8, 2025

Choose a reason for hiding this comment

Uh oh!

martin-augment Dec 8, 2025

Choose a reason for hiding this comment

Uh oh!

claude bot commented Dec 8, 2025

Uh oh!

martin-augment commented Dec 8, 2025

Uh oh!

martin-augment commented Dec 8, 2025

Uh oh!

martin-augment commented Dec 8, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

coderabbitai bot commented Dec 8, 2025 •

edited

Loading