3445: AVRO-3520: [java] expose read schema in custom encoding#19
3445: AVRO-3520: [java] expose read schema in custom encoding#19martin-augment wants to merge 2 commits intomainfrom
Conversation
WalkthroughAdds schema-aware hooks to Avro's custom encoding reflection framework. Introduces a Changes
✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (6)
🧰 Additional context used🧬 Code graph analysis (3)lang/java/avro/src/main/java/org/apache/avro/reflect/FieldAccessReflect.java (1)
lang/java/avro/src/test/java/org/apache/avro/reflect/TestCustomEncoderSchemaChanges.java (2)
lang/java/avro/src/main/java/org/apache/avro/reflect/ReflectData.java (1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
🔇 Additional comments (4)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
| } | ||
| return result; | ||
| } | ||
|
|
There was a problem hiding this comment.
Bug: Schema-less accessors break custom encoding evolution
The getAccessorFor method returns accessors from the byName map which are created without schema information in the constructor. When custom encodings are accessed through getFieldAccessor, their withSchema method is never called, causing them to use default schemas instead of the actual field schemas. This breaks schema evolution for custom encodings accessed by field name.
Code Review: AVRO-3520 - Expose read schema in custom encodingSummaryThis PR successfully implements a mechanism to expose the read schema to custom encoders, enabling them to adapt to schema evolution. The implementation is solid with good test coverage. Below are detailed findings: ✅ Strengths
|
| * opportunity to detect schema changes and behave accordingly. Useful for | ||
| * maintaining backwards compatibility. | ||
| * | ||
| * @param schema the schema detected during read. |
There was a problem hiding this comment.
The Javadoc says "schema detected during read", but withSchema is also called on write paths (e.g., ReflectData.populateEncoderCache and in field access for writing), so this phrasing can be misleading. Consider generalizing to "active schema (read or write)" to reflect actual usage.
🤖 Was this useful? React with 👍 or 👎
There was a problem hiding this comment.
value:useful; category:documentation; feedback: The Augment AI reviewer is correct that the javadoc should be updated to not mention read or write schema. The method is used with both.
|
value:annoying; category:bug; feedback: The Codex AI reviewer talks about classes which are not changed in this Pull Request |
value:useful; category:bug; feedback: The Claude AI reviewer is correct that |
value:useful; category:bug; feedback: The Claude AI reviewer is correct that the new method could easily lead to concurrency issues if the override just does something like |
3445: To review by AI
Note
Plumbs read schema into custom encoders and field access so encoders can adapt to schema changes, with tests validating schema-dependent read/write behavior.
CustomEncoding.withSchema(Schema)to provide the active schema to encoders.ReflectDatato create schema-specific field accessors and to initialize top-level custom encoders viawithSchema(schema); introducebuildByNameto bind accessors per schema.FieldAccess#getAccessorto acceptSchemaand propagate field-level schema inFieldAccessReflect, wiring custom encoders withwithSchema.ReflectionUtilvalidation to use schema-aware accessors.TestCustomEncoderSchemaChangescovering top-level and nested custom encoders adapting to differing schemas (string vs int and field count).Written by Cursor Bugbot for commit f378f08. This will update automatically on new commits. Configure here.