3622: [C] Fix negative length validation in binary decoding#27
3622: [C] Fix negative length validation in binary decoding#27martin-augment wants to merge 1 commit intomainfrom
Conversation
The read_bytes() and read_string() functions in encoding_binary.c
decode length values using zigzag encoding, which can produce negative
numbers from malicious input. These negative values were passed directly
to avro_malloc(), causing allocation failures or undefined behavior.
This patch adds validation to reject negative length values with a
clear error message before attempting memory allocation.
Bug: Negative length values from varint decoding cause
allocation-size-too-big when cast to size_t
Impact: DoS via crafted binary input
Co-Authored-By: Claude <noreply@anthropic.com>
WalkthroughThe changes add validation checks to prevent negative length values from being processed in the encoding binary module. Specifically, the ✨ Finishing touches
📜 Recent review detailsConfiguration used: Organization UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
🔇 Additional comments (2)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Code Review - PR #27: [C] Fix negative length validation in binary decodingSummaryThis PR addresses a critical security vulnerability where negative length values from zigzag-decoded varints could be passed to ✅ StrengthsSecurity Fix - Critical
Code Quality
|
🤖 Augment PR SummarySummary: Adds explicit validation to reject negative lengths while decoding Avro binary Technical Notes: On malformed negative lengths, decoding now sets an error message and fails fast with 🤖 Was this summary useful? React with 👍 or 👎 |
| int rval; | ||
| check_prefix(rval, read_long(reader, len), | ||
| "Cannot read bytes length: "); | ||
| if (*len < 0) { |
There was a problem hiding this comment.
There was a problem hiding this comment.
value:useful; category:security; feedback:The Augment AI reviewer is correct! The same kind of problem as the fixed ones could be seen in the skip_bytes() method. It should be fixed the same way as the others. Prevents a security attack by crafting an Avro file with negative length for a field with a Bytes schema.
| avro_set_error("Invalid bytes length: %" PRId64, *len); | ||
| return EINVAL; | ||
| } | ||
| *bytes = (char *) avro_malloc(*len + 1); |
There was a problem hiding this comment.
avro_malloc is a thin wrapper over realloc and doesn’t guard size overflows; if *len is extremely large, *len + 1 can overflow (UB) and/or truncate when converted to size_t. Consider adding an upper-bound check (also applies to read_string’s str_len + 1) before computing the allocation size.
🤖 Was this useful? React with 👍 or 👎
There was a problem hiding this comment.
value:good-to-have; category:security; feedback:The Augment AI reviewer is correct! A specially crafted Avro file with a really big length value will lead to trying to allocate a lot of memory and possible crash due to out-of-memory error.
There was a problem hiding this comment.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
lang/c/src/encoding_binary.c (1)
144-152: Missing negative length validation inskip_bytes.The same malicious varint input that could produce negative lengths in
read_bytescan also affectskip_bytes. Here,lenis passed directly toAVRO_SKIPwithout validation, which could cause undefined behavior depending on how the macro handles negative values.For consistency and completeness of the security fix, consider adding the same validation:
Proposed fix
static int skip_bytes(avro_reader_t reader) { int64_t len = 0; int rval; check_prefix(rval, read_long(reader, &len), "Cannot read bytes length: "); + if (len < 0) { + avro_set_error("Invalid bytes length: %" PRId64, len); + return EINVAL; + } AVRO_SKIP(reader, len); return 0; }
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
lang/c/src/encoding_binary.c
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Cursor Bugbot
🔇 Additional comments (2)
lang/c/src/encoding_binary.c (2)
130-133: LGTM! Correctly validates negative length before allocation.The validation properly rejects negative lengths decoded from malicious varint input, preventing the undefined behavior that would occur when casting a negative
int64_ttosize_tforavro_malloc().
182-185: LGTM! Consistent validation withread_bytes.The check correctly guards against negative string lengths before buffer allocation.
value:useful; category:security; feedback:The Claude AI reviewer is correct! The same kind of problem as the fixed ones could be seen in the skip_bytes() method. It should be fixed the same way as the others. Prevents a security attack by crafting an Avro file with negative length for a field with a Bytes schema. |
value:good-to-have; category:security; feedback:The Claude AI reviewer is correct! A specially crafted Avro file with a really big length value will lead to trying to allocate a lot of memory and possible crash due to out-of-memory error. |
value:useful; category:security; feedback:The CodeRabbit AI reviewer is correct! The same kind of problem as the fixed ones could be seen in the skip_bytes() method. It should be fixed the same way as the others. Prevents a security attack by crafting an Avro file with negative length for a field with a Bytes schema. |
3622: To review by AI