-
Notifications
You must be signed in to change notification settings - Fork 0
3646: AVRO-4228: [c++] Fix BinaryDecoder::arrayNext() to handle negative block counts #33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -170,7 +170,7 @@ size_t BinaryDecoder::doDecodeItemCount() { | |
| } | ||
|
|
||
| size_t BinaryDecoder::arrayNext() { | ||
| return static_cast<size_t>(doDecodeLong()); | ||
| return doDecodeItemCount(); | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The call to
Owner
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. value:delightful; category:security; feedback: The Gemini AI reviewer is correct! If the count value is size_t's min then the negation will lead to an overflow and even more issues because of this. Prevents a vulnerability if an attacker crafts an Avro file with such min count. |
||
| } | ||
|
|
||
| size_t BinaryDecoder::skipArray() { | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -2100,6 +2100,42 @@ static void testJsonCodecReinit() { | |
| } | ||
| } | ||
|
|
||
| static void testArrayNegativeBlockCount() { | ||
| // Array of ints [10, 20, 30, 40, 50] encoded with a negative block count. | ||
| // Per the Avro spec, a negative count means: abs(count) items follow, | ||
| // preceded by a long byte-size of the block. | ||
| // | ||
| // Block 1: count=-3, bytesize=3, items: 10, 20, 30 | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This test puts the negative block count in the first block (handled by Severity: medium 🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.
Owner
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. value:useful; category:bug; feedback: The Augment AI reviewer is correct! The new test passes even without the fix, so it does not prevent regressions at all. The test should be improved to use a negative count for the non-first block(s) too. |
||
| // Block 2: count=2, items: 40, 50 | ||
| // Terminal: count=0 | ||
| const uint8_t data[] = { | ||
| 0x05, // zigzag(-3) = 5 | ||
| 0x06, // zigzag(3) = 6 (byte-size of block) | ||
| 0x14, 0x28, 0x3c, // zigzag ints: 10, 20, 30 | ||
| 0x04, // zigzag(2) = 4 | ||
| 0x50, 0x64, // zigzag ints: 40, 50 | ||
| 0x00 // terminal | ||
| }; | ||
|
|
||
| InputStreamPtr is = memoryInputStream(data, sizeof(data)); | ||
| DecoderPtr d = binaryDecoder(); | ||
| d->init(*is); | ||
|
|
||
| std::vector<int32_t> result; | ||
| for (size_t n = d->arrayStart(); n != 0; n = d->arrayNext()) { | ||
| for (size_t i = 0; i < n; ++i) { | ||
| result.push_back(d->decodeInt()); | ||
| } | ||
| } | ||
|
|
||
| BOOST_CHECK_EQUAL(result.size(), 5u); | ||
| BOOST_CHECK_EQUAL(result[0], 10); | ||
| BOOST_CHECK_EQUAL(result[1], 20); | ||
| BOOST_CHECK_EQUAL(result[2], 30); | ||
| BOOST_CHECK_EQUAL(result[3], 40); | ||
| BOOST_CHECK_EQUAL(result[4], 50); | ||
|
Comment on lines
+2131
to
+2136
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This series of checks can be made more robust and concise. If the initial size check fails, the subsequent element access const std::vector<int32_t> expected = {10, 20, 30, 40, 50};
BOOST_CHECK_EQUAL_COLLECTIONS(result.begin(), result.end(), expected.begin(), expected.end());
Owner
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. value:good-to-have; category:bug; feedback: The Gemini AI reviewer is correct! Using BOOST_CHECK_EQUAL_COLLECTIONS will provide better error messages when asserting the values in a collection, like in this test. |
||
| } | ||
|
|
||
| static void testByteCount() { | ||
| OutputStreamPtr os1 = memoryOutputStream(); | ||
| EncoderPtr e1 = binaryEncoder(); | ||
|
|
@@ -2125,6 +2161,7 @@ init_unit_test_suite(int, char *[]) { | |
| ts->add(BOOST_PARAM_TEST_CASE(&avro::testJson, avro::jsonData, | ||
| ENDOF(avro::jsonData))); | ||
| ts->add(BOOST_TEST_CASE(avro::testJsonCodecReinit)); | ||
| ts->add(BOOST_TEST_CASE(avro::testArrayNegativeBlockCount)); | ||
| ts->add(BOOST_TEST_CASE(avro::testByteCount)); | ||
|
|
||
| return ts; | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
doDecodeItemCount()negatesresultwhen it’s negative; if the encoded count isINT64_MIN,-resultoverflows (undefined behavior). SincearrayNext()now routes through this path, consider explicitly guarding that value and throwing instead of overflowing.Severity: medium
🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
value:delightful; category:security; feedback: The Augment AI reviewer is correct! If the count value is size_t's min then the negation will lead to an overflow and even more issues because of this. Prevents a vulnerability if an attacker crafts an Avro file with such min count.