2826: feat: Support ANSI mode SUM (Decimal types)#33
2826: feat: Support ANSI mode SUM (Decimal types)#33martin-augment wants to merge 8 commits intomainfrom
Conversation
WalkthroughThe changes introduce EvalMode support to the SUM aggregation function for Decimal128 types. The protobuf Sum message field is replaced from a boolean fail_on_error to an EvalMode enum field. The SumDecimal UDF is updated to accept and store an eval_mode parameter throughout its API, including constructors, accumulators, and aggregator implementations. Overflow handling logic is refactored to respect the EvalMode: in ANSI mode, overflows raise errors; in non-ANSI modes, they set the sum to None. The Scala serialization layer is updated to map Spark's EvalMode to protobuf representation. New test cases are added to verify decimal SUM and try_sum behavior under ANSI and non-ANSI modes with null values, group-by operations, and overflow scenarios. ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Code Review for PR #33: ANSI Mode Support for Decimal SUMSummaryThis PR adds ANSI mode support to the SUM aggregate function for decimal types, implementing proper overflow handling that either throws errors (ANSI/TRY modes) or returns NULL (Legacy mode). ✅ StrengthsCode Quality
Test Coverage
Architecture
🔍 Issues Found1.
|
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (1)
spark/src/test/scala/org/apache/comet/exec/CometAggregateSuite.scala (1)
27-32: New decimal SUM / try_sum ANSI & overflow tests provide strong coverageThe added tests nicely cover:
- NULL behaviour for
sumandtry_sumon decimals, with and without GROUP BY, under both ANSI on/off.- Decimal overflow semantics for
SUMin ANSI vs non‑ANSI (including grouped and repartitioned cases), asserting both Spark and Comet throwARITHMETIC_OVERFLOWin ANSI mode.- Decimal overflow handling for
try_sum, including a partial‑overflow GROUP BY where one group overflows and another succeeds.- Correct wiring via
CometConf.getExprAllowIncompatConfigKey(classOf[Sum])and use ofSum/colimports.If you want to make the last “partial overflow” test even more self‑documenting, you could also assert the concrete result (e.g., group 1 → NULL, group 2 → 500) instead of only relying on
checkSparkAnswerAndOperator, but the current tests are already effective.Also applies to: 1475-1635
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
native/core/src/execution/planner.rs(1 hunks)native/proto/src/proto/expr.proto(1 hunks)native/spark-expr/benches/aggregate.rs(2 hunks)native/spark-expr/src/agg_funcs/sum_decimal.rs(12 hunks)spark/src/main/scala/org/apache/comet/serde/aggregates.scala(3 hunks)spark/src/test/scala/org/apache/comet/exec/CometAggregateSuite.scala(2 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-11-04T14:26:48.750Z
Learnt from: martin-augment
Repo: martin-augment/datafusion-comet PR: 7
File: native/spark-expr/src/math_funcs/abs.rs:201-302
Timestamp: 2025-11-04T14:26:48.750Z
Learning: In the abs function in native/spark-expr/src/math_funcs/abs.rs (Rust), NULL values for signed integers (Int8, Int16, Int32, Int64) and decimals (Decimal128, Decimal256) should return the argument as-is (e.g., ColumnarValue::Scalar(ScalarValue::Int8(None))) rather than panicking on unwrap().
Applied to files:
native/spark-expr/src/agg_funcs/sum_decimal.rs
🧬 Code graph analysis (4)
native/spark-expr/benches/aggregate.rs (1)
native/spark-expr/src/agg_funcs/sum_decimal.rs (1)
try_new(47-63)
native/spark-expr/src/agg_funcs/sum_decimal.rs (2)
native/spark-expr/src/utils.rs (1)
is_valid_decimal_precision(225-229)native/spark-expr/src/lib.rs (1)
arithmetic_overflow_error(110-114)
spark/src/main/scala/org/apache/comet/serde/aggregates.scala (3)
spark/src/main/scala/org/apache/comet/CometSparkSessionExtensions.scala (8)
comet(79-81)comet(86-123)comet(125-127)comet(129-138)comet(152-154)comet(156-158)comet(160-166)comet(168-174)spark/src/main/scala/org/apache/comet/serde/QueryPlanSerde.scala (2)
QueryPlanSerde(45-717)evalModeToProto(439-446)spark/src/main/scala/org/apache/comet/serde/SupportLevel.scala (1)
Incompatible(39-39)
spark/src/test/scala/org/apache/comet/exec/CometAggregateSuite.scala (1)
spark/src/main/scala/org/apache/spark/sql/comet/operators.scala (2)
CometHashAggregateExec(1104-1130)CometHashAggregateExec(1168-1225)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Cursor Bugbot
- GitHub Check: claude-review
🔇 Additional comments (16)
native/proto/src/proto/expr.proto (1)
120-124: Sum.eval_mode proto change: check versioning/compat expectationsSwitching field 3 from
bool fail_on_errortoEvalMode eval_modekeeps the same field number and varint wire type, but changes the semantic meaning of existing encoded values (e.g., oldtruenow decoding as enum value1). This is fine if plans are never persisted across mixed Comet versions, but if there is any cross-version interchange, you may need to document thatSumis not backward‑compatible across this change or gate it by version.native/core/src/execution/planner.rs (1)
1867-1893: Decimal SUM now wired to EvalMode-aware UDF correctlyFor decimal inputs you now: (1) decode
expr.eval_modeviafrom_protobuf_eval_mode, and (2) constructSumDecimal::try_new(datatype, eval_mode)and wrap it in anAggregateUDF. This cleanly propagates LEGACY/TRY/ANSI into the decimal SUM implementation while leaving the non-decimal path unchanged and still backed bysum_udaf. GivenCometSum.getSupportLevelonly allows ANSI/TRY for decimal types, this looks consistent and safe.native/spark-expr/benches/aggregate.rs (1)
35-36: Benchmark update to new SumDecimal API looks goodImporting
EvalModeand constructingSumDecimalwithEvalMode::Legacyaligns the benchmark with the newtry_new(data_type, eval_mode)signature without changing the workload semantics. No issues here.Also applies to: 98-101
spark/src/main/scala/org/apache/comet/serde/aggregates.scala (1)
32-34: Sum serde changes correctly gate ANSI/TRY and propagate EvalMode
CometSum.getSupportLevelnow treats ANSI/TRY as incompatible only whensum.dataTypeis non‑decimal, which matches the native support matrix and ensures non-decimal SUM in ANSI/TRY still falls back to Spark.- In
convert, wiringbuilder.setEvalMode(evalModeToProto(CometEvalModeUtil.fromSparkEvalMode(sum.evalMode)))means the Rust side sees the exact eval mode for decimal SUM, consistent with how scalar/math casts already behave.
Overall this is a clean extension of EvalMode into the SUM aggregate path.Also applies to: 216-225, 242-247
native/spark-expr/src/agg_funcs/sum_decimal.rs (12)
18-19: LGTM: EvalMode integration in struct definition.The addition of the
eval_modefield and necessary imports properly extends theSumDecimalstruct to support ANSI mode semantics.Also applies to: 43-43
47-63: LGTM: Constructor properly parameterized with EvalMode.The
try_newmethod correctly accepts and stores theeval_modeparameter, enabling mode-specific behavior downstream.
71-86: LGTM: Consistent state field semantics.The implementation correctly propagates
eval_modeto accumulators and consistently tracks bothsumandis_emptystate fields to match Spark's behavior. The comments clarify this design decision effectively.
133-153: LGTM: Accumulator initialization is correct.The accumulator properly initializes with
sum = Some(0)andis_empty = true, which correctly represents an empty aggregation state. Theeval_modeparameter is stored for overflow handling.
155-177: LGTM: Overflow handling is correctly implemented.The overflow detection and handling logic is sound:
- Early return for already-overflowed state prevents further processing
unwrap_or(0)at line 162 is safe due to prior overflow check at line 157- ANSI mode correctly raises errors on overflow
- Non-ANSI mode correctly sets
sumtoNonewhile markingis_emptyasfalse
181-211: LGTM: Batch update correctly handles nulls and overflow.The batch update logic correctly:
- Detects and skips already-overflowed states
- Updates
is_emptystatus based on null counts- Skips null values during iteration (lines 205-206)
- Delegates to
update_singlefor per-value processing and error propagation
213-230: LGTM: Evaluation correctly distinguishes empty vs. overflow states.The evaluation logic properly handles:
- Empty aggregations (no non-null values) →
None- Overflowed sums (non-ANSI mode) →
None- Valid sums with precision checks → decimal value
The defensive precision check at line 221 is good practice even though it shouldn't fail during normal operation.
236-306: LGTM: State serialization and merging are correct.The state and merge logic correctly handles:
- Serialization of both
sumandis_emptystate fields- Detection and propagation of overflow states (line 270-277)
- Safe unwrapping at lines 289-290 due to prior validation
- EvalMode-aware overflow handling during merge
- Proper short-circuiting for empty and overflowed states
308-353: LGTM: Groups accumulator initialization and per-group updates are correct.The groups accumulator properly:
- Maintains per-group state vectors for
sumandis_empty- Initializes new groups to
Some(0)andtrueinresize_helper- Implements overflow detection and EvalMode-aware handling per group
- Safe use of
unwrap_or(0)at line 340 due to prior overflow check at line 336
388-449: LGTM: Group evaluation and state extraction are correct.The evaluation logic correctly:
- Handles both
EmitTo::AllandEmitTo::First(n)cases consistently- Checks
is_emptyand precision before emitting values- Returns
Nonefor empty groups or overflowed sums- Serializes per-group state as two arrays (sum, is_empty)
451-517: LGTM: Groups merge batch correctly handles per-group state merging.The merge logic correctly:
- Processes each group independently with proper state extraction
- Detects and propagates overflow states (lines 480-488)
- Safely unwraps at lines 500-501 after validation
- Applies EvalMode-aware overflow handling during merge (lines 504-513)
- Handles empty state short-circuiting (lines 490-497)
544-595: Tests updated correctly for new API.The existing tests are properly updated to pass
EvalMode::Legacyto the constructors. While these Rust unit tests don't cover overflow or ANSI mode behavior, the AI summary indicates that comprehensive edge case testing (overflow, ANSI/non-ANSI modes, nulls) is provided in the Scala test suite (CometAggregateSuite.scala).
value:incorrect-but-reasonable; category:bug; feedback:The Claude AI reviewer is not correct! The implementation is ANSI vs non-ANSI, i.e. TRY and LEGACY and treated the same since both of them should return NULL (i.e. None) when the sum overflows. Only ANSI mode should return an error. |
value:good-to-have; category:bug; feedback:The Claude AI reviewer is correct! Having a short comment about the possible values and their behavior would be helpful for the maintenance of the protobuf file. It would prevent consulting with the rest of the code to see the possible values. |
value:good-to-have; category:bug; feedback:The Claude AI reviewer is correct! A javadoc for the method would be helpful. The method generates 50 tuples of 99999999999999999999 and 1. Those are used as values for 50 rows used in the test cases. Each test case uses a SQL query like "SELECT SUM(col(1)) from table" and here the summing leads to overflow. |
|
Thank you for your contribution. Unfortunately, this pull request is stale because it has been open 60 days with no activity. Please remove the stale label or comment or this will be closed in 7 days. |
2826: To review by AI