2851: feat: Add expression registry to native planner#35
2851: feat: Add expression registry to native planner#35martin-augment wants to merge 3 commits intomainfrom
Conversation
WalkthroughThis pull request introduces a modular, registry-based architecture for expression handling in the planner. A new arithmetic expression module provides builders for addition, subtraction, multiplication, division, integral division, remainder, and unary minus operations. Concurrently, core traits (ExpressionBuilder, OperatorBuilder) and enums (ExpressionType, OperatorType) are defined in a new traits module. An ExpressionRegistry maps expression types to their corresponding builders. The planner's public API is expanded to expose helper functions and the registry itself, shifting from monolithic match-based expression handling to a pluggable registry-driven approach while preserving fallback paths for non-registered cases. ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Code Review: Expression Registry RefactoringOverviewThis PR introduces a modular expression registry pattern to the native planner, refactoring arithmetic expressions out of the monolithic ✅ Strengths
🐛 Critical Issues1. Copy/Clone Issue in
|
| Category | Status |
|---|---|
| Architecture | ✅ Excellent |
| Code Quality | |
| Bugs | ❌ 1 critical (Copy/Clone) |
| Performance | |
| Security | ✅ No concerns |
| Tests | ❌ Missing |
Verdict: Needs Changes before merging. Fix the Copy/Clone issue and add basic test coverage.
🎯 Action Items
Must Fix (Blocking):
- ✅ Fix
expression_registrycopy/clone issue inwith_exec_id - ✅ Add at least basic integration tests
Should Fix (Recommended):
3. Handle return_type.unwrap() more safely
4. Remove unimplemented expression types from enum
5. Add doc comments to public APIs
Nice to Have:
6. Consider using Arc<ExpressionRegistry> for performance
7. Optimize registry lookup to single HashMap access
Great work on the refactoring! The architecture is solid and this sets up a good foundation for future improvements. 🚀
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (1)
native/core/src/execution/expressions/arithmetic.rs (1)
195-206: Consider defensive handling for missingreturn_type.The
.unwrap()on line 201 will panic ifreturn_typeisNone. While this follows existing patterns in the codebase (e.g.,planner.rsline 873), consider using.ok_or_else()for a more descriptive error:- let result = create_modulo_expr( - left, - right, - expr.return_type - .as_ref() - .map(crate::execution::serde::to_arrow_datatype) - .unwrap(), + let return_type = expr.return_type + .as_ref() + .map(crate::execution::serde::to_arrow_datatype) + .ok_or_else(|| ExecutionError::GeneralError( + "Remainder expression missing return_type".to_string() + ))?; + let result = create_modulo_expr( + left, + right, + return_type,This is optional since other builders use the same pattern via
create_binary_expr.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
native/core/src/execution/expressions/arithmetic.rs(1 hunks)native/core/src/execution/expressions/mod.rs(1 hunks)native/core/src/execution/planner.rs(10 hunks)native/core/src/execution/planner/expression_registry.rs(1 hunks)native/core/src/execution/planner/traits.rs(1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-11-04T14:26:48.750Z
Learnt from: martin-augment
Repo: martin-augment/datafusion-comet PR: 7
File: native/spark-expr/src/math_funcs/abs.rs:201-302
Timestamp: 2025-11-04T14:26:48.750Z
Learning: In the abs function in native/spark-expr/src/math_funcs/abs.rs (Rust), NULL values for signed integers (Int8, Int16, Int32, Int64) and decimals (Decimal128, Decimal256) should return the argument as-is (e.g., ColumnarValue::Scalar(ScalarValue::Int8(None))) rather than panicking on unwrap().
Applied to files:
native/core/src/execution/expressions/arithmetic.rs
🧬 Code graph analysis (3)
native/core/src/execution/planner/expression_registry.rs (2)
native/core/src/execution/planner.rs (11)
expr(647-651)expr(701-705)expr(1797-1801)expr(2259-2263)expr(2355-2355)expr(2390-2394)expr(2528-2528)new(170-177)new(2560-2572)create_expr(246-798)default(164-166)native/spark-expr/src/nondetermenistic_funcs/rand.rs (1)
None(97-97)
native/core/src/execution/planner/traits.rs (1)
native/core/src/execution/expressions/arithmetic.rs (7)
build(39-60)build(67-88)build(95-116)build(123-144)build(151-175)build(182-212)build(219-234)
native/core/src/execution/planner.rs (1)
native/core/src/execution/planner/expression_registry.rs (1)
new(37-44)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Cursor Bugbot
- GitHub Check: claude-review
🔇 Additional comments (17)
native/core/src/execution/expressions/mod.rs (1)
20-20: LGTM!The new module export follows the existing pattern and correctly exposes the arithmetic expression builders for use by the registry.
native/core/src/execution/expressions/arithmetic.rs (2)
38-61: LGTM! Clean implementation of the builder pattern.The
AddBuildercorrectly delegates toplanner.create_binary_exprwith the appropriate operator. The pattern is consistent across all arithmetic builders.
215-235: LGTM!The
UnaryMinusBuildercorrectly usescreate_negate_exprwith thefail_on_errorflag from the protobuf expression.native/core/src/execution/planner/expression_registry.rs (3)
31-53: Well-designed registry pattern enabling incremental migration.The registry correctly checks
can_handlebefore attempting dispatch, and the fallback path inplanner.rshandles unregistered expressions. This enables gradual migration of expressions to the modular system.
74-106: Good modular structure with clear extension points.The TODO comments and separate registration methods (
register_arithmetic_expressions, etc.) provide clear guidance for future contributors to add more expression categories.
188-192: LGTM!The
Defaultimplementation correctly delegates tonew(), ensuring consistent initialization.native/core/src/execution/planner.rs (7)
20-21: LGTM!Clean module structure exposing the new registry and traits modules.
147-150: Public visibility is appropriate for cross-module usage.Making
BinaryExprOptionsand its field public is necessary for the arithmetic builders to usecreate_binary_expr_with_options.
160-161: LGTM!The
expression_registryfield is correctly added toPhysicalPlanner.
170-186: Correct initialization and preservation of registry.The registry is properly initialized in
new()and preserved inwith_exec_id(), ensuring consistent state across planner instances.
251-259: Clean registry-first dispatch with fallback.The pattern correctly checks
can_handlebefore dispatching to the registry, falling back to the original match-based handling for unregistered expressions. This enables incremental migration while maintaining backward compatibility.
826-827: Public visibility enables builder access.Making
create_binary_exprpublic allows arithmetic builders to leverage this shared implementation.
2644-2650: Public visibility with complete enum mapping.Making
from_protobuf_eval_modepublic enables the arithmetic builders to convert eval modes. The function correctly maps all three Spark eval modes (Legacy, Try, Ansi).native/core/src/execution/planner/traits.rs (4)
30-39: LGTM! Well-defined trait for expression building.The
ExpressionBuildertrait has appropriateSend + Syncbounds for concurrent use and a clear contract for building physical expressions from Spark protobuf.
41-52: Good forward scaffolding for operator registry.The
OperatorBuildertrait is prepared for a future operator registry similar to the expression registry. The#[allow(dead_code)]annotation correctly acknowledges this is scaffolding.
54-125: Comprehensive expression type enumeration.The
ExpressionTypeenum covers all current expression types with clear categorization (arithmetic, comparison, logical, etc.). This provides type-safe keys for registry dispatch.
127-145: Forward scaffolding for operator registry.The
OperatorTypeenum mirrors the operator types in the planner'screate_planmatch, preparing for a future modular operator system.
value:annoying; category:bug; feedback: The Claude AI reviewer is not correct! The ownership of this struct is moved, i.e. it is neither copied nor cloned. The build passes and it confirms that there is no such issue here. |
value:useful; category:bug; feedback: The Claude AI reviewer is correct! Panicking in non-test code is bad practice because it leads to application crash. It is better to return an Err and let the caller decide whether to crash or to handle it somehow. |
|
Thank you for your contribution. Unfortunately, this pull request is stale because it has been open 60 days with no activity. Please remove the stale label or comment or this will be closed in 7 days. |
2851: To review by AI