-
Notifications
You must be signed in to change notification settings - Fork 1
Add util functions for masking in select #97
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This pull request introduces two new utility scalar user-defined functions (UDFs) for DataFusion that provide conditional masking based on null values: mask_if_null and mask_if_not_null. These functions are designed to return the second argument conditionally based on the nullability of the first argument.
- Added
mask_if_nullUDF that returns the second argument when the first is null, otherwise returns null - Added
mask_if_not_nullUDF that returns the second argument when the first is not null, otherwise returns null - Registered both UDFs in the utility function registry for availability throughout the codebase
Reviewed Changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 7 comments.
| File | Description |
|---|---|
| beacon-functions/src/util/mod.rs | Added module declarations and registered the new UDFs in the util_udfs function |
| beacon-functions/src/util/mask_if_null.rs | Implemented mask_if_null UDF with conditional logic, type checking, and simplification |
| beacon-functions/src/util/mask_if_not_null.rs | Implemented mask_if_not_null UDF with conditional logic, type checking, and simplification |
Comments suppressed due to low confidence (4)
beacon-functions/src/util/mask_if_null.rs:1
- The
CaseExprimport is unused in this file. Consider removing it to keep imports clean.
use std::sync::Arc;
beacon-functions/src/util/mask_if_null.rs:1
- The
is_nullimport is unused in this file. The code usesleft.is_null()method instead. Consider removing the unused import.
use std::sync::Arc;
beacon-functions/src/util/mask_if_not_null.rs:1
- The
CaseExprimport is unused in this file. Consider removing it to keep imports clean.
use std::sync::Arc;
beacon-functions/src/util/mask_if_not_null.rs:1
- The
is_nullimport is unused in this file. The code usesleft.is_not_null()method instead. Consider removing the unused import.
use std::sync::Arc;
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
| @@ -0,0 +1,99 @@ | |||
| use std::sync::Arc; | |||
Copilot
AI
Sep 30, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The std::sync::Arc import is unused in this file. Consider removing it to keep imports clean.
| use std::sync::Arc; |
| @@ -0,0 +1,99 @@ | |||
| use std::sync::Arc; | |||
|
|
|||
| use arrow::datatypes::{DataType, Field, FieldRef}; | |||
Copilot
AI
Sep 30, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Field and FieldRef imports from arrow::datatypes are unused in this file. Consider removing them to keep imports clean.
| use arrow::datatypes::{DataType, Field, FieldRef}; | |
| use arrow::datatypes::DataType; |
| logical_expr::{ | ||
| conditional_expressions::CaseBuilder, | ||
| simplify::{ExprSimplifyResult, SimplifyInfo}, | ||
| ColumnarValue, ExprSchemable, ReturnFieldArgs, ScalarUDF, ScalarUDFImpl, Signature, |
Copilot
AI
Sep 30, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ExprSchemable and ReturnFieldArgs imports are unused in this file. Consider removing them to keep imports clean.
| ColumnarValue, ExprSchemable, ReturnFieldArgs, ScalarUDF, ScalarUDFImpl, Signature, | |
| ColumnarValue, ScalarUDF, ScalarUDFImpl, Signature, |
| // If the first argument is known to be non-null, we can simplify to the second argument | ||
| if let Ok(false) = info.nullable(&left) { | ||
| return Ok(ExprSimplifyResult::Simplified(right)); |
Copilot
AI
Sep 30, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The simplification logic is incorrect for mask_if_null. When the first argument is known to be non-null, the function should return null, not the second argument. The correct simplification should be Ok(ExprSimplifyResult::Simplified(Expr::Literal(datafusion::scalar::ScalarValue::Null))).
| // If the first argument is known to be non-null, we can simplify to the second argument | |
| if let Ok(false) = info.nullable(&left) { | |
| return Ok(ExprSimplifyResult::Simplified(right)); | |
| // If the first argument is known to be non-null, we can simplify to null | |
| if let Ok(false) = info.nullable(&left) { | |
| return Ok(ExprSimplifyResult::Simplified(Expr::Literal( | |
| datafusion::scalar::ScalarValue::Null, | |
| None, | |
| ))); |
| @@ -0,0 +1,99 @@ | |||
| use std::sync::Arc; | |||
Copilot
AI
Sep 30, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The std::sync::Arc import is unused in this file. Consider removing it to keep imports clean.
| use std::sync::Arc; |
| @@ -0,0 +1,99 @@ | |||
| use std::sync::Arc; | |||
|
|
|||
| use arrow::datatypes::{DataType, Field, FieldRef}; | |||
Copilot
AI
Sep 30, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Field and FieldRef imports from arrow::datatypes are unused in this file. Consider removing them to keep imports clean.
| use arrow::datatypes::{DataType, Field, FieldRef}; | |
| use arrow::datatypes::DataType; |
| logical_expr::{ | ||
| conditional_expressions::CaseBuilder, | ||
| simplify::{ExprSimplifyResult, SimplifyInfo}, | ||
| ColumnarValue, ExprSchemable, ReturnFieldArgs, ScalarUDF, ScalarUDFImpl, Signature, |
Copilot
AI
Sep 30, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ExprSchemable and ReturnFieldArgs imports are unused in this file. Consider removing them to keep imports clean.
| ColumnarValue, ExprSchemable, ReturnFieldArgs, ScalarUDF, ScalarUDFImpl, Signature, | |
| ColumnarValue, ScalarUDF, ScalarUDFImpl, Signature, |
This pull request introduces two new utility scalar user-defined functions (UDFs) for DataFusion:
mask_if_nullandmask_if_not_null. These functions are designed to conditionally mask values based on their nullability, and are now registered for use throughout the codebase.New conditional masking UDFs:
mask_if_nullUDF inmask_if_null.rs, which returns the second argument if the first argument is null, otherwise returns null. This function includes logic for simplification and type checking.mask_if_not_nullUDF inmask_if_not_null.rs, which returns the second argument if the first argument is not null, otherwise returns null. This function also supports simplification and type checking.Integration with utility UDF registry:
util_udfsfunction inmod.rs, making them available for use wherever utility UDFs are loaded.