-
Notifications
You must be signed in to change notification settings - Fork 0
3647: fix: Support on all-literal RLIKE expression #45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -21,7 +21,7 @@ use arrow::array::types::Int32Type; | |
| use arrow::array::{Array, BooleanArray, DictionaryArray, RecordBatch, StringArray}; | ||
| use arrow::compute::take; | ||
| use arrow::datatypes::{DataType, Schema}; | ||
| use datafusion::common::{internal_err, Result}; | ||
| use datafusion::common::{internal_err, Result, ScalarValue}; | ||
| use datafusion::physical_expr::PhysicalExpr; | ||
| use datafusion::physical_plan::ColumnarValue; | ||
| use regex::Regex; | ||
|
|
@@ -140,8 +140,24 @@ impl PhysicalExpr for RLike { | |
| let array = self.is_match(inputs); | ||
| Ok(ColumnarValue::Array(Arc::new(array))) | ||
| } | ||
| ColumnarValue::Scalar(_) => { | ||
| internal_err!("non scalar regexp patterns are not supported") | ||
| ColumnarValue::Scalar(scalar) => { | ||
| if scalar.is_null() { | ||
| return Ok(ColumnarValue::Scalar(ScalarValue::Boolean(None))); | ||
| } | ||
|
|
||
| let is_match = match scalar { | ||
| ScalarValue::Utf8(Some(s)) | ||
| | ScalarValue::LargeUtf8(Some(s)) | ||
| | ScalarValue::Utf8View(Some(s)) => self.pattern.is_match(s.as_str()), | ||
| _ => { | ||
| return internal_err!( | ||
| "RLike requires string type for input, got {:?}", | ||
| scalar.data_type() | ||
| ); | ||
| } | ||
| }; | ||
|
|
||
| Ok(ColumnarValue::Scalar(ScalarValue::Boolean(Some(is_match)))) | ||
| } | ||
| } | ||
| } | ||
|
|
@@ -165,3 +181,26 @@ impl PhysicalExpr for RLike { | |
| Display::fmt(self, f) | ||
| } | ||
| } | ||
|
|
||
| #[cfg(test)] | ||
| mod tests { | ||
| use super::*; | ||
| use datafusion::physical_expr::expressions::Literal; | ||
|
|
||
| #[test] | ||
| fn test_rlike_scalar_utf8_literal() { | ||
| let expr = RLike::try_new( | ||
| Arc::new(Literal::new(ScalarValue::Utf8(Some("Rose".to_string())))), | ||
| "R[a-z]+", | ||
| ) | ||
| .unwrap(); | ||
| let result = expr | ||
| .evaluate(&RecordBatch::new_empty(Arc::new(Schema::empty()))) | ||
| .unwrap(); | ||
| let ColumnarValue::Scalar(result) = result else { | ||
| panic!("expected scalar result"); | ||
| }; | ||
|
|
||
| assert_eq!(result, ScalarValue::Boolean(Some(true))); | ||
| } | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The added unit test
Owner
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. value:useful; category:bug; feedback: The Gemini AI reviewer is correct! The new unit test tests only the happy path with Utf8(Some). The other Utf8 types (Large and View) are not covered. The NULL case is also not covered (Utf(None)). Testing with a non-Utf8 type would also cover the error handling logic. |
||
| } | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test validates the happy-path match, but it doesn’t cover the new NULL/non-match behavior in the scalar branch (e.g.,
ScalarValue::Utf8(None)or a literal that doesn’t match). Consider extending coverage so regressions inscalar.is_null()handling and false matches get caught.Severity: low
🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
value:useful; category:bug; feedback: The Augment AI reviewer is correct! The new unit test tests only the happy path with Utf8(Some). The other Utf8 types (Large and View) are not covered. The NULL case is also not covered (Utf(None)). Testing with a non-Utf8 type would also cover the error handling logic.