Conversation
…ection Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
WalkthroughThe changes introduce a new ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Code ReviewThe implementation is clean and follows the established pattern of CorrectnessThe core logic is correct: The test table schema confirms the expected results:
Issues1. Missed optimization: no Since nullability is a planning-time property (like the type returned by Example sketch: fn simplify(
&self,
args: Vec<Expr>,
info: &dyn SimplifyInfo,
) -> Result<ExprSimplifyResult> {
let nullable = info.nullable(&args[0])?;
Ok(ExprSimplifyResult::Simplified(lit(nullable)))
}2. SQL example table has mismatched column widths (in both The separator line has 28/24 dashes per column, but the actual column headers are 38/36 characters wide, producing a misaligned table. They should be: Minor Observations
|
There was a problem hiding this comment.
🧹 Nitpick comments (1)
datafusion/functions/src/core/is_nullable.rs (1)
56-56: Consider documenting the immutability semantics.
Volatility::Immutableis correct here since schema-level nullability doesn't change during query execution. This allows the optimizer to potentially constant-foldis_nullable(col)expressions when the schema is known. The current choice is appropriate.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@datafusion/functions/src/core/is_nullable.rs` at line 56, Add a brief doc comment explaining why Volatility::Immutable is used for the is_nullable function/operator: note that schema-level nullability does not change during query execution, so Signature::any(1, Volatility::Immutable) allows the optimizer to constant-fold is_nullable(col) when the schema is known; place this comment near the is_nullable function (or the location where Signature::any is called) and reference Volatility::Immutable and Signature::any in the note.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@datafusion/functions/src/core/is_nullable.rs`:
- Line 56: Add a brief doc comment explaining why Volatility::Immutable is used
for the is_nullable function/operator: note that schema-level nullability does
not change during query execution, so Signature::any(1, Volatility::Immutable)
allows the optimizer to constant-fold is_nullable(col) when the schema is known;
place this comment near the is_nullable function (or the location where
Signature::any is called) and reference Volatility::Immutable and Signature::any
in the note.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 6ab4f094-1ffc-4172-9a09-51048a40b447
📒 Files selected for processing (7)
.cursor/rules.mdAGENTS.mddatafusion/functions/src/core/is_nullable.rsdatafusion/functions/src/core/mod.rsdatafusion/sqllogictest/src/test_context.rsdatafusion/sqllogictest/test_files/is_nullable.sltdocs/source/user-guide/sql/scalar_functions.md
There was a problem hiding this comment.
Code Review
This pull request introduces the is_nullable scalar function to DataFusion, which returns whether an expression's field is nullable based on the schema. The implementation includes the function logic, registration, SQL logic tests, and documentation. Feedback includes removing the "Hash" trait from the "IsNullableFunc" struct to avoid compilation errors, correcting the ASCII table alignment in the documentation example, and avoiding manual edits to the auto-generated "scalar_functions.md" file.
| description = "Expression to evaluate. The expression can be a constant, column, or function, and any combination of operators." | ||
| ) | ||
| )] | ||
| #[derive(Debug, PartialEq, Eq, Hash)] |
There was a problem hiding this comment.
The Hash trait cannot be derived for IsNullableFunc because the signature field of type Signature does not implement Hash. Since ScalarUDFImpl does not require the implementation to be Hash, and ScalarUDF hashes based on the function name, you should remove Hash from the derive list to avoid a compilation error.
| #[derive(Debug, PartialEq, Eq, Hash)] | |
| #[derive(Debug, PartialEq, Eq)] |
| +----------------------------+------------------------+ | ||
| | is_nullable(table_with_metadata.name) | is_nullable(table_with_metadata.ts) | | ||
| +----------------------------+------------------------+ | ||
| | true | false | | ||
| +----------------------------+------------------------+ |
There was a problem hiding this comment.
The ASCII table in the SQL example is misaligned. The column headers are significantly longer than the separator lines and the data rows, which makes the documentation harder to read.
| +----------------------------+------------------------+ | |
| | is_nullable(table_with_metadata.name) | is_nullable(table_with_metadata.ts) | | |
| +----------------------------+------------------------+ | |
| | true | false | | |
| +----------------------------+------------------------+ | |
| +---------------------------------------+-------------------------------------+ | |
| | is_nullable(table_with_metadata.name) | is_nullable(table_with_metadata.ts) | | |
| +---------------------------------------+-------------------------------------+ | |
| | true | false | | |
| +---------------------------------------+-------------------------------------+ |
| - [arrow_try_cast](#arrow_try_cast) | ||
| - [arrow_typeof](#arrow_typeof) | ||
| - [get_field](#get_field) | ||
| - [is_nullable](#is_nullable) |
There was a problem hiding this comment.
This file is automatically generated by the dev/update_function_docs.sh script, as noted in the header of the file. You should avoid editing it manually. Instead, ensure the documentation in the ScalarUDFImpl implementation is correct and run the update script to regenerate this file. This ensures consistency and prevents your changes from being overwritten in the future.
🤖 Augment PR SummarySummary: This PR adds a new core scalar function Changes:
Technical Notes: The implementation reads nullability from 🤖 Was this summary useful? React with 👍 or 👎 |
| &self.signature | ||
| } | ||
|
|
||
| fn return_type(&self, _arg_types: &[DataType]) -> Result<DataType> { |
There was a problem hiding this comment.
datafusion/functions/src/core/is_nullable.rs:70: ScalarUDFImpl’s default return_field_from_args marks outputs as nullable, but is_nullable always returns a non-NULL boolean (ScalarValue::Boolean(Some(...))), so the planned output field metadata may be incorrect.
Severity: medium
🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.
| arg1 | ||
| ),( | ||
| is_nullable, | ||
| "Returns whether the input expression is nullable", |
There was a problem hiding this comment.
datafusion/functions/src/core/mod.rs:126: The expr_fn docs say “Returns whether the input expression is nullable”, which could be read as runtime NULL-ness; the function actually reports schema/field nullability as documented elsewhere.
Severity: low
🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.
21387: To review by AI