[GLUTEN-10134][VL] Add ANSI mode support for cast string to boolean#11437
[GLUTEN-10134][VL] Add ANSI mode support for cast string to boolean#11437malinjawi wants to merge 2 commits intoapache:mainfrom
Conversation
Implements ANSI-compliant string to boolean casting that throws exceptions for invalid inputs instead of returning null. Changes: - Add CastStringToBooleanAnsi.h with ANSI-compliant cast logic - Register spark_cast_string_to_boolean_ansi function in Velox - Update CastTransformer to route ANSI casts to custom function - Add literal evaluation optimization for compile-time casts - Include validation test suites for both ANSI and non-ANSI modes Contributes to issue apache#10134 (ANSI mode support)
|
Run Gluten Clickhouse CI on x86 |
|
Run Gluten Clickhouse CI on x86 |
|
@malinjawi, thanks for drafting this PR. We generally implement Spark functions in Velox, not in Gluten. I suggest firstly evaluating Velox's existing implementation for cast to see if we can just add if/else branch for ANSI support or add a separate implementation (if most code is not shared). Velox is now aware of the Spark ANSI setting. So you may get the ANSI enabling state from Velox config to implement some branching logic if needed. With the cast implemented in Velox, I assume most test cases proposed for Gluten can be moved to Velox. |
|
Run Gluten Clickhouse CI on x86 |
Thanks for the reply @PHILO-HE Just to confirm from a Gluten integration perspective: once the cast behavior is implemented and tested in Velox, are there any additional changes or validation needed in Gluten itself (e.g., wiring, config propagation, or integration tests), or should the Velox src/test coverage be sufficient? I want to make sure I’m aligning with the expected ownership and test boundaries going forward. |
|
@PHILO-HE Thanks again for the help. I have gone ahead and raised a fix on velox PR: facebookincubator/velox#16059 Let me know if I should go ahead and close this PR or if anything else is needed from a gluten side. |
You can add |
@malinjawi, Gluten has a test framework that allows importing Spark unit tests for the integration validation. We have imported many Spark test suites, like |
|
This PR is stale because it has been open 45 days with no activity. Remove stale label or comment or this will be closed in 10 days. |
What changes are proposed in this pull request?
This PR implements ANSI-compliant string to boolean casting for the Velox backend, addressing part of issue #10134 (ANSI mode support).
Key Changes:
CastStringToBooleanAnsi.hwith a custom Velox function that implements Spark's ANSI cast semantics for string-to-boolean conversionspark_cast_string_to_boolean_ansifunction in Velox's function registryCastTransformerto detect ANSI mode and route string-to-boolean casts to the custom functionBehavior:
t,true,y,yes,1(true) andf,false,n,no,0(false)VELOX_USER_FAILexception with descriptive error messageFixes #10134 (partial - cast string to boolean component)
How was this patch tested?
Test Coverage:
ANSI Mode Tests (
CastStringToBooleanAnsiValidateSuite.scala):Non-ANSI Mode Tests (
CastStringToBooleanValidateSuite.scala):Validation:
Manual Testing:
spark.sql.ansi.enabled=trueandfalsespark_cast_string_to_boolean_ansiin ANSI mode