Skip to content

[GLUTEN-11012][CH] Support Thai/Khmer digit dates in CH#11476

Merged
baibaichen merged 6 commits intoapache:mainfrom
zhanglistar:fix/thai-numeral-date-main
Feb 2, 2026
Merged

[GLUTEN-11012][CH] Support Thai/Khmer digit dates in CH#11476
baibaichen merged 6 commits intoapache:mainfrom
zhanglistar:fix/thai-numeral-date-main

Conversation

@zhanglistar
Copy link
Contributor

What changes are proposed in this pull request?

Scan UTF-8 strings for local digits before conversion and add regression queries for Thai and Khmer numeral date parsing in the function suite.
Fix #11012.

How was this patch tested?

UT

@github-actions
Copy link

Run Gluten Clickhouse CI on x86

@zhanglistar zhanglistar requested a review from lgbo-ustc January 26, 2026 03:07
@github-actions
Copy link

Run Gluten Clickhouse CI on x86

@zhanglistar zhanglistar force-pushed the fix/thai-numeral-date-main branch from e145b6c to d713a30 Compare January 26, 2026 09:59
@github-actions
Copy link

Run Gluten Clickhouse CI on x86

4 similar comments
@github-actions
Copy link

Run Gluten Clickhouse CI on x86

@github-actions
Copy link

Run Gluten Clickhouse CI on x86

@github-actions
Copy link

Run Gluten Clickhouse CI on x86

@github-actions
Copy link

Run Gluten Clickhouse CI on x86

Scan UTF-8 strings for local digits before conversion and add regression
queries for Thai and Khmer numeral date parsing in the function suite.
Add comments describing the base64-encoded local digit date fixtures used in local digit date tests.
Use SIMD-based ASCII detection, fast-path common UTF-8 digit ranges, and avoid double scans when converting local digits.
Preserve original UTF-8 bytes when no local digit is detected in multi-byte sequences, and downgrade logging to debug.
Map UTF-8 byte ranges to correct digit values for Devanagari and Bengali local digits.
@zhanglistar zhanglistar force-pushed the fix/thai-numeral-date-main branch from 0601e08 to fc9ec4c Compare February 2, 2026 04:34
@github-actions
Copy link

github-actions bot commented Feb 2, 2026

Run Gluten Clickhouse CI on x86

@github-actions
Copy link

github-actions bot commented Feb 2, 2026

Run Gluten Clickhouse CI on x86

@baibaichen baibaichen merged commit 02c776f into apache:main Feb 2, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[CH] support arabic indic digits in unix_timestamp

2 participants