-
Notifications
You must be signed in to change notification settings - Fork 68
feat: add Unicode identifier support with major performance improvements #116
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Hi @uksarkar! Thank you for this excellent contribution! I've reviewed the changes and the implementation looks great. The Unicode identifier support is well-designed with comprehensive test coverage across multiple scripts (Chinese, Japanese, Arabic, Bengali, Cyrillic, etc.). This PR currently has conflicts with the
(I apologize for merging this in this order. I didn't look ahead to see which would create conflicts for each other.) To resolve the conflicts, please rebase your branch: # Update your local main branch
git checkout main
git pull upstream main
# Rebase your feature branch
git checkout <your-branch-name>
git rebase main
# Resolve the conflict in expressions/scanner.go
# The conflict is at the end of the file - just keep your isValidUnicodeIdentifier() function
# Continue the rebase
git add expressions/scanner.go
git rebase --continue
# Force push to update the PR
git push --force-with-lease origin <your-branch-name>Reference: I've created a successfully rebased version at Once rebased, this PR will be ready to merge! Great work on this feature. |
|
As an alternative, to rebasing I can create a new PR from the rebased branch ( Your authorship will be fully preserved -- you'll still be credited as the author of the commit, and it will appear in your GitHub contribution history. Either way works! Let me know what you prefer. |
I agree, remove |
- Support Unicode letters in identifiers and properties across all scripts
- Full backward compatibility with existing ASCII templates
- Comprehensive test coverage for multiple scripts
The scanner now handles variables like {{ 用户名 }}, {{ ব্যবহারকারী }},
{{ المستخدم }} and properties like {{ 用户.姓名 }} efficiently.
Includes benchmark automation script for performance validation.
|
Removed the |
This test verifies that Unicode variable names, specifically the Chinese variable '描述' mentioned in issue #63, now work correctly after the fix in PR #116 which added Unicode identifier support. The test uses the exact example from the issue report to ensure the previously failing case now passes. Refs #63
This test verifies that Unicode variable names, specifically the Chinese variable '描述' mentioned in issue #63, now work correctly after the fix in PR #116 which added Unicode identifier support. The test uses the exact example from the issue report to ensure the previously failing case now passes. Refs #63 Co-authored-by: Claude <noreply@anthropic.com>
The scanner now handles variables like {{ 用户名 }}, {{ ব্যবহারকারী }}, {{ المستخدم }} and properties like {{ 用户.姓名 }} efficiently.
Includes benchmark automation script for performance validation.
Related Issues
Related to #63 - Unicode identifier support
Performance Improvements
Benchmark results show significant optimizations:
Checklist
make testpasses.make lintpasses.