Skip to content

Conversation

@uksarkar
Copy link
Contributor

@uksarkar uksarkar commented Oct 30, 2025

  • Support Unicode letters in identifiers and properties across all scripts
  • Full backward compatibility with existing ASCII templates
  • Comprehensive test coverage for multiple scripts

The scanner now handles variables like {{ 用户名 }}, {{ ব্যবহারকারী }}, {{ المستخدم }} and properties like {{ 用户.姓名 }} efficiently.

Includes benchmark automation script for performance validation.

Related Issues

Related to #63 - Unicode identifier support

Performance Improvements

Benchmark results show significant optimizations:

  • Unicode scanning: 60% faster (5.4µs → 2.1µs)
  • Unicode memory: 76% less (47.5KB → 11.5KB)
  • Overall performance: 35% improvement (geomean)
  • ASCII performance is maintained with minimal impact (5.73% slower)

Checklist

  • I have read the contribution guidelines.
  • make test passes.
  • make lint passes.
  • New and changed code is covered by tests.
  • Performance improvements include benchmarks.
  • Changes match the documented (not just the implemented) behavior of Shopify.

@osteele
Copy link
Owner

osteele commented Nov 6, 2025

Hi @uksarkar!

Thank you for this excellent contribution! I've reviewed the changes and the implementation looks great. The Unicode identifier support is well-designed with comprehensive test coverage across multiple scripts (Chinese, Japanese, Arabic, Bengali, Cyrillic, etc.).

This PR currently has conflicts with the main branch due to two PRs that were merged after your branch was created:

(I apologize for merging this in this order. I didn't look ahead to see which would create conflicts for each other.)

To resolve the conflicts, please rebase your branch:

# Update your local main branch
git checkout main
git pull upstream main

# Rebase your feature branch
git checkout <your-branch-name>
git rebase main

# Resolve the conflict in expressions/scanner.go
# The conflict is at the end of the file - just keep your isValidUnicodeIdentifier() function

# Continue the rebase
git add expressions/scanner.go
git rebase --continue

# Force push to update the PR
git push --force-with-lease origin <your-branch-name>

Reference: I've created a successfully rebased version at claude/rebase-review-liquid-pr-011CUrTC3VYbQLMokpZCLa84 that you can reference or reset to if needed. The only conflict was adding the isValidUnicodeIdentifier() function after the Error() method in expressions/scanner.go.

Once rebased, this PR will be ready to merge! Great work on this feature.

@osteele
Copy link
Owner

osteele commented Nov 6, 2025

As an alternative, to rebasing I can create a new PR from the rebased branch (claude/rebase-review-liquid-pr-011CUrTC3VYbQLMokpZCLa84). Then we'd close this PR without merging.

Your authorship will be fully preserved -- you'll still be credited as the author of the commit, and it will appear in your GitHub contribution history.

Either way works! Let me know what you prefer.

@uksarkar
Copy link
Contributor Author

uksarkar commented Nov 6, 2025

Thank you, @osteele . I’ve resolved the conflicts. I’m also considering removing this file, as it doesn’t appear to be needed in the main branch. Would you have any recommendations?

@osteele
Copy link
Owner

osteele commented Nov 7, 2025

I’m also considering removing this file, as it doesn’t appear to be needed in the main branch. Would you have any recommendations?

I agree, remove bench.sh. It was useful for validating the work, but shouldn't go into the main branch.

- Support Unicode letters in identifiers and properties across all scripts
- Full backward compatibility with existing ASCII templates
- Comprehensive test coverage for multiple scripts

The scanner now handles variables like {{ 用户名 }}, {{ ব্যবহারকারী }},
{{ المستخدم }} and properties like {{ 用户.姓名 }} efficiently.

Includes benchmark automation script for performance validation.
@uksarkar
Copy link
Contributor Author

uksarkar commented Nov 7, 2025

Removed the bench.sh file, please review and merge the PR.

@osteele osteele merged commit 4237459 into osteele:main Nov 8, 2025
8 checks passed
osteele pushed a commit that referenced this pull request Nov 8, 2025
This test verifies that Unicode variable names, specifically the Chinese
variable '描述' mentioned in issue #63, now work correctly after the fix
in PR #116 which added Unicode identifier support.

The test uses the exact example from the issue report to ensure the
previously failing case now passes.

Refs #63
osteele added a commit that referenced this pull request Nov 8, 2025
This test verifies that Unicode variable names, specifically the Chinese
variable '描述' mentioned in issue #63, now work correctly after the fix
in PR #116 which added Unicode identifier support.

The test uses the exact example from the issue report to ensure the
previously failing case now passes.

Refs #63

Co-authored-by: Claude <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants