Skip to content

fix(builtins): prevent awk parser panic on multi-byte UTF-8#476

Merged
chaliy merged 1 commit intomainfrom
claude/fix-395-Y2nIj
Mar 2, 2026
Merged

fix(builtins): prevent awk parser panic on multi-byte UTF-8#476
chaliy merged 1 commit intomainfrom
claude/fix-395-Y2nIj

Conversation

@chaliy
Copy link
Copy Markdown
Contributor

@chaliy chaliy commented Mar 2, 2026

Summary

  • Add current_char() and advance() helpers for char-boundary safe parsing
  • Replace all 66+ chars().nth(self.pos) calls with current_char()
  • Fix string, regex, identifier, and comment parsing for multi-byte chars
  • Fix matches_keyword lookahead to use slice-based char access

Closes #395

The awk parser used byte offsets with chars().nth() (char index),
causing panics when multi-byte UTF-8 appeared in comments, strings,
or regex patterns. Added current_char()/advance() helpers for safe
char-boundary handling and replaced all chars().nth(byte_offset)
patterns with byte-safe slicing.

Closes #395
@chaliy chaliy force-pushed the claude/fix-395-Y2nIj branch from 5f75382 to 49a1814 Compare March 2, 2026 18:05
@chaliy chaliy merged commit f527d64 into main Mar 2, 2026
17 checks passed
@chaliy chaliy deleted the claude/fix-395-Y2nIj branch March 12, 2026 03:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

fix(builtins): awk parser panics on Unicode char boundaries

2 participants