-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Description
Describe the bug
sorry, @muellerj2, I missed a spot when I wrote that regex test suite a while ago
https://262.ecma-international.org/5.1/#sec-15.10.2.11 "It is an error if n is greater than the total number of left capturing parentheses in the entire regular expression."
Therefore, (a)(\3)(c) is valid. (The \3 will only match the empty string, because its ) hasn't been reached yet.) (a)(\4)(c), however, is not.
Yes, that rule is absurd (much better to accept only up to the number of left parens seen - or even better, only accept backrefs that could be nonnull at this point, i.e. reject (a)|\1 and (?!(a))\1), but if the spec says so, then the spec says so.
Personally I'd implement it by compiling such impossible backrefs into absolutely nothing, just set a counter for highest backref seen. Then check that variable at compilation end, when the entire regex is analyzed.
Luckily, we don't need that JS compat hack where (a)(\4)(c) looks for a \x04 byte.
Command-line test case
Full test suite - https://godbolt.org/z/bffnYj7xT
Just this issue - https://godbolt.org/z/r6xqPKrWb
Expected behavior
See bug description
STL version
x64 msvc v19.50 VS18.2 (Godbolt)
Additional context
🦭