Skip to content

<regex>: (a)(\3)(c) should be accepted #6091

@Alcaro

Description

@Alcaro

Describe the bug

sorry, @muellerj2, I missed a spot when I wrote that regex test suite a while ago

https://262.ecma-international.org/5.1/#sec-15.10.2.11 "It is an error if n is greater than the total number of left capturing parentheses in the entire regular expression."

Therefore, (a)(\3)(c) is valid. (The \3 will only match the empty string, because its ) hasn't been reached yet.) (a)(\4)(c), however, is not.

Yes, that rule is absurd (much better to accept only up to the number of left parens seen - or even better, only accept backrefs that could be nonnull at this point, i.e. reject (a)|\1 and (?!(a))\1), but if the spec says so, then the spec says so.

Personally I'd implement it by compiling such impossible backrefs into absolutely nothing, just set a counter for highest backref seen. Then check that variable at compilation end, when the entire regex is analyzed.

Luckily, we don't need that JS compat hack where (a)(\4)(c) looks for a \x04 byte.

Command-line test case

Full test suite - https://godbolt.org/z/bffnYj7xT
Just this issue - https://godbolt.org/z/r6xqPKrWb

Expected behavior

See bug description

STL version

x64 msvc v19.50 VS18.2 (Godbolt)

Additional context

🦭

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingregexmeow is a substring of homeowner

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions