Skip to content

Returning an error in an sdt action produces the "wrong" ErrorToken #121

@kfsone

Description

@kfsone

note: not using any of my branches for this

Scenario: An sdt action that yields an error produces an error token corresponding to the end of the match, this is frequently undesireable:

newline : '\n';
ident : 'a'-'z' { 'a'-'z' };

<<import "fmt">>

Rule: P1 newline P2 newline P3;

P1 : ident << func()interface{} {fmt.Printf("P1 %#+v\n", $0); return $0}(), nil >>;
P2 : ident << $0, func() error { fmt.Printf("P2 %#+v\n", $0); return nil }() >>;
P3 : ident << nil, fmt.Errorf("should be line 3 col 1") >>;

And then a simple parser wrapper:

func main() {
        l := lexer.NewLexer([]byte("a\nb\nc"))
        p := parser.NewParser()
        _, err := p.Parse(l)
        fmt.Printf("%+v\n", err)
}

the output you get is:

> go run .
P1 &token.Token{Type:3, Lit:[]uint8{0x61}, Pos:token.Pos{Offset:0, Line:1, Column:1}}
P2 &token.Token{Type:3, Lit:[]uint8{0x62}, Pos:token.Pos{Offset:2, Line:2, Column:1}}
Error in S7: $(1,), Pos(offset=5, line=3, column=2): should be line 3 col 1

Not sure whether by design or bug; I can sort of see how choosing a token when there are 0 or many might also be "wrong".

Perhaps a solution would be to allow the user to return a token with the error and have that be the ErrorToken?

P3: "*" identifier "*" << $1, errors.New("can't use identifier between asterisks, that's just rude") >>;

and then the parser would use the first return value if it passes a *token.Token type switch?

  switch t := attr.(type) {
  case *token.Token:
    e.ErrorToken = t
  default:
    /* untouched */
  }

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions