Skip to content

Grammar railroad diagram #3

@mingodad

Description

@mingodad

Would be nice if tameparse could also generate an EBNF as understood by https://www.bottlecaps.de/rr/ui to generate railroad diagrams (https://en.wikipedia.org/wiki/Syntax_diagram).

I extended bison, byacc, lemon and btyacc to do so and can be seen here https://github.com/mingodad/lalr-parser-test , also CocoR here https://github.com/mingodad/CocoR-Java , unicc here https://github.com/mingodad/unicc , and peg/leg here https://github.com/mingodad/peg .

Would be nice to have it output a consolidated EBNF to have a full global view of the final grammar because usage of inheritance can use several pieces to compose the final grammar.

Bellow is a partial manual conversion of TameParse/Language/definition.tp to an EBNF understood by https://www.bottlecaps.de/rr/ui .

Copy and paste the EBNF shown bellow on https://www.bottlecaps.de/rr/ui on the tab Edit Grammar the click on the tab View Diagram to see/download a navigable railroad diagram.

//
// The top-level definitions
//

Parser-Language		::= (TopLevel-Block)*

TopLevel-Block		::= Language-Block
						| Import-Block
						| Parser-Block
						| Test-Block

Language-Block		::= language identifier/*[name]*/ (Language-Inherits)? '{' (Language-Definition)* '}'

Import-Block			::= import string/*[filename]*/

Language-Inherits		::= ':' identifier/*[inherit-from]*/

//
// The language block
//

Language-Definition	::= Lexer-Symbols-Definition
						| Lexer-Definition
						| Ignore-Definition
						| Keywords-Definition
						| Grammar-Definition
						| Precedence-Definition

//
// Basic language items
//

Lexer-Symbols-Definition	::= Lexer-Symbols-Modifier*/*[modifiers]*/ lexer-symbols '{' (Lexeme-Definition)*/*[definitions]*/ '}'

Lexer-Definition 			::= /*[=> Lexer-Modifier* lexer]*/ Lexer-Modifier*modifiers? lexer '{' (Lexeme-Definition)*/*[definitions]*/ '}'

Ignore-Definition			::= ignore '{' (Keyword-Definition)*/*[definitions]*/ '}'

Keywords-Definition		::= /*[=> Lexer-Modifier* keywords]*/ Lexer-Modifier*/*[modifiers]*/ keywords '{' (Keyword-Definition)*/*[definitions]*/ '}'

Lexer-Modifier			::= weak
							| case sensitive
							| case insensitive

Lexer-Symbols-Modifier	::= case sensitive
							| case insensitive

Keyword-Definition		::= identifier/*[literal]*/
							| Lexeme-Definition/*[lexeme]*/

Lexeme-Definition 		::= identifier/*[name]*/ ('=' | "|=") (regex | string | character)
							| /*[=> replace identifier '=']*/ replace identifier/*[name]*/ '=' (regex | string | character)
		    | identifier/*[name]*/ '=' identifier/*[source-language]*/ '.' identifier/*[source-name]*/

//
// Defining grammars
//

Grammar-Definition		::= grammar '{' (Nonterminal-Definition)*/*[nonterminals]*/ '}'

Nonterminal-Definition	::= /*[=> nonterminal ('=' | "|=")]*/ nonterminal ('=' | "|=") Production ('|' Production)*
							| /*[=> replace nonterminal '=']*/ replace nonterminal '=' Production ('|' Production)*

// Top level is just a simple EBNF term, as the '|' operator creates a new production at this point
Production				::= (Simple-Ebnf-Item)*/*[items]*/

Ebnf-Item					::= (Simple-Ebnf-Item)*/*[items]*/
							| (Simple-Ebnf-Item)*/*[items]*/ '|' Ebnf-Item/*[or-item]*/

Simple-Ebnf-Item			::= Nonterminal Semantic-Specification?
							| Terminal Semantic-Specification?
							| Guard Semantic-Specification?
							| Simple-Ebnf-Item '*' Semantic-Specification?
							| Simple-Ebnf-Item '+' Semantic-Specification?
							| Simple-Ebnf-Item '?' Semantic-Specification?
							| '(' Ebnf-Item ')' Semantic-Specification?

Guard						::= "[=>" Ebnf-Item ']'
							| "[=>" '[' can-clash ']' Ebnf-Item ']'

Nonterminal				::= nonterminal
							| identifier/*[source-language]*/ '.' nonterminal

Terminal					::= Basic-Terminal
							| identifier/*[source-language]*/ '.' Basic-Terminal

Basic-Terminal			::= identifier/*[lexeme-name]*/
							| string
							| character

//
// Semantics
//

Semantic-Specification	::= '[' Semantic-Item/*[first-item]*/ (',' Semantic-Item)*/*[more-items]*/ ']'

Semantic-Item				::= identifier/*[name]*/
							| conflict '=' shift
							| conflict '=' reduce
							| conflict '=' weak reduce

//
// Defining precedence
//

Precedence-Definition		::= precedence '{' Precedence-Item*/*[items]*/ '}'

Precedence-Item			::= left Equal-Precedence-Items
							| right Equal-Precedence-Items
							| non-associative  Equal-Precedence-Items
							| non-assoc Equal-Precedence-Items

Equal-Precedence-Items	::= Simple-Ebnf-Item
							| '{' Simple-Ebnf-Item*/*[terminals]*/ '}'

//
// The parser declaration block
//

Parser-Block				::= parser identifier/*[name]*/ ':' identifier/*[language-name]*/ '{' (Parser-StartSymbol)+/*[start-symbols]*/ '}'

Parser-StartSymbol		::= Nonterminal

//
// Test definition block
//

Test-Block				::= test identifier/*[language-name]*/ '{' Test-Definition*/*[tests]*/ '}'

Test-Definition			::= Nonterminal '=' Test-Specification+
							| Nonterminal "!=" Test-Specification+
							| Nonterminal from Test-Specification+

Test-Specification		::= string
							| /*[=> identifier '(']*/ identifier '(' string ')'


/// Weak keywords
/// Declared here to suppress warnings
//weak keywords {
	//\(\S+\) -> \1 ::= "\1"
language ::= "language"
import ::= "import"
lexer-symbols ::= "lexer-symbols"
lexer ::= "lexer"
ignore ::= "ignore"
weak ::= "weak"
keywords ::= "keywords"
grammar ::= "grammar"
replace ::= "replace"
parser ::= "parser"
test ::= "test"
from ::= "from"
case ::= "case"
sensitive ::= "sensitive"
insensitive ::= "insensitive"
precedence ::= "precedence"

left ::= "left"
right ::= "right"
non-associative ::= "non-associative"
non-assoc ::= "non-assoc"

conflict ::= "conflict"
shift ::= "shift"
reduce ::= "reduce"

can-clash ::= "can-clash"
//}

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions