-
Notifications
You must be signed in to change notification settings - Fork 1
AST Format
The source code is parsed to a syntax tree, which is stored as a flat array. Each row of the array corresponds to a branch or leaf, depending on the branch type. The number of elements in total is stored in the first element st[0][0].
The columns of the array are as follows:
-
AP_STI_BRTYPE- The branch type, one of theAP_BR_*constants -
AP_STI_VALUE- The values associated with this branch/leaf. Meaning depends onAP_STI_BRTYPE. -
AP_STI_LEFT- The index of the left child branch typically. Meaning depends onAP_STI_BRTYPE. -
AP_STI_RIGHT- The index of the right child branch typically. Meaning depends onAP_STI_BRTYPE. -
AP_STI_TOK_ABS- The absolute character offset of the branch within the source file. -
AP_STI_TOK_LINE- The line the branch appears at within the source file. -
AP_STI_TOK_COL- The column the branch appears at within the source file.
Terminals are values that have no children, so constants, variables, macros, keywords etc. These are any of the following:
-
AP_BR_NUMBER- Numeric constant -
AP_BR_STR- String constant -
AP_BR_VARIABLE- A variable. -
AP_BR_MACRO- A macro. -
AP_BR_PREPROC- A preprocessor statement.#include'd files may be automatically parsed into the same file. -
AP_BR_KEYWORD- A keyword (True, False, Default)
In all the terminals, the AP_STI_VALUE column is filled with the text representation of the constant/variable. In the case of variables and macros the $ and @ are included in the value.
Represents a file within the AST. The AP_STI_VALUE column has the file name, and the AP_STI_LEFT contains a comma separated list of top level statements within the file.
Represents and operator. The name/symbol of the operator is in the AP_STI_VALUE column, and the left and right hand sides of the operator are in AP_STI_LEFT and AP_STI_RIGHT respectively. Unary operators on have a left hand side, even if the child branch would appear on the right hand side in the source (e.g. Not True would have True on the LHS of the Not branch.
Identical to operators, except that they represent an assignment.
The definition of an ENUM set. The VALUE column holds the flags (see VARF_FLAGS) for the variables, and the RIGHT column is a comma separated list of AP_BR_DECL statements, not all of these have values if they are not assigned specific ones.
A variable declaration/definition statement. The VALUE column holds the flags (see VARF_FLAGS), the left child is the variable and the right hand side is the value.
A function definition. The VALUE column is the name of the function, the left side is a comma seperated list of DECL branches for the parameters of the function, and the right side is a comma seperated list of the statements making up the body of the function.
An IF statement. The VALUE column the condition (always the index of another branch), the left child is a comma seperated list of AP_BR_IF branches that are the ElseIfs of the statement. One of these might have a blank condition (VALUE) column, in which case it is the Else statement.
A While loop or Do ... Until loop. The left child is the condition, and the right side a comma separated list of statements in the body.
A for loop. The VALUE column is the index of the loop variable leaf. The left side is a comma separated list that stores the range to be iterated over. The format of this is Start,End,StepOp,Step. Start and End are indexes of expression branches, the StepOp is stored literally as an operator symbol (one of +, -, * or /). Step is the index of an expression branch. The right child is a comma separated list of statements in the body.
For example, the following loop:
For $i = 1 To 8 step *2 ... Next
Would be stored as the following:
A For ... In ... Next loop. The VALUE column is the loop variable, left child the range expression, and right is a comma separated list of statements in the body.
A select statement. The right hand side is a comma separated list of AP_BR_CASE branches.
A switch statement. The left side is the expression to switch, the right hand side is a comma separated list of AP_BR_CASE branches.
A case statement. The left side is the condition expression, and the right side a comma separated list of statements in the body. In Switch statements the left side can also be a comma separated list of multiple expressions to match.
Case Else is stored by setting the condition to "" and the VALUE column to "Else". It's probably better to use the former for determining the default case, rather than the latter.
A redim statement. The left side is an array lookup expression.
An array lookup statement. The left side is the array expression (normally a variable, doesn't have to be), and the right side is a comma separated list of indexes.
A function call. The left column is the expression for the function, the right column the comma separated list of arguments.
Same as AP_BR_OPERATOR except that the right hand side is a comma separated pair of expressions, the first if the condition evaluates to true, the second if false.