This is a toy language inspired by the C ternary operator.
It originated when I was thinking it would be nice to have an equivalent to the C ternary operator for the switch statement, this was then expanded to why not make everything an operator and eliminate keywords.
Further to this I arrived at the following guides for implementation:
- Operators not keywords
- Binary operators only
- Only use Standard C
- Use it as a sandbox for other ideas
Windows 7
jEdit Editor
CodeBlocks (to build and debug)
Git Source Control Management
FreeCommander File Manager
ConEmu Console Terminal
Implemented as an executable syntax tree interpreter.
UTF-8 encoding.
Comments are started with #
If followed by one of (, [, {, the comment is terminated by the corresponding ), ], }. The active brackets can be nested.
Otherwise comments are terminated at the end of line.
[0-9]+([Ee][0-9]+)?
(0x|0X)[0-9]+([Pp][0-9]+)?
[0-9]+\.[0-9]*([Ee][-+]?[0-9]+)?
(0x|0X)[0-9A-Fa-f]+\.[0-9A-Fa-f]*([Pp][-+]?[0-9]+)?
Delimited by ".
Can contain escape sequences (see Characters section)
Delimited by '.
Can contain an escape sequence.
Initiated by \, followed by:
0inserts a nul character.ninserts a new-line character.tinserts a horizontal-tab character.Uorufollowed by up to 8 hexadecimal digits specifying a Unicode code-point.Worwfollowed by up to 4 hexadecimal digits specifying a Unicode code-point.Xorxfollowed by up to 2 hexadecimal digits specifying a Unicode code-point.- End-of-Line characters; these are elided, including CR-LF and LF-CR pairings.
- for other characters, acts as a quoting mechanism.
Initiated by one of (, [, {, terminated by the corresponding ), ], }.
( ) are elided, replaced in the syntax tree by the bracketed sub-expression.
[ ] and { } are represented in the syntax tree by distinct operators.
[ ] is used to define arrays/environments.
{ } is used to designate an evaluation block.
Where the bracketed expression is an operand-less operator sans space, then this forms a distinct operator.
An array can be indexed, associative, or a mix of both. They can also act as an environment (aka Name-space or, scope).
Indexes are zero based. Assigning to to the last + 1 index, appends a new entry.
The following environments are predefined:
local which is the default scope within a function. It can be specifically invoked using the (:) operator. There are no predefined identifiers in the local environment.
static which is the default scope within a source file. It can be specifically invoked using the {:} operator. There are no predefined identifiers in the static environment.
global, which is available to all. It can be specifically invoked using the [:] operator. Unless oboe is invoked with the --math option, there are no predefined identifiers in the global environment.
system, which can be accessed via the sigil operator.
When an environment is applied to an expression or, expression-list, it is automatically linked to the current environment.
An anonymous environment can be utilized to limit the scope of variables.
Used to demark a block of code; primarily this will be used with conditional expressions to isolate a block of code to avoid unwanted interaction with the ; operator which is utilized to designate alternate program flow paths.
- applicate, has no lexical representation, but is invoked by adjacency.
,sequence, creates a list of expressions.;assemblage, creates a list of sequences/expressions.
See lex.h for permitted lexeme characters.
Where an operand-less operator is bracketed sans space, then this forms a distinct operator.
User-defined operators can be named by prefixing an identifier with '`' and can also be terminated with another back-tick.
All operators are inherently binary; when used as a unary operator, the operator is still parsed at the same precedence level; therefore, when an operator is used as a unary operator in a sub-expression, the sub-expression should be parenthesized.
See lex.h for permitted lexeme characters.
Expressions are evaluated left to right.
Precedence levels, in decreasing order, are:
- Primary (Values, Identifiers, Sub-expressions)
- Applicate
- Binding
- Exponential
- Multiplicative
- Additive
- Bitwise
- Relational
- Logical
- Conditional
- Assigning
- Declarative
- Interstitial
- Sequence
- Assemblage
Although the goal is for only binary operators, the simplicity of the implementation of parsing gives us unary operators for free - it would require more code to enforce binary only. However, in the syntax tree all operators are binary, unary operations being represented by having a non-value operand (internally this is the Zen type - Zero/Empty/Null). The empty parenthesis () operator can be used to specify Zen explicitly.
The more detailed grammar (e.g. declaration, selection, iteration) is handled at runtime; but is built from binary operators.
left-operand ; right-operand
An assemblage may be evaluated differently when used as an operand, but is otherwise evaluated thus:
left-operand is evaluated, then right-operand is evaluated, the result of evaluating the right-operand is returned.
left-operand , right-operand
A sequence may be evaluated differently when used as an operand, but is otherwise evaluated thus:
left-operand is evaluated, then right-operand is evaluated, and a new sequence of the results is created. Individual operators [e.g. conditional, iteration or, selection] may handle sequences differently in certain instances.
operand .. operand
either:
referent : operand
or:
referent :: operand
or:
referent :^ reference
or:
referent ( parameter? (, parameter)* ) : operand
or:
[precedence-operator-string]? operator-string ( parameter? (, parameter)* ) : operand
or:
operator-string : operator-string
Normally, non-operator declarations are made in the static environment (source-file, function, ...); if the global environment operator [:] is applied to the declaration then it is made in the global environment. Operator declarations are always made in the global environment.
Declarations within a non static-scope (e.g. within a function), can be made static by applying the static environment operator (:). They will be visible to all functions that share the same static scope; e.g. within the same source file.
either:
reference = operand
or:
reference =^ reference
either:
condition ? true-operand
condition is evaluated, and if the result, when cast to a boolean value, evaluates to true, then true-operand is evaluated.
or:
condition ? (true-operand ; false-operand)
condition is evaluated, and if the result, when cast to a boolean value, evaluates to true, then true-operand is evaluated, otherwise false-operand id evaluated.
sequences in condition are evaluated as a simple list of expressions, each evaluated in turn; with the result of the evaluation of the final expression in the sequence is used to determine the condition.
operand ? Zen
Zen ? operand
evaluates operand and returns its boolean value.
The ! operator is as above, except the condition is inverted.
either:
condition ?: ( (case-expression : action-expression ;)+ default-action-expression?)
or:
Zen ?: ( (case-expression : action-expression ;)+ default-action-expression?)
sequences in condition are evaluated as a simple list of expressions, each evaluated in turn; with the result of the evaluation of the final expression in the sequence is used to determine the condition.
either:
iteration-control ?* iteration-operand
or:
iteration-control ?* ( _iteration-expression ; no-iteration-expression )
no-iteration-operand is evaluated if the controlling condition never evaluates true
or:
iteration-control ?* Zen
where iteration-control is either:
condition
or:
( initialization ; condition )
or:
( initialization ; condition ; recalculation )
or:
( identifier : range [ && condition] )
or:
( identifier = range [ && condition] )
or:
( identifier : sequence [ && condition] )
or:
( identifier = sequence [ && condition] )
or:
( identifier : array[range] [ && condition] )
or:
( identifier = array[range] [ && condition] )
or:
( identifier : [initializer] [ && condition] )
or:
( identifier = [initializer] [ && condition] )
The !* operator is as above, except the condition is inverted; does not apply to ranges.
sequences in initialization, condition and, recalculation are evaluated as a simple list of expressions, each evaluated in turn; in the case of condition with the result of the evaluation of the final expression in the sequence is used to determine the condition.
left-operand operator right-operand
The following operators are built-in:
&& logical AND
|| logical OR
< less than
<= less than or equal
== equal
<> not equal
>= greater than or equal
> greater than
& bitwise AND
| bitwise OR
~ bitwise XOR
+ add
- subtract
* multiply
/ divide
// modulo
<< shift left
>> shift right
<<< extract left
>>> extract right
<<> rotate left
<>> rotate right
The non-comparative operators also have a self-assigning form: e.g.
reference += operand
either:
left-operand right-operand
or:
operator right-operand
operand @ identifier
Used to access the attributes and functions of operand (e.g. type query, type conversion).
When Zen is the operand, it provides access to the system library.