Skip to content

Context-free grammar is directly rooted in systems language as a Control Flow Graph. And the semantics of traversal are specified as a Control Flow Graph too.

Notifications You must be signed in to change notification settings

Antares007/tword

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

26 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

In the beginning was the Step

Ambiguous Step Graph

Context-sensitive parsing without a lexer hack

The Problem

The same word if is a keyword in one position and an identifier in another. That role exists only in context, and context is only known mid-walk. The decision must be made during traversal, and when it turns out wrong, the traversal must swap on the spot to retry with a different role assignment. A static traversal committed upfront has no recovery path.

The Idea

Turn the grammar into a computation that is pausable, incremental, and traversal-pluggable at every step. Represent your grammar as an Ambiguous Step Graph: a bipartite metagraph where pure void tail-recursive functions receive their entire instruction set from the traversal as parameters.

S -> "b" | S "a"

void S(ο o, τε op0, τγγ op1, ταγ op2, τγγ op3) {
  op1.step(o, (γ){unit18}, (γ){S_1});
}

Each step declares its role to the traversal by selecting an opcode and presenting exactly two continuations: what I am, what I hold, and what comes after me. The traversal listens and decides what to do with what it has been offered. The grammar step does not know who traverses it; the traversal does not know the grammar's intent. This mutual delegation is what makes them composable.

The opcodes presented by traversal to grammar step

opcode trversal role meaning a continuation b continuation
op0 descend end of rules
op0 walk end of rule
op1 descend choice point alternative branch next step in current rule
op1 walk end of rule alternative branch next step in current rule
op2 descend/walk action node action to execute next grammar step
op3 descend/walk reference entry of another rule next step in current rule

The same grammar steps (S0, S0_1, ..., a, b) can then be walked by entirely different traversal strategies — backtracking, exhaustive forward, type-check, interpret — all without touching the grammar.

I think the clean distinctions you are trying to draw between languages and layers of compilation don't actually exist in practice and are, in truth, a convenient lie that we tell ourselves in compiler courses and books. - Gerald Jinx

The Machine is the universal boundary. It separates what the Machine is from what interacts with it. Inside the boundary is only the mechanism of transition; outside the boundary is the specification of behavior. This boundary is what makes composition, substitution, and independence possible.

Every jmp, no ret: eliminating the return address attack surface by construction

The ROP attack vulnerability surface is the implicit "what happens next." Typed tail-recursive steps make "what happens next" explicit in the type, so there's nothing left to corrupt.

When we force our thinking to solve the problem with only a typed, void, tail-recursive function, essentially with typed steps, we will get a circuit-like, organically grown control flow graph.

To preserve natural structure (modularity, composability, scalability, and encapsulation), pass only flat values and typed step pointers.

A returnless cyclic circuit model in C

This program implements a complete computational substrate using only forward-passing continuations.

Execution emerges as a cyclic continuation-passing between structurally distinct roles.

ξ = state locus, accepts(observation operator)
ρ = observation operator, accepts(left mutation operator, state, and right mutation operator)
ω = mutation operator, accepts(state prime and transition operator)
δ = topology transition operator, accepts(state locus)

ξ → ρ → ω → δ → ξ

The observer ρ receives a state and an instruction set containing two options: left ω and right ω.

Essentially, the observer ρ receives the state and possible continuations offered by the state locus ξ.

So, state locus ξ is a typed ambiguous step, i.e., it carries forward more than one possible continuation.

Ambiguity can be composed, for example, between two abstract ambiguous steps, Yin and Yang:

Yin defines admissible continuations and selects one defined by Yang. Yang's chosen continuation defines its admissible continuations and chooses one defined by Yin.

This is what "no attack surface" looks like in assembly:

img

Security

Type is security; we don't need to check the whole control flow graph/circuit every time, only the type of machine we want to interact with.

In practice, it means we can have compiled typed machines separately. We can enforce a structural invariant for each cell to lay out in binary like a structured type, i.e., ωξω,ωξω,ωξω,ωξω,...ωξω. That is a crystallized/hashable form of the 10-cell circular tape machine. We can index into it as a universal interface.

By structurally aligning and composing little machines, we can have a binary hash of richer operational semantics, preserving universal interface.

Compilers need to align with the new reality; To produce typed machine binaries with a guaranteed universal interface.

How will produced typed machines change the compilation pipeline and growth of operational semantics, that is, the most engaging road to explore.

Status

These ideas are under active development. The roles of grammar steps, actions, and traversals are being shaped in real battle.

About

Context-free grammar is directly rooted in systems language as a Control Flow Graph. And the semantics of traversal are specified as a Control Flow Graph too.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages