This document provides an in-depth look at the Dim compiler's architecture and implementation details.
The Dim compiler is a multi-phase compiler written in Python 3.6.8+. It follows a traditional compiler pipeline:
Source (.dim)
↓ Lexer (dim_lexer.py)
↓ Parser (dim_parser.py)
↓ AST
↓ Semantic Analysis (dim_semantic.py)
↓ Type Checker (dim_type_checker.py)
↓ MIR (dim_mir_lowering.py)
↓ Borrow Checker (dim_borrow_checker.py)
↓ LLVM IR (dim_mir_to_llvm.py)
File: dim_lexer.py
The lexer converts source code into a stream of tokens. Key features:
- Indentation tracking: Injects
INDENT/DEDENTtokens for Python-style blocks - Keyword recognition: Distinguishes keywords from identifiers
- Span tracking: Every token includes source location information
Key token types in dim_token.py:
IDENTIFIER- Variable/function namesKEYWORD- Language keywords (fn, let, if, etc.)INTEGER,FLOAT,STRING- LiteralsINDENT,DEDENT- Block structureAT- Decorators (@)
File: dim_parser.py
Uses recursive descent parsing with Pratt expression parsing for precedence:
Uses the Pratt parser pattern for expressions:
_parse_expression() → _parse_binary() → _parse_unary() → _parse_postfix() → _parse_primary()
Top-level statements are parsed based on keywords:
fn/async→_parse_function()@→_parse_decorated_function()let→_parse_let()if→_parse_if()- etc.
Defined in dim_ast.py:
FunctionDef- Function definitionsLetStmt- Variable bindingsReturnStmt- Return statementsIfStmt- Conditional statementsBinaryOp,UnaryOp- OperationsCall- Function callsBorrowExpr- Borrow expressions- etc.
File: dim_types.py
Type (base)
├── PrimType (i32, f64, bool, string, unit)
├── RefType (&T, &mut T)
├── TensorType (Tensor[T, shape])
├── PromptType (prompt definitions)
├── FnType (function types)
├── GenericType (generic parameters)
└── TypeVar (for inference)
The type checker uses unification for type inference:
- Match concrete types directly
- Instantiate type variables with concrete types
- Propagate constraints through expressions
File: dim_type_checker.py
Implements Hindley-Milner type inference with extensions for Dim-specific features.
- Literals: Infer from value (42 → i32, 3.14 → f32)
- Variables: Look up declared type
- Let bindings: Infer from expression, constrain to annotation
- Functions: Infer return type from body
- Borrows: Create RefType wrapper
E0020: Undefined variableE0021: Undefined functionE0030: Type mismatchE0040: Use of moved valueE0041: Double mutable borrow
File: dim_mir_lowering.py
Converts AST to Mid-Level IR (MIR) - a Control Flow Graph (CFG) representation.
MIRModule
└── MIRFunction
├── params: List[Local]
├── locals: List[Local]
├── return_type: Type
└── blocks: List[BasicBlock]
├── stmts: List[MIRStatement]
└── terminator: Terminator
StorageLive(local)- Mark local as liveStorageDead(local)- Mark local as deadAssign(place, rvalue)- AssignmentBorrow(dest, place)- Create borrowDrop(place)- Drop value
Return(value)- Return from functionGoto(target)- Unconditional jumpBranch(condition, true, false)- Conditional branchCall(callee, args, dest, next_block)- Function call
File: dim_borrow_checker.py
Implements Polonius-inspired borrow checking using loan lifetime analysis.
- Loans: Each borrow creates a loan with a lifetime
- Paths: Dereferences and field accesses create paths
- Violations: Detects use-after-move, double mutable borrow
Used for precise borrow checking:
- Compute which locals are live at each point
- Loans outlive their borrowers
- Invalidates loans on move
File: dim_mir_to_llvm.py
Generates LLVM IR from MIR.
| Dim Type | LLVM Type |
|---|---|
| i32 | i32 |
| i64 | i64 |
| f32 | float |
| f64 | double |
| bool | i1 |
| string | i8* |
| Unit | void |
| Tensor[T, n] |
Function calls generate:
callinstruction with resultbrto continuation block
Example:
%result = call i32 @add(i32 %a, i32 %b)
br label %bb1File: dim_cli.py
dim lex <file>- Lex onlydim parse <file>- Parse and print ASTdim check <file>- Type checkdim mir <file>- Lower to MIR and printdim borrow <file>- Run borrow checkerdim build <file>- Full pipelinedim test- Run test suite
File: dim_tests.py
Uses a simple test framework with:
@testdecorator for test registration- Tag-based filtering
- Assertion helpers
Run tests:
python dim_tests.py
python dim_tests.py --tag lexer- Native binary emission via LLVM
- WASM target
- Link-time optimization
- Async/await runtime
- Actor message passing implementation
- Typed prompts with model adapters
- Structured output validation
- Taint analysis
- Capability-based security
- Z3 integration for verification