A toy programming language compiler built with LLVM, following the official LLVM Kaleidoscope Tutorial.
Kaleidoscope is a simple procedural language that supports:
- Function definitions and declarations
- Basic arithmetic operations
- Function calls
- Variables
- Comments
The compiler generates LLVM IR, which can then be compiled to native machine code.
- LLVM (version 14 or later recommended)
- Clang or GCC with C++17 support
- CMake (optional, for build automation)
Ubuntu/Debian:
sudo apt-get install llvm llvm-dev clangmacOS (Homebrew):
brew install llvmFedora:
sudo dnf install llvm llvm-devel clangclang++ -g -O3 one.cpp `llvm-config --cxxflags --ldflags --system-libs --libs core` -o kaleidoscopeg++ -g -O3 one.cpp `llvm-config --cxxflags --ldflags --system-libs --libs core` -o kaleidoscopeRun the compiler to start the interactive REPL:
./kaleidoscopeYou'll see the ready> prompt where you can enter Kaleidoscope code.
ready> def add(x y) x + y;
Output:
Parsed a function definition.
define double @add(double %x, double %y) {
entry:
%addtmp = fadd double %x, %y
ret double %addtmp
}
ready> extern sin(x);
ready> 2 + 3;
ready> 4 * (2 + 3);
ready> add(1, 2);
| Operator | Description |
|---|---|
+ |
Addition |
- |
Subtraction |
* |
Multiplication |
< |
Less than comparison |
Lines starting with # are treated as comments:
# This is a comment
def foo(x) x * 2;
ready> def multiply(a b) a * b;
Parsed a function definition.
define double @multiply(double %a, double %b) {
entry:
%multmp = fmul double %a, %b
ret double %multmp
}
ready> def square(x) multiply(x, x);
Parsed a function definition.
define double @square(double %x) {
entry:
%calltmp = call double @multiply(double %x, double %x)
ret double %calltmp
}
ready> extern cos(x);
Parsed an extern.
declare double @cos(double)
ready> cos(1.0);
Parsed a top-level expr
define double @__anon_expr() {
entry:
%calltmp = call double @cos(double 1.000000e+00)
ret double %calltmp
}
Press Ctrl+D (EOF) to exit the REPL and print the complete LLVM IR module.
Kaleidoscope/
├── one.cpp # Main compiler source code
├── tests/ # Unit tests
│ ├── lexer_test.cpp
│ └── README.md
├── LICENSE # MIT License
└── README.md # This file
The compiler consists of several components:
- Lexer - Tokenizes input into tokens (numbers, identifiers, keywords)
- Parser - Builds an Abstract Syntax Tree (AST) using recursive descent parsing
- AST Nodes - Represent expressions, function definitions, and prototypes
- Code Generator - Traverses the AST and generates LLVM IR
NumberExprAST- Numeric literalsVariableExprAST- Variable referencesBinaryExprAST- Binary operations (+, -, *, <)CallExprAST- Function callsPrototypeAST- Function prototypesFunctionAST- Function definitions
This project is licensed under the MIT License - see the LICENSE file for details.