A high-performance compiler for a dynamic language, built in C++17 with direct LLVM IR generation and native machine code output. It's a true compiler, not a transpiler, so it implements a full multi-stage pipeline including custom lexical analysis, recursive descent parsing, AST construction, type inference, and LLVM-based optimization.
- Variable Declarations:
let,var, andconstkeywords - Data Types:
- Numbers (integers and floating-point)
- Arrays (dynamic, homogeneous)
- Strings (with escape sequences)
- Booleans (
true/false) - Null values
- Operators:
- Arithmetic:
+,-,*,/,% - Comparison:
==,!=,<,>,<=,>= - Logical:
&&,||,! - Assignment:
=
- Arithmetic:
- Control Flow:
if/elsestatementswhileloopsforloops with C-style syntax
- Built-in Functions:
print(value): Print value with newlineinput(): Read input from userstr(number): Convert number to stringnum(string): Convert string to numberint(string): Convert string to integer
- String Functions:
len(string): Return the length of a stringupper(string): Convert string to uppercaselower(string): Convert string to lowercaseincludes(haystack, needle): Check if haystack contains needle (returns 1.0 or 0.0)replace(haystack, old, new): Replace first occurrence of old with new in haystack
- Array Functions:
len(array): Return the length of an arrayappend(array, value): Append value to array and return new arrayincludes(haystack, needle): Check if haystack array contains needle (returns 1.0 or 0.0)
- Math Functions:
abs(x): Absolute valueround(x, [decimals]): Round to nearest integer or decimal placefloor(x): Round down to nearest integerceil(x): Round up to nearest integersin(x): Sine of x (radians)cos(x): Cosine of x (radians)tan(x): Tangent of x (radians)min(a, b, ...): Find minimum valuemax(a, b, ...): Find maximum valuepow(x, y): Raise x to power of ysqrt(x): Calculate square rootrandom(): Generate random number from 0 to 1
- Custom Functions:
- User-defined functions with parameters and return values
- Full recursion support (including mutual recursion)
- Local variable scope within functions
- Integration with built-in functions
- Comments:
- Single-line:
// comment - Multi-line:
/* comment */
- Single-line:
- Complete Pipeline: Lexer → Parser → AST → LLVM IR → Native Code
- Error Handling: Syntax error reporting with line/column information
- Optimization: Automatic optimization via LLVM's opt tool
- Multiple Output Formats: Can emit LLVM IR, assembly, object files, or executables
- Verbose Mode: See each compilation step
-
C++ Compiler:
- GCC 7.0+ or Clang 5.0+ with C++17 support
- On Windows: MinGW-w64 recommended
-
LLVM:
- Version 10.0 or higher (tested with 20.1.8)
- Must be in PATH or specify LLVM_DIR for CMake
-
Build Tools:
- CMake 3.10+ (optional but recommended)
- GNU Make or MinGW Make (for CMake builds)
-
Linker:
- GNU ld or compatible (usually comes with GCC)
g++ --version # C++ compiler
llvm-config --version # LLVM
llc --version # LLVM
cmake --version # Cmake (Optional)
ld --version # Linker.\build.bat # Output: twine.exechmod +x build.sh
./build.sh # Output: ./twinemkdir build
cd build
cmake ..
make # Output: build/bin/twineLLVM_FLAGS=$(llvm-config --cxxflags --ldflags --system-libs --libs core support irreader codegen mc mcparser option target)
g++ -std=c++17 -o twine main.cpp lexer.cpp parser.cpp ast.cpp codegen.cpp $LLVM_FLAGS# Compile a .tw file to executable
twine input.tw
# This creates:
# - input.ll (LLVM IR - deleted unless --verbose)
# - input.s (Assembly - deleted unless --verbose)
# - input.o (Object file - deleted unless --verbose)
# - input.exe (Windows) or input (Unix)twine <input.tw> [options]
Options:
-o <output> Specify output executable name
--emit-ir Output LLVM IR only (.ll file)
--emit-asm Output assembly only (.s file)
--emit-obj Output object file only (.o file)
--verbose Show all compilation steps and keep intermediate files
--help Display help message
--version Show version information# Compile with custom output name
twine fibonacci.tw -o fib
# Generate and inspect LLVM IR
twine program.tw --emit-ir
cat program.ll
# See all compilation steps
twine program.tw --verbose
# Generate optimized assembly
twine program.tw --emit-asmThis project is provided as-is for all purposes, and was really just for me - it's not amazing code, but if you want to use it, it's yours. Feel free to use, modify, and distribute as needed.
- The LLVM Project - For providing excellent compiler infrastructure and tools.
- The Dragon Book - For foundational compiler theory, and lots on lexing and parsing.
- Wisp by Adam McDaniel - A minimal Lisp-like language built with LLVM.
- LLVM IR with the C++ API - Tutorial by Mukul Rathi
- “Hello, LLVM” - Tutorial by James Hamilton
- Awesome LLVM - A curated list of useful LLVM resources.
- Jesse Squires' Blog - A great roundup of compiler and LLVM learning materials.
- Building an Optimizing Compiler - Old, but still very good, helpful with high level design.