Skip to content

paulhondola/atomc_compiler

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AtomC-Compiler

AtomC-Compiler is an educational compiler project designed to translate and execute AtomC, a simplified subset of the C programming language. The project follows the fundamental stages of compiler construction, culminating in a stack-based virtual machine (VM) that executes the generated code.


Compiler Architecture

The compilation process is divided into seven distinct activities, each handling a specific layer of translation and analysis:

1. Lexical Analysis (ALEX)

  • Purpose: Groups individual characters from the source code into atomic units called tokens (e.g., numbers, identifiers, operators).
  • Implementation: Uses a tokenize function to iterate through the source code and build a linked list of Token structures.
  • Cleanup: Removes non-essential elements like white spaces and comments.

2. Syntactic Analysis (ASDR)

  • Algorithm: Implements Recursive Descent Analysis (ASDR).
  • Process: Transposes the formal grammar into a series of boolean functions that consume tokens based on language rules.
  • Refinement: Includes the elimination of left recursion to prevent infinite loops during parsing.

3. Domain Analysis (AD)

  • Symbol Table (TS): Organized as a stack of domains to handle nested scopes.
  • Scope Management: Tracks global and local variables, function parameters, and structure members.
  • Validation: Detects incorrect symbol redefinitions.

4. Type Analysis (AT)

  • Semantic Rules: Verifies that expressions and instructions adhere to AtomC semantic constraints.
  • L-values vs. R-values: Distinguishes between expressions that represent memory addresses (left-values) and those that represent simple values (right-values).
  • Type Synthesis: Propagates data types through expressions to ensure compatibility (e.g., operations between int and double).

5. Code Generation

  • Minimalist Approach: Injects semantic actions into the syntactic rules to generate instructions for the VM.
  • Stack Logic: Generates instructions for loading addresses (FPADDR), dereferencing (LOAD), and storing values ( STORE).

The Virtual Machine (VM)

The execution environment is a stack-based virtual machine.

  • Instruction Set: Includes operations such as OP_PUSH, OP_POP, OP_CALL, OP_HALT, and arithmetic comparisons like LESS.i.
  • Universal Values: The stack uses a Val structure capable of storing any supported AtomC data type.
  • External Functions: Supports pre-defined internal functions (e.g., put_i, put_d) to facilitate I/O operations.

Features

  • Recursive Descent Parser: Manual implementation without external generator tools.
  • Scope & Type Safety: Full validation of variable visibility and operation compatibility.
  • Execution: Native VM execution with instruction tracing for debugging.

Running & Testing

Prerequisites

  • Rust (stable toolchain)

Build

cargo build

For an optimized release build:

cargo build --release

Run

cargo run
cargo run -- <input_file> [output_file]
cargo run -- parser/test_parser_code_example.c parser/lexer_output.txt

The release binary is available at target/release/atomc_compiler after a release build.

Test

cargo test

To see output from passing tests as well:

cargo test -- --nocapture

Lint & Format

cargo fmt          # auto-format code
cargo clippy       # lint and catch common mistakes

About

Educational Compiler for C-Subset Language

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages