This project is a simple C compiler that implements a lexical analyzer for a subset of the C programming language. It reads multi-line user input, categorizes tokens like keywords, identifiers, constants, operators, and punctuation, and outputs detailed token statistics.
Compiler/
β
βββ codes/ # C source files for lexical analyzer, parser, and other logic
β βββ lex.c # Lexical analyzer implementation
β βββ parse.c # Parsing logic
β βββ run.c # Code execution logic
β βββ main.c # Main entry point
β
βββ examples/ # Example input files to test the compiler
β βββ filename.txt
β βββ countdigits.txt
β βββ factorial.txt
β βββ fibonacci.txt
β βββ grades.txt
β βββ largestof3.txt
β βββ power.txt
β βββ printeven.txt
β βββ reverse.txt
β βββ smallestof3.txt
β βββ sum.txt
β βββ swap.txt
β βββ tables.txt
β βββ array1.txt
β βββ array2.txt
β βββ array3.txt
β
βββ obj/ # Object files directory (Created through commands)
β
βββ outputs/ # Outputs directory (Created through commands)
β
βββ project report.pdf
- Keyword recognition: Identifies keywords like
int,float,if,else,switch - Identifier recognition: Supports variables starting with alphabetic characters or underscores
- Constant recognition: Identifies numeric constants, including decimals
- Operator recognition: Supports arithmetic and logical operators like
+,-,*,/,==,!=,&&,|| - Punctuation recognition: Recognizes common punctuation such as
;,,,{,},(,) - Finite State Machine (FSM): Used for efficient token detection
- Token classification: Categorizes tokens and stores them in separate lists
- Token statistics: Outputs the number of tokens for each category
- GCC or any standard C compiler
- Basic familiarity with command-line tools
Open your terminal and run the following commands in the project root:
mkdir -p obj
gcc -std=gnu11 -Wall -Werror -c codes/lex.c -o obj/lex.o
gcc -std=gnu11 -Wall -Werror -c codes/parse.c -o obj/parse.o
gcc -std=gnu11 -Wall -Werror -c codes/run.c -o obj/run.o
gcc -std=gnu11 -Wall -Werror -c codes/main.c -o obj/main.o
gcc -o interpret obj/lex.o obj/parse.o obj/run.o obj/main.o./interpret examples/filename.txt
./interpret examples/countdigits.txt
./interpret examples/factorial.txt
./interpret examples/fibonacci.txt
./interpret examples/grades.txt
./interpret examples/largestof3.txt
./interpret examples/power.txt
./interpret examples/printeven.txt
./interpret examples/reverse.txt
./interpret examples/smallestof3.txt
./interpret examples/sum.txt
./interpret examples/swap.txt
./interpret examples/tables.txt
./interpret examples/array1.txt
./interpret examples/array2.txt
./interpret examples/array3.txtBasics of compiler design and lexical analysis
Practical implementation of tokenization and FSM in C
Understanding string manipulation, character classification, and categorizing tokens