A parser is a function that parses a stream of tokens and returns a result denoting whether the stream is syntactically valid or not.
The parser function used for the classic compiler is generated by Bison. classical.y contains all the configuration (i.e. the grammar, production rules) that is used by Bison to generate the parser function.
Although the modules of the compiler are written in C++, Bison's C-Language interface is used, as it is more extensively documented. This means the parser function module is a C file.
To generate the parser with Bison run make bison, this command will create classical.tab.c and states.bison. The former is the implementation file of the parser function and the latter is a file containing all states and items of the parser automaton. The states file is particularly handy for debugging, you can use it to examine state items and validate that your grammar defines the language as you expect.
To build the parser run make parser. If compilation fails you need to examine why. Keep in mind that the flag -Werror is passed to the compiler; this flag promotes all warnings to errors and compilation fails when errors occur.
Unit tests are written for the ast/nodes and ast/builder modules. You can run these tests with make test-nodes and make test-builder, respectively.
Integration tests are written for the whole parser module, where its output is examined given a fixed input stream of tokens. You can run the parser integration tests with make test-parser.
Run make run-parser to run the parser. This will by default feed the tests/test.classic file to the parser. You can modify this file as you'd like, to observe the parser's output and experiment with different programs written in classic.
The Abstract Syntax Tree (AST) is a tree consisting of nodes that represent grammar symbols.
Non-terminal symbols are nodes with children and terminal symbols are leaf nodes.
In the case of the production rule assign-stm = ids , "=" , exp here's an example of the corresponding AST for the statement:
price = before_tax + tax;
According to the below rules, an assign-stm is a statement, just as a compound-stm or an exp can be.
statement = [ non-empty-stm ]
non-empty-stm = compound-stm | assign-stm | exp
So, statement is a generic representation of a statement and compound-stm, assign-stm and exp are specific statement types.
Similarly, exp is a generic representation of an expression.
The concept of generic representation is expressed through generic base type AST node classes. e.g. there is one generic Expression_ class and the more specific ParenthesesExpression_, BinaryOperationExpression_, FunctionCallExpression_, LiteralExpression_ and VariableExpression_ classes.
All AST node classes are suffixed with an underscore. Pointers to those classes do not have suffices, e.g. Expression is equal to Expression_*.
The below class diagram depicts the aforementioned classes.
The generic class Expression_ has a type attribute and a downcast method; these enable the caller to create a specific expression object from the generic. While parsing, objects of the specific classes are created, but what gets pushed to the stack is a generic object; this
is accomplished through the upcast method that specific classes implement.
In summary:
- A generic object is converted to a specific one with the
downcastmethod - A specific object is converted to a generic one with the
upcastmethod - The AST nodes that are pushed to the stack are generic objects
Here's a picture of the parsing process this compiler implements.


