A simple, educational BNF (Backus-Naur Form) parser and matching engine written in Go.
This project implements a parser capable of loading BNF grammar definitions and verifying if an input text matches defined patterns. It supports modern features like Packrat parsing (memoization), left recursion handling, and AST (Abstract Syntax Tree) generation.
- BNF Parsing: generic parser for standard BNF syntax (
<rule> ::= expression). - Pattern Matching: Supports Sequences, Choices (
|), Repetitions (+,*,?), grouping(), and literals. - Packrat Engine: Efficient matching using memoization to ensure performance even with complex backtracking.
- Left Recursion: Direct support for left-recursive rules (e.g.,
Expr ::= Expr "+" Term | Term). - AST Generation: Generate and visualize parse trees for successful matches.
- Grammar Validation: "Pre-flight" checks to ensure all referenced rules are defined.
- CLI Utility: Robust command-line tool with colored error reporting and visual pointers.
make install
# or
go installChecking if the whole content of input.txt matches grammar.bnf:
bnf -g grammar.bnf -i input.txtVisualize the structure of the match:
bnf -g examples/numbers.bnf -i input.txt -pOnly check if the grammar file itself is valid (no undefined rules):
bnf -g grammar.bnf -vOverride the entry point defined in the grammar (useful for testing sub-rules):
bnf -g examples/numbers.bnf -i input.txt -s "digit"Useful for piping content:
echo "123" | bnf -g examples/numbers.bnfCheck each line independently (useful for log files or lists):
bnf -g grammar.bnf -i input.txt -lStandard <rule> ::= ... syntax. Literals can use " or '.
<number> ::= <digit>+
<digit> ::= "0" | "1" | "2" | "3" | "4"
<expr> ::= <expr> "+" <number> | <number> // Left recursion supported!
In a simplified form, the syntax supported by this tool is:
<grammar> ::= <rule>+
<rule> ::= <identifier> "::=" <expression>
<expression> ::= <sequence> ( "|" <sequence> )*
<sequence> ::= <factor>*
<factor> ::= <atom> ( "*" | "+" | "?" )?
<atom> ::= <identifier> | <string> | "(" <expression> ")"
Build and verify the project using the included Makefile:
make build: Compile the binary.make test: Run unit tests and API tests.make integration-test: Run CLI tests against examples.make lint: Run code quality checks.make docs: Generate API documentation in thedocs/folder.
The engine uses a modern Packrat parsing approach:
- Grammar Parser: Reads the
.bnffile into a raw AST, then builds a linked execution graph. - Memoization: Every (Node, Position) match result is cached to prevent exponential time complexity in ambiguous grammars.
- Recursive Growth: Handles left-recursive rules by iteratively "growing" the match from a seed failure until a fixed point is reached.
- Node Interface:
type node interface { match(ctx *context, pos int) ([]MatchResult, error) Expect() []string }