For example, take the following example grammar:
expression : arithmetic-expression
| access-expression;
access-expression : expression '[' expression ']';
arithmetic-expression : expression '+' expression;
Of course, whilst this is the intent of the code, then actually this is thoroughly ambiguous and a simple input will yield the problem:
1 + 2[3]
Is it a:
(1 + 2)[3]
or b:
1 + (2[3])
This means that the need for disambiguating parenthesis is confusingly implied and the recursion has to be specified. The expression syntax instead has to look something like this:
expression : '(' arithmetic_expression ')'
| access_expression;
access_expression : arithmetic_expression '[' arithmetic_expression ']';
arithmetic_expression : bitwise_expression '+' bitwise_expression;
You get the picture. A pity that this process cannot be automated. The grammar for even a relatively simple language like C is actually hundreds of lines long, a lot more complex than I had imagined. However, should I successfully complete a grammar, then both the parser and the lexer will be generated for me, which is nice, and the first step towards realizing my meta-programming type-inferring dream.
No comments:
Post a Comment