Sunday 17 July 2011

Unparsable

So I have a horrifically context-sensitive grammar. Maybe this is the source of why I had so many problems with my grammar in ANTLR and such.

Consider this simple example:

GetTypes()[10] identifier;

Array of type "GetTypes()" or an object the tenth element of GetTypes()?

This gets even worse when you start introducing pointer specifiers.

GetTypes()[10]* identifier;

Pointer to array of ten elements, or pointer to the tenth element, or even just a regular old multiply. Hence I've decided to change the grammar and remove all type modifiers. They will now be member functions on "type". For example,

GetTypes().array(10).pointer() identifier;

This should be differentiable from

GetTypes()[10]* identifier;

even if it could be fairly verbose. What I'm really looking to do in my custom parser here will be to find the semicolon first, and work backwards. The trick is that no two expressions exist without syntax between them, like an operator or comma. If I find such a case, then it must be a variable definition, because it's not a valid expression. Of course, determining the difference between any of the above cases and a statement (that isn't an expression) will be easy- it really doesn't look like a for loop or anything like that.

No comments:

Post a Comment