for(a=[1];!a[999];)/^(11+)\1+$/.test(a+=1)||print(a.length)
Friday 23 March 2018
Thursday 8 March 2018
Vexatious Parses in C++
As part of my work on the egg computer language specification, I've been looking into parsing curly-brace-type languages. There are a number of cul de sacs in these language specifications. Here's one from C++ I've been struggling with today:
Here's a list of parses:
I decided that prefix and postfix increments/decrements as expressions are bad things. This is mainly due to problems associated with side-effects and evaluation ordering. Consider:
int a = 1; int b = 2; int c = a-b;What's the value of "c"? Obviously, it's minus one. But what about this:
c = a--b;My Microsoft compiler tells me that this is a malformed expression:
syntax error: missing ';' before identifier 'b'But the following is fine:
c = a---b;This sets "c" to minus one and decrements "a". Honest.
Here's a list of parses:
a-b // Parsed as "a - b" a--b // Fails to compile: missing ';' before identifier 'b' a---b // Parsed as "a-- - b" a----b // Fails to compile: '--' needs l-value a-----b // Fails to compile: '--' needs l-value a- -b // Parsed as "a - -b" a- --b // Parsed as "a - --b" a-- -b // Parsed as "a-- -b" a- - -b // Parsed as "a - - -b"
The compiler is obviously "greedy" when parsing operators; so, in the absence of white-space, it's easy for it to overlook an alternative interpretation:
a--b // COULD be parsed as "a - -b" a----b // COULD be parsed as "a-- - -b" a-----b // COULD be parsed as "a-- - --b"
I expect the compiler-writers have their hands tied by the formal language specification. But, for a new language like egg, I don't have any such restrictions.
I decided that prefix and postfix increments/decrements as expressions are bad things. This is mainly due to problems associated with side-effects and evaluation ordering. Consider:
int a = p[++i] + p[i++]; // Not allowed
However, I think I will retain the prefix increment/decrement statements:
++i; // Allowed --i; // Allowed i++; // Not allowed i--; // Not allowed
This permits the idiomatic counter-based loop:
for (i = 0; i < count; ++i) { ... }
The reasons for only allowing the prefix versions are two-fold:
- It make the language specification much less ambiguous; and
- People still harp on about prefix increments/decrements being slightly faster than their postfix variants, which is why they are "preferred" for looping.
Whilst I was at it, I also decided I can probably do without the unary '+' operator. That gets rid of the truly vexatious:
c = a+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+b;
Subscribe to:
Posts (Atom)