Consider the context-free grammar:
S -> S S + | S S * | a
aa+a*
can be generated by this grammar.S
-> S
S -> S
S + S -> a S
+ S -> a a + S
-> a a + a *What language is generated by the following grammars? In each case justify your answer.
Which of the grammars in Exercise 2.2.2 are ambiguous?
Yes
Yes
Yes
Construct unambiguous context-free grammars for each of the following languages. In each case show that your grammar is correct.
1. E -> E E op | num
2. list -> list , id | id
3. list -> id , list | id
4. expr -> expr + term | expr - term | term
term -> term * factor | term / factor | factor
factor -> id | num | (expr)
5. expr -> expr + term | expr - term | term
term -> term * unary | term / unary | unary
unary -> + factor | - factor | factor
factor - > id | num | (expr)
Show that all binary strings generated by the following grammar have values divisible by 3. Hint. Use induction on the number of nodes in a parse tree.
num -> 11 | 1001 | num 0 | num num
Does the grammar generate all binary strings with values divisible by 3?
Proof
Any string derived from the grammar can be considered to be a sequence consisting of 11 and 1001, where each sequence element is possibly suffixed with a 0.
Let n
be the set of positions where 11
is placed. 11
is said to be at position i
if the first 1
in 11
is at position i
, where i
starts at 0 and
grows from least significant to most significant bit.
Let m
be the equivalent set for 1001
.
The sum of any string produced by the grammar is:
sum
= Σn (21 + 20) 2 n + Σm (23 + 20) 2m
= Σn 3 2 n + Σm 9 2m
This is clearly divisible by 3.
No. Consider the string "10101", which is divisible by 3, but cannot be derived from the grammar.
Readers seeking a more formal proof can read about it below:
Proof:
Every number divisible by 3 can be written in the form 3k
. We will consider k > 0
(though it would be valid to consider k
to be an arbitrary integer).
Note that every part of num(11, 1001 and 0) is divisible by 3, if the grammar could generate all the numbers divisible by 3, we can get a production for binary k from num's production:
3k = num -> 11 | 1001 | num 0 | num num
k = num/3 -> 01 | 0011 | k 0 | k k
k -> 01 | 0011 | k 0 | k k
It is obvious that any value of k
that has more than 2 consecutive bits set to 1 can never be produced. This can be confirmed by the example given in the beginning:
10101 is 3*7, hence, k = 7 = 111 in binary. Because 111 has more than 2 consecutive 1's in binary, the grammar will never produce 21.
Construct a context-free grammar for roman numerals.
Note: we just consider a subset of roman numerals which is less than 4k.
via wikipedia, we can categorize the single roman numerals into 4 groups:
I, II, III | I V | V, V I, V II, V III | I X
then get the production:
digit -> smallDigit | I V | V smallDigit | I X
smallDigit -> I | II | III | ε
and we can find a simple way to map roman to arabic numerals. For example:
via the upper two rules, we can derive the production:
romanNum -> thousand hundred ten digit
thousand -> M | MM | MMM | ε
hundred -> smallHundred | C D | D smallHundred | C M
smallHundred -> C | CC | CCC | ε
ten -> smallTen | X L | L smallTen | X C
smallTen -> X | XX | XXX | ε
digit -> smallDigit | I V | V smallDigit | I X
smallDigit -> I | II | III | ε