Introduction
1. 1.Introduction
2. 2.A Simple Syntax-Directed Translator
3. 3.Lexical Analysis
4. 4.Syntax Analysis
5. 5.Syntax-Directed Translation
6. 6.Intermediate-Code Generation
7. 7.Run-Time Environments
8. 8.Code Generation
9. 12.InterProcedural Analysis
- 9.1. Exercises for Section 12.3
- 9.2. Exercises for Section 12.3
Published with GitBook

Compilers Principles, Techniques, & Tools (purple dragon book) second edition exercise answers

Exercises for Section 2.2

2.2.1

Consider the context-free grammar:

S -> S S + | S S * | a

Show how the string aa+a* can be generated by this grammar.
Construct a parse tree for this string.
What language does this grammar generate? Justify your answer.

Answer

S -> S S -> S S + S -> a S + S -> a a + S -> a a + a *
L = {Postfix expression consisting of digits, plus and multiple signs}

2.2.2

What language is generated by the following grammars? In each case justify your answer.

S -> 0 S 1 | 0 1
S -> + S S | - S S | a
S -> S ( S ) S | ε
S -> a S b S | b S a S | ε
S -> a | S + S | S S | S * | ( S )

Answer

L = {0ⁿ1ⁿ | n>=1}
L = {Prefix expression consisting of plus and minus signs}
L = {Matched brackets of arbitrary arrangement and nesting, includes ε}
L = {String has the same amount of a and b, includes ε}
L = {Regular expressions used to describe regular languages} refer to wiki

2.2.3

Which of the grammars in Exercise 2.2.2 are ambiguous?

Answer

2.2.4

Construct unambiguous context-free grammars for each of the following languages. In each case show that your grammar is correct.

Arithmetic expressions in postfix notation.
Left-associative lists of identifiers separated by commas.
Right-associative lists of identifiers separated by commas.
Arithmetic expressions of integers and identifiers with the four binary operators +, -, *, /.
Add unary plus and minus to the arithmetic operators of 4.

Answer

1. E -> E E op | num

2. list -> list , id | id

3. list -> id , list | id

4. expr -> expr + term | expr - term | term
   term -> term * factor | term / factor | factor
   factor -> id | num | (expr)

5. expr -> expr + term | expr - term | term
   term -> term * unary | term / unary | unary
   unary -> + factor | - factor | factor
   factor - > id | num | (expr)

2.2.5

Show that all binary strings generated by the following grammar have values divisible by 3. Hint. Use induction on the number of nodes in a parse tree.

num -> 11 | 1001 | num 0 | num num
Does the grammar generate all binary strings with values divisible by 3?

Answer

Proof

Any string derived from the grammar can be considered to be a sequence consisting of 11 and 1001, where each sequence element is possibly suffixed with a 0.

Let n be the set of positions where 11 is placed. 11 is said to be at position i if the first 1 in 11 is at position i, where i starts at 0 and grows from least significant to most significant bit.

Let m be the equivalent set for 1001.

The sum of any string produced by the grammar is:

sum

= Σ_n (2¹ + 2⁰) 2 ⁿ + Σ_m (2³ + 2⁰) 2^m

= Σ_n 3 2 ⁿ + Σ_m 9 2^m

This is clearly divisible by 3.

No. Consider the string "10101", which is divisible by 3, but cannot be derived from the grammar.

Readers seeking a more formal proof can read about it below:

Proof:

Every number divisible by 3 can be written in the form 3k. We will consider k > 0 (though it would be valid to consider k to be an arbitrary integer).

Note that every part of num(11, 1001 and 0) is divisible by 3, if the grammar could generate all the numbers divisible by 3, we can get a production for binary k from num's production:
```
 3k = num   -> 11 | 1001 | num 0 | num num
  k = num/3 -> 01 | 0011 | k 0   | k k
  k         -> 01 | 0011 | k 0   | k k
```
It is obvious that any value of k that has more than 2 consecutive bits set to 1 can never be produced. This can be confirmed by the example given in the beginning:

10101 is 3*7, hence, k = 7 = 111 in binary. Because 111 has more than 2 consecutive 1's in binary, the grammar will never produce 21.

2.2.6

Construct a context-free grammar for roman numerals.

Note: we just consider a subset of roman numerals which is less than 4k.

Answer

wikipedia: Roman_numerals

via wikipedia, we can categorize the single roman numerals into 4 groups:

  I, II, III | I V | V, V I, V II, V III | I X

then get the production:

  digit -> smallDigit | I V | V smallDigit | I X
  smallDigit -> I | II | III | ε

and we can find a simple way to map roman to arabic numerals. For example:
- XII => X, II => 10 + 2 => 12
- CXCIX => C, XC, IX => 100 + 90 + 9 => 199
- MDCCCLXXX => M, DCCC, LXXX => 1000 + 800 + 80 => 1880
via the upper two rules, we can derive the production:

romanNum -> thousand hundred ten digit

thousand -> M | MM | MMM | ε

hundred -> smallHundred | C D | D smallHundred | C M

smallHundred -> C | CC | CCC | ε

ten -> smallTen | X L | L smallTen | X C

smallTen -> X | XX | XXX | ε

digit -> smallDigit | I V | V smallDigit | I X

smallDigit -> I | II | III | ε