next up previous
Next: Left factoring Up: Context-free grammars Previous: Elimination of ambiguity.

Elimination of left recursion

LEFT RECURSION. Let G be a context-free grammar. A production of G is said left recursive if it has the form

A  $\displaystyle \longmapsto$  A$\displaystyle \alpha$ (20)

where A is a nonterminal and $ \alpha$ is a string of grammar symbols. A nonterminal A of G is said left recursive if there exists a string of grammar symbols $ \alpha$ such that we have

A  $\displaystyle \;\stackrel{{{\ast}}}{{\Rightarrow}}\;$  A$\displaystyle \alpha$ (21)

Such derivation is also said left recursive. The grammar G is left recursive if it has at least one left recursive nonterminal.

Remark 4   Top-down parsing is one of the methods that we will study for generating parse trees. This method cannot handle left recursive grammars. We present now an algorithm for transforming a left recursive grammar G into a grammar G' which is not left recursive and which generates the same language as G.

THE BASIC TRICK is to replace the production

A  $\displaystyle \longmapsto$  A$\displaystyle \alpha$  |  $\displaystyle \beta$    by    $\displaystyle \left\{\vphantom{ \begin{array}{lll} A & \longmapsto & {\beta} A'...
... A' & \longmapsto & {\alpha} A' \ \mid \ {\varepsilon} \\  \end{array} }\right.$$\displaystyle \begin{array}{lll} A & \longmapsto & {\beta} A' \\  A' & \longmapsto & {\alpha} A' \ \mid \ {\varepsilon} \\  \end{array}$ (22)

where $ \alpha$ $ \neq$ $ \varepsilon$ is assumed. Observe that this trick eliminates left recursive productions. Applying this trick is not enough to remove all derivations of the form A $ \;\stackrel{{{\ast}}}{{\Rightarrow}}\;$ A$ \alpha$. However this trick is sufficient for many grammars.

Example 10   The following grammar which generates arithmetic expressions

E $\displaystyle \longmapsto$ E + T  |  T
T $\displaystyle \longmapsto$ T*F  |  F
F $\displaystyle \longmapsto$ (E)  |  $\displaystyle \bf id$
(23)

has two left recursive productions. Applying the above trick leads to

E $\displaystyle \longmapsto$ TE'
E' $\displaystyle \longmapsto$ + TE'  |  $\displaystyle \varepsilon$
T $\displaystyle \longmapsto$ FT'
T' $\displaystyle \longmapsto$ *FT'  |  $\displaystyle \varepsilon$
F $\displaystyle \longmapsto$ (E)  |  $\displaystyle \bf id$
(24)

THE CASE OF SEVERAL LEFT RECURSIVE A-PRODUCTIONS. Assume that the set of all A-productions has the form

A  $\displaystyle \longmapsto$  A$\displaystyle \alpha_{{1}}^{{}}$  |  A$\displaystyle \alpha_{{2}}^{{}}$  |   ... A$\displaystyle \alpha_{{m}}^{{}}$  |  $\displaystyle \beta_{{1}}^{{}}$  |  $\displaystyle \beta_{{2}}^{{}}$  |   ...  $\displaystyle \beta_{{n}}^{{}}$ (25)

where no $ \beta_{{i}}^{{}}$ begins with an A and where $ \alpha_{{i}}^{{}}$ $ \neq$ $ \varepsilon$ holds for every i = 1 ... m. Then we repalce these A-productions by

A $\displaystyle \longmapsto$ $\displaystyle \beta_{{1}}^{{}}$A'  |  $\displaystyle \beta_{{2}}^{{}}$A'  |   ...  $\displaystyle \beta_{{n}}^{{}}$A'
A' $\displaystyle \longmapsto$ $\displaystyle \alpha_{{1}}^{{}}$A'  |  $\displaystyle \alpha_{{2}}^{{}}$A'  |   ...  $\displaystyle \alpha_{{m}}^{{}}$A'  |  $\displaystyle \varepsilon$
(26)

Remark 5   Let us consider the following grammar.

S $\displaystyle \longmapsto$ Aa  |  b
A $\displaystyle \longmapsto$ Ac  |   Sd  |  $\displaystyle \varepsilon$
(27)

The nonterminal S is left recursive since we have

S  $\displaystyle \Rightarrow$  Aa  $\displaystyle \Rightarrow$  Sda (28)

However there is no left recursive S-productions. We show now how to deal with these left recursive derivations.

REMOVING ALL LEFT RECURSIVE DERIVATIONS. Let us assume that non-terminals are numbered: A1, A2,..., An. The goal is to remove any possible circuit for all 1 $ \leq$ j < i $ \leq$ n

Ai$\displaystyle \;\stackrel{{{\ast}}}{{\Rightarrow}}\;$Aj$\displaystyle \delta_{{j}}^{{}}$$\displaystyle \;\stackrel{{{\ast}}}{{\Rightarrow}}\;$Ai$\displaystyle \delta_{{i}}^{{}}$. (29)

The idea is for i = 1 ... n The method in more detail: Observations:

Example 11   Consider the following grammar.

A3 $\displaystyle \longmapsto$ A2A1
A2 $\displaystyle \longmapsto$ $\displaystyle \varepsilon$
A1 $\displaystyle \longmapsto$ A2A1
(33)

Applying the previous algorithm for removing all possible left-recursive derivations fail on this grammar, because of the $ \varepsilon$-production.

Example 12   Consider again the following grammar.

S $\displaystyle \longmapsto$ Aa  |  b
A $\displaystyle \longmapsto$ Ac  |  Sd  |  $\displaystyle \varepsilon$
(34)

We order the nonterminals: S < A. There is no left recursive S-productions. So the next step is to remove S from the right-hand side of the A-productions of the form A $ \longmapsto$ S$ \alpha$. Hence we obtain

S $\displaystyle \longmapsto$ Aa  |  b
A $\displaystyle \longmapsto$ Ac  |  Aad  |  bd  |  $\displaystyle \varepsilon$
(35)

Then we remove the left recursive A-productions.

S $\displaystyle \longmapsto$ Aa  |  b
A $\displaystyle \longmapsto$ bdA'  |  A'
A' $\displaystyle \longmapsto$ cA'  |  adA'  |  $\displaystyle \varepsilon$
(36)


next up previous
Next: Left factoring Up: Context-free grammars Previous: Elimination of ambiguity.
Marc Moreno Maza
2004-12-02