next up previous
Next: LL(1) Grammars Up: Parsing Previous: Predictive parsing

Non-recursive implementation of predictive parsing

THE IDEA. Predictive parsing can be performed using a pushdown stack, avoiding recursive calls.

Figure 8 shows the structure of non-recursive predictive parsers. For each nonterminal A and each token a the entry M[A, a] of the parsing table contains either an A-production generating sentences starting with a or an error-entry.
Figure 8: The structure of non-recursive predictive parsers.
\begin{figure}\htmlimage
\centering\includegraphics[scale=.5]{nonRecursivePredictiveParsing.eps}
\end{figure}

Example 17   Consider the grammar G given by:
S $ \longmapsto$ aAa  |  BAa  |  $ \varepsilon$
A $ \longmapsto$ cA  |  bA  |  $ \varepsilon$
B $ \longmapsto$ b
The parsing table is:
a b c $
S S $ \longmapsto$ aAa S $ \longmapsto$ BAa S $ \longmapsto$ $ \varepsilon$
A A $ \longmapsto$ $ \varepsilon$ A $ \longmapsto$ bA A $ \longmapsto$ cA
B B $ \longmapsto$ b
where the empty slots correspond to error-entries. Consider the parsing of the word w  =  bcba.
Stack Remaining input action
$S bcba$ choose S $ \longmapsto$ BAa
$aAB bcba$ choose B $ \longmapsto$ b
$aAb bcba$ match b
$aA cba$ choose A $ \longmapsto$ cA
$aAc cba$ match c
$aA ba$ choose A $ \longmapsto$ bA
$aAb ba$ match b
$aA a$ choose A $ \longmapsto$ $ \varepsilon$
$a a$ match a


THE ALGORITHM.

Algorithm 5  

\fbox{
\begin{minipage}{12 cm}
\begin{description}
\item[{\bf Initial configurat...
...sto Y_1 Y_2 \cdots Y_k$; \\
\> {\bf until} $X = \$$\end{tabbing}\end{minipage}}

COMPUTING THE FIRST SETS. Recall that for any string $ \alpha$ of symbols the set FIRST($ \alpha$) satisfies the following conditions for every terminal a and every string of symbols $ \beta$

For a symbol X $ \in$ VT  $ \cup$ VN the set FIRST(X) can be computed as follows

Algorithm 6  

\fbox{
\begin{minipage}{12 cm}
\begin{description}
\item[{\bf Input:}] $X \in V_...
...) := {\sc FIRST}($X$) ${\cup} \ \{ {\varepsilon} \}$\end{tabbing}\end{minipage}}

Comments about the computation of FIRST(X) with Algorithm 6.

Algorithm 7  

\fbox{
\begin{minipage}{10 cm}
\begin{description}
\item[{\bf Input:}] $X = X_1 ...
...) := {\sc FIRST}($X$) ${\cup} \ \{ {\varepsilon} \}$\end{tabbing}\end{minipage}}

The principle of Algorithm 7 is similar to that of Algorithm 6.


THE FOLLOW SETS. In order to give an algorithm for building the parsing table we define for every A $ \in$ VN

FOLLOW(A)  =  $\displaystyle \left\{\vphantom{ a \in V_T \, {\cup} \, \{ \$ \} \ \mid \ S\$ \s...
...N)^{\ast} \, {\cup} \, \{ {{\$} \, \varepsilon} \} \end{array} \right. }\right.$a $\displaystyle \in$ VT  $\displaystyle \cup$  {$}  |  S$$\displaystyle \;\stackrel{{\ast}}{{\Longrightarrow}}\;$$\displaystyle \alpha$Aar  |  $\displaystyle \left\{\vphantom{ \begin{array}{l} {\alpha} \in (V_T \, {\cup} \,...
...p} \, V_N)^{\ast} \, {\cup} \, \{ {{\$} \, \varepsilon} \} \end{array} }\right.$$\displaystyle \begin{array}{l} {\alpha} \in (V_T \, {\cup} \, V_N)^{\ast} \\  r...
...T \, {\cup} \, V_N)^{\ast} \, {\cup} \, \{ {{\$} \, \varepsilon} \} \end{array}$$\displaystyle \left.\vphantom{ a \in V_T \, {\cup} \, \{ \$ \} \ \mid \ S\$ \st...
...)^{\ast} \, {\cup} \, \{ {{\$} \, \varepsilon} \} \end{array} \right. }\right\}$ (42)

In other words FOLLOW(A) is the set of the terminals that can appear immediately to the right of the nonterminal A in some sentential form. Moreover $ belongs to FOLLOW(A) if A is the rightmost symbol in some sentential form.


COMPUTING THE FOLLOW SETS.

Algorithm 8  

\fbox{
\begin{minipage}{12 cm}
\begin{description}
\item[{\bf Input:}] $G = (V_T...
...{\bf then} \\
\> \> \> \> \> $done$\ := {\bf true}
\end{tabbing}\end{minipage}}

The FOLLOW sets of all nonterminal symbols are computed together by the following process:

Example 18   Consider the following grammar (with terminals a, b, c and nonterminals S, A)
S $ \longmapsto$ aAa  |  bAa  |  $ \varepsilon$
A $ \longmapsto$ cA  |  bA  |  $ \varepsilon$
The FIRST sets of the right side of the productions are given by
FIRST(aAa) = {a}
FIRST(bAa) = {b}
FIRST(cA) = {c}
FIRST(bA) = {b}
FIRST( $ \varepsilon$) = {$ \varepsilon$}
The FOLLOW sets of the left side of the productions are given by
FOLLOW(S) = {$}
FOLLOW(A) = {a}


COMPUTING THE PARSING TABLE.

Algorithm 9  

\fbox{
\begin{minipage}{10 cm}
\begin{description}
\item[{\bf Input:}] $G = (V_T...
...} \\
\> \> \> \> \> \> \> $M[A,a]$\ := {\em error}
\end{tabbing}\end{minipage}}

Algorithm 9 consists of three main steps.

Example 19   Consider the following grammar (with terminals + ,*,(,),$ \bf id$ and nonterminals T, E, E', F)
E $ \longmapsto$ TE'
E' $ \longmapsto$ + TE'  |  $ \varepsilon$
T $ \longmapsto$ FT'
T' $ \longmapsto$ *FT'  |  $ \varepsilon$
F $ \longmapsto$ (E)  |  $ \bf id$
The FIRST sets of the right side (and some of their prefixes) of the productions are given by
FIRST(F) = {(,$ \bf id$}
FIRST(T) = {(,$ \bf id$}
FIRST(T') = {*,$ \varepsilon$}
FIRST(E) = {(,$ \bf id$}
FIRST(E') = { + ,$ \varepsilon$}
FIRST(TE') = {(,$ \bf id$}
FIRST(FT') = {(,$ \bf id$}
FIRST((E)) = {(}
FIRST(+ TE') = { + }
FIRST(*FT') = {*}
The FOLLOW sets of the left side of the productions are given by
FOLLOW(E) = {),$}
FOLLOW(E') = {),$}
FOLLOW(T) = { + ,),$}
FOLLOW(T') = { + ,),$}
FOLLOW(F) = { + ,*,),$}
$ \bf id$ + * ( ) $
E E $ \longmapsto$ TE' E $ \longmapsto$ TE'
E' E' $ \longmapsto$ + TE' E' $ \longmapsto$ $ \varepsilon$ E' $ \longmapsto$ $ \varepsilon$
T T $ \longmapsto$ FT' T $ \longmapsto$ FT'
T' T' $ \longmapsto$ $ \varepsilon$ T' $ \longmapsto$ *FT' T' $ \longmapsto$ $ \varepsilon$ T' $ \longmapsto$ $ \varepsilon$
F F $ \longmapsto$ $ \bf id$ F $ \longmapsto$ (E)

Example 20   Consider the following grammar (with terminals a, b, e, i, t and nonterminals S, S', E)
S $ \longmapsto$ i E t S S'  |  a
S' $ \longmapsto$ e S  |  $ \varepsilon$
E $ \longmapsto$ b
The FIRST sets of the right side of the productions are given by
FIRST( i E t S S') = {i}
FIRST(a) = {a}
FIRST(e S) = {e}
FIRST( $ \varepsilon$) = {$ \varepsilon$}
FIRST(b) = {b}
Le's sketch the computation of the FOLLOW sets. Recall that the rules are:

if A $\displaystyle \longmapsto$ $\displaystyle \alpha$B$\displaystyle \beta$ then $\displaystyle \mbox{{\sc FOLLOW}($B$)}$ : = $\displaystyle \mbox{{\sc FIRST}(${\beta}$)}$ $\displaystyle \setminus$ {$\displaystyle \varepsilon$$\displaystyle \cup$  $\displaystyle \mbox{{\sc FOLLOW}($B$)}$
if A $\displaystyle \longmapsto$ $\displaystyle \alpha$B then $\displaystyle \mbox{{\sc FOLLOW}($B$)}$ : = $\displaystyle \mbox{{\sc FOLLOW}($B$)}$  $\displaystyle \cup$  $\displaystyle \mbox{{\sc FOLLOW}($A$)}$
if $\displaystyle \left\{\vphantom{ \begin{array}{l} A \longmapsto {\alpha} B {\beta} \\  {\beta} \stackrel{\ast}{\Longrightarrow} {\varepsilon} \end{array} }\right.$$\displaystyle \begin{array}{l} A \longmapsto {\alpha} B {\beta} \\  {\beta} \stackrel{\ast}{\Longrightarrow} {\varepsilon} \end{array}$ then $\displaystyle \mbox{{\sc FOLLOW}($B$)}$ : = $\displaystyle \mbox{{\sc FOLLOW}($B$)}$  $\displaystyle \cup$  $\displaystyle \mbox{{\sc FOLLOW}($A$)}$
(43)

Initialization S $ \longmapsto$ iEtSS' S $ \longmapsto$ iEtSS'
FOLLOW(S) {$} {e,$} {e,$}
FOLLOW(S') $ \emptyset$ $ \emptyset$ {e,$}
FOLLOW(E) $ \emptyset$ {t} {t}
This leads to the following parsing table.
a b e i $ t
S S $ \longmapsto$ a S $ \longmapsto$ iEtSS'
S' S' $ \longmapsto$ $ \varepsilon$ S' $ \longmapsto$ $ \varepsilon$
S' $ \longmapsto$ eS
E E $ \longmapsto$ b
The entry M[S', e] contains both We know that this grammar is ambiguous and this ambiguity is shown by this choice of productions when an e (else) is seen.


ERROR RECOVERY IN PREDICTIVE PARSING. A predictive parser attempts to match the nonterminals and the terminals in the stack with the remaining input. Therefore two types of conflicts can occur.

T-conflict.
A terminal appearing on top of the stack does not match the following input token.
N-conflict.
For a nonterminal B on top of the stack and the lookahead token b the entry M[B, b] of the parsing table is empty.
Panic-mode recovery is based on the idea of skipping symbols on the input string until a token in a selected set of synchronizing tokens appears. These synchronizing sets should be chosen such that the parser recovers quickly from errors that are likely to occur in practice. Here are some strategies for the above conflict cases.
T-conflict.
Skip (= ignore and advance) the token in the input string. Hence the synchronizing set conists here of all other tokens.
N-conflict.
Possible solutions.
  1. Skip (= ignore and advance) the token b in the input string.
  2. If M[B, b] is a blank entry labeled synch then skip (= pop and ignore) the nonterminal B. (Strictly speaking and according to the above definition, this is not a panic-mode recovery, but quite close in the spirit.)
  3. Skip tokens from the input string until an element of FIRST(B) is reached, then continue parsing normally. So the synchronizing set here is FIRST(B).
  4. Skip tokens from the input string until an element of FOLLOW(B) is reached, then skip B and continue parsing normally. (Again this is a variation of panic-mode recovery.)
  5. Delimiters such as ; in C can be added to the two previous synchronizing sets.


next up previous
Next: LL(1) Grammars Up: Parsing Previous: Predictive parsing
Marc Moreno Maza
2004-12-02