next up previous
Next: LR Parsers Up: Parsing (II) Previous: Parsing (II)

Bottom-up parsing

KEY IDEAS.

Example 21   Consider the grammar (with terminals a, b, c, d, e and nonterminals S, A, B)
S $ \longmapsto$ aABe
A $ \longmapsto$ Abc  |  b
B $ \longmapsto$ d
Consider the input word w = abbcde. We reduce w to S by repeating the following algorithm.
  1. Choose a string $ \beta$ such that
    • $ \beta$ is the leftmost factor of w such that
    • there exists a production A $ \longmapsto$ $ \beta$.
  2. Replace $ \beta$ by A in w.
Warning! This algorithm is a first approach. (True algorithms for botton-up parsing do not follow exactly this scheme.) Well, our first-approach algorithms leads to the following sequence of transformations in the first column of the table below. The second column gives the production that could be used in order to rewrite the current word backwards. The last column gives the chosen production according to the above algorithm.
Current string Qualifying productions Chosen production
abbcde A $ \longmapsto$ b
B $ \longmapsto$ d A $ \longmapsto$ b
aAbcde A $ \longmapsto$ Abc
A $ \longmapsto$ b
B $ \longmapsto$ d A $ \longmapsto$ Abc
aAde B $ \longmapsto$ d B $ \longmapsto$ d
aABe S $ \longmapsto$ aABe S $ \longmapsto$ aABe
S
Observe that we have

S $\displaystyle \;\stackrel{{{\ rm }}}{{\Longrightarrow}}\;$ aABe $\displaystyle \;\stackrel{{{\ rm }}}{{\Longrightarrow}}\;$ aAde $\displaystyle \;\stackrel{{{\ rm }}}{{\Longrightarrow}}\;$ aAbcde $\displaystyle \;\stackrel{{{\ rm }}}{{\Longrightarrow}}\;$ abbcde (44)

where $ \alpha$ $ \;\stackrel{{{\ rm }}}{{\Longrightarrow}}\;$ $ \beta$ means that $ \beta$ derives from $ \alpha$ by a rightmost derivation.


BOTTOM-UP PARSING constructs a parse tree for an input string beginning at the leaves and working up towards the root.


HANDLE. Let G = (VT, VN, S, P) be a context-free grammar. Let n be a positive integer, let w,$ \beta$ be two strings of symbols and let A be a nonterminal such that A $ \longmapsto$ $ \beta$ is a production of G. We say that (A $ \longmapsto$ $ \beta$, n) is a handle for w if there exists a string of symbols $ \gamma$ such that

S $\displaystyle \;\stackrel{{{\ rm } \ {\ast}}}{{\Longrightarrow}}\;$ $\displaystyle \alpha$A$\displaystyle \gamma$ $\displaystyle \;\stackrel{{{\ rm }}}{{\Longrightarrow}}\;$ $\displaystyle \alpha$$\displaystyle \beta$$\displaystyle \gamma$  =  w (45)

and such that n = | $ \alpha$ | + 1.

More informally, a handle of w is a substring $ \beta$ that matches a right side of a production A $ \longmapsto$ $ \beta$ such that the reduction of the handle corresponds to a step of the reverse of a rightmost derivation.

Theorem 4   Let G = (VT, VN, S, P) be a context-free grammar. If G is unambiguous then for each string of symbols w such that S $ \;\stackrel{{{\ast}}}{{\Longrightarrow}}\;$ w there exists exactly one handle.

Example 22   Consider the grammar
S $ \longmapsto$ aSb  |  $ \varepsilon$

Example 23   Consider the grammar
S $ \longmapsto$ AC  |  BC
A $ \longmapsto$ a
B $ \longmapsto$ a
C $ \longmapsto$ c
The string ac has handles (A $ \longmapsto$ a, 1) and (B $ \longmapsto$ a, 1). So the grammar is ambiguous.

Example 24   Consider the grammar
S $ \longmapsto$ AB  |  A
A $ \longmapsto$ a
B $ \longmapsto$ b
N.B. In the sentential form Ab the handle is (B $ \longmapsto$ b, 2) and not (S $ \longmapsto$ A, 1).


HANDLE PRUNING is the general approach used in shift-and-reduce parsing. The implementation of handle pruning involves the following data-structures:

- a stack
to hold the grammar symbols,
- an input buffer
containing the remaining input,
- a table
to decide handles.
Initially the situation is
Stack Input
$ w$
where w is the sentence to parse. The possible actions of a shift-and-reduce parser are:
- shift
where the next symbol is shifted onto the top of the stack
- reduce
where the parser
- accept
where the parser announces successful completion of parsing.
- error
where the parser discovers that a syntax error has occured and calls an error recovery routine.
The final configuration must be
Stack Input
$S $

Example 25   Consider the grammar G
E $ \longmapsto$ E + E
E $ \longmapsto$ E*E
E $ \longmapsto$ (E)
E $ \longmapsto$ $ \bf id$
A rigthmost derivation of the sentence w = $ \bf id$ + $ \bf id$*$ \bf id$ is

S $\displaystyle \;\stackrel{{{\ rm }}}{{\Longrightarrow}}\;$ E + E
  $\displaystyle \;\stackrel{{{\ rm }}}{{\Longrightarrow}}\;$ E + E*E
  $\displaystyle \;\stackrel{{{\ rm }}}{{\Longrightarrow}}\;$ E + E*$\displaystyle \bf id$
  $\displaystyle \;\stackrel{{{\ rm }}}{{\Longrightarrow}}\;$ E + $\displaystyle \bf id$*$\displaystyle \bf id$
  $\displaystyle \;\stackrel{{{\ rm }}}{{\Longrightarrow}}\;$ $\displaystyle \bf id$ + $\displaystyle \bf id$*$\displaystyle \bf id$
(46)

Let us sketch the actions a shift-and-reduce parser might take in parsing w to obtain the above rigthmost derivation. (Another shift-and-reduce parser might obtain another rigthmost derivation of w w.r.t. G.)
Stack Input Action to take
$ $ \bf id$ + $ \bf id$*$ \bf id$$ shift
$$ \bf id$ + $ \bf id$*$ \bf id$$ reduce with E $ \longrightarrow$ $ \bf id$
$E + $ \bf id$*$ \bf id$$ shift
$E + $ \bf id$*$ \bf id$$ shift
$E + $ \bf id$ *$ \bf id$$ reduce with E $ \longrightarrow$ $ \bf id$
$E + E *$ \bf id$$ shift
$E + E* $ \bf id$$ shift
$E + E*$ \bf id$ $ reduce with E $ \longrightarrow$ $ \bf id$
$E + E*E $ reduce with E $ \longrightarrow$ E*E
$E + E $ reduce with E $ \longrightarrow$ E + E
$E $ accept
Of course the decisions (especially for the 4th shift) made in the above table are not explained at this point.

Handle pruning can be implemented using a stack because of the following fact.

Proposition 1   Let G = (VT, VN, S, P) be an unambiguous context-free grammar. In a shift-and-reduce parser for G implemented with a stack, the right end of the handle will always appear on top the stack.

Proof. This fact becomes clear when considering the possible forms of two successive steps in any rightmost derivation. These two steps can be of the form (type 1)

S $\displaystyle \;\stackrel{{{\ rm } \ {\ast}}}{{\Longrightarrow}}\;$ $\displaystyle \alpha$ A z $\displaystyle \;\stackrel{{{\ rm } \ {\ast}}}{{\Longrightarrow}}\;$ $\displaystyle \alpha$ $\displaystyle \beta$ B y z $\displaystyle \;\stackrel{{{\ rm } \ {\ast}}}{{\Longrightarrow}}\;$ $\displaystyle \alpha$ $\displaystyle \beta$ $\displaystyle \gamma$ y z (47)

or the form (type 2)

S $\displaystyle \;\stackrel{{{\ rm } \ {\ast}}}{{\Longrightarrow}}\;$ $\displaystyle \alpha$ B x A z $\displaystyle \;\stackrel{{{\ rm } \ {\ast}}}{{\Longrightarrow}}\;$ $\displaystyle \alpha$ B x y z $\displaystyle \;\stackrel{{{\ rm } \ {\ast}}}{{\Longrightarrow}}\;$ $\displaystyle \alpha$ $\displaystyle \gamma$ x y z (48)

We proceed by induction. Let us assume that the handle B $ \longrightarrow$ $ \gamma$ appears on top of the stack. Then let us show that the handle A $ \longrightarrow$ $ \beta$ B y in type 1 (and the handle A $ \longrightarrow$ y in type 2) appears on top also. Let us consider type 1 in reverse, where a shift-and-reduce parser has just reached the following configuration.
Stack Input
$ \alpha$ $ \beta$ $ \gamma$ y z
The parser now reduces the handle.
Stack Input
$ \alpha$ $ \beta$ B y z
Since B is the rightmost nonterminal, the right end of the handle of w = $ \alpha$ $ \beta$ Byz cannot be to the left of B. Therefore the parser can shift y.
Stack Input
$ \alpha$ $ \beta$ B y z
For case 2 in reverse, assume the following configuration has been reached
Stack Input
$ \alpha$$ \gamma$ x y z
Reduction leads to
Stack Input
$ \alpha$ B x y z
Again B is the rightmost nonterminal so it needs to be involved.

$ \qedsymbol$


next up previous
Next: LR Parsers Up: Parsing (II) Previous: Parsing (II)
Marc Moreno Maza
2004-12-02