next up previous
Next: Predictive parsing Up: Parsing Previous: Parsing

Top-down parsing

The construction of the parse tree relies on the following principles.

Create a tree reduced to the start-symbol of the grammar. The current token being scanned is called the lookahead symbol. Initially, the lookahead symbol is the leftmost token of the input string.
If the leftmost leaf L of the current tree is labeled by a nonterminal A then
select an A-production A $ \longmapsto$ $ \alpha$ and
create a parse tree for A $ \longmapsto$ $ \alpha$ rooted at L.
Advance to the leftmost leaf in the tree.
Otherwise if the terminal at node L matches the lookahead symbol then
Advance to the next leftmost leaf in the tree.
Advance the lookahead symbol in the input string to the next token.
Back-track in the tree to the last nonterminal A for which several A-productions could be chosen.
Choose another A-production and peforms steps (2.1.2) and (2.1.3).
Adjust the lookahead symbol accordingly.
Until the input string is read completely or the back-tracking process fails.

Example 14   Consider the following grammar G (with terminals a, b, c, d and nonterminals S, A)
S $ \longmapsto$ cAd
A $ \longmapsto$ ab  | a
and the input string w = cad. Figures 6 and 7 show the building of a parse tree for w w.r.t. G.
Figure 6: Sketch of the top-down parsing of w = cad before conflict.
Figure 7: Sketch of the top-down parsing of w = cad: conflct, back-track and termination.

Remark 6   A left-recursive grammar can cause an infinite loop: when we try to expand the nonterminal A we may be led to expand A without having consumed any input. Fortunetely, any left-recursive grammar G can be replaced by a grammar without left-recursion and generating the same language as G.

However avoiding backtraking requires more advanced techniques to be described in the next section.

next up previous
Next: Predictive parsing Up: Parsing Previous: Parsing
Marc Moreno Maza