next up previous
Next: Syntax Trees for Expressions Up: Compiler Theory: Syntax-Directed Translation Previous: Compiler Theory: Syntax-Directed Translation

Syntax-Directed Definitions

A SYNTAX-DIRECTED DEFINITION is a context-free grammar in which

It is usual to denote the attributes of a grammar symbol in the form X.name where name is an meaningful name for the attribute.

Example 1   The following syntax-directed definition generates binary strings where (one of) the attributes gives the number represented by the string. The nonterminals are B and D, the terminals are 0 and 1. The symbol B stands for binary expansion and the symbol D stands for digit.
Grammar symbol Synthesized attribute Inherited attribute
B B.pos, B.val
D D.val D.pow
Production Semantic Rule
B $ \longmapsto$ DB1 B.pos := B1.pos + 1
B.val := B1.val + D.val
D.pow := B1.pos
B $ \longmapsto$ D B.pos := 1
B.val := D.val
D.pow := 0
D $ \longmapsto$ 0 D.val := 0
D $ \longmapsto$ 1 D.val := 2D.pow


AT A PARSE TREE NODE, following the previous definition,

Example 2   Let us continue Example 1 and let us consider the input string w = 1010. Figure 1 shows a parse tree for w.
Figure 1: Parse tree for w = 1010.
\begin{figure}\htmlimage
\centering\includegraphics[scale=.35]{annotatedParseTree2.eps}
\end{figure}
Figure 2 shows the values of the attributes at each node.
Figure 2: Annotated parse tree for w = 1010.
\begin{figure}\htmlimage
\centering\includegraphics[scale=.35]{annotatedParseTree2b.eps}
\end{figure}
Production Semantic Rule
B $ \longmapsto$ DB1 B.pos := B1.pos + 1
B.val := B1.val + D.val
D.pow := B1.pos
B $ \longmapsto$ D B.pos := 1
B.val := D.val
D.pow := 0
D $ \longmapsto$ 0 D.val := 0
D $ \longmapsto$ 1 D.val := 2D.pow


SEMANTIC RULES set up dependencies between attributes that will be represented as a dependency graph.


ATTRIBUTE GRAMMARS. It is useful to allow side effects (printing a value, updating a global variable, ...) in semantic rules. Such semantic rules are written as procedure calls or program fragments like

p(c1,..., ck) (2)

However these semantic rules can be thought of as rules defining the values of dummy attributes. An attribute grammar is a syntax-directed definition where the semantic rules cannot have side effects.


A S-ATTRIBUTE GRAMMAR is an attribute grammar where all attributes are synthesized attributes. The interest of an S-attribute grammar is that any parse tree can always be annotated by evaluating the semantic rules for the attributes at each node bottom up, from the leaves to the root.


AN ANNOTATED PARSE TREE. is a parse tree showing the values of the attributes at each node. The process of computing the attribute values at the nodes is called annotating or decorating the parse tree.

Example 3   The following syntax-directed definition is from a desk calculator. This example is of course closely related to the detailed YACC example of the previous chapter. The terminals $ \bf number$ and $ \bf n$ stand respectively for an integer number and the newline character.

Grammar symbol Synthesized attribute
E E.val $ \in$ $ \mbox{${\mathbb Z}$}$
T T.val $ \in$ $ \mbox{${\mathbb Z}$}$
F F.val $ \in$ $ \mbox{${\mathbb Z}$}$
$ \bf number$ $ \bf number$.lexval $ \in$ $ \mbox{${\mathbb Z}$}$
Production Semantic Rule
L $ \longmapsto$ E$ \bf n$ print(E.val )
E $ \longmapsto$ E1 + T E.val := E1.val + T.val
E $ \longmapsto$ T E.val := T.val
T $ \longmapsto$ T1*F T.val := T1.val*F.val
T $ \longmapsto$ F T.val := F.val
F $ \longmapsto$ (E) F.val := E.val
F $ \longmapsto$ $ \bf number$ F.val := $ \bf number$.lexval
Observations.

Example 4   The following syntax-directed definition implements statements for declaring identifiers $ \bf id$ with type $ \bf real$ and $ \bf int$. Note that the comma , is also a terminal symbol for this grammar. The addtype procedure is used to add the type of an identifier in the symbol table (by accessing its entry in the symbol table with $ \bf id$.entry).
Grammar symbol Synthesized attribute Inherited attribute
L L.in (string)
T T.type (string)
$ \bf id$ $ \bf id$.entry $ \in$ $ \mbox{${\mathbb Z}$}$
Production Semantic Rule
D $ \longmapsto$ TL L.in := T.type
T $ \longmapsto$ $ \bf int$ T.type := $ \bf int$
T $ \longmapsto$ $ \bf real$ T.type := $ \bf real$
L $ \longmapsto$ L1$ \bf ,$$ \bf id$ L1.in := L.in
addtype ($ \bf id$.entry, L.in)
L $ \longmapsto$ $ \bf id$ addtype ($ \bf id$.entry, L.in)
Observations.
Figure 3: Parse tree for $ \bf real \ id, \ id, \ id$.
\begin{figure}\htmlimage
\centering\includegraphics[scale=.35]{annotatedParseTree1.eps}
\end{figure}

Remark 1   To compute the values of the attributes at each node of an annotated parse tree we need to define an evaluation order. This will be done by a dependency graph.


DEPENDENCY GRAPH. Let us a consider a syntax-directed definition. Let us assume that every semantic rule as the form

b  : =  f (c1,..., ck) (3)

Let T be an annotated tree for this syntax-directed definition. Let N be an internal node of T and let A $ \longmapsto$ $ \alpha$ be the associated production. Let b := f (c1,..., ck) be a semantic rule associated with this production. Then for i = 1 ... k we say that the attribute b depends on the attribute ci.

Let $ \cal {A}$ be the set of all attributes. The underlying idea guiding the construction of the dependency graph of T is: if an attribute b the node N of T depends on the attribute ci then ci must before evaluated before b. Thus, it is tempting to consider the graph of the relation

(bc) $\displaystyle \in$ $\displaystyle \cal {A}$×$\displaystyle \cal {A}$    $\displaystyle \longmapsto$     b  depends  on  c (4)

However a given grammar symbol X may label different nodes of T. Therefore the dependency graph of T is the directed graph constructed as follows.

Algorithm 1  

\fbox{
\begin{minipage}{12 cm}
\begin{description}
\item[{\bf Input:}] An annota...
...{\sf create} an arc from $Z.c(P)$\ to $Y.b(M)$\ \\
\end{tabbing}\end{minipage}}

Given a node N of the annotated tree T labeled by the grammar symbol X it is convenient to call attribute of N every attribute of X.

Example 5  

Figure 4: Parse tree for $ \bf real \ id, \ id, \ id$ with dummy synthesized attributes.
\begin{figure}\htmlimage
\centering\includegraphics[scale=.35]{annotatedParseTree1b.eps}
\end{figure}

Figure 5: Parse tree for $ \bf real \ id, \ id, \ id$ with dummy synthesized attributes and showing the attributes of each mode.
\begin{figure}\htmlimage
\centering\includegraphics[scale=.35]{annotatedParseTree1c.eps}
\end{figure}

Figure 6: Construction of the vertices of the dependency graph.
\begin{figure}\htmlimage
\centering\includegraphics[scale=.35]{annotatedParseTree1d.eps}
\end{figure}

Figure 7: Construction of the arcs of the dependency graph.
\begin{figure}\htmlimage
\centering\includegraphics[scale=.35]{annotatedParseTree1e.eps}
\end{figure}
Figures 4, 5, 6 and 7 show the construction of the vertices and the arcs of the dependency graph G of the parse tree of Figure 3. Observations.


DIRECTED GRAPHS. let G be a directed graph with vertex set X and arc set A. The following definitions are very useful.

SOURCES IN DIRECTED GRAPHS. A source in the directed graph G is a vertex x such that for every other vertex y the couple (y, x) is not an arc. In broad words, there is no path leading to a source. It is not hard to see that every DAG has at least one source. (Otherwise there would be a circuit and thus a cycle).


TOPOLOGICAL SORT OF A DAG. Let G = (X, A) be a DAG with n vertices. A topological sort of the DAG G is an enumeration of its vertices (a bijection from {1,..., n} to X) say x1, x2,..., xn such that there is no path from xj to xi for 1 $ \leq$ i < j $ \leq$ n.

By induction on n, it is not hard to see that every DAG has a topological sort. Indeed, the case n = 1 is trivial. For n > 1 let s be a source of G. Let G - s be the graph induced from G by the cancellation of s and every arc leaving from s. By the induction hypothesis, the graph G - s has a topological sort, say x1, x2,..., xn-1. Now observe that s, x1, x2,..., xn-1 is a topological sort of G.


THE TRANSITIVE CLOSURE OF A DIRECTED GRAPH


COMPUTING THE TRANSITIVE CLOSURE OF A DIRECTED GRAPH

Algorithm 2  

\fbox{
\begin{minipage}{12 cm}
\begin{description}
\item[{\bf Input:}] A directe...
...k) + L^{(k-1)}(k,j)).$\ \\
\> {\bf return} $L^(n).$\end{tabbing}\end{minipage}}


COMPUTING A TOPOLOGICAL SORT OF A DAG. In our simple example we obtain a topological sort easily by canceling the sources one after another. This leads to Figure 8.

Figure 8: Topological sort for the dependency graph.
\begin{figure}\htmlimage
\centering\includegraphics[scale=.35]{annotatedParseTree1f.eps}
\end{figure}
Coming back to the parse tree (where the nodes without attributes are removed, for clarity) we obtain an order for evaluating the attributes. This leads to the annotated parse tree on Figure 9 where the order for executing the semantic rules is shown by increasing numbers.
Figure 9: Annotated parse tree (where nodes without attributes are removed).
\begin{figure}\htmlimage
\centering\includegraphics[scale=.35]{annotatedParseTree1g.eps}
\end{figure}
Let us denote by ai the attribute associated with the i-th node in the topological sort of the dependency graph. We finally obtain the following program.
a4 := $ \bf real$
a5 := a4
a7 := a5
addtype ($ \bf id_{3}^{}$.entry, a5)
a9 := a7
addtype ($ \bf id_{2}^{}$.entry, a7)
addtype ($ \bf id_{1}^{}$.entry, a9)

Remark 2   A grammar which is suitable for parsing (as the one of the detailed Example 4) may not be suitable for translation (i.e. may lead to an expensive translation scheme). In the following example, which also implements statements for declaring identifiers, inherited attributes are avoided. Thus the construction of a dependency graph is not needed.
Production Semantic Rule
D $ \longmapsto$ L
T $ \longmapsto$ $ \bf int$ T.type := $ \bf int$
T $ \longmapsto$ $ \bf real$ T.type := $ \bf real$
L $ \longmapsto$ $ \bf id$$ \bf ,$ L1 L.type := L1.type
addtype ($ \bf id$.entry, L1.type)
L $ \longmapsto$ $ \bf :$ T L.type := T.type
If for the same language we use the following grammar
D $ \longmapsto$ L$ \bf :$ T
T $ \longmapsto$ $ \bf int$
T $ \longmapsto$ $ \bf real$
L $ \longmapsto$ L$ \bf ,$ $ \bf id$
L $ \longmapsto$ $ \bf id$
then we get into the same difficulties as in Example 4.

Example 6   The following syntax-directed definition implements statements for performing substitutions.
Grammar symbol Synthesized attribute Inherited attribute
S
E E.val $ \in$ $ \mbox{${\mathbb Z}$}$ E.env
There are three functions used in the semantic rules.
initialEnv
which returns the empty environment.
LookUp
which determines the value of an identifier by consulting the environment.
Update
which adds a binding to the environment.
In the above syntax-directed definition E1, E2 and E3 correspond to the same nonterminal E.
Production Semantic Rule
S $ \longmapsto$ E E.env := initialEnv()
E1 $ \longmapsto$ $ \bf let$ $ \bf id$ $ \bf =$ E2 $ \bf in$ E3 $ \bf end$ E2.env := E1.env
E3.env := Update( E1.env,$ \bf id$, E2.val)
E1.val := E3.val
E1 $ \longmapsto$ (E2 + E3) E1.val := E2.val + E3.val
E2.env := E1.env
E3.env := E1.env
E $ \longmapsto$ $ \bf id$ E.val := LookUp( $ \bf id$, E.env)
E $ \longmapsto$ $ \bf 1$ E.val := $ \bf 1$
E $ \longmapsto$ $ \bf0$ E.val := $ \bf0$
Exercise. Translate the following statement.
$ \bf let$ $ \bf id$ $ \bf =$ 1 $ \bf in$ $ \bf id$ + $ \bf id$ $ \bf end$

CONCLUSIONS


next up previous
Next: Syntax Trees for Expressions Up: Compiler Theory: Syntax-Directed Translation Previous: Compiler Theory: Syntax-Directed Translation
Marc Moreno Maza
2004-12-02