next up previous
Next: The design of an efficient Up: Lexical Analysis Previous: More examples

The role of the lexical analyzer in the compiler

Upon receiving a get-next-tohen command from the parser, the lexical analyzer reads input characters until it can identify the next token.
Figure 19: Interaction of Lexical analyzer with parser.
\begin{figure}\htmlimage
\centering\includegraphics[scale=.4]{lexerParser.eps}
\end{figure}
The lexical analyzer collects also information about tokens into their associated attributes:


INTERFACING LEX WITH YACC. One of the main uses of LEX is as a companion to the YACC parser-generator.

Example 5   The file calculator.y below specifies the GRAMMAR of the expressions of a desk calculator. This file can be processed by the YACC program.
%{
#include <ctype.h>
#include <stdio.h>

void yyerror(const char *str)
{
  fprintf(stderr,"error: %s\n",str);
}

int yywrap()
{
  return 1;
}

main()
{
  yyparse();
}


%}

%token TOK_NUMBER TOK_PLUS TOK_TIMES TOK_MINUS TOK_DIVIDE TOK_LP TOK_RP

%%
line        : line expr '\n' {printf("%d\n", $2); }
            | /* empty word */
            ;

expr        : expr TOK_PLUS term          {$$ = $1 + $3; }
            | term                  {$$ = $1; }


term        : factor TOK_TIMES term        {$$ = $1 * $3; }
            | factor               {$$ = $1; }


factor      : TOK_LP expr TOK_RP        {$$ = $2; }
            | TOK_NUMBER           {$$ = $1; }
             ;
%%

The file calculator.l specifies how the above tokens are detected

%{
#include "y.tab.h"
extern int yylval;
#include <stdlib.h>
%}

PLUS            [\+]
MINUS           [\-]
TIMES           [\*]
DIVIDE          [/]
DIGIT           [0-9]
NUMBER          [0-9]+
WS              [ \t]*
LP              "("
RP              ")"
RET             [\n]

%%

{WS}            {
                /* eat up white space */
                }
{NUMBER}        {
                yylval = atoi( yytext ); 
                return TOK_NUMBER;
                }
{PLUS}          {
                return TOK_PLUS;
                }
{TIMES}         {
                return TOK_TIMES;
                }
{MINUS}         {
                return TOK_MINUS;
                }
{DIVIDE}        {
                return TOK_DIVIDE;
                }
{LP}            {
                return TOK_LP;
                }
{RP}            {
                return TOK_RP;
                }
.               {
                }
{RET}           {
                return yytext[0];
                }

The corresponding calculator is built and run as follows:

[moreno@iguanodon yacc]$ yacc -d src/calculator.y
yacc: 16 shift/reduce conflicts
[moreno@iguanodon yacc]$ flex src/calculator.l 
[moreno@iguanodon yacc]$ gcc lex.yy.c y.tab.c -ll -o calculator
[moreno@iguanodon yacc]$ ./calculator 
1 + 3
4
2 * 4
8
(2 + 3) * 6
30
(2 + 3) * (4 + 5)
45


ABOUT THE SEPARATION OF THE LEXER FROM THE PARSER.

Other possible tasks of the lexical analyzer:


next up previous
Next: The design of an efficient Up: Lexical Analysis Previous: More examples
Marc Moreno Maza
2004-12-02