__ALPHABET, STRING, LANGUAGE.__ We call an *alphabet* any finite set of symbols.
Let be an alphabet.
A *string* or *word* over is any finite sequence
of symbols from .
The empty string is denoted by
.
A *language* over is any set of strings over .
Note that the above definitions do not ascribe any meaning
to the strings of the language.

__DETERMINISTIC FINITE AUTOMATA.__ A deterministic finite automaton (DFA) consists of five things:

- an input alphabet ,
- a finite set
*S*whose elements are called*states*, - a distinguished state
*s*_{0}*S*, called*starting state*, - a set
*F**S*of distinguished states, called*accepting states*(or*final states*), - a partial function from
*S*× to*S*called the*transition function*(hence for every state-symbol couple either maps it to a symbol or is not defined for this couple).

- to say ``let
= (,
*S*,*s*_{0},*F*,) be a DFA'' for ``let be a DFA with alphabet , set of states*S*, starting state*s*_{0}, set of accepting states*F*and transition function .`` - to represent a DFA by a
*transition diagram*using the rules shown on Figure 1.

__LANGUAGE RECOGNIZED BY A DFA.__ A DFA accepts an input string if when beginning the computation
in the starting state, after reading the entire string, the
automaton is in an accepting state.

__TRANSITION TABLE.__ A straightforward way to implement a DFA is to represent
the transition function as a *transition table.*
This table has a row for each state *s* and a column for
each input symbol *a* .
The intersection of row *s* and column *a* contains
(*s*, *a*).
Figure 3 shows the transition diagram
and the transition table of an automaton.

__COMPLETE DFA.__ A DFA
= (, *S*, *s*_{0}, *F*,)
is said *complete* if for every *s* *S* and for
every
*a* the transition
(*s*, *a*)
is defined.

The automaton of Figure 3 is complete whereas none of the automata of Figure 2 are.

As a model for a computer program, it is desirable for a DFA
to be complete.
To transform a non-complete DFA
= (, *S*, *s*_{0}, *F*,)
into a complete DFA one needs only

- to add a new state to
*S*, say*E*(for*error*) and then - for every
(
*s*,*a*)*S*× such that was not defined at this couple, define (*s*,*a*) =*E*.

__DFA RECOGNIZING THE UNION OF SEVERAL LANGUAGES.__ Each of the four DFA in Figure 2
recognizes a simple language over the alphabet

Let us denote these languages by

*L*_{1}is the language reduced to the single world*if*,*L*_{2}is the language of the identifiers made from letters and digits and starting with a letter,*L*_{3}is the language of the integer numbers,*L*_{4}is the language of the floating point numbers.

- Consider a word
*w*_{i}which belongs to some*L*_{i}and a word*w*_{j}such that*w*_{i}*w*_{j}belongs to some*L*_{j}. For instance with*w*_{i}= 123 and*w*_{j}= .456 should the combined DFA accept*w*_{i}or*w*_{i}*w*_{j}? The rule is to choose the longest initial substring that can match a pattern (think of integers). Hence*w*_{i}*w*_{j}is chosen. - It may happen that
*w*_{i}*w*_{j}(the longest match) belongs to several*L*_{j}. Hence we need to define some priorities. For instance the word*if*belongs to*L*_{1}and*L*_{2}. However we want the word*if*to be a reserved word, not an identifier. So we have to replace*L*_{2}by*L*_{2}*L*_{1}.

2004-12-02