Other Issues - § 3.9 – Not Discussed More advanced algorithm construction – regular expression...

15
Other Issues - § 3.9 – Not Discussed More advanced algorithm construction – regular expression to DFA directly

Transcript of Other Issues - § 3.9 – Not Discussed More advanced algorithm construction – regular expression...

Page 1: Other Issues - § 3.9 – Not Discussed More advanced algorithm construction – regular expression to DFA directly.

Other Issues - § 3.9 – Not Discussed

• More advanced algorithm construction – regular expression to DFA directly

Page 2: Other Issues - § 3.9 – Not Discussed More advanced algorithm construction – regular expression to DFA directly.

Final Notes : R.E. to NFA Construction

• So, an NFA may be simulated by algorithm, when NFA is constructed using Previous techniques

• Algorithm run time is proportional to |N| * |x| where |N| is the number of states and |x| is the length of input

• Alternatively, we can construct DFA from NFA and use the resulting Dtran to recognize input:

space required

O(|r|) O(|r|*|x|)

O(|x|)O(2|r|)DFA

NFA

time to simulate

where |r| is the length of the regular expression.

Page 3: Other Issues - § 3.9 – Not Discussed More advanced algorithm construction – regular expression to DFA directly.

Pulling Together Concepts• Designing Lexical Analyzer Generator

Reg. Expr. NFA construction

NFA DFA conversion

DFA simulation for lexical analyzer• Recall Lex Structure

Pattern Action

Pattern Action

… …

- Each pattern recognizes lexemes- Each pattern described by regular expression

e.g.

etc.

(abc*)ab

(a | b*)abb

Recognizer!

Page 4: Other Issues - § 3.9 – Not Discussed More advanced algorithm construction – regular expression to DFA directly.

Lex Specification Lexical Analyzer

• Let P1, P2, … , Pn be Lex patterns

(regular expressions for valid tokens in prog. lang).

• Construct N(P1), N(P2), … N(Pn)

• Note: accepting state of N(Pi) will be marked by Pi

• Construct NFA:

N(P1)

N(P2)

N(Pn)

• Lex applies conversion algorithm to construct DFA that is equivalent!

Page 5: Other Issues - § 3.9 – Not Discussed More advanced algorithm construction – regular expression to DFA directly.

PictoriallyLex Specification Lex Compiler Transition Table

(a )Lex Compiler

FA Simulator

Transition Table

lexeme input buffer

(b )Schematic lexical analyzer

Page 6: Other Issues - § 3.9 – Not Discussed More advanced algorithm construction – regular expression to DFA directly.

ExampleP1 : aP2 : abbP3 : a*b+

3 patterns

NFA’s:

start

start

start

1

b

b

bb

a

a

a

2

3 4 5

87

6

P1

P2

P3

Page 7: Other Issues - § 3.9 – Not Discussed More advanced algorithm construction – regular expression to DFA directly.

Example – continued (2)Combined NFA:

0

b

b

bb

a

a

a

2

3 4 5

87

6

1

start

Examples a a b a{0,1,3,7} {2,4,7} {7} {8} death

pattern matched: - P1 - P3 -

a b b{0,1,3,7} {2,4,7} {5,8} {6,8}

pattern matched: - P1 P3 P2,P3

P1

P2

P3

break tie in favor of P2

Page 8: Other Issues - § 3.9 – Not Discussed More advanced algorithm construction – regular expression to DFA directly.

Example – continued (3)Alternatively Construct DFA: (keep track of correspondence between patterns and new accepting states)

P2{8}-{6,8}

P3{6,8}-{5,8}

none{8}{7}{7}

P3{8}-{8}

P1{5,8}{7}{2,4,7}

none{8}{2,4,7}{0,1,3,7}

PatternbaSTATE

Input Symbol

break tie in favor of P2

Page 9: Other Issues - § 3.9 – Not Discussed More advanced algorithm construction – regular expression to DFA directly.

Minimizing the Number of States of DFA

.1Construct initial partition of S with two groups: accepting/ non-accepting.

.2(Construct new) For each group G of do begin

.1Partition G into subgroups such that two states s,tof G are in the same subgroup iff for all symbols astates s,t have transitions on a to states of the same group of .

.2Replace G in new by the set of all these subgroups.

.3Compare new and . If equal, final:= then proceed to 4, else set := new and goto 2.

.4Aggregate states belonging in the groups of final

Page 10: Other Issues - § 3.9 – Not Discussed More advanced algorithm construction – regular expression to DFA directly.

example

DC

AB

b

ba

a

a

b

b

Fa

b

A,C,DB,F

a

bb

a

a

Minimized DFA:

Page 11: Other Issues - § 3.9 – Not Discussed More advanced algorithm construction – regular expression to DFA directly.

Using LEX

Lex Program Structure:declarations%%translation rules%%auxiliary procedures

Name the file e.g. test.lexThen, “lex test.lex” produces the file“lex.yy.c” (a C-program)

Page 12: Other Issues - § 3.9 – Not Discussed More advanced algorithm construction – regular expression to DFA directly.

LEX{%

*/definitions of all constantsLT, LE, EQ, NE, GT, GE, IF, THEN, ELSE/* ... ,

}%

......

letter[A-Za-z]

digit[0-9]

id{letter}({letter}|{digit})*

......

%%

if{ return(IF);}

then{ return(THEN);}

{id} {yylval = install_id)(; return)ID(} ;

......

%%

install_id)(

{ */procedure to install the lexeme to the ST/*

C de

clar

ation

sde

clar

ation

sRu

les

Auxi

liary

Page 13: Other Issues - § 3.9 – Not Discussed More advanced algorithm construction – regular expression to DFA directly.

Example of a Lex Program int num_lines = 0, num_chars = 0;

%%

\n {++num_lines; ++num_chars;}. {++num_chars;}

%%

main( argc, argv )int argc; char **argv;

{ ++ argv, --argc; /* skip over program name */

if ( argc > 0 ) yyin = fopen( argv[0], "r" ); else yyin = stdin; yylex(); printf( "# of lines = %d, # of chars = %d\n",

num_lines, num_chars )} ;

Page 14: Other Issues - § 3.9 – Not Discussed More advanced algorithm construction – regular expression to DFA directly.

Another Example# {%include <stdio.h}% >

WS[ \t\n]*

%%

[0123456789+] printf("NUMBER\n");[a-zA-Z[]a-zA-Z0-9 *]printf)"WORD\n"(;

{WS} */ do nothing/* . printf(“UNKNOWN\n“);

%%

main( argc, argv )int argc; char **argv;

++ { argv, --argc; if ( argc > 0 ) yyin = fopen( argv[0], "r" );

else yyin = stdin; yylex} ;)(

Page 15: Other Issues - § 3.9 – Not Discussed More advanced algorithm construction – regular expression to DFA directly.

Concluding RemarksFocused on Lexical Analysis Process, Including

-Regular Expressions- Finite Automaton

- Conversion- Lex

- Interplay among all these various aspects of lexical analysis

Looking Ahead:

The next step in the compilation process is Parsing:

- Top-down vs. Bottom-up

- -Relationship to Language Theory