Applied Computer Science II Chapter 1 : Regular Languages Prof. Dr. Luc De Raedt Institut für...

61
Applied Computer Science II Chapter 1 : Regular Languages Prof. Dr. Luc De Raedt Institut für Informatik Albert-Ludwigs Universität Freiburg Germany
  • date post

    18-Dec-2015
  • Category

    Documents

  • view

    212
  • download

    0

Transcript of Applied Computer Science II Chapter 1 : Regular Languages Prof. Dr. Luc De Raedt Institut für...

Page 1: Applied Computer Science II Chapter 1 : Regular Languages Prof. Dr. Luc De Raedt Institut für Informatik Albert-Ludwigs Universität Freiburg Germany.

Applied Computer Science IIChapter 1 : Regular

Languages

Prof. Dr. Luc De Raedt

Institut für InformatikAlbert-Ludwigs Universität Freiburg

Germany

Page 2: Applied Computer Science II Chapter 1 : Regular Languages Prof. Dr. Luc De Raedt Institut für Informatik Albert-Ludwigs Universität Freiburg Germany.

Overview

• Deterministic finite automata• Regular languages• Nondeterministic finite automata• Closure operations• Regular expressions• Nonregular languages• The pumping lemma

Page 3: Applied Computer Science II Chapter 1 : Regular Languages Prof. Dr. Luc De Raedt Institut für Informatik Albert-Ludwigs Universität Freiburg Germany.

Finite Automata

• An intuitive example : supermarket door controller

• Figures 1,2,3

• Probabilistic counterparts exist– Markov chains, bayesian nets, etc.– Not in this course

Page 4: Applied Computer Science II Chapter 1 : Regular Languages Prof. Dr. Luc De Raedt Institut für Informatik Albert-Ludwigs Universität Freiburg Germany.

A finite automaton

• Figure 1.4 • Formally

1 2 3

1

2

States : , ,

Startstate :

Acceptstate :

Transitions

Output : or

q q q

q

q

accept reject

0A is a 5-tuple ( , , , , )

1. is a finite set of states

2. is a finite set, the alphabet

3. : is the transition function

4. is the start stat

finite automaton

e

5. is the set of accept stateo

Q q F

Q

Q Q

q Q

F Q

s

Page 5: Applied Computer Science II Chapter 1 : Regular Languages Prof. Dr. Luc De Raedt Institut für Informatik Albert-Ludwigs Universität Freiburg Germany.

1

1 2 3

1 1 2

2 3 2

3 2 2

1

2

Describe

{ , , }

{0,1}

defined by

0 1

start state

{ }

M

Q q q q

q q q

q q q

q q q

q

F q

is the language of machine

we write ( )

{ | contains at least one 1 and an

even number of 0s follow the last 1 }

A M

L M A

A w w

Page 6: Applied Computer Science II Chapter 1 : Regular Languages Prof. Dr. Luc De Raedt Institut für Informatik Albert-Ludwigs Universität Freiburg Germany.
Page 7: Applied Computer Science II Chapter 1 : Regular Languages Prof. Dr. Luc De Raedt Institut für Informatik Albert-Ludwigs Universität Freiburg Germany.

Other examples

• 7,8,9

Page 8: Applied Computer Science II Chapter 1 : Regular Languages Prof. Dr. Luc De Raedt Institut für Informatik Albert-Ludwigs Universität Freiburg Germany.

Another example

0

is the language of all strings where the sum of the numbers

is a multiple of except that the sum is reset to 0 whenever the symbol appears

A generalisation :

Auto

1.

maton =

{ ,..., }

2.

i

i i

i

A

i reset

Q q q

B

0

={0,1,2, }

3. ( ,0)

( ,1) where ( 1) mod

( , 2) where ( 2) mod

( , )

4. is start and accept state

j j

j k

j k

j

o

reset

q q

q q k j i

q q k j i

q reset q

q Q

Page 9: Applied Computer Science II Chapter 1 : Regular Languages Prof. Dr. Luc De Raedt Institut für Informatik Albert-Ludwigs Universität Freiburg Germany.

Formal definition of computation

0

1

0

0 0

1 1

Let be a finite automaton ( , , , , )

Let .... be a string over

if a sequence of states ,..., exists in suaccepts

recogn

ch that

1.

2. ( , ) for all 0,..., 1

3

z i e

.

n

n

i i i

n

M Q q F

w w w

M w r r Q

r q

r w r i n

r F

M

s

regul

langu

ar

age if { | accepts }

A language is if some finite automaton recognizes it.

A A w M w

Page 10: Applied Computer Science II Chapter 1 : Regular Languages Prof. Dr. Luc De Raedt Institut für Informatik Albert-Ludwigs Universität Freiburg Germany.

Designing finite automata

• Design automaton for language consisting of binary strings with an odd number of 1s

• Design first states• Then transitions• Accept and reject states• Fig. 12

Page 11: Applied Computer Science II Chapter 1 : Regular Languages Prof. Dr. Luc De Raedt Institut für Informatik Albert-Ludwigs Universität Freiburg Germany.

Another example

• Design an automaton to recognize the language of binary strings containing the string 001 as substring

• Fig 13

Page 12: Applied Computer Science II Chapter 1 : Regular Languages Prof. Dr. Luc De Raedt Institut für Informatik Albert-Ludwigs Universität Freiburg Germany.

The regular operations

*1

*

Let and be languages

We define :

: { | or }

: { | and }

: { ,..., | 0 and

Union

Concaten

each }

note:

ation

St

always

Example

{ ,

{

r

,

a

}

}

n i

A B

A B x x A x B

A B xy x A y B

A x x n x A

A

A good bad

B boy girl

Page 13: Applied Computer Science II Chapter 1 : Regular Languages Prof. Dr. Luc De Raedt Institut für Informatik Albert-Ludwigs Universität Freiburg Germany.

Regular languages are closed under …

• Proof

A set is under an operation if applying on

elements of yields elements of .

Example: multiplication on natural numbers

Counterexample :division of nat

closed

ural numbers

S o o

S S

1 2 1 2

1.12

The class of the regular languages is closed under the union operation.

In other words, if and are regular languages, so is A A A A

Theorem

Page 14: Applied Computer Science II Chapter 1 : Regular Languages Prof. Dr. Luc De Raedt Institut für Informatik Albert-Ludwigs Universität Freiburg Germany.

1 1 1 1 1 1 2 2 2 2 2 2

1 2 1 1 2 2

1 2

1 2 1 1 2 2

1 2

1 2 1 1 2 2

( , , , , )

constructed from ( , , , , ) and ( , , , , )

Define

1. {( , ) | and }

2.

3. (( , ), ) ( ( , ), ( , ))

4. ( , )

5. {( , ) | or }

M Q q F

M Q q F M Q q F

Q r r r Q r Q

r r a r a r a

q q q

F r r r F r F

Page 15: Applied Computer Science II Chapter 1 : Regular Languages Prof. Dr. Luc De Raedt Institut für Informatik Albert-Ludwigs Universität Freiburg Germany.
Page 16: Applied Computer Science II Chapter 1 : Regular Languages Prof. Dr. Luc De Raedt Institut für Informatik Albert-Ludwigs Universität Freiburg Germany.

1 2 1 2

1.13

The class of the regular languages is closed under the concatenation operation.

In other words, if and are regular languages, so is A A A A

Theorem

Page 17: Applied Computer Science II Chapter 1 : Regular Languages Prof. Dr. Luc De Raedt Institut für Informatik Albert-Ludwigs Universität Freiburg Germany.

Non deterministic finite automata

• Deterministic – One successor state transitions not allowed

• Non deterministic– Several successor states possible– transitions possible

• Figure 14

Page 18: Applied Computer Science II Chapter 1 : Regular Languages Prof. Dr. Luc De Raedt Institut für Informatik Albert-Ludwigs Universität Freiburg Germany.

Deterministic versus non deterministic

computation• Figure 15

Page 19: Applied Computer Science II Chapter 1 : Regular Languages Prof. Dr. Luc De Raedt Institut für Informatik Albert-Ludwigs Universität Freiburg Germany.

Example

Page 20: Applied Computer Science II Chapter 1 : Regular Languages Prof. Dr. Luc De Raedt Institut für Informatik Albert-Ludwigs Universität Freiburg Germany.

Every NFA has an equivalent DFA

• Figures 17-18

Page 21: Applied Computer Science II Chapter 1 : Regular Languages Prof. Dr. Luc De Raedt Institut für Informatik Albert-Ludwigs Universität Freiburg Germany.

Another NFA

Page 22: Applied Computer Science II Chapter 1 : Regular Languages Prof. Dr. Luc De Raedt Institut für Informatik Albert-Ludwigs Universität Freiburg Germany.

Nondeterministic finite automaton

0A is a 5-tuple ( , , , , )

1. is a finite set of states

2. is a finite set, the alphabet

nondeterministic

P3. : ( ) is the transition function

4. is the start s

fini

tate

5.

te automa

is

th

ton

o

Q q F

Q

Q Q

q Q

F Q

include

e set of

s

P( ) the powerset o

accept

f

stat

es

Q Q

Page 23: Applied Computer Science II Chapter 1 : Regular Languages Prof. Dr. Luc De Raedt Institut für Informatik Albert-Ludwigs Universität Freiburg Germany.

Example

• Example 18

Page 24: Applied Computer Science II Chapter 1 : Regular Languages Prof. Dr. Luc De Raedt Institut für Informatik Albert-Ludwigs Universität Freiburg Germany.

Formal definition of computation

0

1

0

0 0

1 1

Let be a finite automaton ( , , , , )

Let .... be a string over

if a sequence of states ,..., exists in such that

1.

2. ( , ) for all 0,..

nondeterministic

., 1

accepts

n

n

i i i

M Q q F

w w w

M w r r Q

r q

r r w i n

3. nr F

Page 25: Applied Computer Science II Chapter 1 : Regular Languages Prof. Dr. Luc De Raedt Institut für Informatik Albert-Ludwigs Universität Freiburg Germany.

Equivalence NFA and DFA

Two machines are if they recognize the same language

1.19

Every nondeterministic finite automaton has an equivalent finite automaton

1.20

A language is regular if and onl f

y i

equivalent

Theorem

Corollary

some nondeterministic finite

automaton recognizes it.

Page 26: Applied Computer Science II Chapter 1 : Regular Languages Prof. Dr. Luc De Raedt Institut für Informatik Albert-Ludwigs Universität Freiburg Germany.

Proof

Page 27: Applied Computer Science II Chapter 1 : Regular Languages Prof. Dr. Luc De Raedt Institut für Informatik Albert-Ludwigs Universität Freiburg Germany.
Page 28: Applied Computer Science II Chapter 1 : Regular Languages Prof. Dr. Luc De Raedt Institut für Informatik Albert-Ludwigs Universität Freiburg Germany.

An example

Page 29: Applied Computer Science II Chapter 1 : Regular Languages Prof. Dr. Luc De Raedt Institut für Informatik Albert-Ludwigs Universität Freiburg Germany.

Closure under the regular operations

1 2 1 2

1.12/1.22

The class of the regular languages is closed under the union operation.

In other words, if and are regular languages, so is A A A A

Theorem

1.23

The class of the regular languages is closed under the concatenation operation.

Theorem

1.24

The class of the regular languages is closed under the star operation.

Theorem

Page 30: Applied Computer Science II Chapter 1 : Regular Languages Prof. Dr. Luc De Raedt Institut für Informatik Albert-Ludwigs Universität Freiburg Germany.

Proof idea

• INSERT FIG 1.24

1 2 1 2

1.12/1.22

The class of the regular languages is closed under the union operation.

In other words, if and are regular languages, so is A A A A

Theorem

Page 31: Applied Computer Science II Chapter 1 : Regular Languages Prof. Dr. Luc De Raedt Institut für Informatik Albert-Ludwigs Universität Freiburg Germany.

• PROOF P 60 !

Page 32: Applied Computer Science II Chapter 1 : Regular Languages Prof. Dr. Luc De Raedt Institut für Informatik Albert-Ludwigs Universität Freiburg Germany.

Proof idea

1.23

The class of the regular languages is closed under the concatenation operation.

Theorem

Page 33: Applied Computer Science II Chapter 1 : Regular Languages Prof. Dr. Luc De Raedt Institut für Informatik Albert-Ludwigs Universität Freiburg Germany.
Page 34: Applied Computer Science II Chapter 1 : Regular Languages Prof. Dr. Luc De Raedt Institut für Informatik Albert-Ludwigs Universität Freiburg Germany.

Proof idea 1.24

The class of the regular languages is closed under the star operation.

Theorem

Page 35: Applied Computer Science II Chapter 1 : Regular Languages Prof. Dr. Luc De Raedt Institut für Informatik Albert-Ludwigs Universität Freiburg Germany.
Page 36: Applied Computer Science II Chapter 1 : Regular Languages Prof. Dr. Luc De Raedt Institut für Informatik Albert-Ludwigs Universität Freiburg Germany.

Regular expressions

Page 37: Applied Computer Science II Chapter 1 : Regular Languages Prof. Dr. Luc De Raedt Institut für Informatik Albert-Ludwigs Universität Freiburg Germany.

R

R

R

R

Page 38: Applied Computer Science II Chapter 1 : Regular Languages Prof. Dr. Luc De Raedt Institut für Informatik Albert-Ludwigs Universität Freiburg Germany.

Applications

• Design of compilers

• awk, grep, vi … in unix (search for strings)

• Perl programming language• Bioinformatics

– So called motifs (patterns occurring in sequences, e.g. proteins)

* * * *{ , , }( . . )

where {0,...,9}

DD DD D D DD

D

Page 39: Applied Computer Science II Chapter 1 : Regular Languages Prof. Dr. Luc De Raedt Institut für Informatik Albert-Ludwigs Universität Freiburg Germany.

Proof through :

1.28

A language is regular if and only if some regular expression describes it

Theorem

1.29

If a language is described by some regular expression, then it is regular

Lemma

1.32

If a language is regular, then it is described by some regular expression

Lemma

Page 40: Applied Computer Science II Chapter 1 : Regular Languages Prof. Dr. Luc De Raedt Institut für Informatik Albert-Ludwigs Universität Freiburg Germany.

1.29

If a language is described by some regular expression, then it is regular

Lemma

Page 41: Applied Computer Science II Chapter 1 : Regular Languages Prof. Dr. Luc De Raedt Institut für Informatik Albert-Ludwigs Universität Freiburg Germany.
Page 42: Applied Computer Science II Chapter 1 : Regular Languages Prof. Dr. Luc De Raedt Institut für Informatik Albert-Ludwigs Universität Freiburg Germany.
Page 43: Applied Computer Science II Chapter 1 : Regular Languages Prof. Dr. Luc De Raedt Institut für Informatik Albert-Ludwigs Universität Freiburg Germany.
Page 44: Applied Computer Science II Chapter 1 : Regular Languages Prof. Dr. Luc De Raedt Institut für Informatik Albert-Ludwigs Universität Freiburg Germany.

• Two steps– DFA into GNFA (generalized

nondeterministic finite automaton)– GNFA into regular expression

1.32

If a language is regular, then it is described by some regular expression

Lemma

Page 45: Applied Computer Science II Chapter 1 : Regular Languages Prof. Dr. Luc De Raedt Institut für Informatik Albert-Ludwigs Universität Freiburg Germany.

GNFAs

• Two states q and r are connected in both directions

• Exception : – Start state– Accept state– One direction only

• Labels are regular expressions

Page 46: Applied Computer Science II Chapter 1 : Regular Languages Prof. Dr. Luc De Raedt Institut für Informatik Albert-Ludwigs Universität Freiburg Germany.

FormallyA is a 5-tuple ( , , , , )

1. is a finite set of states

2. is a finite set, the

generalized nondeterministic finite au

al

tomat

phabe

: ( { }) ( { }) is the tran

t

3 sitio

on

n.

fun

star

accept

t accep

start

t

Q q

Q q q

Q

Q q R

4. is the start state

5. the accept stat

ction

e start

accept

q Q

q Q

*1

0

0

1

A GNFA ... where each

if a sequence of states ,..., exists in such that

1.

2.

3. for all 0,..., 1, we ha ( ) ve that

where ( , )

accepts k i

n

start

k accept

i i

i

i

i

w w w w

r r Q

r q

r q

i n

R r r

w L R

Page 47: Applied Computer Science II Chapter 1 : Regular Languages Prof. Dr. Luc De Raedt Institut für Informatik Albert-Ludwigs Universität Freiburg Germany.

Convert DFA into GNFA

Add new start state, with arrow to old start state

Add new accept state, with arrows from old accept states

If any arrows have multiple labels and , replace by

Add arrows with label between sta

a b a b

tes where necessary

Page 48: Applied Computer Science II Chapter 1 : Regular Languages Prof. Dr. Luc De Raedt Institut für Informatik Albert-Ludwigs Universität Freiburg Germany.

Convert GNFA into regular expression

Page 49: Applied Computer Science II Chapter 1 : Regular Languages Prof. Dr. Luc De Raedt Institut für Informatik Albert-Ludwigs Universität Freiburg Germany.
Page 50: Applied Computer Science II Chapter 1 : Regular Languages Prof. Dr. Luc De Raedt Institut für Informatik Albert-Ludwigs Universität Freiburg Germany.
Page 51: Applied Computer Science II Chapter 1 : Regular Languages Prof. Dr. Luc De Raedt Institut für Informatik Albert-Ludwigs Universität Freiburg Germany.
Page 52: Applied Computer Science II Chapter 1 : Regular Languages Prof. Dr. Luc De Raedt Institut für Informatik Albert-Ludwigs Universität Freiburg Germany.

Induction Proof

• Induction proof

1.34

For any GNFA , Convert( ) is equivalent to G G G

Claim

2; straightforward

Assume claim holds for 1 states

Suppose

k

k

Basis

Induction

Page 53: Applied Computer Science II Chapter 1 : Regular Languages Prof. Dr. Luc De Raedt Institut für Informatik Albert-Ludwigs Universität Freiburg Germany.

Nonregular Languages

• Finite Automata have a finite memory• Are the following languages regular ?

• Mathematical proof necessary

{0 1 | 0}

{ | has an equal number of 0s and 1s}

{ | has an equal number of occurences of 01 and 10}

n nB n

C w w

D w w

Page 54: Applied Computer Science II Chapter 1 : Regular Languages Prof. Dr. Luc De Raedt Institut für Informatik Albert-Ludwigs Universität Freiburg Germany.

The pumping lemma

If is regular, then there is a number , for which

if is any string in of length at least

then may be divided into three pieces

such that

1. for each 0,

2. | |

0

3

i

A p

s A p

s s xyz

i xy z A

y

The pumping lemma

. | |

Note

xy p

y

Page 55: Applied Computer Science II Chapter 1 : Regular Languages Prof. Dr. Luc De Raedt Institut für Informatik Albert-Ludwigs Universität Freiburg Germany.

Proof Idea

Page 56: Applied Computer Science II Chapter 1 : Regular Languages Prof. Dr. Luc De Raedt Institut für Informatik Albert-Ludwigs Universität Freiburg Germany.

Nonregular languages

{0 1 | 0}n nB n

Choose 0 1

1. string consists only of 0s

2. string consists only of 1s

3. string consists of both 0s and 1s

p ps

y

y

y

Page 57: Applied Computer Science II Chapter 1 : Regular Languages Prof. Dr. Luc De Raedt Institut für Informatik Albert-Ludwigs Universität Freiburg Germany.

{ | has an equal number of 0s and 1s}C w w

Choose 0 1

Would seem possible without condition 3!

However, condition 3 of lemma states | |

Thus consists of 0s only

Then

Choic

p ps

xy p

y

xyyz C

* *

e of crucial. Consider (01)

Alternative proof :

is nonregular

If were regular, then 0 1 regular

Regular languages cl undeosed r inters io ect n

ps s

B

C C B

Page 58: Applied Computer Science II Chapter 1 : Regular Languages Prof. Dr. Luc De Raedt Institut für Informatik Albert-Ludwigs Universität Freiburg Germany.

*{ | {0,1} }F ww w

Choose 0 10 1

Condition 3 of lemma states | |

Thus consists of 0s only

Then

0 0 would not work !

p p

p p

s

xy p

y

xyyz F

Page 59: Applied Computer Science II Chapter 1 : Regular Languages Prof. Dr. Luc De Raedt Institut für Informatik Albert-Ludwigs Universität Freiburg Germany.

2

{1 | 0}nD n 2

1

1

2

Choose 1

Perfect squares : 0,1,4,9,16,25,36,49,...

Consider and

Prove that for large : and cannot both be squares

which should be true according to pumping lemma

Let

p

i i

i i

s

xy z xy z

i xy z xy z

m n

2 2

4

2 4

4

| |

Then ( 1) 2 1 2 1

Choose | | 2 1 2 | | 1

by setting

Indeed, observe

| | | |

<2 1

<2 | | 1

i

i

i

xy z

n n n m

y m xy z

i p

y s p p

p

xy z

Page 60: Applied Computer Science II Chapter 1 : Regular Languages Prof. Dr. Luc De Raedt Institut für Informatik Albert-Ludwigs Universität Freiburg Germany.

{ | 0 1 where }i jE w i j

1

0

Choose 0 1

Condition 3 of lemma states | |

Thus consists of 0s only

Then

p ps

xy p

y

xy z F

Page 61: Applied Computer Science II Chapter 1 : Regular Languages Prof. Dr. Luc De Raedt Institut für Informatik Albert-Ludwigs Universität Freiburg Germany.

Summary

• Deterministic finite automata• Regular languages• Nondeterministic finite automata• Closure operations• Regular expressions• Nonregular languages• The pumping lemma