Automata & languages · Automata & languages A primer on the Theory of Computation. Part 2 out of...
Transcript of Automata & languages · Automata & languages A primer on the Theory of Computation. Part 2 out of...
-
Laurent Vanbever
ETH Zürich (D-ITET)
24 September 20
nsg.ee.ethz.ch
Automata & languages
A primer on the Theory of Computation
-
Part 2 out of 5
-
Last week was all about
Deterministic Finite Automaton
-
Regular Language
Formal definition
Closure
We saw three main concepts
-
Regular Language
Formal definition
Closure
A language L is regular
if some finite automaton
recognizes it
-
Regular Language
Formal definition
Closure
A finite automaton is a 5-tuple
(Q,⌃, �, q0, F )
-
set of
states
alphabet
transition function
start
state
set of
accept
states
(Q,⌃, �, q0, F )
-
Regular Language
Formal definition
Closure
If and are regular,
then so are:
L1 [ L2
L1 L2
L1 \ L2L1 � L2L1 � L2
L1
-
Equivalence
Closure1
2
Finite Automata
Thu Sept 24
DFA
NFA
Regular Expression
-
1/42
BacktoNondeterministicFA
• Question:DrawanFAwhichacceptsthelanguageL1={xÎ {0,1}*|4th bitfromleftofxis0}
• FAforL1:
• Question:Whataboutthe4th bitfromtheright?
• Looksascomplicated:L2={xÎ {0,1}*|4th bitfromrightofxis0}
0,1
0,1
0,1 0,10
0,11
-
1/43
WeirdIdea
• NoticethatL2isthereverseL1.
• I.e.sayingthat0shouldbethe4th fromtheleftisreverseofsayingthat0shouldbe4th fromtheright.Canwesimplyreverse thepicture(reversearrows,swapstartandaccept)?!?
• Here’sthereversedversion:
0,1
0,1
0,1 0,10
0,11
0,1
0,1
0,1 0,10
0,11
-
1/44
Discussionofweirdidea
1. Sillyunreachablestate.Notpretty,butallowedinmodel.
2. Oldstartstatebecameacrashingacceptstate.Underdeterminism.Couldfixwithfailstate.
3. Oldacceptstatebecameastatefromwhichwedon’tknowwhattodowhenreading0.Overdeterminism.Trouble.
4. (Notinthisexample,but)Therecouldbemorethanonestartstate!Seeminglyoutsidestandarddeterministicmodel.
• Still,thereissomethingaboutourautomaton.ItturnsoutthatNFA’s(=NondeterministicFA)areactuallyquiteusefulandareembeddedinmanypracticalapplications.
• Idea,keepmorethan1activestate ifnecessary.
-
1/45
IntroductiontoNondeterministicFiniteAutomata
• ThestaticpictureofanNFAisasagraphwhoseedgesarelabeledbyS andbye (togethercalledSe)andwithstartvertexq0andacceptstatesF.
• Example:
• AnylabeledgraphyoucancomeupwithisanNFA,aslongasitonlyhasonestartstate.Later,eventhisrestrictionwillbedropped.
0,10
1e
1
-
1/46
NFA:FormalDefinition.
• Definition:Anondeterministicfiniteautomaton(NFA) isencapsulatedbyM=(Q,S,d,q0,F)inthesamewayasanFA,exceptthatd hasdifferentdomainandco- domain: !: #×Σ& → ( #
• Here,P(Q)isthepowersetofQsothatd(q,a) isthesetofallendpointsofedgesfromq whicharelabeledbya.
• Example,forNFAofthepreviousslide:
d(q0,0) = {q1,q3}, d(q0,1) = {q2,q3}, d(q0,e) = Æ,...d(q3,e) = {q2}.
q1q0
q2q3
0,10
1e
1
-
1/47
FormalDefinitionofanNFA:Dynamic
• JustaswithFA’s,thereisanimplicitauxiliarytapecontainingtheinputstringwhichisoperatedonbytheNFA.AsopposedtoFA’s,NFA’sareparallelmachines – abletobeinseveralstatesatanygiveninstant.TheNFAreadsthetapefromlefttorightwitheachnewcharactercausingtheNFAtogointoanothersetofstates.Whenthestringiscompletelyread,thestringisaccepteddependingonwhethertheNFA’sfinalconfigurationcontainsanacceptstate.
• Definition:Astringuisaccepted byanNFAM iff thereexistsapathstartingatq0whichislabeledbyuandendsinanacceptstate.ThelanguageacceptedbyM isthesetofallstringswhichareacceptedbyMandisdenotedbyL(M).• Followingalabele isforfree(withoutreadinganinputsymbol).In
computingthelabelofapath,youshoulddeletealle’s.• TheonlydifferenceinacceptanceforNFA’svs.FA’sarethewords“there
exists”.InFA’sthepathalwaysexistsandisunique.
-
1/48
Example
M4:
Question:Whichofthefollowingstringsisaccepted?1. e2. 03. 14. 0111
q1q0
q2q3
0,10
1e
1
-
1/50
NFA’svs.RegularOperations
• OnthefollowingfewslideswewillstudyhowNFA’sinteractwithregularoperations.
• WewillusethefollowingschematicdrawingforageneralNFA.
• Theredcirclestandsforthestartstateq0,thegreenportionrepresentstheacceptstatesF,theotherstatesaregray.
-
1/51
NFA:Union
• TheunionAÈB isformedbyputtingtheautomataAandBinparallel.Createanewstartstateandconnectittotheformerstartstatesusinge-edges:
-
1/52
UnionExample
• L ={xhasevenlength}È {xendswith11}
c
b
0,1 0,1
d e f
0
1
0
0
1
1
ae e
-
1/53
NFA:Concatenation
• TheconcatenationA•B isformedbyputtingtheautomatainserial.ThestartstatecomesfromA whiletheacceptstatescomefromB.A’sacceptstatesareturnedoffandconnectedviae-edgestoB’sstartstate:
-
1/54
ConcatenationExample
• L ={xhasevenlength}• {xendswith11}
• Remark:Thisexampleissomewhatquestionable…
c
b
0,1 0,1
d e f
0
1
0
01
1e
-
1/55
NFA’s:Kleene-+.
• TheKleene-+A+ isformedbycreatingafeedbackloop.Theacceptstatesconnecttothestartstateviae-edges:
-
1/56
Kleene-+Example
L={ }+
={ }
x is a streak of one or more 1’s followedby a streak of two or more 0’s
00c d f
1
10
e
e
x starts with 1, ends with 0, and alternatesbetween one or more consecutive 1’s
and two or more consecutive 0’s
-
1/57
NFA’s:Kleene-*
• TheconstructionfollowsfromKleene-+constructionusingthefactthatA*istheunionofA+ withtheemptystring.JustcreateKleene-+andaddanewstartacceptstateconnectingtooldstartstatewithane-edge:
-
1/58
ClosureofNFAunderRegularOperations
• TheconstructionsaboveallshowthatNFA’sareconstructively closedundertheregularoperations.Moreformally,
• Theorem:IfL1and L2areacceptedbyNFA’s,thensoareL1È L2, L1•L2,L1+ andL1*.Infact,theacceptingNFA’scanbeconstructedinlineartime.
• Thisisalmostwhatwewant.IfwecanshowthatallNFA’scanbeconvertedintoFA’sthiswillshowthatFA’s– andhenceregularlanguages– areclosedundertheregularoperations.
-
1/59
RegularExpressions(REX)
• Wearealreadyfamiliarwiththeregularoperations.Regularexpressionsgiveawayofsymbolizingasequenceofregularoperations,andthereforeawayofgeneratingnewlanguagesfromold.
• Forexample,togeneratetheregularlanguage{banana,nab}*fromtheatomiclanguages{a},{b}and{n}wecoulddothefollowing:
(({b}•{a}•{n}•{a}•{n}•{a})È({n}•{a}•{b}))*
Regularexpressionsspecifythesameinamorecompactform:
(bananaÈnab)*
-
1/60
RegularExpressions(REX)
• Definition:Thesetofregularexpressions overanalphabetS andthelanguagesinS*whichtheygeneratearedefinedrecursively:– BaseCases:EachsymbolaÎ S aswellasthesymbolse andÆ are
regularexpressions:• ageneratestheatomiclanguageL(a)={a}• e generatesthelanguageL(e)={e}• Æ generatestheemptylanguageL(Æ)={}=Æ
– InductiveCases:ifr1 andr2areregularexpressionssoarer1Èr2,(r1)(r2),(r1)*and(r1)+:• L(r1Èr2)=L(r1)ÈL(r2),sor1Èr2generatestheunion• L((r1)(r2))=L(r1)•L(r2),so(r1)(r2) istheconcatenation• L((r1)*)=L(r1)*, so(r1)* representstheKleene-*• L((r1)+)=L(r1)+,so(r1)+ representstheKleene-+
-
1/61
RegularExpressions:TableofOperationsincludingUNIX
Operation Notation Language UNIX
Union r1Èr2 L(r1)ÈL(r2) r1|r2
Concatenation (r1)(r2) L(r1)•L(r2) (r1)(r2)
Kleene-* (r )* L(r )* (r )*
Kleene-+ (r )+ L(r )+ (r )+
Exponentiation (r )n L(r )n (r ){n}
-
1/62
RegularExpressions:Simplifications
• Justasalgebraicformulascanbesimplifiedbyusinglessparentheseswhentheorderofoperationsisclear,regularexpressionscanbesimplified.Usingthepuredefinitionofregularexpressionstoexpressthelanguage{banana,nab}*wewouldbeforcedtowritesomethingnastylike
((((b)(a))(n))(((a)(n))(a))È(((n)(a))(b)))*
• Usingtheoperatorprecedenceordering*,• ,È andtheassociativityof• allowsustoobtainthesimpler:
(bananaÈnab)*
• Thisisdoneinthesamewayasonewouldsimplifythealgebraicexpressionwithre-orderingdisallowed:
((((b)(a))(n))(((a)(n))(a))+(((n)(a))(b)))4=(banana+nab)4
-
1/63
RegularExpressions:Example
• Question:Findaregularexpressionthatgeneratesthelanguageconsistingofallbit-stringswhichcontainastreakofseven0’sorcontaintwodisjointstreaksofthree1’s.– Legal:010000000011010,01110111001,111111– Illegal:11011010101,10011111001010,00000100000
• Answer:(0È1)*(07È13(0È1)*13)(0È1)*– Anevenbriefervalidansweris:S*(07È13S*13)S*– Theofficial answerusingonlythestandardregularoperationsis:
(0È1)*(0000000È111(0È1)*111)(0È1)*– AbriefUNIXansweris:
(0|1)*(0{7}|1{3}(0|1)*1{3})(0|1)*
-
1/64
RegularExpressions:Examples
1) 0*10*
2) (SS)*
3) 1*Ø
4) S ={0,1},{w|whasatleastone1}
5) S ={0,1},{w|wstartsandendswiththesamesymbol}
6) {w|wisanumericalconstantwithsignand/orfractionalpart}• E.g.3.1415,-.001,+2000
-
1/65
RegularExpressions:Adifferentview…
• Regularexpressionsarejuststrings.Consequently,thesetofallregularexpressionsisasetofstrings,sobydefinitionisalanguage.
• Question:Supposingthatonlyunion,concatenationandKleene-*areconsidered.Whatisthealphabetforthelanguageofregularexpressionsoverthebasealphabet S ?
• Answer:S È {(,),È,*}
-
1/66
REXà NFA
• SinceNFA’sareclosedundertheregularoperationsweimmediatelyget
• Theorem:GivenanyregularexpressionrthereisanNFANwhichsimulatesr.Thatis,thelanguageacceptedbyN ispreciselythelanguagegeneratedbyr sothatL(N)=L(r).Furthermore,theNFAisconstructibleinlineartime.
-
1/67
REXà NFA
• Proof:Theproofworksbyinduction,usingtherecursivedefinitionofregularexpressions.FirstweneedtoshowhowtoacceptthebasecaseregularexpressionsaÎS, e andÆ.ThesearerespectivelyacceptedbytheNFA’s:
• Finally,weneedtoshowhowtoinductivelyacceptregularexpressionsformedbyusingtheregularoperations.Thesearejusttheconstructionsthatwesawbefore,encapsulatedby:
q0 q0q1q0a
-
1/68
REXà NFAexercise:FindNFAfor )* ∪ ) ∗
-
1/69
REXà NFA:Example
• Question:FindanNFAfortheregularexpression(0È1)*(0000000È111(0È1)*111)(0È1)*
ofthepreviousexample.
-
1/70
REXà NFAà FA?!?
• ThefactthatregularexpressionscanbeconvertedintoNFA’smeansthatitmakessensetocallthelanguagesacceptedbyNFA’s“regular.”
• However,theregularlanguagesweredefinedtobethelanguagesacceptedbyFA’s,whicharebydefault,deterministic.ItwouldbeniceifNFA’scouldbe“determinized”andconvertedtoFA’s,forthenthedefinitionof“regular”languages,asbeingFA-acceptedwouldbejustified.
• Let’strythisnext.
-
1/71
NFA’shave3typesofnon-determinism
Nondeterminismtype
MachineAnalog
d -function Easy to fix? Formally
Under-determined Crash No output yes, fail-state |d(q,a)|= 0
Over-determined Random choiceMulti-valued no |d(q,a)|> 1
ePause reading
Redefine alphabet no |d(q,e)|> 0
-
1/72
DeterminizingNFA’s:Example
• Idea:Wemightkeeptrackofallparallelactivestatesastheinputisbeingcalledout.Ifattheendoftheinput,oneoftheactivestateshappenedtobeanacceptstate,theinputwasaccepted.
• Example,considerthefollowingNFA,anditsdeterministicFA.
1
2 3
a
a
e
a,b
b
-
1/73
One-Slide-RecipetoDerandomize
• InsteadofthestatesintheNFA,weconsiderthepower-states intheFA.(IftheNFAhasnstates,theFAhas2n states.)
• Firstwefigureoutwhichpower-stateswillreachwhichpower-statesintheFA.(UsingtherulesoftheNFA.)
• Thenwemustaddallepsilon-edges:Weredirectpointersthatareinitiallypointingtopower-state{a,b,c}topower-state{a,b,c,d,e,f},ifandonlyifthereisanepsilon-edge-only-pathpointingfromanyofthestatesa,b,ctostatesd,e,f(a.k.a.transitiveclosure).Wedotheverysameforthestartingstate:startingstateofFA={startingstateofNFA,allNFAstatesthatcanrecursivelybereachedfromthere}
• Acceptingstates oftheFAareallstatesthatincludeaacceptingNFAstate.
-
1/74
Remarks
• Thepreviousrecipecanbemadetotallyformal.Moredetailscanbefoundinthereadingmaterial.
• JustfollowingtherecipewilloftenproduceatoocomplicatedFA.Sometimesobvioussimplificationscanbemade.Ingeneralhowever,thisisnotaneasytask.
-
1/75
AutomataSimplification
• TheFAcanbesimplified.States{1,2}and{1},forexample,cannotbereached.StilltheresultisnotassimpleastheNFA.
-
Derandomization Exercise
• Exercise:Let’sderandomize thesimplifed two-stateNFAfromslide1/70whichwederivedfromregularexpression )* ∪ ) ∗
1/76
x na b
aa
-
1/77
REXà NFAà FA
• Summary:StartingfromanyNFA,wecanusesubsetconstructionandtheepsilon-transitive-closuretofindanequivalentFAacceptingthesamelanguage.Thus,
• Theorem:IfLisanylanguageacceptedbyanNFA,thenthereexistsaconstructible[deterministic]FAwhichalsoacceptsL.
• Corollary:Theclassofregularlanguagesisclosedundertheregularoperations.
• Proof:SinceNFA’sareclosedunderregularoperations,andFA’sarebydefaultalsoNFA’s,wecanapplytheregularoperationstoanyFA’sanddeterminizeattheendtoobtainanFAacceptingthelanguagedefinedbytheregularoperations.
-
1/78
REXà NFAà FAà REX…
• WeareonestepawayfromshowingthatFA’s» NFA’s» REX’s;i.e.,allthreerepresentationareequivalent.Wewillbedonewhenwecancompletethecircleoftransformations:
FA
NFA
REX
-
1/79
NFAà REXissimple?!?
• ThenFAà REXevensimpler!
• Pleasesolvethissimpleexample:
1
0
0
1
1
11
0
00
-
1/80
REXà NFAà FAà REX…
• InconvertingNFA’stoREX’swe’llintroducethemostgeneralizednotionofanautomaton,thesocalled“GeneralizedNFA”or“GNFA”.InconvertingintoREX’s,we’llfirstgothroughaGNFA:
FA
NFA
REX
GNFA
-
1/81
GNFA’s
• Definition:Ageneralizednondeterministicfiniteautomaton(GNFA) isagraphwhoseedgesarelabeledbyregularexpressions,– withauniquestartstatewithin-degree0,butarrowstoevery
otherstate– andauniqueacceptstatewithout-degree0,butarrowsfrom
everyotherstate(notethatacceptstate¹ startstate)– andanarrowfromanystatetoanyotherstate(includingself).
• AGNFAaccepts astringsifthereexistsapathp fromthestartstatetotheacceptstate suchthatw isanelementofthelanguagegeneratedbytheregularexpressionobtainedbyconcatenatingalllabelsoftheedgesinp.
• Thelanguageaccepted byaGNFAconsistsofalltheacceptedstringsoftheGNFA.
-
1/82
GNFAExample
• ThisisaGNFAbecauseedgesarelabeledbyREX’s,startstatehasnoin-edges,andtheunique acceptstatehasnoout-edges.
• Convinceyourselfthat000000100101100110isaccepted.
b
c
0Èe
000
a
(0110È1001)*
-
1/83
NFAà REXconversionprocess
1. ConstructaGNFAfromtheNFA.A. Iftherearemorethanonearrowsfromonestatetoanother,unify
themusing“È”B. Createauniquestartstatewithin-degree0C. Createauniqueacceptstateofout-degree0D. [Ifthereisnoarrowfromonestatetoanother,insertonewith
labelØ]
2. Loop:AslongastheGNFAhasstrictlymorethan2states:Ripoutarbitraryinteriorstateandmodifyedgelabels.
3. Theansweristheuniquelabelr.
acceptstartr
-
1/84
NFAà REX:RippingOut.
• Rippingoutisdoneasfollows.Ifyouwanttoripthemiddlestatevout(forallpairsofneighborsu,w)…
• …thenyou’llneedtorecreateallthelostpossibilitiesfromutow.I.e.,tothecurrentREXlabelr4oftheedge(u,w)youshouldaddtheconcatenationofthe(u,v )labelr1followedbythe(v,v )-looplabelr2repeatedarbitrarily,followedbythe(v,w )labelr3..Thenew(u,w)substitutewouldthereforebe:
v wur3
r2
r1
r4
wur4 È r1 (r2)*r3
-
1/85
FAà REX:Example
-
1/86
FAà REX:Exercise
-
1/87
Summary:FA≈ NFA≈ REX
• Thiscompletesthedemonstrationthatthethreemethodsofdescribingregularlanguagesare:
1. DeterministicFA’s2. NFA’s3. RegularExpressions
• Wehavelearntthatalltheseareequivalent.
-
1/88
RemarkaboutAutomatonSize
• Creatinganautomatonofsmallsizeisoftenadvantageous.– Allowsforsimpler/cheaperhardware,orbetterexamgrades.– Designing/Minimizingautomataisthereforeafunnysport.Example:
a
b
1
d
0,1
e
0,1
1
c
0,1
gf
0
0
0
01
1
-
1/89
Minimization
• Definition:Anautomatonisirreducible if– itcontainsnouselessstates,and– notwodistinctstatesareequivalent.
• Byjustfollowingthesetworules,youcanarriveatan“irreducible”FA.Generally,suchalocalminimumdoesnothavetobeaglobalminimum.
• Itcanbeshownhowever,thattheseminimizationrulesactuallyproducetheglobalminimumautomaton.
• Theideaisthattwoprefixesu,v areindistinguishableiff forallsuffixesx,ux Î Liff vx Î L.Ifuandvaredistinguishable,theycannotendupinthesamestate.Thereforethenumberofstatesmustbeatleastasmanyasthenumberofpairwisedistinguishableprefixes.
-
1/91
Pigeonholeprinciple
• ConsiderlanguageL,whichcontainswordw Î L.• ConsideranFAwhichacceptsL,withn<|w|states.• Then,whenacceptingw,theFAmustvisitatleastonestatetwice.
• Thisisaccordingtothepigeonhole(a.k.a.Dirichlet)principle:– Ifm>npigeonsareputintonpigeonholes,there'saholewith
morethanonepigeon.– That’saprettyfancynameforaboringobservation...
-
1/92
Languageswithunboundedstrings
• Consequently,regularlanguageswithunboundedstringscanonlyberecognizedbyFA(finite!bounded!)automataiftheselongstringsloop.
• TheFAcanenterthelooponce,twice,…,andnotatall.• Thatis,languageLcontainsall {xz,xyz,xy2z,xy3z,…}.
-
1/93
PumpingLemma
• Theorem:GivenaregularlanguageL,thereisanumberp (calledthepumpingnumber)suchthatanystringinLoflength³ pispumpablewithinitsfirstp letters.
• Inotherwords,forallu Î Lwith|u |³ pwecanwrite:– u=xyz (xisaprefix,zisasuffix)– |y|³ 1 (mid-portionyisnon-empty)– |xy|£ p (pumpingoccursinfirstpletters)– xyizÎ Lforalli ³ 0(canpumpy-portion)
• If,ontheotherhand,thereisnosuchp,thenthelanguageisnotregular.
-
1/94
PumpingLemmaExample
• LetLbethelanguage{0n1n |n³ 0}
• Assume(forthesakeofcontradiction)thatLisregular• Letp bethepumpinglength.Letu bethestring0p1p.• Let’scheckstringuagainstthepumpinglemma:
• “Inotherwords,forallu Î Lwith|u |³ pwecanwrite:– u=xyz (xisaprefix,zisasuffix)– |y|³ 1 (mid-portionyisnon-empty)– |xy|£ p (pumpingoccursinfirstpletters)– xyiz Î Lforalli ³ 0(canpumpy-portion)”
à y = 0+
à Then, xz or xyyz is not in L. Contradiction!
-
1/95
Let’smaketheexampleabitharder…
• LetLbethelanguage{w|whasanequalnumberof0sand1s}
• Assume(forthesakeofcontradiction)thatLisregular• Letp bethepumpinglength.Letu bethestring0p1p.• Let’scheckstringuagainstthepumpinglemma:
• “Inotherwords,forallu Î Lwith|u |³ pwecanwrite:– u=xyz (xisaprefix,zisasuffix)– |y|³ 1 (mid-portionyisnon-empty)– |xy|£ p (pumpingoccursinfirstpletters)– xyiz Î Lforalli ³ 0(canpumpy-portion)”
-
1/96
Harderexamplecontinued
• Again,ymustconsistof0sonly!• Pumpitthere!Clearlyagain,ifxyzÎ L,thenxz orxyyz arenotinL.
• There’sanotheralternativeproofforthisexample:– 0*1*isregular.– Ç isaregularoperation.– IfLregular,thenLÇ 0*1*isalsoregular.– However,LÇ 0*1*isthelanguagewestudiedinthepreviousexample(0n1n).
Acontradiction.
-
1/97
Nowyoutry…
• Is-. = 00 0 ∈ 0 ∪ 1 ∗} regular?
• Is-6 = 17 8beingaprimenumber} regular?