Administrivia CS388: Natural Language Processing Lecture ...gdurrett/courses/fa2019/lectures/lec12...
Transcript of Administrivia CS388: Natural Language Processing Lecture ...gdurrett/courses/fa2019/lectures/lec12...
CS388:NaturalLanguageProcessing
GregDurre8
Lecture12:DependencyI dependency
syntax
coordina@on
Administrivia
‣ Project1graded,discussionatendoflecture
‣ Mini2duetonight
‣ FinalprojectproposalsduenextTuesday
Recall:Cons@tuency
‣ Tree-structuredsyntac@canalysesofsentences
‣ Nonterminals(NP,VP,etc.)aswellasPOS tags(bo8omlayer)
‣ StructuredisdefinedbyaCFG
Recall:CKY
He wrote a long report on Mars
NPPP
NP
‣ FindargmaxP(T|x)=argmaxP(T,x)
‣ Dynamicprogramming:chartmaintainsthebestwayofbuildingsymbolXoverspan(i,j)
‣ Loopoverallsplitpointsk,applyrulesX->YZtobuild Xineverypossibleway
Cocke-Kasami-Younger
i jk
X
ZY
Recall:Top-downParsing
‣ Dynamicprogrammingversion:
‣ Greedytop-downversion:ateachstage,predictsplitpointkandlabell
(bestwayofbuildingiandjinvolvesmaxingoversplitpointandasinglelabel)
‣ Canscoresplitpointsandalsolabels
Outline
‣ Dependencyrepresenta@on,contrastwithcons@tuency
‣ Projec@vity
‣ Graph-baseddependencyparsers
DependencyRepresenta@on
LexicalizedParsing
S(ran)
NP(dog)
VP(ran)
PP(to)
NP(house)
DT(the) NN(house)TO(to)VBD(ran)DT(the) NN(dog)the housetoranthe dog
DependencyParsing
DT NNTOVBDDT NNthe housetoranthe dog
‣ Dependencysyntax:syntac@cstructureisdefinedbythesearcs‣ Head(parent,governor)connectedtodependent(child,modifier)‣ EachwordhasexactlyoneparentexceptfortheROOTsymbol,dependenciesmustformadirectedacyclicgraph
ROOT
‣ POStagssameasbefore,usuallyrunataggerfirstaspreprocessing
DependencyParsing
DT
NN
TO
VBD
DT
NN
the
house
to
ran
the
dog
‣ S@llano@onofhierarchy!Subtreesobenalignwithcons@tuents
DependencyParsing
DT NNTOVBDDT NNthe housetoranthe dog
‣ Canlabeldependenciesaccordingtosyntac@cfunc@on
det
‣ Majorsourceofambiguityisinthestructure,sowefocusonthatmore(labelingseparatelywithaclassifierworkspre8ywell)
nsubj
pobj
detprep
Dependencyvs.Cons0tuency:PPA5achment
‣ Cons@tuency:severalruleproduc@onsneedtochange
thechildrenatethecakewithaspoon
‣ Dependency:oneword(with)assignedadifferentparent
Dependencyvs.Cons0tuency:PPA5achment
‣Morepredicate-argumentfocusedviewofsyntax
‣ “What’sthemainverbofthesentence?Whatisitssubjectandobject?”—easiertoanswerunderdependencyparsing
‣ Cons@tuency:ternaryruleNP->NPCCNP
Dependencyvs.Cons0tuency:Coordina0on
dogsinhousesandcats
‣ Dependency:firstitemisthehead
Dependencyvs.Cons0tuency:Coordina0on
dogsinhousesandcats
‣ Coordina@onisdecomposedacrossafewarcsasopposedtobeingasingleruleproduc@onasincons@tuency
‣ Canalsochooseandtobethehead‣ Inbothcases,headworddoesn’treallyrepresentthephrase—cons@tuencyrepresenta@onmakesmoresense
[dogsinhouses]andcats dogsin[housesandcats]
StanfordDependencies‣ Designedtobeprac@callyusefulforrela@onextrac@on
Standard Collapsed
Billsonportsandimmigra@onweresubmi8edbySenatorBrownback,RepublicanofKansas
Dependencyvs.Cons@tuency
‣ Dependencyisobenmoreusefulinprac@ce(modelspredicateargumentstructure)
‣ PPa8achmentisbe8ermodeledunderdependency
‣ Coordina@onisbe8ermodeledundercons@tuency
‣ Slightlydifferentrepresenta@onalchoices:
‣ Dependencyparsersareeasiertobuild:no“grammarengineering”,nounaries,easiertogetstructureddiscrimina@vemodelsworkingwell
‣ Dependencyparsersareusuallyfaster
‣ Dependenciesaremoreuniversalcross-lingually
UniversalDependencies‣ Annotatedependencieswiththesamerepresenta@oninmanylanguages
h8p://universaldependencies.org/
English
Bulgarian
Czech
Swiss
Projec@vity
DT
NN
TO
VBD
DT
NN
the
house
to
ran
the
dog
‣ Anysubtreeisacon@guousspanofthesentence<->treeisprojec/ve
Projec@vity‣ Projec@ve<->no“crossing”arcs
dogsinhousesandcats thedograntothehouse
credit:LanguageLog
‣ Crossingarcs:
Projec@vityinotherlanguages
credit:Pitleretal.(2013)
‣ (SwissGermanalsohasfamousnon-context-freeconstruc@ons)
‣ SwissGermanexample
Projec@vity
Pitleretal.(2013)
‣Manytreesinotherlanguagesarenonprojec@ve
‣ Numberoftreesproduceableunderdifferentformalisms
Projec@vity
‣Manytreesinotherlanguagesarenonprojec@ve
‣ Someotherformalisms(thatarehardertoparsein),mostusefuloneis1-Endpoint-Crossing
‣ Numberoftreesproduceableunderdifferentformalisms
Pitleretal.(2013)
Graph-BasedParsing
DefiningDependencyGraphs
‣ Wordsinsentencex,treeTisacollec@onofdirectededges(parent(i),i)foreachwordi
‣ Eachwordhasexactlyoneparent.Edgesmustformaprojec@vetree
‣ Log-linearCRF(discrimina@ve):
‣ Exampleofafeature=I[head=to&modifier=house](moreinafewslides)
the housetoranthe dogROOT
P (T |x) = exp
X
i
w>f(i, parent(i),x)
!
‣ Parsing=iden@fyparent(i)foreachword
GeneralizingCKY
wrote a long report on Mars
45
4
2 5
‣ score(2,7,4)=max(score(2,7,4),newscore)
‣ newscore=chart(2,5,4)+chart(5,7,5)+edgescore(4->5)‣ DPchartwiththreedimensions:start,end,andhead,start<=head<end
‣ Timecomplexityofthis?
‣ Manyspuriousderiva/ons:canbuildthesametreeinmanyways…needabe8eralgorithm
4=report5=on
4 7
Eisner’sAlgorithm:O(n3)
DT NNTOVBDDT NNthe housetoranthe dog
ROOT
‣ Completeitems:headisat“tallend”,maybemissingchildrenontallside‣ Incompleteitems:arcfrom“tall”to“short”end,wordonshortendmayalsobemissingchildren
‣ Cubic-@mealgorithm
‣Maintaintwodynamicprogrammingchartswithdimension[n,n,2]:
Eisner’sAlgorithm:O(n3)
DT NNTOVBDDT NNthe housetoranthe dog
ROOT
+
‣ Completeitem:allchildrenarea8ached,headisatthe“tallend”‣ Incompleteitem:arcfrom“tallend”to“shortend”,mays@llexpectchildren
‣ Taketwoadjacentcompleteitems,addarcandbuildincompleteitem
= or
+ =
‣ Takeanincompleteitem,completeit(othercaseissymmetric)
Eisner’sAlgorithm:O(n3)
DT NNTOVBDDT NNthe housetoranthe dog
ROOT
1)Buildincompletespan
2)Promotetocomplete
3)Buildincompletespan
+
=
+
or
=
Eisner’sAlgorithm:O(n3)
DT NNTOVBDDT NNthe housetoranthe dog
ROOT
+
=
+
or
=4)Promotetocomplete
Eisner’sAlgorithm:O(n3)
DT NNTOVBDDT NNthe housetoranthe dog
ROOT
‣We’vebuiltlebchildrenandrightchildrenofranascompleteitems
‣ A8achingtoROOTmakesanincompleteitemwithlebchildren,a8acheswithrightchildrensubsequentlytofinishtheparse
Eisner’sAlgorithm
the ran to the housedogROOTthe ran to the housedogROOT
Rightcomplete
Lebcomplete
Rightincomplete
Lebincomplete
Eisner’sAlgorithm
DT NNTOVBDDT NNthe housetoranthe dog
ROOT
‣ Eisner’salgorithmdoesn’thavesplitpointambigui@eslikeCKYdoes
‣ Lebandrightchildrenarebuiltindependently,headsareedgesofspans
‣ Chartsarenxnx2becauseweneedtotrackarcdirec@on/lebvsright
Eisner:
n5
BuildingSystems
‣ Canimplementdecodingandmarginalcomputa@onusingEisner’salgorithmtomax/sumoverprojec@vetrees
‣ Conceptuallythesameasinference/learningforsequen@alCRFsforNER,canalsousemargin-basedmethods
FeaturesinGraph-BasedParsing
‣ Dynamicprogramexposestheparentandchildindices
‣ McDonaldetal.(2005)—conjunc@onsofparentandchildwords+POS,POSofwordsinbetween,POSofsurroundingwords
DT NNTOVBDDT NNthe housetoranthe dog
ROOT
‣ HEAD=TO&MOD=NN‣ HEAD=TO&MOD-1=the
‣ HEAD=TO&MOD=house‣ ARC_CROSSES=DT
f(i, parent(i),x)
Higher-OrderParsing
KooandCollins(2009)
‣ Trackaddi@onalstateduringparsingsowecanlookat“grandparents”(andsiblings).O(n4)dynamicprogramoruseapproximatesearch
DT NNTOVBDDT NNthe housetoranthe dog
ROOT
f(i, parent(i), parent(parent(i)),x)
BiaffineNeuralParsing‣ NeuralCRFsfordependencyparsing:letc=LSTMembeddingofi,p=LSTMembeddingofparent(i).score(i,parent(i),x)=pTUc
DozatandManning(2017)
(numwordsxhiddensize) (numwordsxnumwords)
LSTMlooksatwordsandPOS
Evalua@ngDependencyParsing‣ UAS:unlabeleda8achmentscore.Accuracyofchoosingeachword’sparent(ndecisionspersentence)
‣ Log-linearCRFparser,decodingwithEisneralgorithm:91UAS
‣ LAS:addi@onallyconsiderlabelforeachedge
‣ Higher-orderfeaturesfromKooparser:93UAS
‣ BestEnglishresultswithneuralCRFs(DozatandManning):95-96UAS
HPSG
PollardandSag(1994),ZhouandZhao(2019)
‣ Head-drivenphrasestructuregrammar(HPSG):verycomplexgrammarformalismwhichannotateslargefeaturestructuresovertree
‣ Veryli8leworkonHPSGinNLP
Parsingwith“HPSG”
ZhouandZhao(2019)
‣ Jointmodelofcons@tuencyanddependencycombiningideasfromDozat+ManningandSternetal.
Parsingwith“HPSG”
ZhouandZhao(2019)
‣ SlightlystrongerresultsthanDozat+Manning,significantlybe8erresultsonChinese
Takeaways
‣ Dependencyparsingalsohasefficientdynamicprogramsforinference
‣ Dependencyformalismprovidesanalterna@vetocons@tuency,par@cularlyusefulinhowportableitisacrosslanguages
‣ CRFs+neuralCRFs(again)workwell
Proj1Results
JiamingChen:82.46F1
Po-YiChen:82.02F1
Ting-YuYen:81.57F1
Allothers<81
‣ WordPairfeatures,largerwindowforPOStagextrac@on([-2,2])
‣ Alsolargerwindowanddatashufflinginbetweenepochs
‣ Citygaze8eer,genericdaterecognizer
PrakharSingh:81.54F1
‣ UnregularizedAdagradworkedbest