TTIC 31190: Natural Language Processingkgimpel/teaching/31190/...• Speech and Language Processing,...
Transcript of TTIC 31190: Natural Language Processingkgimpel/teaching/31190/...• Speech and Language Processing,...
![Page 1: TTIC 31190: Natural Language Processingkgimpel/teaching/31190/...• Speech and Language Processing, 2nd Ed. – some chapters of 3rd edition are online • The Analysis of Data, Volume](https://reader034.fdocuments.us/reader034/viewer/2022042606/5fb1bad58825762f186715c2/html5/thumbnails/1.jpg)
TTIC31190:NaturalLanguageProcessing
KevinGimpelWinter2016
Lecture1:Introduction
1
![Page 2: TTIC 31190: Natural Language Processingkgimpel/teaching/31190/...• Speech and Language Processing, 2nd Ed. – some chapters of 3rd edition are online • The Analysis of Data, Volume](https://reader034.fdocuments.us/reader034/viewer/2022042606/5fb1bad58825762f186715c2/html5/thumbnails/2.jpg)
2
Whatisnaturallanguageprocessing?
![Page 3: TTIC 31190: Natural Language Processingkgimpel/teaching/31190/...• Speech and Language Processing, 2nd Ed. – some chapters of 3rd edition are online • The Analysis of Data, Volume](https://reader034.fdocuments.us/reader034/viewer/2022042606/5fb1bad58825762f186715c2/html5/thumbnails/3.jpg)
3
anexperimentalcomputerscienceresearchareathatincludesproblemsandsolutionspertainingto
theunderstandingofhumanlanguage
Whatisnaturallanguageprocessing?
![Page 4: TTIC 31190: Natural Language Processingkgimpel/teaching/31190/...• Speech and Language Processing, 2nd Ed. – some chapters of 3rd edition are online • The Analysis of Data, Volume](https://reader034.fdocuments.us/reader034/viewer/2022042606/5fb1bad58825762f186715c2/html5/thumbnails/4.jpg)
4
TextClassification
![Page 5: TTIC 31190: Natural Language Processingkgimpel/teaching/31190/...• Speech and Language Processing, 2nd Ed. – some chapters of 3rd edition are online • The Analysis of Data, Volume](https://reader034.fdocuments.us/reader034/viewer/2022042606/5fb1bad58825762f186715c2/html5/thumbnails/5.jpg)
5
TextClassification
• spam/notspam• prioritylevel• category(primary/social/promotions/updates)
![Page 6: TTIC 31190: Natural Language Processingkgimpel/teaching/31190/...• Speech and Language Processing, 2nd Ed. – some chapters of 3rd edition are online • The Analysis of Data, Volume](https://reader034.fdocuments.us/reader034/viewer/2022042606/5fb1bad58825762f186715c2/html5/thumbnails/6.jpg)
6
SentimentAnalysis
![Page 7: TTIC 31190: Natural Language Processingkgimpel/teaching/31190/...• Speech and Language Processing, 2nd Ed. – some chapters of 3rd edition are online • The Analysis of Data, Volume](https://reader034.fdocuments.us/reader034/viewer/2022042606/5fb1bad58825762f186715c2/html5/thumbnails/7.jpg)
7
MachineTranslation
![Page 8: TTIC 31190: Natural Language Processingkgimpel/teaching/31190/...• Speech and Language Processing, 2nd Ed. – some chapters of 3rd edition are online • The Analysis of Data, Volume](https://reader034.fdocuments.us/reader034/viewer/2022042606/5fb1bad58825762f186715c2/html5/thumbnails/8.jpg)
8
MachineTranslation
NewPoll:WillyoubuyanAppleWatch?
![Page 9: TTIC 31190: Natural Language Processingkgimpel/teaching/31190/...• Speech and Language Processing, 2nd Ed. – some chapters of 3rd edition are online • The Analysis of Data, Volume](https://reader034.fdocuments.us/reader034/viewer/2022042606/5fb1bad58825762f186715c2/html5/thumbnails/9.jpg)
9
QuestionAnswering
![Page 10: TTIC 31190: Natural Language Processingkgimpel/teaching/31190/...• Speech and Language Processing, 2nd Ed. – some chapters of 3rd edition are online • The Analysis of Data, Volume](https://reader034.fdocuments.us/reader034/viewer/2022042606/5fb1bad58825762f186715c2/html5/thumbnails/10.jpg)
10
Summarization
![Page 11: TTIC 31190: Natural Language Processingkgimpel/teaching/31190/...• Speech and Language Processing, 2nd Ed. – some chapters of 3rd edition are online • The Analysis of Data, Volume](https://reader034.fdocuments.us/reader034/viewer/2022042606/5fb1bad58825762f186715c2/html5/thumbnails/11.jpg)
11
Summarization
TheAppleWatchhasdrawbacks.Thereareothersmartwatches thatoffermorecapabilities.
![Page 12: TTIC 31190: Natural Language Processingkgimpel/teaching/31190/...• Speech and Language Processing, 2nd Ed. – some chapters of 3rd edition are online • The Analysis of Data, Volume](https://reader034.fdocuments.us/reader034/viewer/2022042606/5fb1bad58825762f186715c2/html5/thumbnails/12.jpg)
12
DialogSystems
user:ScheduleameetingwithMattandDavidonThursday.computer:Thursdaywon’tworkforDavid.HowaboutFriday?user:I’dpreferMondaythen,butFridaywouldbeokifnecessary.
![Page 13: TTIC 31190: Natural Language Processingkgimpel/teaching/31190/...• Speech and Language Processing, 2nd Ed. – some chapters of 3rd edition are online • The Analysis of Data, Volume](https://reader034.fdocuments.us/reader034/viewer/2022042606/5fb1bad58825762f186715c2/html5/thumbnails/13.jpg)
13
Part-of-SpeechTagging
determinerverb(past)prep.properproperposs.adj.nounSomequestionedifTimCook’sfirstproduct
modalverbdet.adjectivenounprep.properpunc.wouldbeabreakawayhitforApple.
![Page 14: TTIC 31190: Natural Language Processingkgimpel/teaching/31190/...• Speech and Language Processing, 2nd Ed. – some chapters of 3rd edition are online • The Analysis of Data, Volume](https://reader034.fdocuments.us/reader034/viewer/2022042606/5fb1bad58825762f186715c2/html5/thumbnails/14.jpg)
determinerverb(past)prep.properproperposs.adj.noun
modalverbdet.adjectivenounprep.properpunc.
14
Part-of-SpeechTagging
determinerverb(past)prep.nounnounposs.adj.nounSomequestionedifTimCook’sfirstproduct
modalverbdet.adjectivenounprep.nounpunc.wouldbeabreakawayhitforApple.
![Page 15: TTIC 31190: Natural Language Processingkgimpel/teaching/31190/...• Speech and Language Processing, 2nd Ed. – some chapters of 3rd edition are online • The Analysis of Data, Volume](https://reader034.fdocuments.us/reader034/viewer/2022042606/5fb1bad58825762f186715c2/html5/thumbnails/15.jpg)
determinerverb(past)prep.properproperposs.adj.noun
modalverbdet.adjectivenounprep.properpunc.
15
Part-of-SpeechTagging
determinerverb(past)prep.nounnounposs.adj.nounSomequestionedifTimCook’sfirstproduct
modalverbdet.adjectivenounprep.nounpunc.wouldbeabreakawayhitforApple.
SomequestionedifTimCook’sfirstproductwouldbeabreakawayhitforApple.
NamedEntityRecognition
PERSON ORGANIZATION
![Page 16: TTIC 31190: Natural Language Processingkgimpel/teaching/31190/...• Speech and Language Processing, 2nd Ed. – some chapters of 3rd edition are online • The Analysis of Data, Volume](https://reader034.fdocuments.us/reader034/viewer/2022042606/5fb1bad58825762f186715c2/html5/thumbnails/16.jpg)
16
SyntacticParsing
![Page 17: TTIC 31190: Natural Language Processingkgimpel/teaching/31190/...• Speech and Language Processing, 2nd Ed. – some chapters of 3rd edition are online • The Analysis of Data, Volume](https://reader034.fdocuments.us/reader034/viewer/2022042606/5fb1bad58825762f186715c2/html5/thumbnails/17.jpg)
figurecredit:Durrett &Klein(2014)17
Revenues of $14.5 billion were posted by Dell1. The company1 ...
en.wikipedia.org/wiki/Dell
en.wikipedia.org/wiki/Michael_Dell
Infobox type: company
Infobox type: person
ORGANIZATIONPERSON
Figure 1: Coreference can help resolve ambiguous casesof semantic types or entity links: propagating informationacross coreference arcs can inform us that, in this context,Dell is an organization and should therefore link to thearticle on Dell in Wikipedia.
shown that tighter integration of coreference andentity linking is promising (Hajishirzi et al., 2013;Zheng et al., 2013); we extend these approaches andmodel the entire process more holistically. Namedentity recognition is improved by simple coreference(Finkel et al., 2005; Ratinov and Roth, 2009) andknowledge from Wikipedia (Kazama and Torisawa,2007; Ratinov and Roth, 2009; Nothman et al.,2013; Sil and Yates, 2013). Joint models of corefer-ence and NER have been proposed in Haghighi andKlein (2010) and Durrett et al. (2013), but in neithercase was supervised data used for both tasks. Tech-nically, our model is most closely related to that ofSingh et al. (2013), who handle coreference, namedentity recognition, and relation extraction.2 Our sys-tem is novel in three ways: the choice of tasks tomodel jointly, the fact that we maintain uncertaintyabout all decisions throughout inference (rather thanusing a greedy approach), and the feature sets wedeploy for cross-task interactions.
In designing a joint model, we would like topreserve the modularity, efficiency, and structuralsimplicity of pipelined approaches. Our model’sfeature-based structure permits improvement of fea-tures specific to a particular task or to a pair of tasks.By pruning variable domains with a coarse modeland using approximate inference via belief propaga-tion, we maintain efficiency and our model is only afactor of two slower than the union of the individual
2Our model could potentially be extended to handle relationextraction or mention detection, which has also been addressedin past joint modeling efforts (Daume and Marcu, 2005; Li andJi, 2014), but that is outside the scope of the current work.
models. Finally, as a structured CRF, it is concep-tually no more complex than its component modelsand its behavior can be understood using the sameintuition.
We apply our model to two datasets, ACE 2005and OntoNotes, with different mention standardsand layers of annotation. In both settings, our jointmodel outperforms our independent baseline mod-els. On ACE, we achieve state-of-the-art entity link-ing results, matching the performance of the systemof Fahrni and Strube (2014). On OntoNotes, wematch the performance of the best published coref-erence system (Bjorkelund and Kuhn, 2014) andoutperform two strong NER systems (Ratinov andRoth, 2009; Passos et al., 2014).
2 Motivating Examples
We first present two examples to motivate our ap-proach. Figure 1 shows an example of a case wherecoreference is beneficial for named entity recogni-tion and entity linking. The company is clearlycoreferent to Dell by virtue of the lack of other possi-ble antecedents; this in turn indicates that Dell refersto the corporation rather than to Michael Dell. Thiseffect can be captured for entity linking by a fea-ture tying the lexical item company to the fact thatCOMPANY is in the Wikipedia infobox for Dell,3
thereby helping the linker make the correct decision.This would also be important for recovering the factthat the mention the company links to Dell; how-ever, in the version of the task we consider, a men-tion like the company actually links to the Wikipediaarticle for Company.4
Figure 2 shows a different example, one wherethe coreference is now ambiguous but entity linkingis transparent. In this case, an NER system basedon surface statistics alone would likely predict thatFreddie Mac is a PERSON. However, the Wikipediaarticle for Freddie Mac is unambiguous, which al-lows us to fix this error. The pronoun his can then becorrectly resolved.
These examples justify why these tasks should behandled jointly: there is no obvious pipeline orderfor a system designer who cares about the perfor-
3Monospaced fonts indicate titles of Wikipedia articles.4This decision was largely driven by a need to match the
ACE linking annotations provided by Bentivogli et al. (2010).
Coreference Resolution
EntityLinking
![Page 18: TTIC 31190: Natural Language Processingkgimpel/teaching/31190/...• Speech and Language Processing, 2nd Ed. – some chapters of 3rd edition are online • The Analysis of Data, Volume](https://reader034.fdocuments.us/reader034/viewer/2022042606/5fb1bad58825762f186715c2/html5/thumbnails/18.jpg)
18
“Winograd Schema”Coreference Resolution
Themancouldn'tlifthissonbecausehewassoweak.
Themancouldn'tlifthissonbecausehewassoheavy.
![Page 19: TTIC 31190: Natural Language Processingkgimpel/teaching/31190/...• Speech and Language Processing, 2nd Ed. – some chapters of 3rd edition are online • The Analysis of Data, Volume](https://reader034.fdocuments.us/reader034/viewer/2022042606/5fb1bad58825762f186715c2/html5/thumbnails/19.jpg)
19
“Winograd Schema”Coreference Resolution
Themancouldn'tlifthissonbecausehewassoweak.
Themancouldn'tlifthissonbecausehewassoheavy.
man
son
![Page 20: TTIC 31190: Natural Language Processingkgimpel/teaching/31190/...• Speech and Language Processing, 2nd Ed. – some chapters of 3rd edition are online • The Analysis of Data, Volume](https://reader034.fdocuments.us/reader034/viewer/2022042606/5fb1bad58825762f186715c2/html5/thumbnails/20.jpg)
OncetherewasaboynamedFritzwholovedtodraw.Hedreweverything.Inthemorning,hedrewapictureofhiscerealwithmilk.Hispapasaid,“Don’tdrawyourcereal.Eatit!”Afterschool,Fritzdrewapictureofhisbicycle.Hisunclesaid,“Don'tdrawyourbicycle.Rideit!”…
WhatdidFritzdrawfirst?A)thetoothpasteB)hismamaC)cerealandmilkD)hisbicycle
20
ReadingComprehension
![Page 21: TTIC 31190: Natural Language Processingkgimpel/teaching/31190/...• Speech and Language Processing, 2nd Ed. – some chapters of 3rd edition are online • The Analysis of Data, Volume](https://reader034.fdocuments.us/reader034/viewer/2022042606/5fb1bad58825762f186715c2/html5/thumbnails/21.jpg)
Conspicuousbytheirabsence…• speechrecognition(seeTTIC31110)• informationretrievalandwebsearch• knowledgerepresentation• recommendersystems
21
![Page 22: TTIC 31190: Natural Language Processingkgimpel/teaching/31190/...• Speech and Language Processing, 2nd Ed. – some chapters of 3rd edition are online • The Analysis of Data, Volume](https://reader034.fdocuments.us/reader034/viewer/2022042606/5fb1bad58825762f186715c2/html5/thumbnails/22.jpg)
ComputationalLinguisticsvs.NaturalLanguageProcessing
• howdotheydiffer?
22
![Page 23: TTIC 31190: Natural Language Processingkgimpel/teaching/31190/...• Speech and Language Processing, 2nd Ed. – some chapters of 3rd edition are online • The Analysis of Data, Volume](https://reader034.fdocuments.us/reader034/viewer/2022042606/5fb1bad58825762f186715c2/html5/thumbnails/23.jpg)
ComputationalBiologyvs.Bioinformatics
“Computationalbiology=thestudyofbiologyusingcomputationaltechniques.Thegoalistolearnnewbiology,knowledgeaboutlivingsystems.Itisaboutscience.
Bioinformatics=thecreationoftools(algorithms,databases)thatsolveproblems.Thegoalistobuildusefultoolsthatworkonbiologicaldata.Itisaboutengineering.”
--RussAltman
23
![Page 24: TTIC 31190: Natural Language Processingkgimpel/teaching/31190/...• Speech and Language Processing, 2nd Ed. – some chapters of 3rd edition are online • The Analysis of Data, Volume](https://reader034.fdocuments.us/reader034/viewer/2022042606/5fb1bad58825762f186715c2/html5/thumbnails/24.jpg)
ComputationalLinguisticsvs.NaturalLanguageProcessing
• manypeoplethinkofthetwotermsassynonyms
• computationallinguisticsismoreinclusive;morelikelytoincludesociolinguistics,cognitivelinguistics,andcomputationalsocialscience
• NLPismorelikelytousemachinelearningandinvolveengineering/system-building
24
![Page 25: TTIC 31190: Natural Language Processingkgimpel/teaching/31190/...• Speech and Language Processing, 2nd Ed. – some chapters of 3rd edition are online • The Analysis of Data, Volume](https://reader034.fdocuments.us/reader034/viewer/2022042606/5fb1bad58825762f186715c2/html5/thumbnails/25.jpg)
IsNLPScienceorEngineering?• goalofNLPistodeveloptechnology,whichtakestheformofengineering
• thoughwetrytosolvetoday’sproblems,weseekprinciplesthatwillbeusefulforthefuture
• ifscience,it’snotlinguisticsorcognitivescience;it’sthescienceofcomputationalprocessingoflanguage
• soIliketothinkthatwe’redoingthescienceofengineering
25
![Page 26: TTIC 31190: Natural Language Processingkgimpel/teaching/31190/...• Speech and Language Processing, 2nd Ed. – some chapters of 3rd edition are online • The Analysis of Data, Volume](https://reader034.fdocuments.us/reader034/viewer/2022042606/5fb1bad58825762f186715c2/html5/thumbnails/26.jpg)
CourseOverview• Newcourse,firsttimebeingoffered
• Aimedatfirst-yearPhDstudents
• Instructorofficehours:Mondays3-4pm,TTIC531
• Teachingassistant:Lifu Tu,TTICPhDstudent
26
![Page 27: TTIC 31190: Natural Language Processingkgimpel/teaching/31190/...• Speech and Language Processing, 2nd Ed. – some chapters of 3rd edition are online • The Analysis of Data, Volume](https://reader034.fdocuments.us/reader034/viewer/2022042606/5fb1bad58825762f186715c2/html5/thumbnails/27.jpg)
Prerequisites• Nocourseprerequisites,butIwillassume:– someprogrammingexperience(nospecificlanguagerequired)
– familiaritywithbasicsofprobability,calculus,andlinearalgebra
• Undergraduateswithrelevantbackgroundarewelcometotakethecourse.Pleasebringanenrollmentapprovalformtomeifyoucan’tenrollonline.
27
![Page 28: TTIC 31190: Natural Language Processingkgimpel/teaching/31190/...• Speech and Language Processing, 2nd Ed. – some chapters of 3rd edition are online • The Analysis of Data, Volume](https://reader034.fdocuments.us/reader034/viewer/2022042606/5fb1bad58825762f186715c2/html5/thumbnails/28.jpg)
Grading• 3assignments(15%each)• midtermexam(15%)• courseproject(35%):– preliminaryreportandmeetingwithinstructor(10%)– classpresentation(5%)– finalreport(20%)
• classparticipation(5%)• nofinal
28
![Page 29: TTIC 31190: Natural Language Processingkgimpel/teaching/31190/...• Speech and Language Processing, 2nd Ed. – some chapters of 3rd edition are online • The Analysis of Data, Volume](https://reader034.fdocuments.us/reader034/viewer/2022042606/5fb1bad58825762f186715c2/html5/thumbnails/29.jpg)
Assignments• Mixtureofformalexercises,implementation,experimentation,analysis
• “Chooseyourownadventure”componentbasedonyourinterests,e.g.:– exploratorydataanalysis– machinelearning– implementation/scalability– modelanderroranalysis– visualization
29
![Page 30: TTIC 31190: Natural Language Processingkgimpel/teaching/31190/...• Speech and Language Processing, 2nd Ed. – some chapters of 3rd edition are online • The Analysis of Data, Volume](https://reader034.fdocuments.us/reader034/viewer/2022042606/5fb1bad58825762f186715c2/html5/thumbnails/30.jpg)
Project• Replicate[partof]apublishedNLPpaper,ordefineyourownproject.
• Theprojectmaybedoneindividuallyorinagroupoftwo.Eachgroupmemberwillreceivethesamegrade.
• Moredetailstocome.
30
![Page 31: TTIC 31190: Natural Language Processingkgimpel/teaching/31190/...• Speech and Language Processing, 2nd Ed. – some chapters of 3rd edition are online • The Analysis of Data, Volume](https://reader034.fdocuments.us/reader034/viewer/2022042606/5fb1bad58825762f186715c2/html5/thumbnails/31.jpg)
CollaborationPolicy• Youarewelcometodiscussassignmentswithothersinthecourse,butsolutionsandcodemustbewrittenindividually
31
![Page 32: TTIC 31190: Natural Language Processingkgimpel/teaching/31190/...• Speech and Language Processing, 2nd Ed. – some chapters of 3rd edition are online • The Analysis of Data, Volume](https://reader034.fdocuments.us/reader034/viewer/2022042606/5fb1bad58825762f186715c2/html5/thumbnails/32.jpg)
Textbooks
• Allareoptional• SpeechandLanguageProcessing,2nd Ed.
– somechaptersof3rd editionareonline
• TheAnalysisofData,Volume1:Probability– freelyavailableonline
• IntroductiontoInformationRetrieval– freelyavailableonline
32
![Page 33: TTIC 31190: Natural Language Processingkgimpel/teaching/31190/...• Speech and Language Processing, 2nd Ed. – some chapters of 3rd edition are online • The Analysis of Data, Volume](https://reader034.fdocuments.us/reader034/viewer/2022042606/5fb1bad58825762f186715c2/html5/thumbnails/33.jpg)
Roadmap• classification• words• lexicalsemantics• languagemodeling• sequencelabeling• syntaxandsyntacticparsing• neuralnetworkmethodsinNLP• semanticcompositionality• semanticparsing• unsupervisedlearning• machinetranslationandotherapplications
33
![Page 34: TTIC 31190: Natural Language Processingkgimpel/teaching/31190/...• Speech and Language Processing, 2nd Ed. – some chapters of 3rd edition are online • The Analysis of Data, Volume](https://reader034.fdocuments.us/reader034/viewer/2022042606/5fb1bad58825762f186715c2/html5/thumbnails/34.jpg)
WhyisNLPhard?• ambiguityandvariabilityoflinguisticexpression:– variability:manyformscanmeanthesamething– ambiguity:oneformcanmeanmanythings
• therearemanydifferentkindsofambiguity• eachNLPtaskhastoaddressadistinctsetofkinds
34
![Page 35: TTIC 31190: Natural Language Processingkgimpel/teaching/31190/...• Speech and Language Processing, 2nd Ed. – some chapters of 3rd edition are online • The Analysis of Data, Volume](https://reader034.fdocuments.us/reader034/viewer/2022042606/5fb1bad58825762f186715c2/html5/thumbnails/35.jpg)
WordSenseAmbiguity• manywordshavemultiplemeanings
35
![Page 36: TTIC 31190: Natural Language Processingkgimpel/teaching/31190/...• Speech and Language Processing, 2nd Ed. – some chapters of 3rd edition are online • The Analysis of Data, Volume](https://reader034.fdocuments.us/reader034/viewer/2022042606/5fb1bad58825762f186715c2/html5/thumbnails/36.jpg)
WordSenseAmbiguity
36
credit:A.Zwicky
![Page 37: TTIC 31190: Natural Language Processingkgimpel/teaching/31190/...• Speech and Language Processing, 2nd Ed. – some chapters of 3rd edition are online • The Analysis of Data, Volume](https://reader034.fdocuments.us/reader034/viewer/2022042606/5fb1bad58825762f186715c2/html5/thumbnails/37.jpg)
WordSenseAmbiguity
37
credit:A.Zwicky
![Page 38: TTIC 31190: Natural Language Processingkgimpel/teaching/31190/...• Speech and Language Processing, 2nd Ed. – some chapters of 3rd edition are online • The Analysis of Data, Volume](https://reader034.fdocuments.us/reader034/viewer/2022042606/5fb1bad58825762f186715c2/html5/thumbnails/38.jpg)
AttachmentAmbiguity
38
![Page 39: TTIC 31190: Natural Language Processingkgimpel/teaching/31190/...• Speech and Language Processing, 2nd Ed. – some chapters of 3rd edition are online • The Analysis of Data, Volume](https://reader034.fdocuments.us/reader034/viewer/2022042606/5fb1bad58825762f186715c2/html5/thumbnails/39.jpg)
MeaningAmbiguity
39
![Page 40: TTIC 31190: Natural Language Processingkgimpel/teaching/31190/...• Speech and Language Processing, 2nd Ed. – some chapters of 3rd edition are online • The Analysis of Data, Volume](https://reader034.fdocuments.us/reader034/viewer/2022042606/5fb1bad58825762f186715c2/html5/thumbnails/40.jpg)
• simplestuser-facingNLPapplication• email(spam,priority,categories):
• sentiment:
• topicclassification• others?
40
TextClassification
![Page 41: TTIC 31190: Natural Language Processingkgimpel/teaching/31190/...• Speech and Language Processing, 2nd Ed. – some chapters of 3rd edition are online • The Analysis of Data, Volume](https://reader034.fdocuments.us/reader034/viewer/2022042606/5fb1bad58825762f186715c2/html5/thumbnails/41.jpg)
Whatisaclassifier?
41
![Page 42: TTIC 31190: Natural Language Processingkgimpel/teaching/31190/...• Speech and Language Processing, 2nd Ed. – some chapters of 3rd edition are online • The Analysis of Data, Volume](https://reader034.fdocuments.us/reader034/viewer/2022042606/5fb1bad58825762f186715c2/html5/thumbnails/42.jpg)
Whatisaclassifier?• afunctionfrominputsx toclassificationlabelsy
42
![Page 43: TTIC 31190: Natural Language Processingkgimpel/teaching/31190/...• Speech and Language Processing, 2nd Ed. – some chapters of 3rd edition are online • The Analysis of Data, Volume](https://reader034.fdocuments.us/reader034/viewer/2022042606/5fb1bad58825762f186715c2/html5/thumbnails/43.jpg)
Whatisaclassifier?• afunctionfrominputsx toclassificationlabelsy• onesimpletypeofclassifier:– foranyinputx,assignascoretoeachlabely,parameterizedbyvector :
43
![Page 44: TTIC 31190: Natural Language Processingkgimpel/teaching/31190/...• Speech and Language Processing, 2nd Ed. – some chapters of 3rd edition are online • The Analysis of Data, Volume](https://reader034.fdocuments.us/reader034/viewer/2022042606/5fb1bad58825762f186715c2/html5/thumbnails/44.jpg)
Whatisaclassifier?• afunctionfrominputsx toclassificationlabelsy• onesimpletypeofclassifier:– foranyinputx,assignascoretoeachlabely,parameterizedbyvector :
– classifybychoosinghighest-scoringlabel:
44
![Page 45: TTIC 31190: Natural Language Processingkgimpel/teaching/31190/...• Speech and Language Processing, 2nd Ed. – some chapters of 3rd edition are online • The Analysis of Data, Volume](https://reader034.fdocuments.us/reader034/viewer/2022042606/5fb1bad58825762f186715c2/html5/thumbnails/45.jpg)
CoursePhilosophy• Fromreadingpapers,onegetstheideathatmachinelearningconceptsaremonolithic,opaqueobjects– e.g.,naïveBayes,logisticregression,SVMs,CRFs,neuralnetworks,LSTMs,etc.
• Nothingisopaque• Everythingcanbedissected,whichrevealsconnections• Thenamesaboveareusefulshorthand,butnotusefulforgainingunderstanding
45
![Page 46: TTIC 31190: Natural Language Processingkgimpel/teaching/31190/...• Speech and Language Processing, 2nd Ed. – some chapters of 3rd edition are online • The Analysis of Data, Volume](https://reader034.fdocuments.us/reader034/viewer/2022042606/5fb1bad58825762f186715c2/html5/thumbnails/46.jpg)
CoursePhilosophy• Wewilldrawfrommachinelearning,linguistics,and
algorithms,buttechnicalmaterialwillbe(mostly)self-contained;wewon’tusemanyblackboxes
• Wewillfocusondeclarative(ratherthanprocedural)specifications,becausetheyhighlightconnectionsanddifferences
46
![Page 47: TTIC 31190: Natural Language Processingkgimpel/teaching/31190/...• Speech and Language Processing, 2nd Ed. – some chapters of 3rd edition are online • The Analysis of Data, Volume](https://reader034.fdocuments.us/reader034/viewer/2022042606/5fb1bad58825762f186715c2/html5/thumbnails/47.jpg)
Modeling,Inference,Learning
47
![Page 48: TTIC 31190: Natural Language Processingkgimpel/teaching/31190/...• Speech and Language Processing, 2nd Ed. – some chapters of 3rd edition are online • The Analysis of Data, Volume](https://reader034.fdocuments.us/reader034/viewer/2022042606/5fb1bad58825762f186715c2/html5/thumbnails/48.jpg)
Modeling,Inference,Learning
• Modeling:Howdoweassignascoretoan(x,y)pairusingparameters?
modeling:definescorefunction
48
![Page 49: TTIC 31190: Natural Language Processingkgimpel/teaching/31190/...• Speech and Language Processing, 2nd Ed. – some chapters of 3rd edition are online • The Analysis of Data, Volume](https://reader034.fdocuments.us/reader034/viewer/2022042606/5fb1bad58825762f186715c2/html5/thumbnails/49.jpg)
Modeling,Inference,Learning
• Inference:Howdoweefficientlysearchoverthespaceofalllabels?
inference:solve_ modeling:definescorefunction
49
![Page 50: TTIC 31190: Natural Language Processingkgimpel/teaching/31190/...• Speech and Language Processing, 2nd Ed. – some chapters of 3rd edition are online • The Analysis of Data, Volume](https://reader034.fdocuments.us/reader034/viewer/2022042606/5fb1bad58825762f186715c2/html5/thumbnails/50.jpg)
Modeling,Inference,Learning
• Learning:Howdowechoose?
learning:choose_
modeling:definescorefunctioninference:solve_
50
![Page 51: TTIC 31190: Natural Language Processingkgimpel/teaching/31190/...• Speech and Language Processing, 2nd Ed. – some chapters of 3rd edition are online • The Analysis of Data, Volume](https://reader034.fdocuments.us/reader034/viewer/2022042606/5fb1bad58825762f186715c2/html5/thumbnails/51.jpg)
Modeling,Inference,Learning
• Wewillusethissameparadigmthroughoutthecourse,evenwhentheoutputspacesizeisexponentialinthesizeoftheinputorisunbounded(e.g.,machinetranslation)
learning:choose_
modeling:definescorefunctioninference:solve_
51
![Page 52: TTIC 31190: Natural Language Processingkgimpel/teaching/31190/...• Speech and Language Processing, 2nd Ed. – some chapters of 3rd edition are online • The Analysis of Data, Volume](https://reader034.fdocuments.us/reader034/viewer/2022042606/5fb1bad58825762f186715c2/html5/thumbnails/52.jpg)
Notation• We’lluseboldfaceforvectors:
• Individualentrieswillusesubscriptsandnoboldface,e.g.,forentryi:
52
![Page 53: TTIC 31190: Natural Language Processingkgimpel/teaching/31190/...• Speech and Language Processing, 2nd Ed. – some chapters of 3rd edition are online • The Analysis of Data, Volume](https://reader034.fdocuments.us/reader034/viewer/2022042606/5fb1bad58825762f186715c2/html5/thumbnails/53.jpg)
Modeling:LinearModels• Scorefunctionislinearin:
• f:featurefunctionvector• :weightvector
53
![Page 54: TTIC 31190: Natural Language Processingkgimpel/teaching/31190/...• Speech and Language Processing, 2nd Ed. – some chapters of 3rd edition are online • The Analysis of Data, Volume](https://reader034.fdocuments.us/reader034/viewer/2022042606/5fb1bad58825762f186715c2/html5/thumbnails/54.jpg)
Modeling:LinearModels• Scorefunctionislinearin:
• f:featurefunctionvector• :weightvector• Howdowedefinef?
54
![Page 55: TTIC 31190: Natural Language Processingkgimpel/teaching/31190/...• Speech and Language Processing, 2nd Ed. – some chapters of 3rd edition are online • The Analysis of Data, Volume](https://reader034.fdocuments.us/reader034/viewer/2022042606/5fb1bad58825762f186715c2/html5/thumbnails/55.jpg)
DefiningFeatures• ThisisalargepartofNLP• Last20years:featureengineering• Last2years:representationlearning
• Inthiscourse,wewilldoboth• Learningrepresentationsdoesn’tmeanthatwedon’thavetolookatthedataortheoutput!
• There’sstillplentyofengineeringrequiredinrepresentationlearning
55
![Page 56: TTIC 31190: Natural Language Processingkgimpel/teaching/31190/...• Speech and Language Processing, 2nd Ed. – some chapters of 3rd edition are online • The Analysis of Data, Volume](https://reader034.fdocuments.us/reader034/viewer/2022042606/5fb1bad58825762f186715c2/html5/thumbnails/56.jpg)
DefiningFeatures• ThisisalargepartofNLP• Last20years:featureengineering• Last2years:representationlearning
• Inthiscourse,we’lldoboth• Learningrepresentationsdoesn’tmeanthatwedon’thavetolookatthedataortheoutput!
• There’sstillplentyofengineeringrequiredinrepresentationlearning
56
![Page 57: TTIC 31190: Natural Language Processingkgimpel/teaching/31190/...• Speech and Language Processing, 2nd Ed. – some chapters of 3rd edition are online • The Analysis of Data, Volume](https://reader034.fdocuments.us/reader034/viewer/2022042606/5fb1bad58825762f186715c2/html5/thumbnails/57.jpg)
FeatureEngineering• Oftendecriedas“costly,hand-crafted,expensive,domain-specific”,etc.
• Butinpractice,simplefeaturestypicallygivethebulkoftheperformance
• Let’sgetconcrete:howshouldwedefinefeaturesfortextclassification?
57
![Page 58: TTIC 31190: Natural Language Processingkgimpel/teaching/31190/...• Speech and Language Processing, 2nd Ed. – some chapters of 3rd edition are online • The Analysis of Data, Volume](https://reader034.fdocuments.us/reader034/viewer/2022042606/5fb1bad58825762f186715c2/html5/thumbnails/58.jpg)
FeatureEngineeringforTextClassification
58
![Page 59: TTIC 31190: Natural Language Processingkgimpel/teaching/31190/...• Speech and Language Processing, 2nd Ed. – some chapters of 3rd edition are online • The Analysis of Data, Volume](https://reader034.fdocuments.us/reader034/viewer/2022042606/5fb1bad58825762f186715c2/html5/thumbnails/59.jpg)
FeatureEngineeringforTextClassification
59
isnowavectorbecauseitisasequenceofwords
![Page 60: TTIC 31190: Natural Language Processingkgimpel/teaching/31190/...• Speech and Language Processing, 2nd Ed. – some chapters of 3rd edition are online • The Analysis of Data, Volume](https://reader034.fdocuments.us/reader034/viewer/2022042606/5fb1bad58825762f186715c2/html5/thumbnails/60.jpg)
FeatureEngineeringforTextClassification
60
isnowavectorbecauseitisasequenceofwords
let’sconsidersentimentanalysis:
![Page 61: TTIC 31190: Natural Language Processingkgimpel/teaching/31190/...• Speech and Language Processing, 2nd Ed. – some chapters of 3rd edition are online • The Analysis of Data, Volume](https://reader034.fdocuments.us/reader034/viewer/2022042606/5fb1bad58825762f186715c2/html5/thumbnails/61.jpg)
FeatureEngineeringforTextClassification
61
isnowavectorbecauseitisasequenceofwords
so,hereisoursentimentclassifierthatusesalinearmodel:
let’sconsidersentimentanalysis:
![Page 62: TTIC 31190: Natural Language Processingkgimpel/teaching/31190/...• Speech and Language Processing, 2nd Ed. – some chapters of 3rd edition are online • The Analysis of Data, Volume](https://reader034.fdocuments.us/reader034/viewer/2022042606/5fb1bad58825762f186715c2/html5/thumbnails/62.jpg)
FeatureEngineeringforTextClassification
• Twofeatures:
where
• Whatshouldtheweightsbe?
62
![Page 63: TTIC 31190: Natural Language Processingkgimpel/teaching/31190/...• Speech and Language Processing, 2nd Ed. – some chapters of 3rd edition are online • The Analysis of Data, Volume](https://reader034.fdocuments.us/reader034/viewer/2022042606/5fb1bad58825762f186715c2/html5/thumbnails/63.jpg)
FeatureEngineeringforTextClassification
• Twofeatures:
where
• Whatshouldtheweightsbe?
63
![Page 64: TTIC 31190: Natural Language Processingkgimpel/teaching/31190/...• Speech and Language Processing, 2nd Ed. – some chapters of 3rd edition are online • The Analysis of Data, Volume](https://reader034.fdocuments.us/reader034/viewer/2022042606/5fb1bad58825762f186715c2/html5/thumbnails/64.jpg)
FeatureEngineeringforTextClassification
• Twofeatures:
• Let’ssayweset• Onsentencescontaining“great”intheStanfordSentimentTreebanktrainingdata,thiswouldgetusanaccuracyof69%
• But“great’’onlyappearsin83/6911examples
64
![Page 65: TTIC 31190: Natural Language Processingkgimpel/teaching/31190/...• Speech and Language Processing, 2nd Ed. – some chapters of 3rd edition are online • The Analysis of Data, Volume](https://reader034.fdocuments.us/reader034/viewer/2022042606/5fb1bad58825762f186715c2/html5/thumbnails/65.jpg)
FeatureEngineeringforTextClassification
• Twofeatures:
• Let’ssayweset• Onsentencescontaining“great”intheStanfordSentimentTreebanktrainingdata,thiswouldgetusanaccuracyof69%
• But“great’’onlyappearsin83/6911examples
65
variability:manyotherwordscanindicatepositivesentiment
ambiguity:“great”canmeandifferentthingsindifferentcontexts
![Page 66: TTIC 31190: Natural Language Processingkgimpel/teaching/31190/...• Speech and Language Processing, 2nd Ed. – some chapters of 3rd edition are online • The Analysis of Data, Volume](https://reader034.fdocuments.us/reader034/viewer/2022042606/5fb1bad58825762f186715c2/html5/thumbnails/66.jpg)
• Usually,greatindicatespositivesentiment:Themostwondrous lovestoryinyears,itisagreat film.Agreat companionpiecetootherNapoleon films.
• Sometimesnot.Why?Negation:It'snotagreatmonstermovie.Differentsense:There'sagreatdealofcornydialogueandpreposterousmoments.Multiplesentiments: Agreat ensemblecastcan'tliftthisheartfeltenterpriseoutofthefamiliar.
• Andtherearemanyotherwordsthatindicatepositivesentiment
66
![Page 67: TTIC 31190: Natural Language Processingkgimpel/teaching/31190/...• Speech and Language Processing, 2nd Ed. – some chapters of 3rd edition are online • The Analysis of Data, Volume](https://reader034.fdocuments.us/reader034/viewer/2022042606/5fb1bad58825762f186715c2/html5/thumbnails/67.jpg)
• Usually,greatindicatespositivesentiment:Themostwondrous lovestoryinyears,itisagreat film.Agreat companionpiecetootherNapoleon films.
• Sometimesnot.Why?Negation:It'snotagreatmonstermovie.Differentsense:There'sagreatdealofcornydialogueandpreposterousmoments.Multiplesentiments: Agreat ensemblecastcan'tliftthisheartfeltenterpriseoutofthefamiliar.
67
![Page 68: TTIC 31190: Natural Language Processingkgimpel/teaching/31190/...• Speech and Language Processing, 2nd Ed. – some chapters of 3rd edition are online • The Analysis of Data, Volume](https://reader034.fdocuments.us/reader034/viewer/2022042606/5fb1bad58825762f186715c2/html5/thumbnails/68.jpg)
FeatureEngineeringforTextClassification
• Whataboutafeaturelikethefollowing?
• Whatshoulditsweightbe?
68
![Page 69: TTIC 31190: Natural Language Processingkgimpel/teaching/31190/...• Speech and Language Processing, 2nd Ed. – some chapters of 3rd edition are online • The Analysis of Data, Volume](https://reader034.fdocuments.us/reader034/viewer/2022042606/5fb1bad58825762f186715c2/html5/thumbnails/69.jpg)
FeatureEngineeringforTextClassification
• Whataboutafeaturelikethefollowing?
• Whatshoulditsweightbe?• Doesn’tmatter.• Why?
69
![Page 70: TTIC 31190: Natural Language Processingkgimpel/teaching/31190/...• Speech and Language Processing, 2nd Ed. – some chapters of 3rd edition are online • The Analysis of Data, Volume](https://reader034.fdocuments.us/reader034/viewer/2022042606/5fb1bad58825762f186715c2/html5/thumbnails/70.jpg)
TextClassification
70
ourlinearsentimentclassifier:
![Page 71: TTIC 31190: Natural Language Processingkgimpel/teaching/31190/...• Speech and Language Processing, 2nd Ed. – some chapters of 3rd edition are online • The Analysis of Data, Volume](https://reader034.fdocuments.us/reader034/viewer/2022042606/5fb1bad58825762f186715c2/html5/thumbnails/71.jpg)
Inference forTextClassification
71
inference:solve_
![Page 72: TTIC 31190: Natural Language Processingkgimpel/teaching/31190/...• Speech and Language Processing, 2nd Ed. – some chapters of 3rd edition are online • The Analysis of Data, Volume](https://reader034.fdocuments.us/reader034/viewer/2022042606/5fb1bad58825762f186715c2/html5/thumbnails/72.jpg)
Inference forTextClassification
72
inference:solve_
• trivial(loopoverlabels)
![Page 73: TTIC 31190: Natural Language Processingkgimpel/teaching/31190/...• Speech and Language Processing, 2nd Ed. – some chapters of 3rd edition are online • The Analysis of Data, Volume](https://reader034.fdocuments.us/reader034/viewer/2022042606/5fb1bad58825762f186715c2/html5/thumbnails/73.jpg)
TextClassification
73
![Page 74: TTIC 31190: Natural Language Processingkgimpel/teaching/31190/...• Speech and Language Processing, 2nd Ed. – some chapters of 3rd edition are online • The Analysis of Data, Volume](https://reader034.fdocuments.us/reader034/viewer/2022042606/5fb1bad58825762f186715c2/html5/thumbnails/74.jpg)
LearningforTextClassification
74
learning:choose_
![Page 75: TTIC 31190: Natural Language Processingkgimpel/teaching/31190/...• Speech and Language Processing, 2nd Ed. – some chapters of 3rd edition are online • The Analysis of Data, Volume](https://reader034.fdocuments.us/reader034/viewer/2022042606/5fb1bad58825762f186715c2/html5/thumbnails/75.jpg)
LearningforTextClassification
75
learning:choose_
• Therearemanywaystochoose
![Page 76: TTIC 31190: Natural Language Processingkgimpel/teaching/31190/...• Speech and Language Processing, 2nd Ed. – some chapters of 3rd edition are online • The Analysis of Data, Volume](https://reader034.fdocuments.us/reader034/viewer/2022042606/5fb1bad58825762f186715c2/html5/thumbnails/76.jpg)
ExperimentalPractice• inthebeginning,wejusthaddata• firstinnovation:splitintotrainandtest– motivation:simulateconditionsofapplyingsysteminpractice
• but,there’saproblemwiththis…– weneedtoexploreandevaluatemethodologicalchoices
– aftermultipleevaluationsontest,itisnolongerasimulationofreal-worldconditions
76
![Page 77: TTIC 31190: Natural Language Processingkgimpel/teaching/31190/...• Speech and Language Processing, 2nd Ed. – some chapters of 3rd edition are online • The Analysis of Data, Volume](https://reader034.fdocuments.us/reader034/viewer/2022042606/5fb1bad58825762f186715c2/html5/thumbnails/77.jpg)
ExperimentalPractice• inthebeginning,wejusthaddata• firstinnovation:splitintotrain andtest– motivation:simulateconditionsofapplyingsysteminpractice
• but,there’saproblemwiththis…– weneedtoexploreandevaluatemethodologicalchoices
– aftermultipleevaluationsontest,itisnolongerasimulationofreal-worldconditions
77
![Page 78: TTIC 31190: Natural Language Processingkgimpel/teaching/31190/...• Speech and Language Processing, 2nd Ed. – some chapters of 3rd edition are online • The Analysis of Data, Volume](https://reader034.fdocuments.us/reader034/viewer/2022042606/5fb1bad58825762f186715c2/html5/thumbnails/78.jpg)
ExperimentalPractice• inthebeginning,wejusthaddata• firstinnovation:splitintotrain andtest– motivation:simulateconditionsofapplyingsysteminpractice
• but,there’saproblemwiththis…– weneedtoexploreandevaluatemethodologicalchoices
– aftermultipleevaluationsontest,itisnolongerasimulationofreal-worldconditions
78
![Page 79: TTIC 31190: Natural Language Processingkgimpel/teaching/31190/...• Speech and Language Processing, 2nd Ed. – some chapters of 3rd edition are online • The Analysis of Data, Volume](https://reader034.fdocuments.us/reader034/viewer/2022042606/5fb1bad58825762f186715c2/html5/thumbnails/79.jpg)
ExperimentalPractice• inthebeginning,wejusthaddata• firstinnovation:splitintotrain andtest– motivation:simulateconditionsofapplyingsysteminpractice
• but,there’saproblemwiththis…– weneedtoexploreandevaluatemethodologicalchoices
– aftermultipleevaluationsontest,itisnolongerasimulationofreal-worldconditions
79
![Page 80: TTIC 31190: Natural Language Processingkgimpel/teaching/31190/...• Speech and Language Processing, 2nd Ed. – some chapters of 3rd edition are online • The Analysis of Data, Volume](https://reader034.fdocuments.us/reader034/viewer/2022042606/5fb1bad58825762f186715c2/html5/thumbnails/80.jpg)
ExperimentalPractice• weneedtoexplore/evaluatemethodologicalchoices• whatshouldwedo?– someusecrossvalidationontrain,butthisisslowanddoesn’tquitesimulatereal-worldsettings(why?)
• secondinnovation:dividedataintotrain,test,andathirdsetcalleddevelopmentorvalidation– usedevelopment/validationtoevaluatechoices– then,whenreadytowritethepaper,evaluatethebestmodelontest
• arewedoneyet?no!there’sstillaproblem
80
![Page 81: TTIC 31190: Natural Language Processingkgimpel/teaching/31190/...• Speech and Language Processing, 2nd Ed. – some chapters of 3rd edition are online • The Analysis of Data, Volume](https://reader034.fdocuments.us/reader034/viewer/2022042606/5fb1bad58825762f186715c2/html5/thumbnails/81.jpg)
ExperimentalPractice• weneedtoexplore/evaluatemethodologicalchoices• whatshouldwedo?– someusecrossvalidationontrain,butthisisslowanddoesn’tquitesimulatereal-worldsettings(why?)
• secondinnovation:dividedataintotrain,test,andathirdsetcalleddevelopment (dev)orvalidation(val)– usedev/val toevaluatechoices– then,whenreadytowritethepaper,evaluatethebestmodelontest
• arewedoneyet?no!there’sstillaproblem:– overfitting todev/val
81
![Page 82: TTIC 31190: Natural Language Processingkgimpel/teaching/31190/...• Speech and Language Processing, 2nd Ed. – some chapters of 3rd edition are online • The Analysis of Data, Volume](https://reader034.fdocuments.us/reader034/viewer/2022042606/5fb1bad58825762f186715c2/html5/thumbnails/82.jpg)
ExperimentalPractice• weneedtoexplore/evaluatemethodologicalchoices• whatshouldwedo?– someusecrossvalidationontrain,butthisisslowanddoesn’tquitesimulatereal-worldsettings(why?)
• secondinnovation:dividedataintotrain,test,andathirdsetcalleddevelopment (dev)orvalidation(val)– usedev/val toevaluatechoices– then,whenreadytowritethepaper,evaluatethebestmodelontest
• arewedoneyet?no!there’sstillaproblem:– overfitting todev/val
82
![Page 83: TTIC 31190: Natural Language Processingkgimpel/teaching/31190/...• Speech and Language Processing, 2nd Ed. – some chapters of 3rd edition are online • The Analysis of Data, Volume](https://reader034.fdocuments.us/reader034/viewer/2022042606/5fb1bad58825762f186715c2/html5/thumbnails/83.jpg)
ExperimentalPractice• weneedtoexplore/evaluatemethodologicalchoices• whatshouldwedo?– someusecrossvalidationontrain,butthisisslowanddoesn’tquitesimulatereal-worldsettings(why?)
• secondinnovation:dividedataintotrain,test,andathirdsetcalleddevelopment (dev)orvalidation(val)– usedev/val toevaluatechoices– then,whenreadytowritethepaper,evaluatethebestmodelontest
• arewedoneyet?no!there’sstillaproblem:– overfitting todev/val
83
![Page 84: TTIC 31190: Natural Language Processingkgimpel/teaching/31190/...• Speech and Language Processing, 2nd Ed. – some chapters of 3rd edition are online • The Analysis of Data, Volume](https://reader034.fdocuments.us/reader034/viewer/2022042606/5fb1bad58825762f186715c2/html5/thumbnails/84.jpg)
ExperimentalPractice• bestpractice:splitdataintotrain,development(dev),developmenttest(devtest),andtest– trainmodelontrain,tunehyperparameter valuesondev,dopreliminarytestingondevtest,dofinaltestingontestasingletimewhenwritingthepaper
– Evenbettertohaveevenmoretestsets!test1,test2,etc.
• experimentalcredibilityisahugecomponentofdoingusefulresearch
• whenyoupublisharesult,ithadbetterbereplicablewithouttuninganythingontest
84
![Page 85: TTIC 31190: Natural Language Processingkgimpel/teaching/31190/...• Speech and Language Processing, 2nd Ed. – some chapters of 3rd edition are online • The Analysis of Data, Volume](https://reader034.fdocuments.us/reader034/viewer/2022042606/5fb1bad58825762f186715c2/html5/thumbnails/85.jpg)
Don’tCheat!
85