Text Classification and Naïve Bayes - ecology lab€¦ · Decision Trees suffer from...
Transcript of Text Classification and Naïve Bayes - ecology lab€¦ · Decision Trees suffer from...
![Page 1: Text Classification and Naïve Bayes - ecology lab€¦ · Decision Trees suffer from fragmentationin such cases –especially if little data •Optimal if the independence assumptions](https://reader035.fdocuments.us/reader035/viewer/2022071103/5fdcea93c027751b2732e2ad/html5/thumbnails/1.jpg)
TextClassificationandNaïveBayes
TheTaskofTextClassification
Many slides are adapted from slides by Dan Jurafsky
![Page 2: Text Classification and Naïve Bayes - ecology lab€¦ · Decision Trees suffer from fragmentationin such cases –especially if little data •Optimal if the independence assumptions](https://reader035.fdocuments.us/reader035/viewer/2022071103/5fdcea93c027751b2732e2ad/html5/thumbnails/2.jpg)
Isthisspam?
![Page 3: Text Classification and Naïve Bayes - ecology lab€¦ · Decision Trees suffer from fragmentationin such cases –especially if little data •Optimal if the independence assumptions](https://reader035.fdocuments.us/reader035/viewer/2022071103/5fdcea93c027751b2732e2ad/html5/thumbnails/3.jpg)
WhowrotewhichFederalistpapers?• 1787-8:anonymousessaystrytoconvinceNewYorktoratifyU.SConstitution: Jay,Madison,Hamilton.
• Authorshipof12ofthelettersindispute• 1963:solvedbyMosteller andWallaceusingBayesianmethods
JamesMadison AlexanderHamilton
![Page 4: Text Classification and Naïve Bayes - ecology lab€¦ · Decision Trees suffer from fragmentationin such cases –especially if little data •Optimal if the independence assumptions](https://reader035.fdocuments.us/reader035/viewer/2022071103/5fdcea93c027751b2732e2ad/html5/thumbnails/4.jpg)
Maleorfemaleauthor?1. By1925present-dayVietnamwasdividedintothreeparts
underFrenchcolonialrule.ThesouthernregionembracingSaigonandtheMekongdeltawasthecolonyofCochin-China;thecentralareawithitsimperialcapitalatHuewastheprotectorateofAnnam…
2. Claraneverfailedtobeastonishedbytheextraordinaryfelicityofherownname.Shefoundithardtotrustherselftothemercyoffate,whichhadmanagedovertheyearstoconverthergreatestshameintooneofhergreatestassets…
S.Argamon,M.Koppel,J.Fine,A.R.Shimoni,2003.“Gender,Genre,andWritingStyleinFormalWrittenTexts,”Text,volume23,number3,pp.321–346
![Page 5: Text Classification and Naïve Bayes - ecology lab€¦ · Decision Trees suffer from fragmentationin such cases –especially if little data •Optimal if the independence assumptions](https://reader035.fdocuments.us/reader035/viewer/2022071103/5fdcea93c027751b2732e2ad/html5/thumbnails/5.jpg)
Positiveornegativemoviereview?• unbelievablydisappointing• Fullofzanycharactersandrichlyappliedsatire,andsomegreatplottwists
• thisisthegreatestscrewballcomedyeverfilmed
• Itwaspathetic.Theworstpartaboutitwastheboxingscenes.
5
![Page 6: Text Classification and Naïve Bayes - ecology lab€¦ · Decision Trees suffer from fragmentationin such cases –especially if little data •Optimal if the independence assumptions](https://reader035.fdocuments.us/reader035/viewer/2022071103/5fdcea93c027751b2732e2ad/html5/thumbnails/6.jpg)
Whatisthesubjectofthisarticle?
• Antogonists andInhibitors
• BloodSupply• Chemistry• DrugTherapy• Embryology• Epidemiology• …
6
MeSH SubjectCategoryHierarchy
?
MEDLINE Article
![Page 7: Text Classification and Naïve Bayes - ecology lab€¦ · Decision Trees suffer from fragmentationin such cases –especially if little data •Optimal if the independence assumptions](https://reader035.fdocuments.us/reader035/viewer/2022071103/5fdcea93c027751b2732e2ad/html5/thumbnails/7.jpg)
TextClassification
• Assigningsubjectcategories,topics,orgenres• Spamdetection• Authorshipidentification• Age/genderidentification• LanguageIdentification• Sentimentanalysis• …
![Page 8: Text Classification and Naïve Bayes - ecology lab€¦ · Decision Trees suffer from fragmentationin such cases –especially if little data •Optimal if the independence assumptions](https://reader035.fdocuments.us/reader035/viewer/2022071103/5fdcea93c027751b2732e2ad/html5/thumbnails/8.jpg)
TextClassification:definition• Input:– adocumentd– afixedsetofclassesC= {c1,c2,…,cJ}
• Output:apredictedclassc Î C
![Page 9: Text Classification and Naïve Bayes - ecology lab€¦ · Decision Trees suffer from fragmentationin such cases –especially if little data •Optimal if the independence assumptions](https://reader035.fdocuments.us/reader035/viewer/2022071103/5fdcea93c027751b2732e2ad/html5/thumbnails/9.jpg)
ClassificationMethods:Hand-codedrules
• Rulesbasedoncombinationsofwordsorotherfeatures– spam:black-list-addressOR(“dollars”AND“have beenselected”)
• Accuracycanbehigh– Ifrulescarefullyrefinedbyexpert
• Butbuildingandmaintainingtheserulesisexpensive
![Page 10: Text Classification and Naïve Bayes - ecology lab€¦ · Decision Trees suffer from fragmentationin such cases –especially if little data •Optimal if the independence assumptions](https://reader035.fdocuments.us/reader035/viewer/2022071103/5fdcea93c027751b2732e2ad/html5/thumbnails/10.jpg)
ClassificationMethods:SupervisedMachineLearning
• Input:– adocumentd– afixedsetofclassesC= {c1,c2,…,cJ}– Atrainingsetofm hand-labeleddocuments(d1,c1),....,(dm,cm)
• Output:– alearnedclassifierγ:dà c
10
![Page 11: Text Classification and Naïve Bayes - ecology lab€¦ · Decision Trees suffer from fragmentationin such cases –especially if little data •Optimal if the independence assumptions](https://reader035.fdocuments.us/reader035/viewer/2022071103/5fdcea93c027751b2732e2ad/html5/thumbnails/11.jpg)
ClassificationMethods:SupervisedMachineLearning
• Anykindofclassifier– Naïve Bayes– Logisticregression,maxent– Support-vectormachines– k-NearestNeighbors
– …
![Page 12: Text Classification and Naïve Bayes - ecology lab€¦ · Decision Trees suffer from fragmentationin such cases –especially if little data •Optimal if the independence assumptions](https://reader035.fdocuments.us/reader035/viewer/2022071103/5fdcea93c027751b2732e2ad/html5/thumbnails/12.jpg)
TextClassificationandNaïveBayes
TheTaskofTextClassification
![Page 13: Text Classification and Naïve Bayes - ecology lab€¦ · Decision Trees suffer from fragmentationin such cases –especially if little data •Optimal if the independence assumptions](https://reader035.fdocuments.us/reader035/viewer/2022071103/5fdcea93c027751b2732e2ad/html5/thumbnails/13.jpg)
TextClassificationandNaïveBayes
TextClassification:Evaluation
![Page 14: Text Classification and Naïve Bayes - ecology lab€¦ · Decision Trees suffer from fragmentationin such cases –especially if little data •Optimal if the independence assumptions](https://reader035.fdocuments.us/reader035/viewer/2022071103/5fdcea93c027751b2732e2ad/html5/thumbnails/14.jpg)
The2-by-2contingencytable
correct notcorrectselected tp fp
notselected fn tn
![Page 15: Text Classification and Naïve Bayes - ecology lab€¦ · Decision Trees suffer from fragmentationin such cases –especially if little data •Optimal if the independence assumptions](https://reader035.fdocuments.us/reader035/viewer/2022071103/5fdcea93c027751b2732e2ad/html5/thumbnails/15.jpg)
Precisionandrecall• Precision:%ofselecteditemsthatarecorrectRecall:%ofcorrectitemsthatareselected
correct notcorrectselected tp fp
notselected fn tn
![Page 16: Text Classification and Naïve Bayes - ecology lab€¦ · Decision Trees suffer from fragmentationin such cases –especially if little data •Optimal if the independence assumptions](https://reader035.fdocuments.us/reader035/viewer/2022071103/5fdcea93c027751b2732e2ad/html5/thumbnails/16.jpg)
Acombinedmeasure:F• AcombinedmeasurethatassessestheP/RtradeoffisFmeasure(weightedharmonicmean):
• PeopleusuallyusebalancedF1measure– i.e.,withb =1(thatis,a =½): F =2PR/(P+R)
RPPR
RP
F+
+=
−+= 2
2 )1(1)1(1
1ββ
αα
![Page 17: Text Classification and Naïve Bayes - ecology lab€¦ · Decision Trees suffer from fragmentationin such cases –especially if little data •Optimal if the independence assumptions](https://reader035.fdocuments.us/reader035/viewer/2022071103/5fdcea93c027751b2732e2ad/html5/thumbnails/17.jpg)
Confusionmatrixc• Foreachpairofclasses<c1,c2>howmanydocumentsfromc1 wereincorrectlyassignedtoc2?– c3,2:90wheatdocumentsincorrectlyassignedtopoultry
17
Docsintestset AssignedUK
Assignedpoultry
Assignedwheat
Assignedcoffee
Assignedinterest
Assignedtrade
TrueUK 95 1 13 0 1 0
Truepoultry 0 1 0 0 0 0
Truewheat 10 90 0 1 0 0
Truecoffee 0 0 0 34 3 7
Trueinterest - 1 2 13 26 5
Truetrade 0 0 2 14 5 10
![Page 18: Text Classification and Naïve Bayes - ecology lab€¦ · Decision Trees suffer from fragmentationin such cases –especially if little data •Optimal if the independence assumptions](https://reader035.fdocuments.us/reader035/viewer/2022071103/5fdcea93c027751b2732e2ad/html5/thumbnails/18.jpg)
PerclassevaluationmeasuresRecall:Fractionofdocsinclassi classifiedcorrectly:
Precision:Fractionofdocsassignedclassi thatareactually
aboutclassi:
Accuracy:(1- errorrate)Fractionofdocsclassifiedcorrectly: 18
ciii∑
ciji∑
j∑
ciic ji
j∑
ciicij
j∑
Sec. 15.2.4
![Page 19: Text Classification and Naïve Bayes - ecology lab€¦ · Decision Trees suffer from fragmentationin such cases –especially if little data •Optimal if the independence assumptions](https://reader035.fdocuments.us/reader035/viewer/2022071103/5fdcea93c027751b2732e2ad/html5/thumbnails/19.jpg)
Micro- vs.Macro-Averaging– Ifwehavemorethanoneclass,howdowecombinemultipleperformancemeasuresintoonequantity?
• Macroaveraging:Computeperformanceforeachclass,thenaverage.Averageonclasses
• Microaveraging:Collectdecisionsforeachinstancefromallclasses,computecontingencytable,evaluate.Averageoninstances
19
Sec. 15.2.4
![Page 20: Text Classification and Naïve Bayes - ecology lab€¦ · Decision Trees suffer from fragmentationin such cases –especially if little data •Optimal if the independence assumptions](https://reader035.fdocuments.us/reader035/viewer/2022071103/5fdcea93c027751b2732e2ad/html5/thumbnails/20.jpg)
Micro- vs.Macro-Averaging:Example
Truth:yes
Truth:no
Classifier:yes 10 10
Classifier:no 10 970
Truth:yes
Truth:no
Classifier:yes 90 10
Classifier:no 10 890
Truth:yes
Truth:no
Classifier:yes 100 20
Classifier:no 20 1860
20
Class1 Class2 MicroAve.Table
Sec.15.2.4
• Macroaveraged precision:(0.5+0.9)/2=0.7• Microaveraged precision:100/120=.83• Microaveraged scoreisdominatedbyscoreoncommonclasses
![Page 21: Text Classification and Naïve Bayes - ecology lab€¦ · Decision Trees suffer from fragmentationin such cases –especially if little data •Optimal if the independence assumptions](https://reader035.fdocuments.us/reader035/viewer/2022071103/5fdcea93c027751b2732e2ad/html5/thumbnails/21.jpg)
DevelopmentTestSetsandCross-validation
• Metric:P/R/F1orAccuracy• Unseentestset– avoidoverfitting (‘tuningtothetestset’)– moreconservativeestimateofperformance
– Cross-validationovermultiplesplits• Handlesamplingerrorsfromdifferentdatasets
– Poolresultsovereachsplit– Computepooleddev setperformance
Trainingset Development Test Set TestSet
TestSet
TrainingSet
TrainingSetDev Test
TrainingSet
Dev Test
Dev Test
![Page 22: Text Classification and Naïve Bayes - ecology lab€¦ · Decision Trees suffer from fragmentationin such cases –especially if little data •Optimal if the independence assumptions](https://reader035.fdocuments.us/reader035/viewer/2022071103/5fdcea93c027751b2732e2ad/html5/thumbnails/22.jpg)
TextClassificationandNaïveBayes
TextClassification:Evaluation
![Page 23: Text Classification and Naïve Bayes - ecology lab€¦ · Decision Trees suffer from fragmentationin such cases –especially if little data •Optimal if the independence assumptions](https://reader035.fdocuments.us/reader035/viewer/2022071103/5fdcea93c027751b2732e2ad/html5/thumbnails/23.jpg)
TextClassificationandNaïveBayes
FormalizingtheNaïve BayesClassifier
![Page 24: Text Classification and Naïve Bayes - ecology lab€¦ · Decision Trees suffer from fragmentationin such cases –especially if little data •Optimal if the independence assumptions](https://reader035.fdocuments.us/reader035/viewer/2022071103/5fdcea93c027751b2732e2ad/html5/thumbnails/24.jpg)
NaïveBayesIntuition
• Simple(“naïve”)classificationmethodbasedonBayesrule
• Reliesonverysimplerepresentationofdocument– Bagofwords
![Page 25: Text Classification and Naïve Bayes - ecology lab€¦ · Decision Trees suffer from fragmentationin such cases –especially if little data •Optimal if the independence assumptions](https://reader035.fdocuments.us/reader035/viewer/2022071103/5fdcea93c027751b2732e2ad/html5/thumbnails/25.jpg)
Bayes’RuleAppliedtoDocumentsandClasses
•Foradocumentd andaclassc
P(c | d) = P(d | c)P(c)P(d)
![Page 26: Text Classification and Naïve Bayes - ecology lab€¦ · Decision Trees suffer from fragmentationin such cases –especially if little data •Optimal if the independence assumptions](https://reader035.fdocuments.us/reader035/viewer/2022071103/5fdcea93c027751b2732e2ad/html5/thumbnails/26.jpg)
Naïve BayesClassifier(I)
cMAP = argmaxc∈C
P(c | d)
= argmaxc∈C
P(d | c)P(c)P(d)
= argmaxc∈C
P(d | c)P(c)
MAP is “maximum a posteriori” = most likely class
Bayes Rule
Dropping the denominator
![Page 27: Text Classification and Naïve Bayes - ecology lab€¦ · Decision Trees suffer from fragmentationin such cases –especially if little data •Optimal if the independence assumptions](https://reader035.fdocuments.us/reader035/viewer/2022071103/5fdcea93c027751b2732e2ad/html5/thumbnails/27.jpg)
Naïve BayesClassifier(II)
cMAP = argmaxc∈C
P(d | c)P(c)
Document d represented as features x1..xn
= argmaxc∈C
P(x1, x2,…, xn | c)P(c)
![Page 28: Text Classification and Naïve Bayes - ecology lab€¦ · Decision Trees suffer from fragmentationin such cases –especially if little data •Optimal if the independence assumptions](https://reader035.fdocuments.us/reader035/viewer/2022071103/5fdcea93c027751b2732e2ad/html5/thumbnails/28.jpg)
Naïve BayesClassifier(III)
How often does this class occur?
cMAP = argmaxc∈C
P(x1, x2,…, xn | c)P(c)
O(|X|n•|C|)parameters
We can just count the relative frequencies in a corpus
Couldonlybeestimatedifavery,verylargenumberoftrainingexampleswasavailable.
![Page 29: Text Classification and Naïve Bayes - ecology lab€¦ · Decision Trees suffer from fragmentationin such cases –especially if little data •Optimal if the independence assumptions](https://reader035.fdocuments.us/reader035/viewer/2022071103/5fdcea93c027751b2732e2ad/html5/thumbnails/29.jpg)
ThebagofwordsrepresentationI love this movie! It's sweet, but with satirical humor. The dialogue is great and the adventure scenes are fun… It manages to be whimsical and romantic while laughing at the conventions of the fairy tale genre. I would recommend it to just about anyone. I've seen it several times, and I'm always happy to see it again whenever I have a friend who hasn't seen it yet.
γ(
)=c
![Page 30: Text Classification and Naïve Bayes - ecology lab€¦ · Decision Trees suffer from fragmentationin such cases –especially if little data •Optimal if the independence assumptions](https://reader035.fdocuments.us/reader035/viewer/2022071103/5fdcea93c027751b2732e2ad/html5/thumbnails/30.jpg)
Thebagofwordsrepresentation
γ(
)=cgreat 2love 2
recommend 1
laugh 1happy 1
... ...
![Page 31: Text Classification and Naïve Bayes - ecology lab€¦ · Decision Trees suffer from fragmentationin such cases –especially if little data •Optimal if the independence assumptions](https://reader035.fdocuments.us/reader035/viewer/2022071103/5fdcea93c027751b2732e2ad/html5/thumbnails/31.jpg)
Planning GUIGarbageCollection
Machine Learning NLP
parsertagtrainingtranslationlanguage...
learningtrainingalgorithmshrinkagenetwork...
garbagecollectionmemoryoptimizationregion...
Test document
parserlanguagelabeltranslation…
Bagofwordsfordocumentclassification
...planningtemporalreasoningplanlanguage...
?
![Page 32: Text Classification and Naïve Bayes - ecology lab€¦ · Decision Trees suffer from fragmentationin such cases –especially if little data •Optimal if the independence assumptions](https://reader035.fdocuments.us/reader035/viewer/2022071103/5fdcea93c027751b2732e2ad/html5/thumbnails/32.jpg)
MultinomialNaïve BayesIndependenceAssumptions
• BagofWordsassumption:Assumepositiondoesn’tmatter
• ConditionalIndependence:AssumethefeatureprobabilitiesP(xi|cj)areindependentgiventheclassc.
P(x1, x2,…, xn | c)
P(x1,…, xn | c) = P(x1 | c)•P(x2 | c)•P(x3 | c)•...•P(xn | c)
![Page 33: Text Classification and Naïve Bayes - ecology lab€¦ · Decision Trees suffer from fragmentationin such cases –especially if little data •Optimal if the independence assumptions](https://reader035.fdocuments.us/reader035/viewer/2022071103/5fdcea93c027751b2732e2ad/html5/thumbnails/33.jpg)
ApplyingMultinomialNaiveBayesClassifierstoTextClassification
cNB = argmaxc j∈C
P(cj ) P(xi | cj )i∈positions∏
positions ¬ allwordpositionsintestdocument
![Page 34: Text Classification and Naïve Bayes - ecology lab€¦ · Decision Trees suffer from fragmentationin such cases –especially if little data •Optimal if the independence assumptions](https://reader035.fdocuments.us/reader035/viewer/2022071103/5fdcea93c027751b2732e2ad/html5/thumbnails/34.jpg)
TextClassificationandNaïveBayes
FormalizingtheNaïve BayesClassifier
![Page 35: Text Classification and Naïve Bayes - ecology lab€¦ · Decision Trees suffer from fragmentationin such cases –especially if little data •Optimal if the independence assumptions](https://reader035.fdocuments.us/reader035/viewer/2022071103/5fdcea93c027751b2732e2ad/html5/thumbnails/35.jpg)
TextClassificationandNaïveBayes
Naïve Bayes:Learning
![Page 36: Text Classification and Naïve Bayes - ecology lab€¦ · Decision Trees suffer from fragmentationin such cases –especially if little data •Optimal if the independence assumptions](https://reader035.fdocuments.us/reader035/viewer/2022071103/5fdcea93c027751b2732e2ad/html5/thumbnails/36.jpg)
LearningtheMultinomialNaïve BayesModel
• Firstattempt:maximumlikelihoodestimates– simplyusethefrequenciesinthedata
Sec.13.3
P̂(wi | cj ) =count(wi,cj )count(w,cj )
w∈V∑
P̂(cj ) =doccount(C = cj )
Ndoc
![Page 37: Text Classification and Naïve Bayes - ecology lab€¦ · Decision Trees suffer from fragmentationin such cases –especially if little data •Optimal if the independence assumptions](https://reader035.fdocuments.us/reader035/viewer/2022071103/5fdcea93c027751b2732e2ad/html5/thumbnails/37.jpg)
Parameterestimation
• Createmega-documentfortopicj byconcatenatingalldocsinthistopic– Usefrequencyofw inmega-document
fractionoftimeswordwi appearsamongallwordsindocumentsoftopiccj
P̂(wi | cj ) =count(wi,cj )count(w,cj )
w∈V∑
![Page 38: Text Classification and Naïve Bayes - ecology lab€¦ · Decision Trees suffer from fragmentationin such cases –especially if little data •Optimal if the independence assumptions](https://reader035.fdocuments.us/reader035/viewer/2022071103/5fdcea93c027751b2732e2ad/html5/thumbnails/38.jpg)
ProblemwithMaximumLikelihood• Whatifwehaveseennotrainingdocumentswiththewordfantastic and
classifiedinthetopicpositive (thumbs-up)?
• Zeroprobabilitiescannotbeconditionedaway,nomattertheotherevidence!
P̂("fantastic" positive) = count("fantastic", positive)count(w, positive
w∈V∑ )
= 0
cMAP = argmaxc P̂(c) P̂(xi | c)i∏
Sec.13.3
![Page 39: Text Classification and Naïve Bayes - ecology lab€¦ · Decision Trees suffer from fragmentationin such cases –especially if little data •Optimal if the independence assumptions](https://reader035.fdocuments.us/reader035/viewer/2022071103/5fdcea93c027751b2732e2ad/html5/thumbnails/39.jpg)
Laplace(add-1)smoothing:unknownwords
P̂(wu | c) = count(wu,c)+1
count(w,cw∈V∑ )
#
$%%
&
'(( + V +1
Addoneextrawordtothevocabulary,the“unknownword”wu
=1
count(w,cw∈V∑ )
#
$%%
&
'(( + V +1
![Page 40: Text Classification and Naïve Bayes - ecology lab€¦ · Decision Trees suffer from fragmentationin such cases –especially if little data •Optimal if the independence assumptions](https://reader035.fdocuments.us/reader035/viewer/2022071103/5fdcea93c027751b2732e2ad/html5/thumbnails/40.jpg)
UnderflowPrevention:logspace• Multiplyinglotsofprobabilitiescanresultinfloating-pointunderflow.• Sincelog(xy)=log(x)+log(y)
– Bettertosumlogsofprobabilitiesinsteadofmultiplyingprobabilities.• Classwithhighestun-normalizedlogprobabilityscoreisstillmost
probable.
• Modelisnowjustmaxofsumofweights
cNB = argmaxc j∈C
logP(cj )+ logP(xi | cj )i∈positions∑
![Page 41: Text Classification and Naïve Bayes - ecology lab€¦ · Decision Trees suffer from fragmentationin such cases –especially if little data •Optimal if the independence assumptions](https://reader035.fdocuments.us/reader035/viewer/2022071103/5fdcea93c027751b2732e2ad/html5/thumbnails/41.jpg)
TextClassificationandNaïve Bayes
Naïve Bayes:Learning
![Page 42: Text Classification and Naïve Bayes - ecology lab€¦ · Decision Trees suffer from fragmentationin such cases –especially if little data •Optimal if the independence assumptions](https://reader035.fdocuments.us/reader035/viewer/2022071103/5fdcea93c027751b2732e2ad/html5/thumbnails/42.jpg)
TextClassificationandNaïveBayes
MultinomialNaïve Bayes:AWorkedExample
![Page 43: Text Classification and Naïve Bayes - ecology lab€¦ · Decision Trees suffer from fragmentationin such cases –especially if little data •Optimal if the independence assumptions](https://reader035.fdocuments.us/reader035/viewer/2022071103/5fdcea93c027751b2732e2ad/html5/thumbnails/43.jpg)
Choosingaclass:P(c|d5)
P(j|d5) 1/4*(2/10)3 *2/10 *2/10≈0.00008
Doc Words Class
Training 1 Chinese BeijingChinese c
2 ChineseChineseShanghai c3 ChineseMacao c4 TokyoJapanChinese j
Test 5 ChineseChineseChineseTokyo Japan ?
43
ConditionalProbabilities:P(Chinese|c)=P(Tokyo|c)=P(Japan|c)=P(Chinese|j)=P(Tokyo|j)=P(Japan|j)=
Priors:P(c)=P(j)=
34 1
4
P̂(w | c) = count(w,c)+1count(c)+ |V |
P̂(c) = Nc
N
(5+1)/(8+7)=6/15(0+1)/(8+7)=1/15
(1+1)/(3+7)=2/10(0+1)/(8+7)=1/15
(1+1)/(3+7)=2/10(1+1)/(3+7)=2/10
3/4*(6/15)3 *1/15 *1/15≈0.0002
µ
µ
+1
![Page 44: Text Classification and Naïve Bayes - ecology lab€¦ · Decision Trees suffer from fragmentationin such cases –especially if little data •Optimal if the independence assumptions](https://reader035.fdocuments.us/reader035/viewer/2022071103/5fdcea93c027751b2732e2ad/html5/thumbnails/44.jpg)
Summary:NaiveBayesisNotSoNaive• RobusttoIrrelevantFeatures
IrrelevantFeaturescanceleachotherwithoutaffectingresults
• Verygoodindomainswithmanyequallyimportantfeatures
DecisionTreessufferfromfragmentation insuchcases– especiallyiflittledata
• Optimaliftheindependenceassumptionshold:Ifassumedindependenceiscorrect,thenitistheBayesOptimalClassifierforproblem
• Agooddependablebaselinefortextclassification– Butwewillseeotherclassifiersthatgivebetteraccuracy
![Page 45: Text Classification and Naïve Bayes - ecology lab€¦ · Decision Trees suffer from fragmentationin such cases –especially if little data •Optimal if the independence assumptions](https://reader035.fdocuments.us/reader035/viewer/2022071103/5fdcea93c027751b2732e2ad/html5/thumbnails/45.jpg)
TextClassificationandNaïveBayes
MultinomialNaïve Bayes:AWorkedExample