Stacking With Auxiliary Features: Improved Ensembling for ... · • Bipartite Graph-based...

Stacking With Auxiliary Features: Improved Ensembling for Natural Language and Vision

NazneenRajaniPhDProposal

November7,2016Committeemembers:RayMooney,KatrinErk,GregDurrettandKenBarker

Outline• Introduction• Background&RelatedWork• CompletedWork

– StackedEnsemblesofInformationExtractorsforKnowledgeBasePopulation(ACL2015)– StackingWithAuxiliaryFeatures(Underreview)–CombiningSupervisedandUnsupervisedEnsemblesforKnowledgeBasePopulation(EMNLP2016)

• ProposedWork– Short-termproposals– Long-termproposals

Introduction

System 1

System 2

System N-1

System N

output

• Ensembling:Usedbythe$1MwinningteamfortheNetflixcompetition

Introduction• Makeauxiliaryinformationaccessibletotheensemble

System 1

System 2

System N-1

System N

output

Auxiliary information about task and systems

Background and Related Work

Cold Start Slot Filling (CSSF)• KnowledgeBasePopulation(KBP)isataskofdiscoveringentityfactsandaddingtoaKB

• Relationextraction,aKBPsub-task,usingfixedontologyisslotfilling

• CSSFisanannualNISTevaluationofbuildingKBfromscratch- queryentitiesandpre-definedslots- textcorpus 6

Cold Start Slot Filling (CSSF)• Someslotsaresingle-valued(per:age)whilesomearelist-valued(per:children)

• Entitytypes:PER,ORG,GPE• Alongwithfills,systemsmustprovide- confidencescore- provenance—docid:startoffset-endoffset

Cold Start Slot Filling (CSSF)

1. city_of_headquarters:2. website:3. subsidiaries:4. employees:5. shareholders:

Microsoftisatechnologycompany,headquarteredinRedmond,Washingtonthatdevelops…

city_of_headquarters:Redmondprovenance:

confidencescore:1.0

org:Microsoft

Cold Start Slot Filling (CSSF)

Distant Supervision

Bootstrapping

Universal Schema

Multi-Instance Multi-Learning

Training Data

Source Corpus

Query Expansion

Document level IR

Aliasing

Answer 9

Entity Discovery and Linking (EDL)• KBPsub-taskinvolvingtwoNLPproblems- NamedEntityRecognition(NER)- Disambiguation

• EDLisanannualNISTevaluationin3languages:English,SpanishandChinese

• Tri-lingualEntityDiscoveryandLinking(TEDL)10

Tri-lingual Entity Discovery and Linking (TEDL)

• Detectallentitymentionsincorpus• LinkmentionstoEnglishKB(FreeBase)• IfnoKBentryfound,clusterintoaNILID• Entitytypes—PER,ORG,GPE,FAC,LOC• Systemsmustalsoprovideconfidencescore

FreeBase entry: Hillary Diane Rodham Clinton is a US Secretary of State, U.S. Senator, and First Lady of the United States. From 2009 to 2013, she was the 67th Secretary of State, serving under President Barack Obama. She previously represented New York in the U.S. Senate. Source Corpus Document:

Hillary Clinton Not Talking About ’92 Clinton-Gore Confederate Campaign Button.. FreeBase entry:

WilliamJefferson"Bill"ClintonisanAmericanpoli5cianwhoservedasthe42ndPresidentoftheUnitedStatesfrom1993to2001.ClintonwasGovernorofArkansasfrom1979to1981and1983to1992,andArkansasAJorneyGeneralfrom1977to1979.

Unsupervised Similarity

Supervised Classification

Graph Based

Joint Approach

FreeBase KB

Query Expansion

Answer

Candidate Generation and Ranking

ImageNet Object Detection• WidelyknownannualcompetitioninCVforlarge-scaleobjectrecognition

• Objectdetection- detectallinstancesofobjectcategories(total200)inimages

- localizeusingaxis-alignedBoundingBoxes(BB)• ObjectcategoriesareWordNetsynsets• Systemsalsoprovideconfidencescores 14

ImageNet Object Detection

Ensemble Algorithms(Wolpert, 1992)

• StackingSystem 1

System 2

System N-1

System N

Trained classifier

Accept?

conf 1

conf 2

conf N-1

conf N

Ensemble Algorithms• BipartiteGraph-basedConsensusMaximization(BGCM)(Gaoetal.,2009)- ensembling->optimizationoverbipartitegraph- combiningsupervisedandunsupervisedmodels

• MixturesofExperts(ME)(Jacobsetal.,1991)- partitiontheproblemintosub-spaces- learntoswitchexpertsbasedoninputusingagatingnetwork- DeepMixturesofExperts(Eigenetal.,2013)

Completed Work:I. Stacked Ensembles of Information Extractors for

Knowledge Base Population (ACL2015)

Stacking(Wolpert, 1992)

Foragivenproposedslot-fill,e.g.spouse(Barack,Michelle),combineconfidencesfrommulgplesystems:

System 1

System 2

System N-1

System N

Trained linear SVM

conf 1

conf 2

conf N-1

conf N Accept? 19

Stacking with FeaturesForagivenproposedslot-fill,e.g.spouse(Barack,Michelle),combineconfidencesfrommulgplesystems:

System 1

System 2

System N-1

System N

conf 2

conf N-1

conf N

Trained linear SVM

Accept?

Slot Type conf 1

Stacking with FeaturesForagivenproposedslot-fill,e.g.spouse(Barack,Michelle),combineconfidencesfrommulgplesystems:

System 1

System 2

System N-1

System N

Trained linear SVM

Accept?

Slot Type Provenance conf 1

conf 2

conf N-1

conf N 21

Document Provenance Feature• Foragivenqueryandslot,foreachsystem,i,thereisafeatureDPi:- Nsystemsprovideafillfortheslot.- Ofthese,ngivesameprovenancedocidasi.- DPi=n/Nisthedocumentprovenancescore.

• Measuresextenttowhichsystemsagreeondocumentprovenanceoftheslotfill.

Offset Provenance Feature• Degreeofoverlapbetweensystems’provenancestrings.• UsesJaccardsimilaritycoefficient.

• SystemswithdifferentdocidhavezeroOP

Offset Provenance Feature

Offsets System 1 System 2 System 3

Start Offset 1 4 5

End Offset 9 7 12

OP1 =12×49+512

1 2 3 4 5 6 7 8 9 10 11 12 13

System 2

System 3

Results

Approach Precision Recall F1

Union 0.176 0.647 0.277

Voting 0.694 0.256 0.374

Best ESF system in 2014 (Stanford) 0.585 0.298 0.395

Stacking 0.606 0.402 0.483

Stacking + Relation 0.607 0.406 0.486

Stacking + Provenance + Relation 0.541 0.466 0.501

• Usingthe10commonsystemsbetween2013and2014

Takeaways• Stackedmeta-classifierbeatsthebestperforming2014KBPSF

systembyanF1gainof11points.• Featuresthatutilizeauxiliaryinformationimprovestacking

performance.• Ensemblinghasclearadvantagesbutnaiveapproachessuchas

votingdonotperformaswell.• Althoughsystemschangeeveryyear,thereareadvantagesin

trainingonpastdata.26

Completed Work:II. Stacking With Auxiliary Features (under review)

Stacking With Auxiliary Features (SWAF)

System1

System2

SystemN

TrainedMeta-classifier

ProvenanceFeatures

confN Accept?

SystemN-1 confN-1

AuxiliaryFeatures

InstanceFeatures

• Stackingusingtwotypesofauxiliaryfeatures:

Instance Features• Enablesstackertodiscriminatebetweeninputinstancetypes

• Somesystemsarebetteratcertaininputtypes• CSSF—slottype(per:age)• TEDL—entitytype(PER/ORG/GPE/FAC/LOC)• Objectdetection—objectcategoryandSIFTfeaturedescriptors

Provenance Features• Enablesthestackertodiscriminatebetweensystems

• Outputisreliableifsystemsagreeonsource• CSSFsameasslotfilling• TEDL—measuresoverlapofamention

Provenance Features• Objectdetection—measureBBoverlap

Post-processing• CSSF

- singlevaluedslotfills—resolveconflicts- listvaluesslotfills—alwaysinclude

• TEDL- KBID—includeinoutput- *NILID—mergeacrosssystemsifatleastoneoverlap

• Objectdetection- Foreachsystem,measuremaximumsumoverlapwithothersystems- Union/intersection—penalizedbyevaluationmetric

Results• 2015CSSF—10sharedsystems

Approach Precision Recall F1ME(Jacobsetal.,1991) 0.479 0.184 0.266Oraclevoting(>=3) 0.438 0.272 0.336

Toprankedsystem(Angelietal.,2015) 0.399 0.306 0.346Stacking 0.497 0.282 0.359

Stacking+instancefeatures 0.498 0.284 0.360

Stacking+provenancefeatures 0.508 0.286 0.366

SWAF 0.466 0.331 0.38733

Results• 2015TEDL—6sharedsystems

Approach Precision Recall F1Oraclevoting(>=4) 0.514 0.601 0.554

ME(Jacobsetal.,1991) 0.721 0.494 0.587Toprankedsystem(Siletal.,2015) 0.693 0.547 0.611

Stacking 0.729 0.528 0.613Stacking+instancefeatures 0.783 0.511 0.619

Stacking+provenancefeatures 0.814 0.508 0.625

SWAF 0.814 0.515 0.63034

Results• 2015ImageNetobjectdetection—3sharedsystems

Approach MeanAP MedianAPOraclevoting(>=1) 0.366 0.368

Beststandalonesystem(VGG+selectivesearch) 0.434 0.430Stacking 0.451 0.441

Stacking+instancefeatures 0.461 0.45MixturesofExperts(Jacobsetal.,1991) 0.494 0.489

Stacking+provenancefeatures 0.502 0.494

SWAF 0.506 0.497 35

Results on object detection

Takeaways• SWAFproducedSOTAonCSSFandTEDL;significant

improvementsonobjectdetection• OurapproachismorerobustthanMEintermsofnumberof

componentsystems• Workswellforimageswithmultipleinstancesofthesame

object

Completed Work:III. Combining Supervised and Unsupervised Ensembles for

Knowledge Base Population (EMNLP2016)

Combining supervised & unsupervised ensembles

SupSystem1

SupSystem2

SupSystemN

UnsupSystem1

TrainedlinearSVM

AuxiliaryFeatures

UnsupSystem2 Calibratedconf

UnsupSystemM

ConstrainedOp@miza@on(Wengetal,2013)

Accept?

Unsupervised ensemble(Wang et al., 2013)

• Approachtoaggregaterawconfidencevalues• Re-weighttheconfidencescoreofaninstance- numberofsystemsthatproduceit- rankofthosesystems

• Uniformweightsforallsystems• Ourworkextendstoentitylinking

Results• 2015CSSF—#supsystems=10,#unsupsystems=13

Approach Precision Recall F1Constrainedoptimization 0.1712 0.3998 0.2397

Oraclevoting(>=3) 0.4384 0.2720 0.3357Toprankedsystem(Angelietal.,2015) 0.3989 0.3058 0.3462

SWAF 0.4656 0.3312 0.3871BGCMforcombiningsup+unsup 0.4902 0.3363 0.3989Stackingforcombiningsup+unsup

(BGCM) 0.5901 0.3021 0.3996

Stackingforcombiningsup+unsup(constrainedoptimization) 0.4676 0.4314 0.4489

Results• 2015TEDL—#supsystems=6,#unsupsystems=4

Approach Precision Recall F1Constrainedoptimization 0.176 0.445 0.252

Oraclevoting(>=4) 0.514 0.601 0.554Toprankedsystem(Siletal.,2015) 0.693 0.547 0.611

SWAF 0.813 0.515 0.630BGCMforcombiningsup+unsup 0.810 0.517 0.631Stackingforcombiningsup+unsup

(BGCM) 0.803 0.525 0.635

Stackingforcombiningsup+unsup(constrainedoptimization) 0.686 0.624 0.653

Takeaways• Manyhighrankingsystemsw/otrainingdata• Approximately1/3ofpossibleoutputsproducedbyunsupervisedensemble

• Combinationimprovesrecallsubstantially

Proposed Work:I. Short-term proposals — Semantic Instance-level Features

Instance-level features• Completedworkincludedonlysuperficialinstancefeatures

• Focusmoreontheinstancefeatures—taskspecific• Specifically,moresemanticfeatures• Basedontheresults,thesefeatures:- helpimproveperformancebythemselves,- usedalongwithprovenance

EDL instance-level features(Francis et al., 2016)

• UsedcontextualinformationtodisambiguateentitymentionsusingCNNsforEDL

• Computessimilaritiesbetweenamention'ssourcedocumentanditspotentialentitytargetsatmultiplegranularities.

• CNNs:textblocktopicvector46

EDL instance-level features

Men$onHillaryClintonContextForahostofreasons,HillaryClintonsomehowhasfailedtodeveloptherhetoricalorinterpersonalskillsthatmadeherhusband,andBarackObama,soappealingonthecampaigntrail.DocumentIdon'tdisagreewiththegistofDowd'sarBcle.Forahostofreasons,HillaryClintonsomehowhasfailedtodeveloptherhetoricalorinterpersonalskillsthatmadeherhusband,andBarackObama,soappealingonthecampaigntrail.Clintonhasalso,forreasonsgoodandbad,madeanumberoferrorsinjudgementinherrun-uptohercurrentcampaign...

TitleHillaryDianeRodhamClintonAr$cleHillaryClintonisaformerUnitedStatesSecretaryofState,U.S.Senator,andFirstLadyoftheUnitedStates.From2009to2013,shewasthe67thSecretaryofState,servingunderPresidentBarackObama.ShepreviouslyrepresentedNewYorkintheU.S.Senate.Beforethat,asthewifeofPresidentBillClinton,shewasFirstLadyfrom1993to2001.Inthe2008elecBon,ClintonwasaleadingcandidatefortheDemocraBcpresidenBalnominaBon.AnaBveofIllinois,HillaryRodhamwasthefirststudentcommencementspeakeratWellesleyCollegein1969.ShethenearnedaJ.D.fromYaleLawSchoolin1973.AXerabriefsBntasaCongressionallegalcounsel,shemovedtoArkansasandmarriedBillClintonin1975.RodhamcofoundedtheArkansasAdvocatesforChildrenandFamiliesin1977..

smenBon tBtle

tarBcle

scontext

sdocument

• Examplesourceandtargetgranularitiesforaninstanceinthe2016NISTKBPdataset.

Object detection instance-level features• ImageNetprovidesattributesdatasetforcertaincategories• Annotatedwithpre-definedsetsofattributes:- Color:black,blue,brown,gray,green,orange,pink,red,violet,white,yellow

- Pattern:spotted,striped- Shape:long,round,rectangular,square- Texture:furry,smooth,rough,shiny,metallic,vegetation,wooden,wet

Proposed Work:I. Short-term proposals — Improve Foreign Language KBP

Foreign language features• ThisworkwillonlyapplytotheKBPtasks• Resultsonthe2016TEDLtask

Language Precision Recall F1

English 0.805 0.508 0.623

Spanish 0.79 0.443 0.568

Chinese 0.792 0.495 0.609

Combined 0.789 0.481 0.59750

Foreign language features• TEDL-foreignlanguagetrainingdata• AuxiliaryfeaturesdonottranslatetoChineseandSpanish

• Straightforwardfeature—languageindicator• Uselanguageindependentfeatures- non-lexical

Language Independent Entity Linking (LIEL) solution to TEDL

(Sil and Florian, 2016)

• EntitycategoryPMI• Categoricalrelationfrequency• Titleco-occurrencefrequency

Proposed Work:II. Long-term proposals — Visual Question Answering

Visual Question Answering (VQA)(Antol et al., 2015)

• UnderstandhowDNNsdoobjectdetection

Visual Question Answering (VQA)• VQAinvolvesbothlanguageandvision• DemonstrateSWAFonVQA• Ensemblebasedontheanswers- Multiplechoicequestions- Openendedanswers—90%one-wordanswers

• Useexplanationsasauxiliaryfeatures 55

Proposed Work:II. Long-term proposals — Explanations as auxiliary features

Explanation as auxiliary features• Completedworkfocusedonusingprovenance• Captured“where”aspectoftheoutput• RecentworkongeneratingexplanationstointerpretDNNs:- TowardsTransparentAIsystems- Generatingvisualexplanations- VisualQuestionAnswering(VQA)

• DARPAprogramforexplainableAI(XAI) 57

Explanation as auxiliary features• Useexplanationsasauxiliaryfeatures• Capture“why”aspectoftheoutput• Twotypesofexplanations:- Textual- Visual

Text as Explanation(Hendricks et al., 2016)

• Generatingvisualexplanations• Jointlypredictvisualclassandgeneratetextasexplanation

• Usesdescriptivepropertiesvisibleintheimage

Text as Explanation

ThisisaKentuckywarblerbecausethisisayellowbirdwithashorttail

ThisisaKentuckywarblerbecausethisisayellowbirdwithablackcheekpatchandablackcrown

SystemA(Berkeley) SystemBInputimage

Text as Explanation• Trustagreementbetweensystemswithsimilarexplanations

• MTmetrics—BLEU/METEORforsimilarity• MinimumBayesRisk(MBR)decoding• Embeddingsofwordsintheexplanation

Images as Explanation• DNNsattendtorelevantpartsofimagewhiledoingVQA(Goyaletal.,2016)

• Heat-maptovisualizeattentioninimages• Humanstrustsystemswithbetterexplanationsmoreevenwhentheyallpredictthesameoutput(Selvarajuetal.,2016)

• Enablethestackertolearntorelyonsystemsthat“look”attherightregionoftheimagewhilepredictingtheanswer

Images as Explanation

SystemA SystemBInputimage

Q:Whatcoloristhecat? A:Brown A:Brown

Images as Explanation• UsevisualexplanationtoimproveVQA• Measureagreementbetweensystems’heat-maps- KL-divergence- Measurecorrelation

• Usingvisualexplanation- improveperformance- modelwithbetterexplanations 64

Conclusion

Conclusion• Generalproblemofcombiningoutputsfromdiversesystems

• SWAFonthreedifficulttasks• Provenancecaptures“where”oftheoutput• Combiningsupervisedandunsupervisedensemblesimprovesrecall

• Short-term:betterauxiliaryfeatures• Long-term:focuson“why”oftheoutput

System1

System2

System3

Meta-classifier

You!Questions?

ThankYou!

Questions?

ThankYou!Questions?

Backup slides

Results on CSSF

Results on TEDL

Number of systems in 2016Supervised Unsupervised

English Chinese Spanish English Chinese Spanish

TEDL 5 4 4 7 3 3

CSSF 8 2 3 8 1 0

Learning Curve• Systemschangeeachyear.• Stillusefultotrainonpastdata.

Incremental Training on Systems• Sortthecommonsystemsbasedontheirperformance.• Traintheclassifieraddingonesystemateachstep.• Teston2014data.

Unsupervised ensemble• Mutualexclusionproperty

• Listvaluedslotfillreplace1by

• Forentity-linking,1isreplacedwith

avg no. of correct slot fills

total no. of slot fills

avg no. of correct mentions for an entity type

total no. of mentions for that entity type

Ratio of sup and unsup systems• Unsupervised~1/3ofthecombination• Commonoutput:22%forCSSFand15%forTEDL

KBP instance-level features• Embedthewordsinad-dimensionalspace- d=300withwindowsize=21

• Wordsvectorusingaconv-netfilterMg

• SimilarsemanticfeaturesbetweenquerydocumentandprovenancedocumentfortheCSSFtask

Language Independent Entity Linking (LIEL) solution to TEDL (Sil and Florian, 2016)

• EntitycategoryPMI- CalculatesthePMIbetweenpairofentities(e1,e2)thatco-occurinadocument

• Categoricalrelationfrequency- CountthenumberofKBrelationsthatexistsbetweenpairofentities(e1,e2)

• Titleco-occurrencefrequency- Foreverypairofconsecutiveentities(e,e’),computesthenumberoftimese’appearsasalinkintheKBpagefore 77

Stacking With Auxiliary Features: Improved Ensembling for ... · • Bipartite Graph-based...

Documents

Transcript of Stacking With Auxiliary Features: Improved Ensembling for ... · • Bipartite Graph-based...

Ensembling Medium Range Forecast MOS GUIANCE

Model Comparison, Ensembling and Bayesian Workﬂow

1 Weighted Bipartite Matching Lecture 4: Jan 18. 2 Weighted Bipartite Matching Given a weighted bipartite graph, find a matching with maximum total weight.

Ensembling Neural Networks: Many Could Be Better Than All · 2019. 11. 8. · Artificial Intelligence, 2002, vol.137, no.1-2, pp.239-263. @Elsevier Ensembling Neural Networks: Many

BIPARTITE SETTLEMENT 9

Data Augmentation and Ensembling for FriendsQA

bipartite settlement 1

bipartite settlement 2

Bipartite Networks - I

Bipartite Entanglement,Topological Order, and the ... · Bipartite Entanglement,Topological Order, and the ... “Quantum Information Meets Many-Body ... Bipartite Entanglement,Topological

Workplace Bipartite Cooperation

Yangjun Chen 1 Bipartite Graphs What is a bipartite graph? Properties of bipartite graphs Matching and maximum matching - alternative paths - augmenting.

AusDM 09 Analytic Challenge ‘Ensembling’

Non-bipartite -commongraphs

bipartite settlement 6

Skin Diseases Detection using LBP and WLD- An Ensembling ... · Skin Diseases Detection using LBP and WLD- An Ensembling Approach Arnab Banerjeea, Nibaran Das, Mita Nasipurib aDepartment

Package 'bipartite'

BIPARTITE SETTLEMENT 7

BIPARTITE SETTLEMENT 8

BiNE: Bipartite Network Embedding