Introduction to Triangulated GraphsRec-I-DCM3 significantly improves performance Comparison of TNT...
Transcript of Introduction to Triangulated GraphsRec-I-DCM3 significantly improves performance Comparison of TNT...
-
IntroductiontoTriangulatedGraphs
TandyWarnow
-
Topicsfortoday
• Triangulatedgraphs:theoremsandalgorithms(Chapters11.3and11.9)
• Examplesoftriangulatedgraphsinphylogenyestimation(Chapters4.8,11.3-11.5)
-
Triangulated(i.e.,Chordal)Graphs
• Definition:Agraphistriangulatedifithasnosimplecyclesofsizefourormore.
-
DCMsareDivide-and-Conquerstrategies!
-
DCMsforphylogenyreconstruction
• Defineatriangulatedgraphsothatitsverticescorrespondto
theinputtaxa(orsequences)• Decomposethegraphintooverlappingsubgraphs,thus
decomposingthetaxaintooverlappingsubsets.• Applythe“basemethod” toeachsubsetoftaxa,toconstruct
asubsettree• Applyasupertreemethodtothesubsettreestoobtaina
singletreeonthefullsetoftaxa.
-
DCMs(Disk-CoveringMethods)
• DCMsforpolynomialtimemethodsimprovetopologicalaccuracy(empiricalobservation)andhaveprovabletheoreticalguaranteesunderMarkovmodelsofevolution.
• DCMsforhardoptimizationproblemsreducerunningtimeneededtoachievegoodlevelsofaccuracy(empiricalobservation)
-
DecomposingTriangulatedGraphs
Max Clique Decomposition Separator-component Decomposition
Given:TriangulatedgraphG=(V,E)Output:DecompositionoftheverticesintooverlappingsubsetsRequire:Polynomialtime!Technique:Usespecialpropertiesabouttriangulatedgraphs
-
SimplicialVertices
Definition:LetG=(V,E)beagraph,andletvbeavertexinV.Thenvissimplicialifitssetofneighbors(i.e.,Γ(v))isaclique.Todo:• Giveexampleofagraphthathasnosimplicialvertices.
• Giveexampleofagraphwhereeveryvertexissimplicial.
-
PerfectEliminationOrdering
Definition:LetG=(V,E)beagraphonnvertices.Aperfecteliminationorderingisanorderingoftheverticesv1,v2,…,vnofGsothateachvertexviissimplicialinthegraphinducedon{vi+1,vi+2,…,vn}.Theorems:• Everytriangulatedgraphhasasimplicialvertex.• Infact,everytriangulatedgraphthatisnotacliquehastwonon-adjacentsimplicialvertices.
-
PerfectEliminationOrdering
Definition:LetG=(V,E)beagraphonnvertices.Aperfecteliminationorderingisanorderingoftheverticesv1,v2,…,vnofGsothateachvertexviissimplicialinthegraphinducedon{vi+1,vi+2,…,vn}.Theorems(Rose1970):AgraphGistriangulatedifandonlyifithasaperfecteliminationordering.Furthermore,givenatriangulatedgraph,aperfecteliminationorderingcanbefoundinpolynomialtime.
-
Somepropertiesofchordalgraphs
• Theorem:EverychordalgraphG=(V,E)hasatmost|V|maximalcliques,andthesecanbefoundinpolynomialtime:– Maxcliquedecomposition.
• Provethisusingtheexistenceofaperfecteliminationordering.
-
Somepropertiesofchordalgraphs
• Theorem:EverychordalgraphG=(V,E)hasatmost|V|maximalcliques,andthesecanbefoundinpolynomialtime:– Maxcliquedecomposition.
• Provethisusingtheexistenceofaperfecteliminationordering.
-
Somepropertiesofchordalgraphs
• Everychordalgraphthatisnotacliquehasavertexseparatorthatisamaximalclique,anditcanbefoundinpolynomialtime:– Separator-componentdecomposition.
-
Somepropertiesofchordalgraphs
• Everychordalgraphhasatmostnmaximalcliques,andthesecanbefoundinpolynomialtime:Maxcliquedecomposition.
• Everychordalgraphthatisnotacliquehasavertexseparatorthatisamaximalclique,anditcanbefoundinpolynomialtime:Separator-componentdecomposition.
-
DecomposingTriangulatedGraphs
Max Clique Decomposition Separator-component Decomposition
Given:TriangulatedgraphG=(V,E)Output:DecompositionoftheverticesintooverlappingsubsetsRequire:Polynomialtime!Technique:Usespecialpropertiesabouttriangulatedgraphs
-
DCMsareDivide-and-Conquerstrategies!
-
Howtocombinesubsettrees?
• Everytriangulatedgraphhasaperfecteliminationordering:– enablesustomergecorrectsubtreesandgetacorrectsupertreeback,ifsubtreesarebigenough(sothattheycontainalltheshortquartettrees).
-
TriangulatedGraphsandTrees
Theorem(Gravil1974,Buneman1974):AgraphGistriangulatedifandonlyifGistheintersectiongraphofasetofsubtreesofatree.Proof:Onedirectioniseasy,andtheotherisnot…
-
ExamplesofTriangulatedGraphs
• ThresholdgraphsTG(D,q):Disadditiveandqisanyrealnumber,and(x,y)isanedgeifandonlyifD[x,y]
-
ExamplesofTriangulatedGraphs
• ThresholdgraphsTG(D,q):Disadditiveandqisanyrealnumber,and(x,y)isanedgeifandonlyifD[x,y]
-
DCM1-boostingdistance-basedmethods[Nakhlehetal.ISMB2001]
• Theorem(Warnowetal.,SODA2001):DCM1-NJconvergestothetruetreefrompolynomiallengthsequences
NJ DCM1-NJ
0 400 800 1600 1200 No. Taxa
0
0.2
0.4
0.6
0.8
Erro
r Rat
e
-
ExamplesofTriangulatedGraphs
• ShortSubtreeGraphsSSG(T,w):Tisatreewithedge-weightingw,andeveryshortquartetcontributesa4-clique.
• Theorem:ForalltreesTwithedgeweightingw,SSG(T,w)isadditive.
• Proof:Usethefactthatallsubtreeintersectiongraphsaretriangulated.
-
Rec-I-DCM3 significantly improves performance
Comparison of TNT to Rec-I-DCM3(TNT) on one large dataset
0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
0.18
0.2
0 4 8 12 16 20 24
Hours
Average MP score above
optimal, shown as a percentage of
the optimal
Current best techniques (TNT)
Rec-I-DCM3(TNT)
-
ExamplesofTriangulatedGraphs
• Character-stateintersectiongraphsfromperfectphylogenies.
• Theorem:Forallperfectphylogenies,thecharacterstateintersectiongraph(wherenodescorrespondtocharacterstatesandedgescorrespondtoanytwostatesatanynodeinthetree–internalandleaf)istriangulated.
• Proof:Usethefactthatallsubtreeintersectiongraphsaretriangulated.
-
“Homoplasy-Free”Evolution(perfectphylogenies)
YESNO
-
PerfectPhylogeny
• AphylogenyTforasetSoftaxaisaperfectphylogenyifeachstateofeachcharacteroccupiesasubtree(nocharacterhasback-mutationsorparallelevolution)
30
-
Perfectphylogenies,cont.
• A=(0,0),B=(0,1),C=(1,3),D=(1,2)hasaperfectphylogeny!
• A=(0,0),B=(0,1),C=(1,0),D=(1,1)doesnothaveaperfectphylogeny!
-
Aperfectphylogeny
• A=00• B=01• C=13• D=12• E=03• F=13
A
B
C
D
-
Aperfectphylogeny
• A=00• B=01• C=13• D=12• E=03• F=13
A
B
C
D
E F
-
ThePerfectPhylogenyProblem
• GivenasetSoftaxa(species,languages,etc.)determineifaperfectphylogenyTexistsforS.
• TheproblemofdeterminingwhetheraperfectphylogenyexistsisNP-hard(McMorrisetal.1994,Steel1991).
-
TriangulatedGraphs
• Agraphistriangulatedifithasnosimplecyclesofsizefourormore.
-
TriangulatedGraphs
Theorem(Gravil1974,Buneman1974):AgraphGistriangulatedifandonlyifGistheintersectiongraphofasetofsubtreesofatree.Proof:Onedirectioniseasy,andtheotherisnot…
-
PerfectPhylogeniesandTriangulatedColoredGraphs
• SupposeMisacharactermatrixandTisaperfectphylogenyforM.
• ThenletM’betheextensionofMtoincludetheadditional“species”addedattheinternalnodes.
• CharacterStateIntersectionGraphGbasedonM’:– ForeachcharacteralphaandstateiinM’,giveavertexv(alpha,i)andcolorthevertexwiththecolorforalpha.
– Putedgesbetweentwoverticesiftheyshareanyspecies.– Gistriangulatedandproperlycolored.
-
PerfectPhylogeniesandTriangulatedColoredGraphs
• SupposeMisacharactermatrixandTisaperfectphylogenyforM.
• ThenletM’betheextensionofMtoincludetheadditional“species”addedattheinternalnodes.
• ThecharacterstateintersectiongraphGbasedonM’istriangulatedandproperlycolored.(Why?)
• ButifwehadbaseditonMitmightnothavebeentriangulated.(Why?)
-
Aperfectphylogeny
• A=00• B=01• C=13• D=12• E=03• F=13Drawthecharacterstateintersectiongraph.
A
B
C
D
E F
-
Matrixwithaperfectphylogeny
c1c2c3s1321s2122s3113s4211
Draw the perfect phylogeny and compute the sequences at the internal nodes.
-
Matrixwithaperfectphylogeny
c1c2c3s1321s2122s3113s4211
Draw the character state intersection graph for the extended matrix (including the sequences at the internal nodes).
-
Thepartitionintersectiongraph
“Yes”InstanceofPP:c1c2c3s1321s2122s3113s4211
-
Triangulatingcoloredgraphs
• LetG=(V,E)beagraphandcbeavertexcoloringofG.ThenGcanbec-triangulatedifasupergraphG’=(V,E’)existsthatistriangulatedandwherethecoloringcisproper.
• Inotherwords,Gcanbec-triangulatedifandonlyifwecanaddedgestoGtomakeittriangulatedwithoutaddingedgesbetweenverticesofthesamecolor.
-
Agraphthatcanbec-triangulated
-
Agraphthatcanbec-triangulated
A vertex-colored graph G=(V,E) can be c-triangulated if a supergraph G’=(V,E’) exists that is triangulated and where the coloring is proper.
-
Agraphthatcannotbec-triangulated
-
TriangulatingColoredGraphs(TCG)
TriangulatingColoredGraphs:givenavertex-coloredgraphG,determineifGcanbec-triangulated.
-
ThePPandTCGProblems
• Buneman’sTheorem:AperfectphylogenyexistsforasetSifandonlyiftheassociatedcharacterstateintersectiongraphcanbec-triangulated.
• ThePPandTCGproblemsarepolynomiallyequivalentandNP-hard.
-
Ano-instanceofPerfectPhylogeny
• A=00• B=01• C=10• D=11
0 1
0
1
Aninputtoperfectphylogeny(left)offoursequencesdescribedbytwocharacters,anditscharacterstateintersectiongraph.Notethatthecharacterstateintersectiongraphis2-colored.
-
SolvingthePPProblemUsingBuneman’sTheorem
“Yes”InstanceofPP:c1c2c3s1321s2122s3113s4211
-
SolvingthePPProblemUsingBuneman’sTheorem
“Yes”InstanceofPP:c1c2c3s1321s2122s3113s4211
-
Somespecialcasesareeasy
• Binarycharacterperfectphylogenysolvableinlineartime
• r-statecharacterssolvableinpolynomialtimeforeachr(combinatorialalgorithm)
• Twocharacterperfectphylogenysolvableinpolynomialtime(produces2-coloredgraph)
• k-characterperfectphylogenysolvableinpolynomialtimeforeachk(producesk-coloredgraphs--connectionstoRobertson-Seymourgraphminortheory)
-
EarlyHistory• LeQuesne(1969,1972,1974,1977):initialformulationofperfectphylogenies• Estabrook(1972),Estabrooketal.(1975):mathematicalfoundationsofperfect
phylogenies• McMorris(1997):binarycharactercompatibility• Felsenstein(1984):reviewpaper• Estabrook&Landrum,Fitch1975:compatibilityoftwocharacters• Steel(1992)andBodlaenderetal.(1992):NP-hardness• Buneman(1974):reducedperfectphylogenytotriangulatingcoloredgraphs• KannanandWarnow(1992):establishedequivalenceofTCGandPP
SeeT.Warnow,1993.Constructingphylogenetictreesefficientlyusingcompatibilitycriteria.NewZealandJournalofBotany,31:3,pp.239-248(linkedoffmyhomepage)forasurveyoftheearlyliteratureandthefullcitations.
-
Literaturesample• R.AgarwalaandD.Fernandez-Baca,1994.Apolynomial-timealgorithmfortheperfectphylogenyproblem
whenthenumberofcharacterstatesisfixed.SIAMJournalonComputing,23,1216–1224.• R.AgarwalaandD.Fernandez-Baca,1996.Simplealgorithmsforperfectphylogenyandtriangulating
coloredgraphs.InternationalJournalofFoundationsofComputerScience,7,11–21.• H.L.Bodlaender,M.R.Fellows,MichaelT.Hallett,H.ToddWareham,andT.Warnow.2000.Thehardnessof
perfectphylogeny,feasibleregisterassignmentandotherproblemsonthincoloredgraphs,TheoreticalComputerScience244(2000)167-188
• H.L.BodlaenderandT.KIoks,1993:Asimplelineartimealgorithmfortriangulatingthree-coloredgraphs.JournalofalgorithmsJ5:160-172.
• D.Fernandez-Baca,2000.Theperfectphylogenyproblem.Pages203–234of:Du,D.-Z.,andCheng,X.(eds),SteinerTreesinIndustries.KluwerAcademicPublishers.
• D.Gusfield,1991.Efficientalgorithmsforinferringevolutionarytrees.Networks21,19–28.• R.IduryandA.Schaffer.1993:Triangulatingthree-coloredgraphsinlineartimeandlinearspace.SIAM
journalondiscretemathematics6:289-294.• S.KannanandT.Warnow,1992.Triangulating3-coloredgraphs.SIAMJ.onDiscreteMathematics,Vol.5
No.2,pp.249-258(alsoSODA1991)• S.KannanandT.Warnow,1997.Afastalgorithmforthecomputationandenumerationofperfect
phylogenieswhenthenumberofcharacterstatesisfixed.SIAMJ.Computing,Vol.26,No.6,pp.1749-1763(alsoSODA1995)
• F.R.McMorris,T.Warnow,andT.Wimer,1994.TriangulatingVertexColoredGraphs.SIAMJ.onDiscreteMathematics,Vol.7,No.2,pp.296-306(alsoSODA1993).
-
ApplicationsofPerfectPhylogeny
• TumorPhylogenetics(MohammedEl-KebirwilltalkthisonApril10-17,2018)
• HistoricalLinguistics(IwilltalkaboutthisonMarch15,2018)
• PopulationgeneticsandHaplotypeinference