Marta Casanellas Rius (joint work with Jesus Fern andez-S...
Transcript of Marta Casanellas Rius (joint work with Jesus Fern andez-S...
![Page 1: Marta Casanellas Rius (joint work with Jesus Fern andez-S ...mypages.iit.edu/~as2014/talks/MartaCas2014.pdfPhylogenetic invariants versus classical phylogenetics Marta Casanellas Rius](https://reader035.fdocuments.us/reader035/viewer/2022071513/61348cfcdfd10f4dd73bcd4a/html5/thumbnails/1.jpg)
Phylogenetic invariants versus classical phylogenetics
Marta Casanellas Rius(joint work with Jesus Fernandez-Sanchez)
Departament de Matematica Aplicada IUniversitat Politecnica de Catalunya
Algebraic Statistics 2014IIT, Chicago
May 2014
M. Casanellas (UPC) www.ma1.upc.edu/∼casanellas Chicago 2014 1 / 55
![Page 2: Marta Casanellas Rius (joint work with Jesus Fern andez-S ...mypages.iit.edu/~as2014/talks/MartaCas2014.pdfPhylogenetic invariants versus classical phylogenetics Marta Casanellas Rius](https://reader035.fdocuments.us/reader035/viewer/2022071513/61348cfcdfd10f4dd73bcd4a/html5/thumbnails/2.jpg)
Outline
1 Evolutionary Markov models
2 Phylogenetic invariants
3 Eriksson’s method revisited
M. Casanellas (UPC) www.ma1.upc.edu/∼casanellas Chicago 2014 2 / 55
![Page 3: Marta Casanellas Rius (joint work with Jesus Fern andez-S ...mypages.iit.edu/~as2014/talks/MartaCas2014.pdfPhylogenetic invariants versus classical phylogenetics Marta Casanellas Rius](https://reader035.fdocuments.us/reader035/viewer/2022071513/61348cfcdfd10f4dd73bcd4a/html5/thumbnails/3.jpg)
Outline
1 Evolutionary Markov models
2 Phylogenetic invariants
3 Eriksson’s method revisited
M. Casanellas (UPC) www.ma1.upc.edu/∼casanellas Chicago 2014 2 / 55
![Page 4: Marta Casanellas Rius (joint work with Jesus Fern andez-S ...mypages.iit.edu/~as2014/talks/MartaCas2014.pdfPhylogenetic invariants versus classical phylogenetics Marta Casanellas Rius](https://reader035.fdocuments.us/reader035/viewer/2022071513/61348cfcdfd10f4dd73bcd4a/html5/thumbnails/4.jpg)
Outline
1 Evolutionary Markov models
2 Phylogenetic invariants
3 Eriksson’s method revisited
M. Casanellas (UPC) www.ma1.upc.edu/∼casanellas Chicago 2014 2 / 55
![Page 5: Marta Casanellas Rius (joint work with Jesus Fern andez-S ...mypages.iit.edu/~as2014/talks/MartaCas2014.pdfPhylogenetic invariants versus classical phylogenetics Marta Casanellas Rius](https://reader035.fdocuments.us/reader035/viewer/2022071513/61348cfcdfd10f4dd73bcd4a/html5/thumbnails/5.jpg)
Outline
1 Evolutionary Markov models
2 Phylogenetic invariants
3 Eriksson’s method revisited
M. Casanellas (UPC) www.ma1.upc.edu/∼casanellas Chicago 2014 3 / 55
![Page 6: Marta Casanellas Rius (joint work with Jesus Fern andez-S ...mypages.iit.edu/~as2014/talks/MartaCas2014.pdfPhylogenetic invariants versus classical phylogenetics Marta Casanellas Rius](https://reader035.fdocuments.us/reader035/viewer/2022071513/61348cfcdfd10f4dd73bcd4a/html5/thumbnails/6.jpg)
Phylogenetic trees
M. Casanellas (UPC) www.ma1.upc.edu/∼casanellas Chicago 2014 4 / 55
![Page 7: Marta Casanellas Rius (joint work with Jesus Fern andez-S ...mypages.iit.edu/~as2014/talks/MartaCas2014.pdfPhylogenetic invariants versus classical phylogenetics Marta Casanellas Rius](https://reader035.fdocuments.us/reader035/viewer/2022071513/61348cfcdfd10f4dd73bcd4a/html5/thumbnails/7.jpg)
Phylogenetic reconstruction
Given n current species, e.g. n = 3, HUMAN, GORILLA and CHIMP, wewant to reconstruct their ancestral relationships (phylogeny):
Input: DNA sequences (part of their genome) and an alignment
M. Casanellas (UPC) www.ma1.upc.edu/∼casanellas Chicago 2014 5 / 55
![Page 8: Marta Casanellas Rius (joint work with Jesus Fern andez-S ...mypages.iit.edu/~as2014/talks/MartaCas2014.pdfPhylogenetic invariants versus classical phylogenetics Marta Casanellas Rius](https://reader035.fdocuments.us/reader035/viewer/2022071513/61348cfcdfd10f4dd73bcd4a/html5/thumbnails/8.jpg)
Phylogenetic reconstruction
Given n current species, e.g. n = 3, HUMAN, GORILLA and CHIMP, wewant to reconstruct their ancestral relationships (phylogeny):
Input: DNA sequences (part of their genome) and an alignment
M. Casanellas (UPC) www.ma1.upc.edu/∼casanellas Chicago 2014 5 / 55
![Page 9: Marta Casanellas Rius (joint work with Jesus Fern andez-S ...mypages.iit.edu/~as2014/talks/MartaCas2014.pdfPhylogenetic invariants versus classical phylogenetics Marta Casanellas Rius](https://reader035.fdocuments.us/reader035/viewer/2022071513/61348cfcdfd10f4dd73bcd4a/html5/thumbnails/9.jpg)
Phylogenetic reconstruction
Given n current species, e.g. n = 3, HUMAN, GORILLA and CHIMP, wewant to reconstruct their ancestral relationships (phylogeny):
Input: DNA sequences (part of their genome) and an alignment
AACTTCGAGGCTTACCGCTG...
AAGGTCGATGCTCACCGATG...
AACGTCTATGCTCACCGATG...
M. Casanellas (UPC) www.ma1.upc.edu/∼casanellas Chicago 2014 5 / 55
![Page 10: Marta Casanellas Rius (joint work with Jesus Fern andez-S ...mypages.iit.edu/~as2014/talks/MartaCas2014.pdfPhylogenetic invariants versus classical phylogenetics Marta Casanellas Rius](https://reader035.fdocuments.us/reader035/viewer/2022071513/61348cfcdfd10f4dd73bcd4a/html5/thumbnails/10.jpg)
Phylogenetic reconstruction goals
Goals:
Reconstruct the correct tree topology (topology of the graph withspecies labels at the leaves). E.g. 3 different tree topologies ofunrooted trees on four leaves :
Estimate branch lengths: the amount of substitutions per site thathave occurred between the two species at the ends of the branch
M. Casanellas (UPC) www.ma1.upc.edu/∼casanellas Chicago 2014 6 / 55
![Page 11: Marta Casanellas Rius (joint work with Jesus Fern andez-S ...mypages.iit.edu/~as2014/talks/MartaCas2014.pdfPhylogenetic invariants versus classical phylogenetics Marta Casanellas Rius](https://reader035.fdocuments.us/reader035/viewer/2022071513/61348cfcdfd10f4dd73bcd4a/html5/thumbnails/11.jpg)
Modelling evolution
Assume that all sites of the alignment evolve independently andaccording to the same process (i.i.d.).Model the evolution of z single site on a tree T .At each node of the tree T one puts a random variable taking valuesin {A, C, G, T}
Variables at the leaves are observed and variables at the interior nodesare hidden.At each branch one writes a substitution matrix with the conditionalprobabilities of a nucleotide at the parent node being substituted byanother at its child.
A C G T
Si =
ACGT
P(A|A, i) P(C |A, i) P(G |A, i) P(T |A, i)P(A|C , i) P(C |C , i) P(G |C , i) P(T |C , i)P(A|G , i) P(C |G , i) P(G |G , i) P(T |G , i)P(A|T , i) P(C |T , i) P(G |T , i) P(T |T , i)
M. Casanellas (UPC) www.ma1.upc.edu/∼casanellas Chicago 2014 7 / 55
![Page 12: Marta Casanellas Rius (joint work with Jesus Fern andez-S ...mypages.iit.edu/~as2014/talks/MartaCas2014.pdfPhylogenetic invariants versus classical phylogenetics Marta Casanellas Rius](https://reader035.fdocuments.us/reader035/viewer/2022071513/61348cfcdfd10f4dd73bcd4a/html5/thumbnails/12.jpg)
Modelling evolution
Assume that all sites of the alignment evolve independently andaccording to the same process (i.i.d.).Model the evolution of z single site on a tree T .At each node of the tree T one puts a random variable taking valuesin {A, C, G, T}
Variables at the leaves are observed and variables at the interior nodesare hidden.At each branch one writes a substitution matrix with the conditionalprobabilities of a nucleotide at the parent node being substituted byanother at its child.
A C G T
Si =
ACGT
P(A|A, i) P(C |A, i) P(G |A, i) P(T |A, i)P(A|C , i) P(C |C , i) P(G |C , i) P(T |C , i)P(A|G , i) P(C |G , i) P(G |G , i) P(T |G , i)P(A|T , i) P(C |T , i) P(G |T , i) P(T |T , i)
M. Casanellas (UPC) www.ma1.upc.edu/∼casanellas Chicago 2014 7 / 55
![Page 13: Marta Casanellas Rius (joint work with Jesus Fern andez-S ...mypages.iit.edu/~as2014/talks/MartaCas2014.pdfPhylogenetic invariants versus classical phylogenetics Marta Casanellas Rius](https://reader035.fdocuments.us/reader035/viewer/2022071513/61348cfcdfd10f4dd73bcd4a/html5/thumbnails/13.jpg)
Modelling evolution
Assume that all sites of the alignment evolve independently andaccording to the same process (i.i.d.).Model the evolution of z single site on a tree T .At each node of the tree T one puts a random variable taking valuesin {A, C, G, T}
Variables at the leaves are observed and variables at the interior nodesare hidden.At each branch one writes a substitution matrix with the conditionalprobabilities of a nucleotide at the parent node being substituted byanother at its child.
A C G T
Si =
ACGT
P(A|A, i) P(C |A, i) P(G |A, i) P(T |A, i)P(A|C , i) P(C |C , i) P(G |C , i) P(T |C , i)P(A|G , i) P(C |G , i) P(G |G , i) P(T |G , i)P(A|T , i) P(C |T , i) P(G |T , i) P(T |T , i)
M. Casanellas (UPC) www.ma1.upc.edu/∼casanellas Chicago 2014 7 / 55
![Page 14: Marta Casanellas Rius (joint work with Jesus Fern andez-S ...mypages.iit.edu/~as2014/talks/MartaCas2014.pdfPhylogenetic invariants versus classical phylogenetics Marta Casanellas Rius](https://reader035.fdocuments.us/reader035/viewer/2022071513/61348cfcdfd10f4dd73bcd4a/html5/thumbnails/14.jpg)
Evolutionary Markov models
The entries of Si and the root distribution are unknown parameters.
According to the shape of (or group of symmetries in S4 acting on) Si
one has different evolutionary models: general Markov model, strandsymmetric model, Kimura 3-parameter, Kimura 2-parameters,Jukes-Cantor model.
M. Casanellas (UPC) www.ma1.upc.edu/∼casanellas Chicago 2014 8 / 55
![Page 15: Marta Casanellas Rius (joint work with Jesus Fern andez-S ...mypages.iit.edu/~as2014/talks/MartaCas2014.pdfPhylogenetic invariants versus classical phylogenetics Marta Casanellas Rius](https://reader035.fdocuments.us/reader035/viewer/2022071513/61348cfcdfd10f4dd73bcd4a/html5/thumbnails/15.jpg)
Evolutionary Markov models
Jukes-Cantor model
Root node has uniform distribution: pA = ... = pT = 0.25
A C G T
Si =
ACGT
ai bi bi bi
bi ai bi bi
bi bi ai bi
bi bi bi ai
, ai + 3bi = 1
only 2 different parameters (one for mutation and one for non-mutation),only 1 free parameter.
M. Casanellas (UPC) www.ma1.upc.edu/∼casanellas Chicago 2014 9 / 55
![Page 16: Marta Casanellas Rius (joint work with Jesus Fern andez-S ...mypages.iit.edu/~as2014/talks/MartaCas2014.pdfPhylogenetic invariants versus classical phylogenetics Marta Casanellas Rius](https://reader035.fdocuments.us/reader035/viewer/2022071513/61348cfcdfd10f4dd73bcd4a/html5/thumbnails/16.jpg)
Evolutionary Markov models
Kimura 3-parameter model
Root node has uniform distribution: pA = ... = pT = 0.25
A C G T
Si =ACGT
(ai bi ci dibi ai di cici di ai bidi ci bi ai
), ai + bi + ci + di = 1
Kimura 2-parameter: bi = di
M. Casanellas (UPC) www.ma1.upc.edu/∼casanellas Chicago 2014 9 / 55
![Page 17: Marta Casanellas Rius (joint work with Jesus Fern andez-S ...mypages.iit.edu/~as2014/talks/MartaCas2014.pdfPhylogenetic invariants versus classical phylogenetics Marta Casanellas Rius](https://reader035.fdocuments.us/reader035/viewer/2022071513/61348cfcdfd10f4dd73bcd4a/html5/thumbnails/17.jpg)
Evolutionary Markov models
Strand Symmetric model
Root node distribution satisfies: pA = pT, pC = pG
A C G T
Si =
ACGT
ai bi ci di
ei fi gi hi
hi gi fi eidi ci bi ai
Strand symmetry in a DNA molecule:
A–TC–G
M. Casanellas (UPC) www.ma1.upc.edu/∼casanellas Chicago 2014 9 / 55
![Page 18: Marta Casanellas Rius (joint work with Jesus Fern andez-S ...mypages.iit.edu/~as2014/talks/MartaCas2014.pdfPhylogenetic invariants versus classical phylogenetics Marta Casanellas Rius](https://reader035.fdocuments.us/reader035/viewer/2022071513/61348cfcdfd10f4dd73bcd4a/html5/thumbnails/18.jpg)
Evolutionary Markov models
General Markov model
Root node distribution: arbitrary
A C G T
Si =
ACGT
ai bi ci di
ei fi gi hi
ji ki li mi
ni oi pi qi
All these models are known as equivariant models: they are invariantby permutation of rows and columns of a certain subgroup of G = S4
(permutations of A,C,G,T).
M. Casanellas (UPC) www.ma1.upc.edu/∼casanellas Chicago 2014 9 / 55
![Page 19: Marta Casanellas Rius (joint work with Jesus Fern andez-S ...mypages.iit.edu/~as2014/talks/MartaCas2014.pdfPhylogenetic invariants versus classical phylogenetics Marta Casanellas Rius](https://reader035.fdocuments.us/reader035/viewer/2022071513/61348cfcdfd10f4dd73bcd4a/html5/thumbnails/19.jpg)
Hidden Markov process
px1x2x3 = Prob(X1 = x1,X2 = x2,X3 = x3) = joint probability of theobserved variables X1,X2,X3. Then,
px1x2x3 =∑
y4,yr∈{A,C,G,T}
pyr S1(yr , x1)S4(yr , y4)S2(y4, x2)S3(y4, x3)
px1x2x3 is a homogeneous polynomial on the parameters (entries of thematrices Si and root distribution).
M. Casanellas (UPC) www.ma1.upc.edu/∼casanellas Chicago 2014 10 / 55
![Page 20: Marta Casanellas Rius (joint work with Jesus Fern andez-S ...mypages.iit.edu/~as2014/talks/MartaCas2014.pdfPhylogenetic invariants versus classical phylogenetics Marta Casanellas Rius](https://reader035.fdocuments.us/reader035/viewer/2022071513/61348cfcdfd10f4dd73bcd4a/html5/thumbnails/20.jpg)
An evolutionary model M (with d free parameters) on a treetopology T on n leaves defines a map
ϕMT :∏
e∈E(T )4d −→ 44n−1
ϕMT :∏
e∈E(T ) Cd+1 −→ C4n
Phylogenetic variety VM(T ) = ImϕT , closure in Zariski topology.
Given an alignment, we can estimate the joint probability px1x2...xn bythe relative frequency of column x1x2 . . . xn in the alignment.
In the theoretical model, this would be a point p = (fx1x2...xn)x1x2...xn
on the variety VM(T ).
In the real world, it is only close to VM(T ) for some tree topology T .
M. Casanellas (UPC) www.ma1.upc.edu/∼casanellas Chicago 2014 11 / 55
![Page 21: Marta Casanellas Rius (joint work with Jesus Fern andez-S ...mypages.iit.edu/~as2014/talks/MartaCas2014.pdfPhylogenetic invariants versus classical phylogenetics Marta Casanellas Rius](https://reader035.fdocuments.us/reader035/viewer/2022071513/61348cfcdfd10f4dd73bcd4a/html5/thumbnails/21.jpg)
An evolutionary model M (with d free parameters) on a treetopology T on n leaves defines a map
ϕMT :∏
e∈E(T )4d −→ 44n−1
ϕMT :∏
e∈E(T ) Cd+1 −→ C4n
Phylogenetic variety VM(T ) = ImϕT , closure in Zariski topology.
Given an alignment, we can estimate the joint probability px1x2...xn bythe relative frequency of column x1x2 . . . xn in the alignment.
In the theoretical model, this would be a point p = (fx1x2...xn)x1x2...xn
on the variety VM(T ).
In the real world, it is only close to VM(T ) for some tree topology T .
M. Casanellas (UPC) www.ma1.upc.edu/∼casanellas Chicago 2014 11 / 55
![Page 22: Marta Casanellas Rius (joint work with Jesus Fern andez-S ...mypages.iit.edu/~as2014/talks/MartaCas2014.pdfPhylogenetic invariants versus classical phylogenetics Marta Casanellas Rius](https://reader035.fdocuments.us/reader035/viewer/2022071513/61348cfcdfd10f4dd73bcd4a/html5/thumbnails/22.jpg)
An evolutionary model M (with d free parameters) on a treetopology T on n leaves defines a map
ϕMT :∏
e∈E(T )4d −→ 44n−1
ϕMT :∏
e∈E(T ) Cd+1 −→ C4n
Phylogenetic variety VM(T ) = ImϕT , closure in Zariski topology.
Given an alignment, we can estimate the joint probability px1x2...xn bythe relative frequency of column x1x2 . . . xn in the alignment.
In the theoretical model, this would be a point p = (fx1x2...xn)x1x2...xn
on the variety VM(T ).
In the real world, it is only close to VM(T ) for some tree topology T .
Image(ϕMT )
M. Casanellas (UPC) www.ma1.upc.edu/∼casanellas Chicago 2014 11 / 55
![Page 23: Marta Casanellas Rius (joint work with Jesus Fern andez-S ...mypages.iit.edu/~as2014/talks/MartaCas2014.pdfPhylogenetic invariants versus classical phylogenetics Marta Casanellas Rius](https://reader035.fdocuments.us/reader035/viewer/2022071513/61348cfcdfd10f4dd73bcd4a/html5/thumbnails/23.jpg)
Phylogenetic mixtures
Change the i.i.d hypothesis by “all positions evolve independently andfollowing the same evolutionary model”.
Same model M, same tree topology T = T1 but different substitutionparameters. Then,
p = (pAA...A, . . . , pTT...T) = λϕMT1(S1, . . . ,S4) + (1− λ)ϕMT1
(S ′1, . . . ,S′4)
That is, p ∈ SecVMT1
M. Casanellas (UPC) www.ma1.upc.edu/∼casanellas Chicago 2014 12 / 55
![Page 24: Marta Casanellas Rius (joint work with Jesus Fern andez-S ...mypages.iit.edu/~as2014/talks/MartaCas2014.pdfPhylogenetic invariants versus classical phylogenetics Marta Casanellas Rius](https://reader035.fdocuments.us/reader035/viewer/2022071513/61348cfcdfd10f4dd73bcd4a/html5/thumbnails/24.jpg)
Phylogenetic mixtures
Change the i.i.d hypothesis by “all positions evolve independently andfollowing the same evolutionary model”.
Same model M, different tree topologies and different substitutionparameters. Then,
p = (pAA...A, . . . , pTT...T) = λϕMT1(S1, . . . ,S4) + (1− λ)ϕMT2
(S ′1, . . . ,S′4)
That is, p ∈ VMT1∗ VMT2
(join of two varieties).
M. Casanellas (UPC) www.ma1.upc.edu/∼casanellas Chicago 2014 13 / 55
![Page 25: Marta Casanellas Rius (joint work with Jesus Fern andez-S ...mypages.iit.edu/~as2014/talks/MartaCas2014.pdfPhylogenetic invariants versus classical phylogenetics Marta Casanellas Rius](https://reader035.fdocuments.us/reader035/viewer/2022071513/61348cfcdfd10f4dd73bcd4a/html5/thumbnails/25.jpg)
Outline
1 Evolutionary Markov models
2 Phylogenetic invariants
3 Eriksson’s method revisited
M. Casanellas (UPC) www.ma1.upc.edu/∼casanellas Chicago 2014 14 / 55
![Page 26: Marta Casanellas Rius (joint work with Jesus Fern andez-S ...mypages.iit.edu/~as2014/talks/MartaCas2014.pdfPhylogenetic invariants versus classical phylogenetics Marta Casanellas Rius](https://reader035.fdocuments.us/reader035/viewer/2022071513/61348cfcdfd10f4dd73bcd4a/html5/thumbnails/26.jpg)
Which generators of I (VM(T )) are useful for recovering the treetopology?
Some generators depend on the model chosen (not on the treetopology). E.g. for Jukes-Cantor, n = 3:
∑px1x2x3 = 1 and
pAAA − pCCC, pAAA − pGGG, pAAA − pTTT,
pAAC − pAAG, pAAC − pAAT, pCCA − pCCG, . . . ,
pACA − pAGA, pACA − pATA, pCAC − pCGC, . . .
pCAA − pGAA, pCAA − pTAA, pACC − pGCC, . . .For this model, the polynomials that detect the topology of the threespecies have degree 3.
Definition (Cavender–Felsenstein ’87)
The elements of I (VM(T )) are called invariants. They are calledphylogenetic invariants if they distinguish between different tree topologies.
M. Casanellas (UPC) www.ma1.upc.edu/∼casanellas Chicago 2014 15 / 55
![Page 27: Marta Casanellas Rius (joint work with Jesus Fern andez-S ...mypages.iit.edu/~as2014/talks/MartaCas2014.pdfPhylogenetic invariants versus classical phylogenetics Marta Casanellas Rius](https://reader035.fdocuments.us/reader035/viewer/2022071513/61348cfcdfd10f4dd73bcd4a/html5/thumbnails/27.jpg)
Which generators of I (VM(T )) are useful for recovering the treetopology?
Some generators depend on the model chosen (not on the treetopology). E.g. for Jukes-Cantor, n = 3:
∑px1x2x3 = 1 and
pAAA − pCCC, pAAA − pGGG, pAAA − pTTT,
pAAC − pAAG, pAAC − pAAT, pCCA − pCCG, . . . ,
pACA − pAGA, pACA − pATA, pCAC − pCGC, . . .
pCAA − pGAA, pCAA − pTAA, pACC − pGCC, . . .For this model, the polynomials that detect the topology of the threespecies have degree 3.
Definition (Cavender–Felsenstein ’87)
The elements of I (VM(T )) are called invariants. They are calledphylogenetic invariants if they distinguish between different tree topologies.
M. Casanellas (UPC) www.ma1.upc.edu/∼casanellas Chicago 2014 15 / 55
![Page 28: Marta Casanellas Rius (joint work with Jesus Fern andez-S ...mypages.iit.edu/~as2014/talks/MartaCas2014.pdfPhylogenetic invariants versus classical phylogenetics Marta Casanellas Rius](https://reader035.fdocuments.us/reader035/viewer/2022071513/61348cfcdfd10f4dd73bcd4a/html5/thumbnails/28.jpg)
Which generators of I (VM(T )) are useful for recovering the treetopology?
Some generators depend on the model chosen (not on the treetopology). E.g. for Jukes-Cantor, n = 3:
∑px1x2x3 = 1 and
pAAA − pCCC, pAAA − pGGG, pAAA − pTTT,
pAAC − pAAG, pAAC − pAAT, pCCA − pCCG, . . . ,
pACA − pAGA, pACA − pATA, pCAC − pCGC, . . .
pCAA − pGAA, pCAA − pTAA, pACC − pGCC, . . .For this model, the polynomials that detect the topology of the threespecies have degree 3.
Definition (Cavender–Felsenstein ’87)
The elements of I (VM(T )) are called invariants. They are calledphylogenetic invariants if they distinguish between different tree topologies.
M. Casanellas (UPC) www.ma1.upc.edu/∼casanellas Chicago 2014 15 / 55
![Page 29: Marta Casanellas Rius (joint work with Jesus Fern andez-S ...mypages.iit.edu/~as2014/talks/MartaCas2014.pdfPhylogenetic invariants versus classical phylogenetics Marta Casanellas Rius](https://reader035.fdocuments.us/reader035/viewer/2022071513/61348cfcdfd10f4dd73bcd4a/html5/thumbnails/29.jpg)
Description of the ideal
Computational algebra software: fails for ≥ 5 leaves.
M. Casanellas (UPC) www.ma1.upc.edu/∼casanellas Chicago 2014 16 / 55
![Page 30: Marta Casanellas Rius (joint work with Jesus Fern andez-S ...mypages.iit.edu/~as2014/talks/MartaCas2014.pdfPhylogenetic invariants versus classical phylogenetics Marta Casanellas Rius](https://reader035.fdocuments.us/reader035/viewer/2022071513/61348cfcdfd10f4dd73bcd4a/html5/thumbnails/30.jpg)
Description of the ideal
Computational algebra software: fails for ≥ 5 leaves.Model M fixed.
Theorem (Allman-Rhodes, Draisma-Kuttler, Sturmfels, Sullivant, C)
There is an algorithm for obtaining the generators of the ideal of ann-leaved tree T from those of an unrooted tree of 3 leaves and the edgesof the tree.
I (V (T )) =∑
v∈Int(T )
Iv +∑
e∈E(T )
Ie
Iv
Ie
M. Casanellas (UPC) www.ma1.upc.edu/∼casanellas Chicago 2014 16 / 55
![Page 31: Marta Casanellas Rius (joint work with Jesus Fern andez-S ...mypages.iit.edu/~as2014/talks/MartaCas2014.pdfPhylogenetic invariants versus classical phylogenetics Marta Casanellas Rius](https://reader035.fdocuments.us/reader035/viewer/2022071513/61348cfcdfd10f4dd73bcd4a/html5/thumbnails/31.jpg)
Invariants from edges
Example: M = general Markov model.
16× 16 matrix obtained by flattening p = (pAAAA, pAAAC, . . . , pTTTT) in4⊗ C4 ∼= (
2⊗ C4)⊗ (
2⊗ C4) :
states in leaves 3,4
flatep =states
leaves
1, 2
pAAAA pAAAC pAAAG . . . pAATTpACAA pACAC pACAG . . . pACTTpAGAA pAGAC pAGAG . . . pAGTT. . . . . . . . . . . . . . .
pTTAA pTTAC pTTAG . . . pTTTT
Ie := ideal generated by all 5× 5 minors.
(Allman-Rhodes, Eriksson) Ie are phylogenetic invariants for thegeneral Markov model.
M. Casanellas (UPC) www.ma1.upc.edu/∼casanellas Chicago 2014 17 / 55
![Page 32: Marta Casanellas Rius (joint work with Jesus Fern andez-S ...mypages.iit.edu/~as2014/talks/MartaCas2014.pdfPhylogenetic invariants versus classical phylogenetics Marta Casanellas Rius](https://reader035.fdocuments.us/reader035/viewer/2022071513/61348cfcdfd10f4dd73bcd4a/html5/thumbnails/32.jpg)
Invariants from edges
Example: M = general Markov model.
16× 16 matrix obtained by flattening p = (pAAAA, pAAAC, . . . , pTTTT) in4⊗ C4 ∼= (
2⊗ C4)⊗ (
2⊗ C4) :
states in leaves 3,4
flatep =states
leaves
1, 2
pAAAA pAAAC pAAAG . . . pAATTpACAA pACAC pACAG . . . pACTTpAGAA pAGAC pAGAG . . . pAGTT. . . . . . . . . . . . . . .
pTTAA pTTAC pTTAG . . . pTTTT
Ie := ideal generated by all 5× 5 minors.
(Allman-Rhodes, Eriksson) Ie are phylogenetic invariants for thegeneral Markov model.
M. Casanellas (UPC) www.ma1.upc.edu/∼casanellas Chicago 2014 17 / 55
![Page 33: Marta Casanellas Rius (joint work with Jesus Fern andez-S ...mypages.iit.edu/~as2014/talks/MartaCas2014.pdfPhylogenetic invariants versus classical phylogenetics Marta Casanellas Rius](https://reader035.fdocuments.us/reader035/viewer/2022071513/61348cfcdfd10f4dd73bcd4a/html5/thumbnails/33.jpg)
Phylogenetic invariants
We are just interested in phylogenetic invariants, i.e, generators thatvanish in some tree topologies but not all.
Ingredients:1 n given species2 evolutionary model M is fixed3 want tree topology relating them
We call
T = {trivalent tree topologies on n labelled leaves}
Assuming that our data p comes from a distribution on some treeT ∈ T under the model M, find the correct tree T0.
In algebraic geometry this is: assume p ∈ ∪T∈T VM(T ) and find T0
such that p ∈ VM(T0).
M. Casanellas (UPC) www.ma1.upc.edu/∼casanellas Chicago 2014 18 / 55
![Page 34: Marta Casanellas Rius (joint work with Jesus Fern andez-S ...mypages.iit.edu/~as2014/talks/MartaCas2014.pdfPhylogenetic invariants versus classical phylogenetics Marta Casanellas Rius](https://reader035.fdocuments.us/reader035/viewer/2022071513/61348cfcdfd10f4dd73bcd4a/html5/thumbnails/34.jpg)
Phylogenetic invariants
We are just interested in phylogenetic invariants, i.e, generators thatvanish in some tree topologies but not all.
Ingredients:1 n given species2 evolutionary model M is fixed3 want tree topology relating them
We call
T = {trivalent tree topologies on n labelled leaves}
Assuming that our data p comes from a distribution on some treeT ∈ T under the model M, find the correct tree T0.
In algebraic geometry this is: assume p ∈ ∪T∈T VM(T ) and find T0
such that p ∈ VM(T0).
M. Casanellas (UPC) www.ma1.upc.edu/∼casanellas Chicago 2014 18 / 55
![Page 35: Marta Casanellas Rius (joint work with Jesus Fern andez-S ...mypages.iit.edu/~as2014/talks/MartaCas2014.pdfPhylogenetic invariants versus classical phylogenetics Marta Casanellas Rius](https://reader035.fdocuments.us/reader035/viewer/2022071513/61348cfcdfd10f4dd73bcd4a/html5/thumbnails/35.jpg)
Phylogenetic invariants
We are just interested in phylogenetic invariants, i.e, generators thatvanish in some tree topologies but not all.
Ingredients:1 n given species2 evolutionary model M is fixed3 want tree topology relating them
We call
T = {trivalent tree topologies on n labelled leaves}
Assuming that our data p comes from a distribution on some treeT ∈ T under the model M, find the correct tree T0.
In algebraic geometry this is: assume p ∈ ∪T∈T VM(T ) and find T0
such that p ∈ VM(T0).
M. Casanellas (UPC) www.ma1.upc.edu/∼casanellas Chicago 2014 18 / 55
![Page 36: Marta Casanellas Rius (joint work with Jesus Fern andez-S ...mypages.iit.edu/~as2014/talks/MartaCas2014.pdfPhylogenetic invariants versus classical phylogenetics Marta Casanellas Rius](https://reader035.fdocuments.us/reader035/viewer/2022071513/61348cfcdfd10f4dd73bcd4a/html5/thumbnails/36.jpg)
Phylogenetic invariants
We are just interested in phylogenetic invariants, i.e, generators thatvanish in some tree topologies but not all.
Ingredients:1 n given species2 evolutionary model M is fixed3 want tree topology relating them
We call
T = {trivalent tree topologies on n labelled leaves}
Assuming that our data p comes from a distribution on some treeT ∈ T under the model M, find the correct tree T0.
In algebraic geometry this is: assume p ∈ ∪T∈T VM(T ) and find T0
such that p ∈ VM(T0).
M. Casanellas (UPC) www.ma1.upc.edu/∼casanellas Chicago 2014 18 / 55
![Page 37: Marta Casanellas Rius (joint work with Jesus Fern andez-S ...mypages.iit.edu/~as2014/talks/MartaCas2014.pdfPhylogenetic invariants versus classical phylogenetics Marta Casanellas Rius](https://reader035.fdocuments.us/reader035/viewer/2022071513/61348cfcdfd10f4dd73bcd4a/html5/thumbnails/37.jpg)
Phylogenetic invariants
Goal: for any particular tree T0 ∈ T , how is the variety VM(T0)defined inside ∪T∈T VM(T )? (at least in a biologically meaningfulopen subset).
M. Casanellas (UPC) www.ma1.upc.edu/∼casanellas Chicago 2014 19 / 55
![Page 38: Marta Casanellas Rius (joint work with Jesus Fern andez-S ...mypages.iit.edu/~as2014/talks/MartaCas2014.pdfPhylogenetic invariants versus classical phylogenetics Marta Casanellas Rius](https://reader035.fdocuments.us/reader035/viewer/2022071513/61348cfcdfd10f4dd73bcd4a/html5/thumbnails/38.jpg)
Main Theorem
Theorem
Let M = Jukes-Cantor, Kimura (2 or 3 parameters), Strand Symmetric,General Markov model (or any equivariant model). For topologyreconstruction purposes, it is enough to consider invariants from edges.
More precisely, for each tree topology T ∈ T there exists a dense opensubset UT ⊆ V (T ) such that if a point p belongs to
⋃T∈T
UT , then
p ∈ V (T0) p ∈ Z (∑
e∈E(T0) Ie) .
Theorem (Buneman’s Splits equivalence Theorem)
A tree can be recovered by knowing all bipartitions induced from its edges.
M. Casanellas (UPC) www.ma1.upc.edu/∼casanellas Chicago 2014 20 / 55
![Page 39: Marta Casanellas Rius (joint work with Jesus Fern andez-S ...mypages.iit.edu/~as2014/talks/MartaCas2014.pdfPhylogenetic invariants versus classical phylogenetics Marta Casanellas Rius](https://reader035.fdocuments.us/reader035/viewer/2022071513/61348cfcdfd10f4dd73bcd4a/html5/thumbnails/39.jpg)
Key: Representation theory to include all models
Draisma-Kuttler framework: representation theory
Example: Jukes-Cantor model. Substitution matrices:A C G T
S =
A
C
G
T
a b b bb a b bb b a bb b b a
Any permutation of rows and columns leaves the matrix invariant.Consider G = S4, permutations of A,C,G,T.Consider the representation of G , ρ : G −→ GL(W ),W =< A, C, G, T >C∼= C4, corresponding to the action of G on4× 4-matrices. For instance,
(A, C) 7→(
0 1 0 01 0 0 00 0 1 00 0 0 1
)M. Casanellas (UPC) www.ma1.upc.edu/∼casanellas Chicago 2014 21 / 55
![Page 40: Marta Casanellas Rius (joint work with Jesus Fern andez-S ...mypages.iit.edu/~as2014/talks/MartaCas2014.pdfPhylogenetic invariants versus classical phylogenetics Marta Casanellas Rius](https://reader035.fdocuments.us/reader035/viewer/2022071513/61348cfcdfd10f4dd73bcd4a/html5/thumbnails/40.jpg)
New description of the parametrization
We associate the vector space W = 〈A, C, G, T〉C to each node in thetree and then each substitution matrix belongs to HomG (W ,W ).
Therefore the substitution matrices lie inparG (T ) :=
∏e∈E(T ) HomG (W ,W ) and the map ϕT parameterizing
VM(T ) can be seen as
ϕT : parG (T ) −→ (n⊗W )G
M. Casanellas (UPC) www.ma1.upc.edu/∼casanellas Chicago 2014 22 / 55
![Page 41: Marta Casanellas Rius (joint work with Jesus Fern andez-S ...mypages.iit.edu/~as2014/talks/MartaCas2014.pdfPhylogenetic invariants versus classical phylogenetics Marta Casanellas Rius](https://reader035.fdocuments.us/reader035/viewer/2022071513/61348cfcdfd10f4dd73bcd4a/html5/thumbnails/41.jpg)
Other models: subgroups of S4
G = 〈(A, C)(G, T), (A, G)(C, T)〉 ∼= Z2 × Z2 for Kimura 3-parametermodel.
G = 〈(AT)(CG)〉 for strand symmetric model.
G = id for general Markov model.
and in all cases W = 〈A, C, G, T〉C and ρ : G −→ GL(W ) is thepermutation representation.
M. Casanellas (UPC) www.ma1.upc.edu/∼casanellas Chicago 2014 23 / 55
![Page 42: Marta Casanellas Rius (joint work with Jesus Fern andez-S ...mypages.iit.edu/~as2014/talks/MartaCas2014.pdfPhylogenetic invariants versus classical phylogenetics Marta Casanellas Rius](https://reader035.fdocuments.us/reader035/viewer/2022071513/61348cfcdfd10f4dd73bcd4a/html5/thumbnails/42.jpg)
Jukes-Cantor
If G = S4 (Jukes-Cantor), then G has 5 irreducible representationsM0, . . . ,M4.
(Maschke): Any G -representation E decomposes into isotypiccomponents, E ∼= ⊕4
i=0Mi ⊗ Cmi , and we associate to it a 5-tuple ofmultiplicities m(E ) = (m0, . . . ,m4).
Our particular representation ρ : G −→ GL(W ) for Jukes-Cantormodel decomposes as W ∼= M0 ⊕M3 and it hasm(W ) = (1, 0, 0, 1, 0).
M. Casanellas (UPC) www.ma1.upc.edu/∼casanellas Chicago 2014 24 / 55
![Page 43: Marta Casanellas Rius (joint work with Jesus Fern andez-S ...mypages.iit.edu/~as2014/talks/MartaCas2014.pdfPhylogenetic invariants versus classical phylogenetics Marta Casanellas Rius](https://reader035.fdocuments.us/reader035/viewer/2022071513/61348cfcdfd10f4dd73bcd4a/html5/thumbnails/43.jpg)
Jukes-Cantor
If G = S4 (Jukes-Cantor), then G has 5 irreducible representationsM0, . . . ,M4.
(Maschke): Any G -representation E decomposes into isotypiccomponents, E ∼= ⊕4
i=0Mi ⊗ Cmi , and we associate to it a 5-tuple ofmultiplicities m(E ) = (m0, . . . ,m4).
Our particular representation ρ : G −→ GL(W ) for Jukes-Cantormodel decomposes as W ∼= M0 ⊕M3 and it hasm(W ) = (1, 0, 0, 1, 0).
M. Casanellas (UPC) www.ma1.upc.edu/∼casanellas Chicago 2014 24 / 55
![Page 44: Marta Casanellas Rius (joint work with Jesus Fern andez-S ...mypages.iit.edu/~as2014/talks/MartaCas2014.pdfPhylogenetic invariants versus classical phylogenetics Marta Casanellas Rius](https://reader035.fdocuments.us/reader035/viewer/2022071513/61348cfcdfd10f4dd73bcd4a/html5/thumbnails/44.jpg)
Thin flattening
Let L1 | L2 be a bipartition of the set of leaves {1, . . . , n} , anddenote l1 = |L1|, l2 = |L2|
Let p ∈ (n⊗W )G . The matrix of the thin flattening of p along L1 | L2,
TflatL1|L2p = (P0, . . . ,P4), is obtained from p via the isomorphism
(l1⊗W ⊗
l2⊗W )G ∼= HomG (
l1⊗W ,
l2⊗W ) ∼=
4⊕i=0HomC(Cm(
l1⊗W )i ,Cm(
l2⊗W )i )
(last isomorphism is due to Schur’s Lemma).
Write rkTflatL1,L2p := (rk P0, . . . , rk P4).
Rank conditions:
rkTflatL1,L2p ≤ m(W ) = (1, 0, 0, 1, 0)
M. Casanellas (UPC) www.ma1.upc.edu/∼casanellas Chicago 2014 25 / 55
![Page 45: Marta Casanellas Rius (joint work with Jesus Fern andez-S ...mypages.iit.edu/~as2014/talks/MartaCas2014.pdfPhylogenetic invariants versus classical phylogenetics Marta Casanellas Rius](https://reader035.fdocuments.us/reader035/viewer/2022071513/61348cfcdfd10f4dd73bcd4a/html5/thumbnails/45.jpg)
Example (Jukes-Cantor)
n = 4, L1 = {1, 2}, L2 = {3, 4}.
HomG (2⊗W ,
2⊗W ) ∼=
4⊕i=0
HomC(Cm(2⊗W )i ,Cm(
2⊗W )i )
W ⊗W ∼= M20 ⊕M2 ⊕M3
3 ⊕M4, hence m(2⊗W ) = (2, 0, 1, 3, 1) and
TflatL1,L2p can be written as a block matrix in a certain basis:
TflatL1,L2p =
∗ ∗∗ ∗ 0 0 0
0 ∗ 0 0
0 0∗ ∗ ∗∗ ∗ ∗∗ ∗ ∗
0
0 0 0 ∗
rkTflatL1,L2p ≤ m(W ) = (1, 0, 0, 1, 0).
M. Casanellas (UPC) www.ma1.upc.edu/∼casanellas Chicago 2014 26 / 55
![Page 46: Marta Casanellas Rius (joint work with Jesus Fern andez-S ...mypages.iit.edu/~as2014/talks/MartaCas2014.pdfPhylogenetic invariants versus classical phylogenetics Marta Casanellas Rius](https://reader035.fdocuments.us/reader035/viewer/2022071513/61348cfcdfd10f4dd73bcd4a/html5/thumbnails/46.jpg)
Invariants for mixtures on the same tree
If p is a k-mixture distribution on a tree topology T ,p =
∑ki=1 λiϕ
MT (θi ), and A|B is a bipartition induced by and edge of
T , thenrk(flatA|B(p)) ≤ 4k
(for the general Markov model, analogous for other equivariantmodels)
M. Casanellas (UPC) www.ma1.upc.edu/∼casanellas Chicago 2014 27 / 55
![Page 47: Marta Casanellas Rius (joint work with Jesus Fern andez-S ...mypages.iit.edu/~as2014/talks/MartaCas2014.pdfPhylogenetic invariants versus classical phylogenetics Marta Casanellas Rius](https://reader035.fdocuments.us/reader035/viewer/2022071513/61348cfcdfd10f4dd73bcd4a/html5/thumbnails/47.jpg)
Some results on simulations
Phylogenetic invariants have been tested on simulated data with notso bad results
Have been also tested in quartet-based methods (Rusinko-Hipp,Sumner et al.)
But... there is no canonical way of using them.
Model invariants have been used for model selection with successfulresults (+Kedzierska, Guigo)
Invariants have been also useful for proving identifiability of certainmodels (Allman-Rhodes et al.)
M. Casanellas (UPC) www.ma1.upc.edu/∼casanellas Chicago 2014 28 / 55
![Page 48: Marta Casanellas Rius (joint work with Jesus Fern andez-S ...mypages.iit.edu/~as2014/talks/MartaCas2014.pdfPhylogenetic invariants versus classical phylogenetics Marta Casanellas Rius](https://reader035.fdocuments.us/reader035/viewer/2022071513/61348cfcdfd10f4dd73bcd4a/html5/thumbnails/48.jpg)
Some results on simulations
Phylogenetic invariants have been tested on simulated data with notso bad results
Have been also tested in quartet-based methods (Rusinko-Hipp,Sumner et al.)
But... there is no canonical way of using them.
Model invariants have been used for model selection with successfulresults (+Kedzierska, Guigo)
Invariants have been also useful for proving identifiability of certainmodels (Allman-Rhodes et al.)
M. Casanellas (UPC) www.ma1.upc.edu/∼casanellas Chicago 2014 28 / 55
![Page 49: Marta Casanellas Rius (joint work with Jesus Fern andez-S ...mypages.iit.edu/~as2014/talks/MartaCas2014.pdfPhylogenetic invariants versus classical phylogenetics Marta Casanellas Rius](https://reader035.fdocuments.us/reader035/viewer/2022071513/61348cfcdfd10f4dd73bcd4a/html5/thumbnails/49.jpg)
Some results on simulations
Phylogenetic invariants have been tested on simulated data with notso bad results
Have been also tested in quartet-based methods (Rusinko-Hipp,Sumner et al.)
But... there is no canonical way of using them.
Model invariants have been used for model selection with successfulresults (+Kedzierska, Guigo)
Invariants have been also useful for proving identifiability of certainmodels (Allman-Rhodes et al.)
M. Casanellas (UPC) www.ma1.upc.edu/∼casanellas Chicago 2014 28 / 55
![Page 50: Marta Casanellas Rius (joint work with Jesus Fern andez-S ...mypages.iit.edu/~as2014/talks/MartaCas2014.pdfPhylogenetic invariants versus classical phylogenetics Marta Casanellas Rius](https://reader035.fdocuments.us/reader035/viewer/2022071513/61348cfcdfd10f4dd73bcd4a/html5/thumbnails/50.jpg)
Some results on simulations
Phylogenetic invariants have been tested on simulated data with notso bad results
Have been also tested in quartet-based methods (Rusinko-Hipp,Sumner et al.)
But... there is no canonical way of using them.
Model invariants have been used for model selection with successfulresults (+Kedzierska, Guigo)
Invariants have been also useful for proving identifiability of certainmodels (Allman-Rhodes et al.)
M. Casanellas (UPC) www.ma1.upc.edu/∼casanellas Chicago 2014 28 / 55
![Page 51: Marta Casanellas Rius (joint work with Jesus Fern andez-S ...mypages.iit.edu/~as2014/talks/MartaCas2014.pdfPhylogenetic invariants versus classical phylogenetics Marta Casanellas Rius](https://reader035.fdocuments.us/reader035/viewer/2022071513/61348cfcdfd10f4dd73bcd4a/html5/thumbnails/51.jpg)
Some results on simulations
Phylogenetic invariants have been tested on simulated data with notso bad results
Have been also tested in quartet-based methods (Rusinko-Hipp,Sumner et al.)
But... there is no canonical way of using them.
Model invariants have been used for model selection with successfulresults (+Kedzierska, Guigo)
Invariants have been also useful for proving identifiability of certainmodels (Allman-Rhodes et al.)
M. Casanellas (UPC) www.ma1.upc.edu/∼casanellas Chicago 2014 28 / 55
![Page 52: Marta Casanellas Rius (joint work with Jesus Fern andez-S ...mypages.iit.edu/~as2014/talks/MartaCas2014.pdfPhylogenetic invariants versus classical phylogenetics Marta Casanellas Rius](https://reader035.fdocuments.us/reader035/viewer/2022071513/61348cfcdfd10f4dd73bcd4a/html5/thumbnails/52.jpg)
Are we ready for application to real data???
M. Casanellas (UPC) www.ma1.upc.edu/∼casanellas Chicago 2014 29 / 55
![Page 53: Marta Casanellas Rius (joint work with Jesus Fern andez-S ...mypages.iit.edu/~as2014/talks/MartaCas2014.pdfPhylogenetic invariants versus classical phylogenetics Marta Casanellas Rius](https://reader035.fdocuments.us/reader035/viewer/2022071513/61348cfcdfd10f4dd73bcd4a/html5/thumbnails/53.jpg)
Outline
1 Evolutionary Markov models
2 Phylogenetic invariants
3 Eriksson’s method revisited
M. Casanellas (UPC) www.ma1.upc.edu/∼casanellas Chicago 2014 30 / 55
![Page 54: Marta Casanellas Rius (joint work with Jesus Fern andez-S ...mypages.iit.edu/~as2014/talks/MartaCas2014.pdfPhylogenetic invariants versus classical phylogenetics Marta Casanellas Rius](https://reader035.fdocuments.us/reader035/viewer/2022071513/61348cfcdfd10f4dd73bcd4a/html5/thumbnails/54.jpg)
Singular value decomposition (SVD)
Distance (in Frobenius or 2-norm) of an m× n matrix M to R4 = { m× nmatrices of rank ≤ 4} can be computed easily:
SVD: M = UDV t with U ∈Mm×m, D ∈Mm×n, V ∈Mn×n, U, Vorthogonal, D “almost” diagonal
D =
σ1
σ2
. . .
σm
0 · · · 0
σ1 ≥ σ2 ≥ · · · ≥ σm ≥ 0
Then minX∈R4 ||M − X || = ||M − X4|| where
X4 = U
σ1
σ2
σ3
σ4
0
. . .
0 · · · 0
V t
and d2(M,R4) = σ5, dF (M,R4) =√∑
i≥5 σ2i .
M. Casanellas (UPC) www.ma1.upc.edu/∼casanellas Chicago 2014 31 / 55
![Page 55: Marta Casanellas Rius (joint work with Jesus Fern andez-S ...mypages.iit.edu/~as2014/talks/MartaCas2014.pdfPhylogenetic invariants versus classical phylogenetics Marta Casanellas Rius](https://reader035.fdocuments.us/reader035/viewer/2022071513/61348cfcdfd10f4dd73bcd4a/html5/thumbnails/55.jpg)
Singular value decomposition (SVD)
Distance (in Frobenius or 2-norm) of an m× n matrix M to R4 = { m× nmatrices of rank ≤ 4} can be computed easily:
SVD: M = UDV t with U ∈Mm×m, D ∈Mm×n, V ∈Mn×n, U, Vorthogonal, D “almost” diagonal
D =
σ1
σ2
. . .
σm
0 · · · 0
σ1 ≥ σ2 ≥ · · · ≥ σm ≥ 0
Then minX∈R4 ||M − X || = ||M − X4|| where
X4 = U
σ1
σ2
σ3
σ4
0
. . .
0 · · · 0
V t
and d2(M,R4) = σ5, dF (M,R4) =√∑
i≥5 σ2i .
M. Casanellas (UPC) www.ma1.upc.edu/∼casanellas Chicago 2014 31 / 55
![Page 56: Marta Casanellas Rius (joint work with Jesus Fern andez-S ...mypages.iit.edu/~as2014/talks/MartaCas2014.pdfPhylogenetic invariants versus classical phylogenetics Marta Casanellas Rius](https://reader035.fdocuments.us/reader035/viewer/2022071513/61348cfcdfd10f4dd73bcd4a/html5/thumbnails/56.jpg)
Eriksson’s SVD method
Use SVD to evaluate how close a bipartition A|B is from being anedge split: sc(A|B) = dF (flattA|B(p),R4).Find the best cherry in the tree by looking at all bipartitions of 2 taxaagainst others.Trait this cherry as an indivisible pair of taxa, and proceed recursively.Example:
12|34513|12514|235∗15|23423|14524|13525|13434|12535|12445|123
⇒142|35143|25145|23∗
⇒ (1, 4), 5, (2, 3)
M. Casanellas (UPC) www.ma1.upc.edu/∼casanellas Chicago 2014 32 / 55
![Page 57: Marta Casanellas Rius (joint work with Jesus Fern andez-S ...mypages.iit.edu/~as2014/talks/MartaCas2014.pdfPhylogenetic invariants versus classical phylogenetics Marta Casanellas Rius](https://reader035.fdocuments.us/reader035/viewer/2022071513/61348cfcdfd10f4dd73bcd4a/html5/thumbnails/57.jpg)
Problems of Eriksson’s method:
1 Low performance in the presence of long branches
2 Is it fair to compare bipartitions of different size with this score?
3 What happens when we have small sample size?
M. Casanellas (UPC) www.ma1.upc.edu/∼casanellas Chicago 2014 33 / 55
![Page 58: Marta Casanellas Rius (joint work with Jesus Fern andez-S ...mypages.iit.edu/~as2014/talks/MartaCas2014.pdfPhylogenetic invariants versus classical phylogenetics Marta Casanellas Rius](https://reader035.fdocuments.us/reader035/viewer/2022071513/61348cfcdfd10f4dd73bcd4a/html5/thumbnails/58.jpg)
Long branch attraction
correct tree usually inferred tree
M. Casanellas (UPC) www.ma1.upc.edu/∼casanellas Chicago 2014 34 / 55
![Page 59: Marta Casanellas Rius (joint work with Jesus Fern andez-S ...mypages.iit.edu/~as2014/talks/MartaCas2014.pdfPhylogenetic invariants versus classical phylogenetics Marta Casanellas Rius](https://reader035.fdocuments.us/reader035/viewer/2022071513/61348cfcdfd10f4dd73bcd4a/html5/thumbnails/59.jpg)
Flattening matrix for the wrong topology
13|24 AA AC · · · CA CC CG · · · GC GG GT · · · TG TT +
AA 0.128 0.001 · · · 0.001 0 0.001 · · · 0.001 0.021 0.001 · · · 0 0.011 0.190AC 0.043 0.001 · · · 0 0.002 0 · · · 0.002 0.027 0.001 · · · 0 0.013 0.100AG 0.008 0 · · · 0 0 0 · · · 0 0.009 0 · · · 0 0.002 0.023AT 0.011 0 · · · 0 0 0 · · · 0.001 0.001 0.002 · · · 0 0.039 0.058CA 0.023 0 · · · 0.001 0.025 0.004 · · · 0.003 0.028 0.002 · · · 0.001 0.019 0.114CC 0.005 0 · · · 0.003 0.140 0.011 · · · 0.001 0.017 0 · · · 0.001 0.029 0.213CG 0 0 · · · 0.001 0.026 0.008 · · · 0.002 0.013 0 · · · 0 0.006 0.059CT 0 0 · · · 0 0.012 0.001 · · · 0 0.001 0.004 · · · 0 0.055 0.084GA 0.001 0 · · · 0 0 0 · · · 0 0.009 0 · · · 0 0.001 0.011GC 0.001 0 · · · 0 0 0 · · · 0.001 0.005 0 · · · 0 0.003 0.011GG 0.001 0 · · · 0 0 0 · · · 0 0.003 0 · · · 0 0 0.004GT 0 0 · · · 0 0 0 · · · 0 0.001 0.001 · · · 0 0.004 0.006TA 0.010 0 · · · 0 0 0 · · · 0.004 0.006 0 · · · 0 0.010 0.035TC 0.001 0 · · · 0 0.005 0 · · · 0.001 0.010 0 · · · 0 0.020 0.039TG 0.001 0 · · · 0.001 0 0 · · · 0 0.003 0.0020 · · · 0 0.001 0.009TT 0 0 · · · 0 0 0 · · · 0 0.001 0.003 · · · 0 0.036 0.043+ 0.233 0.002 · · · 0.007 0.210 0.025 · · · 0.016 0.155 0.016 · · · 0.002 0.249 1
M. Casanellas (UPC) www.ma1.upc.edu/∼casanellas Chicago 2014 35 / 55
![Page 60: Marta Casanellas Rius (joint work with Jesus Fern andez-S ...mypages.iit.edu/~as2014/talks/MartaCas2014.pdfPhylogenetic invariants versus classical phylogenetics Marta Casanellas Rius](https://reader035.fdocuments.us/reader035/viewer/2022071513/61348cfcdfd10f4dd73bcd4a/html5/thumbnails/60.jpg)
Eriksson’s revisited: Erik+2
Given a bipartition A|B, consider the transition matrices MA→B andMB→A instead of the flattening matrix:
M12→34(p) =
pAAAApAA++
pAAACpAA++
. . . pAATTpAA++
pACAApAC++
pACACpAC++
. . . pACTTpAC++
pAGAApAG++
pAGACpAG++
. . . pAGTTpAG++
. . . . . . . . .
M34→12(p) =
(columnsums =1
)
These matrices have the same rank as the flattening. We redefine thescore as:
Erik+2: sc(A|B) = (dF (MA→B ,R4) + dF (MB→A,R4))/2.
The lower the score is, the more it is likely that the bipartition comes froman edge of T .
M. Casanellas (UPC) www.ma1.upc.edu/∼casanellas Chicago 2014 36 / 55
![Page 61: Marta Casanellas Rius (joint work with Jesus Fern andez-S ...mypages.iit.edu/~as2014/talks/MartaCas2014.pdfPhylogenetic invariants versus classical phylogenetics Marta Casanellas Rius](https://reader035.fdocuments.us/reader035/viewer/2022071513/61348cfcdfd10f4dd73bcd4a/html5/thumbnails/61.jpg)
Assessing performance
M. Casanellas (UPC) www.ma1.upc.edu/∼casanellas Chicago 2014 37 / 55
![Page 62: Marta Casanellas Rius (joint work with Jesus Fern andez-S ...mypages.iit.edu/~as2014/talks/MartaCas2014.pdfPhylogenetic invariants versus classical phylogenetics Marta Casanellas Rius](https://reader035.fdocuments.us/reader035/viewer/2022071513/61348cfcdfd10f4dd73bcd4a/html5/thumbnails/62.jpg)
Assessing performance
Huelsenbeck ’95: assesses the performance of different phylogeneticreconstruction methods on the following tree space.
For each (a, b), simulate 100 alignments under topology 12|34. For eachmethod, represent the percentage of success in a gray scale: white=0% ,black=100% success.
M. Casanellas (UPC) www.ma1.upc.edu/∼casanellas Chicago 2014 37 / 55
![Page 63: Marta Casanellas Rius (joint work with Jesus Fern andez-S ...mypages.iit.edu/~as2014/talks/MartaCas2014.pdfPhylogenetic invariants versus classical phylogenetics Marta Casanellas Rius](https://reader035.fdocuments.us/reader035/viewer/2022071513/61348cfcdfd10f4dd73bcd4a/html5/thumbnails/63.jpg)
Performance of Eriksson and Erik+2
M. Casanellas (UPC) www.ma1.upc.edu/∼casanellas Chicago 2014 38 / 55
![Page 64: Marta Casanellas Rius (joint work with Jesus Fern andez-S ...mypages.iit.edu/~as2014/talks/MartaCas2014.pdfPhylogenetic invariants versus classical phylogenetics Marta Casanellas Rius](https://reader035.fdocuments.us/reader035/viewer/2022071513/61348cfcdfd10f4dd73bcd4a/html5/thumbnails/64.jpg)
Performance against other methods
M. Casanellas (UPC) www.ma1.upc.edu/∼casanellas Chicago 2014 39 / 55
![Page 65: Marta Casanellas Rius (joint work with Jesus Fern andez-S ...mypages.iit.edu/~as2014/talks/MartaCas2014.pdfPhylogenetic invariants versus classical phylogenetics Marta Casanellas Rius](https://reader035.fdocuments.us/reader035/viewer/2022071513/61348cfcdfd10f4dd73bcd4a/html5/thumbnails/65.jpg)
old invariants
M. Casanellas (UPC) www.ma1.upc.edu/∼casanellas Chicago 2014 40 / 55
![Page 66: Marta Casanellas Rius (joint work with Jesus Fern andez-S ...mypages.iit.edu/~as2014/talks/MartaCas2014.pdfPhylogenetic invariants versus classical phylogenetics Marta Casanellas Rius](https://reader035.fdocuments.us/reader035/viewer/2022071513/61348cfcdfd10f4dd73bcd4a/html5/thumbnails/66.jpg)
Erik+2 needs more data but... works under the most general model (underi.i.d. assumptions).Models difference:
Continuous-time models:
Si = exp(Qi ti ),
Parameters: Qi rate matrix, ti branch length. They assume localhomogeneity:S ′i (t) = QiSi (t). Usually, also global-homogeneity isassumed: same rate-matrix Q throughout the tree, different ti foreach edge i .
(Algebraic) Markov models: the entries of the matrices Si are theparameters of the model. Therefore they allow for non-homogeneousdata, both local and global (different mutation rates at differentlineages of the tree and on a single edge).The results we’ve seen so far were on globally homogeneous data.Now...
M. Casanellas (UPC) www.ma1.upc.edu/∼casanellas Chicago 2014 41 / 55
![Page 67: Marta Casanellas Rius (joint work with Jesus Fern andez-S ...mypages.iit.edu/~as2014/talks/MartaCas2014.pdfPhylogenetic invariants versus classical phylogenetics Marta Casanellas Rius](https://reader035.fdocuments.us/reader035/viewer/2022071513/61348cfcdfd10f4dd73bcd4a/html5/thumbnails/67.jpg)
Erik+2 needs more data but... works under the most general model (underi.i.d. assumptions).Models difference:
Continuous-time models:
Si = exp(Qi ti ),
Parameters: Qi rate matrix, ti branch length. They assume localhomogeneity:S ′i (t) = QiSi (t). Usually, also global-homogeneity isassumed: same rate-matrix Q throughout the tree, different ti foreach edge i .
(Algebraic) Markov models: the entries of the matrices Si are theparameters of the model. Therefore they allow for non-homogeneousdata, both local and global (different mutation rates at differentlineages of the tree and on a single edge).The results we’ve seen so far were on globally homogeneous data.Now...
M. Casanellas (UPC) www.ma1.upc.edu/∼casanellas Chicago 2014 41 / 55
![Page 68: Marta Casanellas Rius (joint work with Jesus Fern andez-S ...mypages.iit.edu/~as2014/talks/MartaCas2014.pdfPhylogenetic invariants versus classical phylogenetics Marta Casanellas Rius](https://reader035.fdocuments.us/reader035/viewer/2022071513/61348cfcdfd10f4dd73bcd4a/html5/thumbnails/68.jpg)
Erik+2 Length=1000
pa
ram
ete
r b
0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4
0.0
0.2
0.4
0.6
0.8
1.0
1.2
1.4
33%
33%
95%
95%
95%
95%
95%
95%
95%
95%
mean= 0.80343 ; sd= 0.16955
Erik+2 Length=10000
0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4
0.0
0.2
0.4
0.6
0.8
1.0
1.2
1.4
95%
95%
95%
95%
95
%
95%
mean= 0.97102 ; sd= 0.04373
NJ Length=1000
pa
ram
ete
r b
0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4
0.0
0.2
0.4
0.6
0.8
1.0
1.2
1.4
33%
9
5%
9
5%
95%
95%
95%
95%
95%
mean= 0.79745 ; sd= 0.18404
NJ Length=10000
0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4
0.0
0.2
0.4
0.6
0.8
1.0
1.2
1.4
95%
9
5%
95%
95%
95%
95%
95%
95
% 9
5%
9
5%
mean= 0.94313 ; sd= 0.08771
ML Length=1000
parameter a
pa
ram
ete
r b
0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4
0.0
0.2
0.4
0.6
0.8
1.0
1.2
1.4
95%
95%
95%
95%
mean= 0.73608 ; sd= 0.17409
ML Length=10000
parameter a
0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4
0.0
0.2
0.4
0.6
0.8
1.0
1.2
1.4
95%
95%
95%
95%
mean= 0.754 ; sd= 0.16967
M. Casanellas (UPC) www.ma1.upc.edu/∼casanellas Chicago 2014 42 / 55
![Page 69: Marta Casanellas Rius (joint work with Jesus Fern andez-S ...mypages.iit.edu/~as2014/talks/MartaCas2014.pdfPhylogenetic invariants versus classical phylogenetics Marta Casanellas Rius](https://reader035.fdocuments.us/reader035/viewer/2022071513/61348cfcdfd10f4dd73bcd4a/html5/thumbnails/69.jpg)
Erik+2 is more robust: rates across sites
M. Casanellas (UPC) www.ma1.upc.edu/∼casanellas Chicago 2014 43 / 55
![Page 70: Marta Casanellas Rius (joint work with Jesus Fern andez-S ...mypages.iit.edu/~as2014/talks/MartaCas2014.pdfPhylogenetic invariants versus classical phylogenetics Marta Casanellas Rius](https://reader035.fdocuments.us/reader035/viewer/2022071513/61348cfcdfd10f4dd73bcd4a/html5/thumbnails/70.jpg)
Erik+2 on 2- mixtures
Kolaczkowski, Thornton. Nature 2004
M. Casanellas (UPC) www.ma1.upc.edu/∼casanellas Chicago 2014 44 / 55
![Page 71: Marta Casanellas Rius (joint work with Jesus Fern andez-S ...mypages.iit.edu/~as2014/talks/MartaCas2014.pdfPhylogenetic invariants versus classical phylogenetics Marta Casanellas Rius](https://reader035.fdocuments.us/reader035/viewer/2022071513/61348cfcdfd10f4dd73bcd4a/html5/thumbnails/71.jpg)
How about larger trees?
Scores are not normalized, we cannot compare bipartitions of differentsize
Eriksson’s method involved comparison of different bipartitions
Also, some penalization should be given: n = 6, requiring a 42 × 44
matrix to have rank ≤ 4 or requiring a 43 × 43 matrix having rank≤ 4 is not the same.
Indeed, codimension of R4,42×44 : (42 − 4)(44 − 4) = 3024,codimension of R4,43×43 : 3600
M. Casanellas (UPC) www.ma1.upc.edu/∼casanellas Chicago 2014 45 / 55
![Page 72: Marta Casanellas Rius (joint work with Jesus Fern andez-S ...mypages.iit.edu/~as2014/talks/MartaCas2014.pdfPhylogenetic invariants versus classical phylogenetics Marta Casanellas Rius](https://reader035.fdocuments.us/reader035/viewer/2022071513/61348cfcdfd10f4dd73bcd4a/html5/thumbnails/72.jpg)
How about larger trees?
Scores are not normalized, we cannot compare bipartitions of differentsize
Eriksson’s method involved comparison of different bipartitions
Also, some penalization should be given: n = 6, requiring a 42 × 44
matrix to have rank ≤ 4 or requiring a 43 × 43 matrix having rank≤ 4 is not the same.
Indeed, codimension of R4,42×44 : (42 − 4)(44 − 4) = 3024,codimension of R4,43×43 : 3600
M. Casanellas (UPC) www.ma1.upc.edu/∼casanellas Chicago 2014 45 / 55
![Page 73: Marta Casanellas Rius (joint work with Jesus Fern andez-S ...mypages.iit.edu/~as2014/talks/MartaCas2014.pdfPhylogenetic invariants versus classical phylogenetics Marta Casanellas Rius](https://reader035.fdocuments.us/reader035/viewer/2022071513/61348cfcdfd10f4dd73bcd4a/html5/thumbnails/73.jpg)
Normalizing the score
But in the meanwhile...
M. Casanellas (UPC) www.ma1.upc.edu/∼casanellas Chicago 2014 46 / 55
![Page 74: Marta Casanellas Rius (joint work with Jesus Fern andez-S ...mypages.iit.edu/~as2014/talks/MartaCas2014.pdfPhylogenetic invariants versus classical phylogenetics Marta Casanellas Rius](https://reader035.fdocuments.us/reader035/viewer/2022071513/61348cfcdfd10f4dd73bcd4a/html5/thumbnails/74.jpg)
Normalizing the score
But in the meanwhile...
M. Casanellas (UPC) www.ma1.upc.edu/∼casanellas Chicago 2014 46 / 55
![Page 75: Marta Casanellas Rius (joint work with Jesus Fern andez-S ...mypages.iit.edu/~as2014/talks/MartaCas2014.pdfPhylogenetic invariants versus classical phylogenetics Marta Casanellas Rius](https://reader035.fdocuments.us/reader035/viewer/2022071513/61348cfcdfd10f4dd73bcd4a/html5/thumbnails/75.jpg)
Weighted quartet-based methods
Weight each quartet tree T1 = 12|34, T2 = 13|24, T3 = 14|23 accordingto the reconstruction method (weight ≡ reliability):
Erik+2: w(Ti ) = sc(Ti )−1/(sc(T1)−1 + sc(T2)−1 + sc(T3)−1)
ML: w(Ti ) = logL(Ti )−1/(logL(T1)−1 + logL(T2)−1 + logL(T3)−1)
Barycentric coordinates: (w(T1),w(T2),w(T3))
M. Casanellas (UPC) www.ma1.upc.edu/∼casanellas Chicago 2014 47 / 55
![Page 76: Marta Casanellas Rius (joint work with Jesus Fern andez-S ...mypages.iit.edu/~as2014/talks/MartaCas2014.pdfPhylogenetic invariants versus classical phylogenetics Marta Casanellas Rius](https://reader035.fdocuments.us/reader035/viewer/2022071513/61348cfcdfd10f4dd73bcd4a/html5/thumbnails/76.jpg)
Barycentric plots
(w(T1),w(T2),w(T3))
M. Casanellas (UPC) www.ma1.upc.edu/∼casanellas Chicago 2014 48 / 55
![Page 77: Marta Casanellas Rius (joint work with Jesus Fern andez-S ...mypages.iit.edu/~as2014/talks/MartaCas2014.pdfPhylogenetic invariants versus classical phylogenetics Marta Casanellas Rius](https://reader035.fdocuments.us/reader035/viewer/2022071513/61348cfcdfd10f4dd73bcd4a/html5/thumbnails/77.jpg)
Barycentric plots
(w(T1),w(T2),w(T3))
M. Casanellas (UPC) www.ma1.upc.edu/∼casanellas Chicago 2014 49 / 55
![Page 78: Marta Casanellas Rius (joint work with Jesus Fern andez-S ...mypages.iit.edu/~as2014/talks/MartaCas2014.pdfPhylogenetic invariants versus classical phylogenetics Marta Casanellas Rius](https://reader035.fdocuments.us/reader035/viewer/2022071513/61348cfcdfd10f4dd73bcd4a/html5/thumbnails/78.jpg)
Quartet-based methods (Weight optimization)
Where do we put leaf 5?
Gascuel-Ranwez 2001M. Casanellas (UPC) www.ma1.upc.edu/∼casanellas Chicago 2014 50 / 55
![Page 79: Marta Casanellas Rius (joint work with Jesus Fern andez-S ...mypages.iit.edu/~as2014/talks/MartaCas2014.pdfPhylogenetic invariants versus classical phylogenetics Marta Casanellas Rius](https://reader035.fdocuments.us/reader035/viewer/2022071513/61348cfcdfd10f4dd73bcd4a/html5/thumbnails/79.jpg)
Topology evaluated; b: internal branch length, leaf edges proportional to it
M. Casanellas (UPC) www.ma1.upc.edu/∼casanellas Chicago 2014 51 / 55
![Page 80: Marta Casanellas Rius (joint work with Jesus Fern andez-S ...mypages.iit.edu/~as2014/talks/MartaCas2014.pdfPhylogenetic invariants versus classical phylogenetics Marta Casanellas Rius](https://reader035.fdocuments.us/reader035/viewer/2022071513/61348cfcdfd10f4dd73bcd4a/html5/thumbnails/80.jpg)
Method Quartets b = 0.005 b = 0.015 b = 0.05 b = 0.1 Average
QP Erik+2 0.05 (100) 0.07 (100) 1.69 (100) 4.57 (100) 1.6ML 9 (100) 9.08 (93) 9.65 (93) - (0) -NJ 9.7 (100) 9.65 (100) 9.91 (100) 9.91 (100) 9.79
WIL Erik+2 0.02 (100) 0.07 (100) 0.33 (100) 1.52 (100) 0.48ML 9.97 (100) 9.28 (93) 9.27 (93) - (0) -NJ 7.66 (100) 7.41 (100) 8.25 (100) 9.61 (100) 8.23
WO Erik+2 0.02 (100) 0.06 (100) 0.2 (100) 1.66 (100) 0.48ML 11.03 (100) 11.18 (93) 11.83 (93) - (0) -NJ 9.46 (100) 9.22 (100) 9.2 (100) 9.73 (100) 9.4
Global NJ 0 0 1.54 10.19
Table : Average Robinson-Foulds distance to the correct tree, alignment length5000, GMM data.
M. Casanellas (UPC) www.ma1.upc.edu/∼casanellas Chicago 2014 52 / 55
![Page 81: Marta Casanellas Rius (joint work with Jesus Fern andez-S ...mypages.iit.edu/~as2014/talks/MartaCas2014.pdfPhylogenetic invariants versus classical phylogenetics Marta Casanellas Rius](https://reader035.fdocuments.us/reader035/viewer/2022071513/61348cfcdfd10f4dd73bcd4a/html5/thumbnails/81.jpg)
Method Quartets b = 0.005 b = 0.015 b = 0.05 b = 0.1 Average
QP Erik+2 0 (100) 0 (100) 0.44 (100) 2.92 (100) 0.84ML 9 (100) 9.11 (96) 9.6 (91) 9 (1) 9.18NJ 9.69 (100) 9.97 (100) 9.92 (100) 9.95 (100) 9.88
WIL Erik+2 0 (100) 0.02 (100) 0.09 (100) 0.44 (100) 0.14ML 9.33 (100) 9 (96) 8.98 (91) 9 (1) 9.08NJ 7.46 (100) 7.44 (100) 8.03 (100) 9.48 (100) 8.1
WO Erik+2 0 (100) 0 (100) 0.08 (100) 0.51 (100) 0.15ML 10.97 (100) 11.01 (96) 11.69 (91) 13 (1) 11.67NJ 9.41 (100) 9.18 (100) 9.33 (100) 9.62 (100) 9.38
Global NJ 0 0 1.16 9.53
Table : Average Robinson-Foulds distance to the correct tree, alignment length10000, GMM data
M. Casanellas (UPC) www.ma1.upc.edu/∼casanellas Chicago 2014 53 / 55
![Page 82: Marta Casanellas Rius (joint work with Jesus Fern andez-S ...mypages.iit.edu/~as2014/talks/MartaCas2014.pdfPhylogenetic invariants versus classical phylogenetics Marta Casanellas Rius](https://reader035.fdocuments.us/reader035/viewer/2022071513/61348cfcdfd10f4dd73bcd4a/html5/thumbnails/82.jpg)
Open problems
Algorithm for larger trees
Incorporate Allman-Rhodes-Taylor semialgebraic conditions
Look for closest rank ≤ 4 nonnegative matrix
Statistical analysis of the method
M. Casanellas (UPC) www.ma1.upc.edu/∼casanellas Chicago 2014 54 / 55
![Page 83: Marta Casanellas Rius (joint work with Jesus Fern andez-S ...mypages.iit.edu/~as2014/talks/MartaCas2014.pdfPhylogenetic invariants versus classical phylogenetics Marta Casanellas Rius](https://reader035.fdocuments.us/reader035/viewer/2022071513/61348cfcdfd10f4dd73bcd4a/html5/thumbnails/83.jpg)
Open problems
Algorithm for larger trees
Incorporate Allman-Rhodes-Taylor semialgebraic conditions
Look for closest rank ≤ 4 nonnegative matrix
Statistical analysis of the method
M. Casanellas (UPC) www.ma1.upc.edu/∼casanellas Chicago 2014 54 / 55
![Page 84: Marta Casanellas Rius (joint work with Jesus Fern andez-S ...mypages.iit.edu/~as2014/talks/MartaCas2014.pdfPhylogenetic invariants versus classical phylogenetics Marta Casanellas Rius](https://reader035.fdocuments.us/reader035/viewer/2022071513/61348cfcdfd10f4dd73bcd4a/html5/thumbnails/84.jpg)
Thanks!
M. Casanellas (UPC) www.ma1.upc.edu/∼casanellas Chicago 2014 55 / 55