Post on 23-Feb-2016
description
Distance matrix methodscalculate a measure of distance between each pair of species, then find a tree that predicts the observed set of distances.
Branch lengths and times in distance matrix methods, branch lengths reflect the expected amount of evolution in different branches of the tree.
branch length = ri • ti
rate of evolution
elapsed time
The least squares method
A B C D E
A 0 Dab Dac Dad Dae
B Dab 0 Dbc Dbd Dbe
C Dac Dbc 0 Dcd Dce
D Dad Dbd Dcd 0 Dde
E Dae Dbe Dce Dde 0
Observed matrix
minimise the difference between the observed matrix of distances and the matrix of distances predicted by the tree.
The least squares method
A B C D E
A 0 dab dac dad dae
B dab 0 dbc dbd dbe
C dac dbc 0 dcd dce
D dad dbd dcd 0 dde
E dae dbe dce dde 0
Expected matrix c
e
ab
d
0.08
0.05
0.10
0.07
0.06
0.05
0.03
The least squares method
c
e
ab
d
0.08
0.05
0.10
0.07
0.06
0.05
A B C D E
A 0
B 0
C 0
D 0
E 0
0.03
Expected matrix
The least squares method
c
e
ab
d
0.08
0.05
0.10
0.07
0.06
0.05
A B C D E
A 0 0.23
B 0
C 0
D 0
E 0
0.08+0.05+0.10
0.03
Expected matrix
The least squares method
c
e
ab
d
0.08
0.05
0.10
0.07
0.06
0.05
A B C D E
A 0 0.23 0.16 0.20 0.17
B 0.23 0 0.23 0.17 0.24
C 0.16 0.23 0 0.15 0.11
D 0.20 0.17 0.15 0 0.21
E 0.17 0.24 0.11 0.21 0
0.03
Expected matrix
The least squares method
Q = S S wij (Dij – dij)2 i=1 j=1
n n
observed distancebetween species i and j
expected distancebetween species i and j
Q is a measure for the discrepancy between the observed and the expected matrix.
The least squares method
Q = S S wij (Dij – dij)2 i=1 j=1
n n
weight(1, 1/D2, 1/D)
distances can be weighed or not.
The least squares method
c
e
ab
d
v1
v7
v2
v4
v5
v3
v6
xij,k= 1 if branch k is on the path between species j and k= 0 if branch k is not on the path between species j and k
Xij, k is a handy variable
The least squares method
c
e
ab
d
v1
v7
v2
v4
v5
v3
v6
Xa-b,1= 1
The least squares method
c
e
ab
d
v1
v7
v2
v4
v5
v3
v6
Xa-b,1= 1Xa-b,7= 1
The least squares method
c
e
ab
d
v1
v7
v2
v4
v5
v3
v6
Xa-b,1= 1Xa-b,7= 1Xa-b,3= 0
The least squares method
Q = S S wij (Dij – dij)2 i=1 j=1
n n
dij = S xij,k vkk
rewrite dij, the expected values
The least squares method
Q = S S wij (Dij – Sxij,k vk)2 i=1 j=1
n n
k
The least squares method
Q = S S wij (Dij – Sxij,k vk)2 i=1 j=1
n n
k
= -2 S S wij xij, k (Dij – Sxij,k vk) i=1 j=1
n ndQdvk k
differentiate Q and equate the derivative to zero
The least squares method
= -2 S S xij, k (Dij – Sxij,k vk) = 0i=1 j=1
n ndQdvk k
for the unweighted case
The least squares method
= -2 S S xij, 1 (Dij – Sxij,k vk) = 0i=1 j:j≠1
n ndQdv1 k
xAB,1 (DAB-SxAB,kvk) + xAC,1 (DAC-SxAC, kvk) + xAD,1 (DAD-SxAD, kvk) + xAB,1 (DAE-SxAE, kvk)
+ xBC,1 (DBC-SxBC, kvk) + xBD,1 (DBD-SxBD, kvk)+ xBE,1 (DBE-SxBE, kvk)
+ xCD,1 (DCD-SxCD, kvk) + xCE,1 (DCE-SxCE, kvk)
+ xDE,1 (DDE-SxDE, kvk) = 0
i=1
i=2
i=3
i=4
j=2 j=3 j=4 j=5
j=3 j=4 j=5
j=4 j=5
j=5
written in full
The least squares method
c
e
ab
d
v1
v7
v2
v4
v5
v3
v6
Xij,1 A B C D E
A - 1 1 1 1
B - 0 0 0
C - 0 0
D - 0
E -
The least squares method
= -2 S S xij, 1 (Dij – Sxij,k vk) = 0i=1 j=1
n ndQdv1 k
1 (DAB-SxAB,kvk) + 1 (DAC-SxAC, kvk)+ 1 (DAD-SxAD, kvk)+ 1 (DAE-SxAE, kvk)
+ 0 (DBC-SxBC, kvk) + 0 (DBD-SxBD, kvk)+ 0 (DBE-SxBE, kvk)
+ 0 (DCD-SxCD, kvk) + 0 (DCE-SxCE, kvk)
+ 0 (DDE-SxDE, kvk) = 0
Xij,1 A B C D E
A - 1 1 1 1
B - 0 0 0
C - 0 0
D - 0
E -
many terms are zero
The least squares method
= -2 S S xij, 1 (Dij – Sxij,k vk) = 0i=1 j=1
n ndQdv1 k
(DAB-SxAB,kvk) + (DAC-SxAC, kvk) + (DAD-SxAD, kvk) + (DAE-SxAE, kvk) = 0
c
e
ab
d
v1
v7
v2
v4
v5
v3
v6
=1•v1 + 1•v2 + 0•v3 + 0•v4 + 0*v5 + 0•v6 + 1*v7
non-zero terms expanded
The least squares method
= -2 S S xij, 1 (Dij – Sxij,k vk) = 0i=1 j=1
n ndQdv1 k
(DAB-SxAB, kvk) + (DAC-SxAC, kvk) + (DAD-SxAD, kvk) + (DAE-SxAE, kvk) = 0
c
e
ab
d
v1
v7
v2
v4
v5
v3
v6
=1•v1 + 0•v2 + 1•v3 + 0•v4 + 0*v5 + 1•v6 + 0*v7
The least squares method
= -2 S S xij, 1 (Dij – Sxij,k vk) = 0i=1 j=1
n ndQdv1 k
(DAB-SxAB, kvk) + (DAC-SxAC, kvk) + (DAD-SxAD, kvk) + (DAE-SxAE, kvk) = 0
DAB + DAC + DAD + DAE – 4v1 – v2 – v3 – v4 – v5 – 2v6 – 2v7 = 0
DAB + DAC + DAD + DAE = 4v1 + v2 + v3 + v4 + v5 + 2v6 + 2v7
rearranging to
The least squares method
= -2 S S xij, 1 (Dij – Sxij,k vk) = 0i=1 j=1
n ndQdv1 k
(DAB-SxAB, kvk) + (DAC-SxAC, kvk) + (DAD-SxAD, kvk) + (DAE-SxAE, kvk) = 0
DAB + DAC + DAD + DAE – 4v1 – v2 – v3 – v4 – v5 – 2v6 – 2v7 = 0
DAB + DAC + DAD + DAE = 4v1 + v2 + v3 + v4 + v5 + 2v6 + 2v7 equation for v1
The least squares method
DAB + DAC + DAD + DAE = 4v1 + v2 + v3 + v4 + v5 + 2v6 + 2v7
DAB + DBC + DBD + DBE = v1 + 4v2 + v3 + v4 + v5 + 2v6 + 3v7
equation for v1equation for v2
mutatis mutandis for v2
The least squares method
DAB + DAC + DAD + DAE = 4v1 + v2 + v3 + v4 + v5 + 2v6 + 2v7
DAB + DBC + DBD + DBE = v1 + 4v2 + v3 + v4 + v5 + 2v6 + 3v7
DAC + DBC + DCD + DDE = v1 + v2 + 4v3 + v4 + v5 + 3v6 + 2v7
DAD + DBD + DCD + DDE = v1 + v2 + v3 + 4v4 + v5 + 2v6 + 3v7
DAE + DBE + DCE + DDE = v1 + v2 + v3 + v4 + 4v5 + 3v6 + 2v7
DAC + DAE + DCE + DBE + DCD + DDE = 2v1 + 2v2 + 3v3 + 2v4 + 3v5 + 6v6 + 4v7
DAB + DAD + DBC + DCD + DBE + DDE = 2v1 + 3v2 + 2v3 + 3v4 + 2v5 + 4v6 + 6v7
equation for v1equation for v2
v3
v4v5
v6v7
and all other branches
The least squares method solving linear equations with matrices
x + 2y = 4
3x - 5y = 1
1 2
3 -5
4
1A = = B
A-1 =-5 -2
-3 1
1| A |
=1
1*(-5)- 3*2
-5 -2
-3 1 = -
-5 -2
-3 1
111
X = A-1 B = --5 -2
-3 1
111
4
1= -
111
-22
-11
2
1=
Clustering algorithms clustering methods have no criterion but apply algorithms to come up with trees
Clustering algorithms: UPGMA
an ultrametric tree
UPGMA assumes that evolutionary rates are the same in all lineages
UnweightedPairGroupMethod withArithmetic mean
Clustering algorithms: UPGMAdog bear raccoon weasel seal sea lion cat monkey
dog 0 32 48 51 50 48 98 148
bear 32 0 26 34 29 33 84 136
raccoon 48 26 0 42 44 44 92 152
weasel 51 34 42 0 44 38 86 142
seal 50 29 44 44 0 24 89 142
sea lion 48 33 44 38 24 0 90 142
cat 98 84 92 86 89 90 0 148
monkey 148 136 152 142 142 142 148 0
1. Find species i and j with the smallest distance .
2. Calculate branch length between i and j.
Clustering algorithms: UPGMA
1. Find species i and j with the smallest distance .
2. Calculate branch length between i and j.
sea
lion
seal
12
Clustering algorithms: UPGMAdog bear raccoon weasel seal sea lion cat monkey
dog 0 32 48 51 50 48 98 148
bear 32 0 26 34 29 33 84 136
raccoon 48 26 0 42 44 44 92 152
weasel 51 34 42 0 44 38 86 142
seal 50 29 44 44 0 24 89 142
sea lion 48 33 44 38 24 0 90 142
cat 98 84 92 86 89 90 0 148
monkey 148 136 152 142 142 142 148 0
1. Find species i and j with the smallest distance .
2. Calculate branch length between i and j.
3. Lump i and j into a new group.
dog bear raccoon weasel SS cat monkey
dog 0 32 48 51 98 148
bear 32 0 26 34 84 136
raccoon 48 26 0 42 92 152
weasel 51 34 42 0 86 142
SS 0
cat 98 84 92 86 0 148
monkey 148 136 152 142 148 0
Clustering algorithms: UPGMAdog bear raccoon weasel seal sea lion cat monkey
dog 0 32 48 51 50 48 98 148
bear 32 0 26 34 29 33 84 136
raccoon 48 26 0 42 44 44 92 152
weasel 51 34 42 0 44 38 86 142
seal 50 29 44 44 0 24 89 142
sea lion 48 33 44 38 24 0 90 142
cat 98 84 92 86 89 90 0 148
monkey 148 136 152 142 142 142 148 0
1. Find species i and j with the smallest distance .
2. Calculate branch length between i and.
3. Lump i and j into a new group.
4. Compute distance between new group and all other groups (weigh for number of species in groups).
dog bear raccoon weasel SS cat monkey
dog 0 32 48 51 98 148
bear 32 0 26 34 84 136
raccoon 48 26 0 42 92 152
weasel 51 34 42 0 86 142
SS 0
cat 98 84 92 86 0 148
monkey 148 136 152 142 148 0
Clustering algorithms: UPGMAdog bear raccoon weasel seal sea lion cat monkey
dog 0 32 48 51 50 48 98 148
bear 32 0 26 34 29 33 84 136
raccoon 48 26 0 42 44 44 92 152
weasel 51 34 42 0 44 38 86 142
seal 50 29 44 44 0 24 89 142
sea lion 48 33 44 38 24 0 90 142
cat 98 84 92 86 89 90 0 148
monkey 148 136 152 142 142 142 148 0
dog bear raccoon weasel SS cat monkey
dog 0 32 48 51 49 98 148
bear 32 0 26 34 84 136
raccoon 48 26 0 42 92 152
weasel 51 34 42 0 86 142
SS 0
cat 98 84 92 86 0 148
monkey 148 136 152 142 148 0
1. Find species i and j with the smallest distance .
2. Calculate branch length between i and.
3. Lump i and j into a new group.
4. Compute distance between new group and all other groups (weigh for number of species in groups).
Clustering algorithms: UPGMAdog bear raccoon weasel seal sea lion cat monkey
dog 0 32 48 51 50 48 98 148
bear 32 0 26 34 29 33 84 136
raccoon 48 26 0 42 44 44 92 152
weasel 51 34 42 0 44 38 86 142
seal 50 29 44 44 0 24 89 142
sea lion 48 33 44 38 24 0 90 142
cat 98 84 92 86 89 90 0 148
monkey 148 136 152 142 142 142 148 0
dog bear raccoon weasel SS cat monkey
dog 0 32 48 51 49 98 148
bear 32 0 26 34 31 84 136
raccoon 48 26 0 42 92 152
weasel 51 34 42 0 86 142
SS 0
cat 98 84 92 86 0 148
monkey 148 136 152 142 148 0
1. Find species i and j with the smallest distance .
2. Calculate branch length between i and.
3. Lump i and j into a new group.
4. Compute distance between new group and all other groups (weigh for number of species in groups).
Clustering algorithms: UPGMAdog bear raccoon weasel seal sea lion cat monkey
dog 0 32 48 51 50 48 98 148
bear 32 0 26 34 29 33 84 136
raccoon 48 26 0 42 44 44 92 152
weasel 51 34 42 0 44 38 86 142
seal 50 29 44 44 0 24 89 142
sea lion 48 33 44 38 24 0 90 142
cat 98 84 92 86 89 90 0 148
monkey 148 136 152 142 142 142 148 0
dog bear raccoon weasel SS cat monkey
dog 0 32 48 51 49 98 148
bear 32 0 26 34 31 84 136
raccoon 48 26 0 42 44 92 152
weasel 51 34 42 0 41 86 142
SS 49 31 44 41 0 89.5 142
cat 98 84 92 86 89.5 0 148
monkey 148 136 152 142 142 148 0
1. Find species i and j with the smallest distance .
2. Calculate branch length between i and.
3. Lump i and j into a new group.
4. Compute distance between new group and all other groups (weigh for number of species in groups).
Clustering algorithms: UPGMA
1. Find species i and j with the smallest distance .
2. Calculate branch length between i and j.
dog bear raccoon weasel SS cat monkey
dog 0 32 48 51 49 98 148
bear 32 0 26 34 31 84 136
raccoon 48 26 0 42 44 92 152
weasel 51 34 42 0 41 86 142
SS 49 31 44 41 0 89.5 142
cat 98 84 92 86 89.5 0 148
monkey 148 136 152 142 142 148 0
Clustering algorithms: UPGMA
1. Find species i and j with the smallest distance .
2. Calculate branch length between i and j.
sea
lion
seal
12
racc
oon
bear
13
Clustering algorithms: UPGMAdog bear raccoon weasel SS cat monkey
dog 0 32 48 51 49 98 148
bear 32 0 26 34 31 84 136
raccoon 48 26 0 42 44 92 152
weasel 51 34 42 0 41 86 142
SS 49 31 44 41 0 89.5 142
cat 98 84 92 86 89.5 0 148
monkey 148 136 152 142 142 148 0
dog BR weasel SS cat monkey
dog 0 40 51 49 98 148
BR 40 0 38 37.5 88 144
weasel 51 38 0 41 86 142
SS 49 37.5 41 0 89.5 142
cat 98 88 86 89.5 0 148
monkey 148 144 142 142 148 0
1. Find species i and j with the smallest distance .
2. Calculate branch length between i and.
3. Lump i and j into a new group.
4. Compute distance between new group and all other groups (weigh for number of species in groups).
Clustering algorithms: UPGMA
1. Find species i and j with the smallest distance .
2. Calculate branch length between i and j.
dog BR weasel SS cat monkey
dog 0 40 51 49 98 148
BR 40 0 38 37.5 88 144
weasel 51 38 0 41 86 142
SS 49 37.5 41 0 89.5 142
cat 98 88 86 89.5 0 148
monkey 148 144 142 142 148 0
Clustering algorithms: UPGMA
1. Find species i and j with the smallest distance .
2. Calculate branch length between i and j.
sea
lion
seal
12
racc
oon
bear
1318.756.755.75
Clustering algorithms: UPGMAdog BR weasel SS cat monkey
dog 0 40 51 49 98 148
BR 40 0 38 37.5 88 144
weasel 51 38 0 41 86 142
SS 49 37.5 41 0 89.5 142
cat 98 88 86 89.5 0 148
monkey 148 144 142 142 148 0
dog BRSS weasel cat monkey
dog 0 44.5 51 98 148
BRSS 44.5 0 39.5 88.75 143
weasel 51 39.5 0 86 142
cat 98 88.75 86 0 148
monkey 148 143 142 148 0
1. Find species i and j with the smallest distance .
2. Calculate branch length between i and.
3. Lump i and j into a new group.
4. Compute distance between new group and all other groups (weigh for number of species in groups).
Clustering algorithms: UPGMA
1. Find species i and j with the smallest distance .
2. Calculate branch length between i and j.
dog BRSS weasel cat monkey
dog 0 44.5 51 98 148
BRSS 44.5 0 39.5 88.75 143
weasel 51 39.5 0 86 142
cat 98 88.75 86 0 148
monkey 148 143 142 148 0
Clustering algorithms: UPGMA
1. Find species i and j with the smallest distance .
2. Calculate branch length between i and j.
sea
lion
seal
12
racc
oon
bear
13 19.756.755.75
wea
sel
Clustering algorithms: UPGMA
1. Find species i and j with the smallest distance .
2. Calculate branch length between i and j. Lump i and j into a new group.
3. Lump i and j into a new group.
4. Compute distance between new group and all other groups (weigh for number of species in groups).
dog BRSS weasel cat monkey
dog 0 44.5 51 98 148
BRSS 44.5 0 39.5 88.75 143
weasel 51 39.5 0 86 142
cat 98 88.75 86 0 148
monkey 148 143 142 148 0
dog BRSSW cat monkey
dog 0 98 148
BRSSW 0
cat 98 0 148
monkey 148 148 0
= (4*44.5 + 1*51)/5
4 species in BRSS
1 species in weasel
Clustering algorithms: UPGMA
1. Find species i and j with the smallest distance .
2. Calculate branch length between i and j. Lump i and j into a new group.
3. Lump i and j into a new group.
4. Compute distance between new group and all other groups (weigh for number of species in groups).
dog BRSS weasel cat monkey
dog 0 44.5 51 98 148
BRSS 44.5 0 39.5 88.75 143
weasel 51 39.5 0 86 142
cat 98 88.75 86 0 148
monkey 148 143 142 148 0
dog BRSSW cat monkey
dog 0 45.8 98 148
BRSSW 45.8 0
cat 98 0 148
monkey 148 148 0
= (4*44.5 + 1*51)/5
4 species in BRSS
1 species in weasel
Clustering algorithms: UPGMA
1. Find species i and j with the smallest distance .
2. Calculate branch length between i and j. Lump i and j into a new group.
3. Lump i and j into a new group.
4. Compute distance between new group and all other groups (weigh for number of species in groups).
dog BRSS weasel cat monkey
dog 0 44.5 51 98 148
BRSS 44.5 0 39.5 88.75 143
weasel 51 39.5 0 86 142
cat 98 88.75 86 0 148
monkey 148 143 142 148 0
dog BRSSW cat monkey
dog 0 45.8 98 148
BRSSW 45.8 0 88.2 142.8
cat 98 88.2 0 148
monkey 148 142.8 148 0
Clustering algorithms: UPGMA
1. Find species i and j with the smallest distance .
2. Calculate branch length between i and j. Lump i and j into a new group.
dog BRSSW cat monkey
dog 0 45.8 98 148
BRSSW 45.8 0 88.2 142.8
cat 98 88.2 0 148
monkey 148 142.8 148 0
Clustering algorithms: UPGMA
1. Find species i and j with the smallest distance .
2. Calculate branch length between i and j.
sea
lion
seal
12
racc
oon
bear
13 19.756.755.75
wea
sel
dog
22.9
Clustering algorithms: UPGMA
1. Find species i and j with the smallest distance .
2. Calculate branch length between i and j. Lump i and j into a new group.
3. Lump i and j into a new group.
4. Compute distance between new group and all other groups (weigh for number of species in groups).
dog BRSSW cat monkey
dog 0 45.8 98 148
BRSSW 45.8 0 88.2 142.8
cat 98 88.2 0 148
monkey 148 142.8 148 0
BRSSWD cat monkey
BRSSWD 0
cat 0 148
monkey 148 0
= (5*88.2 + 1*98)/6
1 species in dog
5 species in BRSSW
Clustering algorithms: UPGMA
1. Find species i and j with the smallest distance .
2. Calculate branch length between i and j. Lump i and j into a new group.
3. Lump i and j into a new group.
4. Compute distance between new group and all other groups (weigh for number of species in groups).
dog BRSSW cat monkey
dog 0 45.8 98 148
BRSSW 45.8 0 88.2 142.8
cat 98 88.2 0 148
monkey 148 142.8 148 0
BRSSWD cat monkey
BRSSWD 0 89.833
cat 89.833 0 148
monkey 148 0
= (5*88.2 + 1*98)/6
1 species in dog
5 species in BRSSW
Clustering algorithms: UPGMA
1. Find species i and j with the smallest distance .
2. Calculate branch length between i and j. Lump i and j into a new group.
3. Lump i and j into a new group.
4. Compute distance between new group and all other groups (weigh for number of species in groups).
dog BRSSW cat monkey
dog 0 45.8 98 148
BRSSW 45.8 0 88.2 142.8
cat 98 88.2 0 148
monkey 148 142.8 148 0
BRSSWD cat monkey
BRSSWD 0 89.833 143.66
cat 89.833 0 148
monkey 143.66 148 0
= (5*88.2 + 1*98)/6
1 species in dog
5 species in BRSSW
Clustering algorithms: UPGMA
1. Find species i and j with the smallest distance .
2. Calculate branch length between i and j. Lump i and j into a new group.
BRSSWD cat monkey
BRSSWD 0 89.833 143.66
cat 89.833 0 148
monkey 143.66 148 0
Clustering algorithms: UPGMA
1. Find species i and j with the smallest distance .
2. Calculate branch length between i and j.
sea
lion
seal
12
racc
oon
bear
13 19.756.755.75
wea
sel
dog
22.9
cat
44.916622.0166
Clustering algorithms: UPGMA
1. Find species i and j with the smallest distance .
2. Calculate branch length between i and j. Lump i and j into a new group.
3. Lump i and j into a new group.
4. Compute distance between new group and all other groups (weigh for number of species in groups).
BRSSWD cat monkey
BRSSWD 0 89.833 143.66
cat 89.833 0 148
monkey 143.66 148 0
BRSSWD monkey
BRSSWD 0
monkey 0= (6*143.66 + 1*148)/7
1 species in cat
6 species in BRSSWD
Clustering algorithms: UPGMA
1. Find species i and j with the smallest distance .
2. Calculate branch length between i and j. Lump i and j into a new group.
3. Lump i and j into a new group.
4. Compute distance between new group and all other groups (weigh for number of species in groups).
BRSSWD cat monkey
BRSSWD 0 89.833 143.66
cat 89.833 0 148
monkey 143.66 148 0
BRSSWD monkey
BRSSWD 0 144.2857
monkey 144.2857 0= (6*143.66 + 1*148)/7
1 species in cat
6 species in BRSSWD
Clustering algorithms: UPGMA
1. Find species i and j with the smallest distance .
2. Calculate branch length between i and j.
sea
lion
seal
12
racc
oon
bear
13 19.756.755.75
wea
sel
dog
22.9
cat
44.916622.0166
mon
key
72.142827.22619
Clustering algorithms: Neighbour-joining
1. Calculate Sx = (SDx)/(n-2)dog bear raccoon weasel seal sea lion cat monkey
dog 0 32 48 51 50 48 98 148
bear 32 0 26 34 29 33 84 136
raccoon 48 26 0 42 44 44 92 152
weasel 51 34 42 0 44 38 86 142
seal 50 29 44 44 0 24 89 142
sea lion 48 33 44 38 24 0 90 142
cat 98 84 92 86 89 90 0 148
monkey 148 136 152 142 142 142 148 0
79.2
62.3
74.7
72.8
70.3
69.8
114.5
168.3
79.2 62.3 74.7 72.8 70.3 69.8 114.5 168.3
Clustering algorithms: Neighbour-joining
1. Calculate Sx = (SDx)/(n-2)2. Calculate Mij = Dij-Si-Sj and
select pair with smallest Mij
dog bear raccoon weasel seal sea lion cat monkey
dog 0 32 48 51 50 48 98 148
bear 32 0 26 34 29 33 84 136
raccoon 48 26 0 42 44 44 92 152
weasel 51 34 42 0 44 38 86 142
seal 50 29 44 44 0 24 89 142
sea lion 48 33 44 38 24 0 90 142
cat 98 84 92 86 89 90 0 148
monkey 148 136 152 142 142 142 148 0
79.2
62.3
74.7
72.8
70.3
69.8
114.5
168.3
79.2 62.3 74.7 72.8 70.3 69.8 114.5 168.3
dog bear raccoon weasel seal sea lion cat monkey
dog -109.50
bear
raccoon
weasel
seal
sea lion
cat
monkey
32 - 79.2 - 62.3 =
-109.5
Clustering algorithms: Neighbour-joining
1. Calculate Sx = (SDx)/(n-2)2. Calculate Mij = Dij-Si-Sj and
select pair with smallest Mij
dog bear raccoon weasel seal sea lion cat monkey
dog 0 32 48 51 50 48 98 148
bear 32 0 26 34 29 33 84 136
raccoon 48 26 0 42 44 44 92 152
weasel 51 34 42 0 44 38 86 142
seal 50 29 44 44 0 24 89 142
sea lion 48 33 44 38 24 0 90 142
cat 98 84 92 86 89 90 0 148
monkey 148 136 152 142 142 142 148 0
79.2
62.3
74.7
72.8
70.3
69.8
114.5
168.3
79.2 62.3 74.7 72.8 70.3 69.8 114.5 168.3
dog bear raccoon weasel seal sea lion cat monkey
dog -109.50 -105.83 -101.00 -99.50 -101.00 -95.67 -99.50
bear -109.50 -111.00 -101.17 -103.67 -99.17 -92.83 -94.67
raccoon -105.83 -111.00 -105.50 -101.00 -100.50 -97.17 -91.00
weasel -101.00 -101.17 -105.50 -99.17 -104.67 -101.33 -99.17
seal -99.50 -103.67 -101.00 -99.17 -116.17 -95.83 -96.67
sea lion -101.00 -99.17 -100.50 -104.67 -116.17 -94.33 -96.17
cat -95.67 -92.83 -97.17 -101.33 -95.83 -94.33 -134.83
monkey -99.50 -94.67 -91.00 -99.17 -96.67 -96.17 -134.83
Clustering algorithms: Neighbour-joining
1. Calculate Sx = (SDx)/(n-2)2. Calculate Mij = Dij-Si-Sj and
select pair with smallest Mij
3. Create a node that joins this pair and calculate branch lengths as (Dij/2)+(Si-Sj)/2
dog bear raccoon weasel seal sea lion cat monkey
dog 0 32 48 51 50 48 98 148
bear 32 0 26 34 29 33 84 136
raccoon 48 26 0 42 44 44 92 152
weasel 51 34 42 0 44 38 86 142
seal 50 29 44 44 0 24 89 142
sea lion 48 33 44 38 24 0 90 142
cat 98 84 92 86 89 90 0 148
monkey 148 136 152 142 142 142 148 0
79.2
62.3
74.7
72.8
70.3
69.8
114.5
168.3
79.2 62.3 74.7 72.8 70.3 69.8 114.5 168.3
branch length cat-cm = 148/2 + (114.5-168.5)/2 = 47.08
Clustering algorithms: Neighbour-joining
1. Calculate Sx = (SDx)/(n-2)2. Calculate Mij = Dij-Si-Sj and
select pair with smallest Mij
3. Create a node that joins this pair and calculate branch lengths as (Dij/2)+(Si-Sj)/2
dog bear raccoon weasel seal sea lion cat monkey
dog 0 32 48 51 50 48 98 148
bear 32 0 26 34 29 33 84 136
raccoon 48 26 0 42 44 44 92 152
weasel 51 34 42 0 44 38 86 142
seal 50 29 44 44 0 24 89 142
sea lion 48 33 44 38 24 0 90 142
cat 98 84 92 86 89 90 0 148
monkey 148 136 152 142 142 142 148 0
79.2
62.3
74.7
72.8
70.3
69.8
114.5
168.3
79.2 62.3 74.7 72.8 70.3 69.8 114.5 168.3
branch length cat-cm = 148/2 + (114.5-168.5)/2 = 47.08branch length monkey-cm = 148/2 + (168.5-114.5)/2 = 110.92
Clustering algorithms: Neighbour-joining
catsea lion
seal
monkey
weasel
bear raccoondog
1. Calculate Sx = (SDx)/(n-2)2. Calculate Mij = Dij-Si-Sj and
select pair with smallest Mij
3. Create a node that joins this pair and calculate branch lengths as (Dij/2)+(Si-Sj)/2
4. Join the two species and make all other taxa in form of a star.
Clustering algorithms: Neighbour-joining
cat
sea lion
seal
monkey
weasel
bear raccoondog
cm 47.08
100.92
1. Calculate Sx = (SDx)/(n-2)2. Calculate Mij = Dij-Si-Sj and
select pair with smallest Mij
3. Create a node that joins this pair and calculate branch lengths as (Dij/2)+(Si-Sj)/2
4. Join the two species and make all other taxa in form of a star.
Clustering algorithms: Neighbour-joiningdog bear raccoon weasel seal sea lion cat monkey
dog 0 32 48 51 50 48 98 148
bear 32 0 26 34 29 33 84 136
raccoon 48 26 0 42 44 44 92 152
weasel 51 34 42 0 44 38 86 142
seal 50 29 44 44 0 24 89 142
sea lion 48 33 44 38 24 0 90 142
cat 98 84 92 86 89 90 0 148
monkey 148 136 152 142 142 142 148 0
dog bear raccoon weasel seal sea lion cm
dog 0 32 48 51 50 48 49
bear 32 0 26 34 29 33
raccoon 48 26 0 42 44 44
weasel 51 34 42 0 44 38
seal 50 29 44 44 0 24
sea lion 48 33 44 38 24 0
cm
1. Calculate Sx = (SDx)/(n-2)2. Calculate Mij = Dij-Si-Sj and
select pair with smallest Mij
3. Create a node that joins this pair and calculate branch lengths as (Dij/2)+(Si-Sj)/2
4. Join the two species and make all other taxa in form of a star.
5. Create a new matrix. Calculate the distances between the new node and other taxa as Dxij=(Dix+Djx-Dij)/2
(98+148-148)/2 =
49
Clustering algorithms: Neighbour-joiningdog bear raccoon weasel seal sea lion cat monkey
dog 0 32 48 51 50 48 98 148
bear 32 0 26 34 29 33 84 136
raccoon 48 26 0 42 44 44 92 152
weasel 51 34 42 0 44 38 86 142
seal 50 29 44 44 0 24 89 142
sea lion 48 33 44 38 24 0 90 142
cat 98 84 92 86 89 90 0 148
monkey 148 136 152 142 142 142 148 0
dog bear raccoon weasel seal sea lion cm
dog 0 32 48 51 50 48 49
bear 32 0 26 34 29 33 36
raccoon 48 26 0 42 44 44 48
weasel 51 34 42 0 44 38 40
seal 50 29 44 44 0 24 41.5
sea lion 48 33 44 38 24 0 42
cm 49 36 48 40 41.5 42 0
1. Calculate Sx = (SDx)/(n-2)2. Calculate Mij = Dij-Si-Sj and
select pair with smallest Mij
3. Create a node that joins this pair and calculate branch lengths as (Dij/2)+(Si-Sj)/2
4. Join the two species and make all other taxa in form of a star.
5. Create a new matrix. Calculate the distances between the new node and other taxa as Dxij=(Dix+Djx-Dij)/2
(98+148-148)/2 =
49
Clustering algorithms: Neighbour-joiningdog bear raccoon weasel seal sea lion cm
dog 0 32 48 51 50 48 49bear 32 0 26 34 29 33 36raccoon 48 26 0 42 44 44 48weasel 51 34 42 0 44 38 40seal 50 29 44 44 0 24 41.5sea lion 48 33 44 38 24 0 42cm 49 36 48 40 41.5 42 0
55.6
38
50.4
49.8
46.5
45.8
51.3
55.6 38 50.4 49.8 46.5 45.8 51.3
1. Calculate Sx = (SDx)/(n-2)
Clustering algorithms: Neighbour-joiningdog bear raccoon weasel seal sea lion cm
dog 0 32 48 51 50 48 49bear 32 0 26 34 29 33 36
raccoon 48 26 0 42 44 44 48
weasel 51 34 42 0 44 38 40
seal 50 29 44 44 0 24 41.5
sea lion 48 33 44 38 24 0 42
cm 49 36 48 40 41.5 42 0
55.6
38
50.4
49.8
46.5
45.8
51.3
55.6 38 50.4 49.8 46.5 45.8 51.3
1. Calculate Sx = (SDx)/(n-2)2. Calculate Mij = Dij-Si-Sj and
select pair with smallest Mij
dog bear raccoon weasel seal sea lion cm
dog -61.60 -58.00 -54.40 -52.10 -53.40 -57.90
bear -61.60 -62.40 -53.80 -55.50 -50.80 -53.30
raccoon -58.00 -62.40 -58.20 -52.90 -52.20 -53.70
weasel -54.40 -53.80 -58.20 -52.30 -57.60 -61.10
seal -52.10 -55.50 -52.90 -52.30 -68.30 -56.30
sea lion -53.40 -50.80 -52.20 -57.60 -68.30 -55.10
cm -57.90 -53.30 -53.70 -61.10 -56.30 -55.10
Clustering algorithms: Neighbour-joiningdog bear raccoon weasel seal sea lion cm
dog 0 32 48 51 50 48 49bear 32 0 26 34 29 33 36
raccoon 48 26 0 42 44 44 48
weasel 51 34 42 0 44 38 40
seal 50 29 44 44 0 24 41.5
sea lion 48 33 44 38 24 0 42
cm 49 36 48 40 41.5 42 0
55.6
38
50.4
49.8
46.5
45.8
51.3
55.6 38 50.4 49.8 46.5 45.8 51.3
1. Calculate Sx = (SDx)/(n-2)2. Calculate Mij = Dij-Si-Sj and
select pair with smallest Mij
3. Create a node that joins this pair and calculate branch lengths as (Dij/2)+(Si-Sj)/2
branch length seal-ss = 24/2 + (46.5-45.8)/2 = 12.35branch length sealion-ss = 24/2 + (45.8-46.5)/2 = 11.65
Clustering algorithms: Neighbour-joining
cat
sea lion
seal
monkey
weasel
bear raccoondog
cm 47.08
100.92
ss
1. Calculate Sx = (SDx)/(n-2)2. Calculate Mij = Dij-Si-Sj and
select pair with smallest Mij
3. Create a node that joins this pair and calculate branch lengths as (Dij/2)+(Si-Sj)/2
4. Join the two species and make all other taxa in form of a star.
Clustering algorithms: Neighbour-joiningdog bear raccoon weasel seal sea lion cm
dog 0 32 48 51 50 48 49bear 32 0 26 34 29 33 36
raccoon 48 26 0 42 44 44 48
weasel 51 34 42 0 44 38 40
seal 50 29 44 44 0 24 41.5
sea lion 48 33 44 38 24 0 42
cm 49 36 48 40 41.5 42 0
1. Calculate Sx = (SDx)/(n-2)2. Calculate Mij = Dij-Si-Sj and
select pair with smallest Mij
3. Create a node that joins this pair and calculate branch lengths as (Dij/2)+(Si-Sj)/2
4. Join the two species and make all other taxa in form of a star.
5. Create a new matrix. Calculate the distances between the new node and other taxa as Dxij=(Dix+Djx-Dij)/2
dog bear raccoon weasel ss cm
dog 0 32 48 51 37 49
bear 32 0 26 34 19 36
raccoon 48 26 0 42 32 48
weasel 51 34 42 0 29 40
ss 37 19 32 29 0 29.75
cm 49 36 48 40 29.75 0
Clustering algorithms: Neighbour-joining
cat
sea lion
seal
monkey
weaselbear
raccoon
dog
cm 47.08
100.92
ss
br
Round 3bear+raccoon
Clustering algorithms: Neighbour-joining
cat
sea lion
seal
monkey
weaselbear
raccoondog
cm 47.08
100.92
ss
brbrd
Round 4(bear+raccoon)+dog
Clustering algorithms: Neighbour-joining
catsea lion
seal
monkey
weasel
bear
raccoondog
cm 47.08
100.92
ss
brbrd
cmw
Round 5(cat+monkey)+weasel
Clustering algorithms: Neighbour-joining
catsea lion
seal
monkey
weasel
bear
raccoondog
cm 47.08
100.92
ss
brbdr
cmwbdrss
Round 6(seal+sealion)+(bear+raccoon+dog)
Clustering algorithms: Neighbour-joining
catsea lion
seal
monkey
weasel
bear
raccoondog
cm 47.08
100.92
ss
brbdr
cmwbdrss
Clustering algorithms: Neighbour-joining
cat
sea
lion
seal
mon
key
wea
sel
bear
racc
oon
dog
sea
lion
seal
racc
oon
bear
wea
sel
dog
cat
mon
keyUPGMA