Deciphering Reticulate Evolution Using Phylogenetic Reconciliation · 2018-05-29 · Deciphering...

46
Deciphering Reticulate Evolution Using Phylogenetic Reconciliation Mukul S. Bansal Department of Computer Science and Engineering, University of Connecticut, USA July 9, 2014

Transcript of Deciphering Reticulate Evolution Using Phylogenetic Reconciliation · 2018-05-29 · Deciphering...

Page 1: Deciphering Reticulate Evolution Using Phylogenetic Reconciliation · 2018-05-29 · Deciphering Reticulate Evolution Using Phylogenetic Reconciliation Mukul S. Bansal Department

Deciphering Reticulate Evolution UsingPhylogenetic Reconciliation

Mukul S. Bansal

Department of Computer Science and Engineering,University of Connecticut,

USA

July 9, 2014

Page 2: Deciphering Reticulate Evolution Using Phylogenetic Reconciliation · 2018-05-29 · Deciphering Reticulate Evolution Using Phylogenetic Reconciliation Mukul S. Bansal Department

Gene Family Evolution

ProblemHow did any given gene family evolve?

I Gene families evolve inside species trees.I Affected by evolutionary events such as gene duplication,

horizontal gene transfer, and gene loss.

A B C D A B C D

y

x

y

z

a1 c1 b1 d1a2 c2

Species Tree S

x

z

Evolution of Gene Family G

Duplication

Loss

Transfer

Gene Tree on G

Page 3: Deciphering Reticulate Evolution Using Phylogenetic Reconciliation · 2018-05-29 · Deciphering Reticulate Evolution Using Phylogenetic Reconciliation Mukul S. Bansal Department

Definition: DTL Reconciliation

A B C D

y

Species Tree S

x

z

a1 c1 b1 d1a2 c2

Gene Tree on G

a1 c1 b1 d1a2 c2

G with evolutionary history

x,Σ

y,Δ

y,Σ z,Σ D,Θ

A B C D

x

y

z

Duplication

Loss

Transfer

Page 4: Deciphering Reticulate Evolution Using Phylogenetic Reconciliation · 2018-05-29 · Deciphering Reticulate Evolution Using Phylogenetic Reconciliation Mukul S. Bansal Department

Definition: DTL Reconciliation

A B C D

y

Species Tree S

x

z

a1 c1 b1 d1a2 c2

Gene Tree on G

a1 c1 b1 d1a2 c2

G with evolutionary history

x,Σ

y,Δ

y,Σ z,Σ D,Θ

Input: A gene tree for that gene family, and a trusted rootedspecies tree.Output: An evolutionary history of that gene family showinghorizontal gene transfers, gene duplications, losses, and speciationevents.

Page 5: Deciphering Reticulate Evolution Using Phylogenetic Reconciliation · 2018-05-29 · Deciphering Reticulate Evolution Using Phylogenetic Reconciliation Mukul S. Bansal Department

DTL Reconciliation Problem Formulation

Typical formulation:

I Costs are assigned to duplications, transfers, and losses.

I Goal: Find the reconciliation that minimizes the total cost.

I Easy to compute cost for a given reconciliation.

a1 c1 b1 d1a2 c2

x,Σ

y,Δ

y,Σ z,Σ D,Θ

Costs: L=1, Δ=2, Θ=3

2 + 3 + 2x1 = 7

1Δ, 1Θ, 2L

Different reconciliation could have different cost.

Page 6: Deciphering Reticulate Evolution Using Phylogenetic Reconciliation · 2018-05-29 · Deciphering Reticulate Evolution Using Phylogenetic Reconciliation Mukul S. Bansal Department

Applications of DTL Reconciliation

A B C D

y

Species Tree S

x

z

a1 c1 b1 d1a2 c2

Gene Tree on G

a1 c1 b1 d1a2 c2

G with evolutionary history

x,Σ

y,Δ

y,Σ z,Σ D,Θ

I Understanding how gene families evolve.

I Dating gene birth.

I Inferring orthologs/paralogs/xenologs.

I Accurate gene tree reconstruction.

I Whole genome species tree construction.

I Constructing species phylogenetic networks.

Page 7: Deciphering Reticulate Evolution Using Phylogenetic Reconciliation · 2018-05-29 · Deciphering Reticulate Evolution Using Phylogenetic Reconciliation Mukul S. Bansal Department

Traditional Bottlenecks of DTL Reconciliation

1. Slow algorithms: Fastest algorithms had cubic timecomplexity.

I Could not study large gene families.I No gene tree reconstruction or species tree reconstruction.

2. Inaccurate gene trees: Accuracy of reconciliation deeplyimpacted by accuracy of gene trees.

3. Multiple optima: There are often multiple optimal solutions.I Unclear how to interpret single reconciliation.I Algorithms to output all optimal reconciliations have

exponential time complexity.

4. Event costs: Event costs impact reconciliations.I What is the “correct” event cost assignment.

Page 8: Deciphering Reticulate Evolution Using Phylogenetic Reconciliation · 2018-05-29 · Deciphering Reticulate Evolution Using Phylogenetic Reconciliation Mukul S. Bansal Department

Recent Advances

Addressing the bottlenecks.

(a) Asymptotically faster algorithms.

(b) Method for accurate prokaryotic gene tree reconstruction.

(c) Handling multiple optima by uniform sampling.

(d) Techniques and tools for studying impact of event costs.

Page 9: Deciphering Reticulate Evolution Using Phylogenetic Reconciliation · 2018-05-29 · Deciphering Reticulate Evolution Using Phylogenetic Reconciliation Mukul S. Bansal Department

(a) Faster Algorithms

Page 10: Deciphering Reticulate Evolution Using Phylogenetic Reconciliation · 2018-05-29 · Deciphering Reticulate Evolution Using Phylogenetic Reconciliation Mukul S. Bansal Department

Our Results

For undated species tree:

I O(mn)-time algorithm.

I Factor of n speedup.

For dated species trees:

I O(mn log n)-time algorithm for the fully dated version.(Transfers restricted to co-existing species).

I Factor of n/ log n speedup.

For improved accuracy:

I General O(mn2)-time algorithm that can handledistance-dependent transfer costs.

I Factor of n speedup.

Page 11: Deciphering Reticulate Evolution Using Phylogenetic Reconciliation · 2018-05-29 · Deciphering Reticulate Evolution Using Phylogenetic Reconciliation Mukul S. Bansal Department

Dynamic Programming Algorithm

A B C D

Species Tree S

a1 c1 b1 d1a2 c2

Gene Tree G

g s

Given any g ∈ V (G ) and s ∈ V (S), let

I c(g , s) = cost of an optimal reconciliation of G (g) with Ssuch that g is mapped to s.

Similarly, define:

I cΣ(g , s): with restriction that g is speciation.

I c∆(g , s): with restriction that g is duplication.

I cΘ(g , s): with restriction that g is transfer.

Page 12: Deciphering Reticulate Evolution Using Phylogenetic Reconciliation · 2018-05-29 · Deciphering Reticulate Evolution Using Phylogenetic Reconciliation Mukul S. Bansal Department

Dynamic Programming Algorithm

A B C D

Species Tree S

a1 c1 b1 d1a2 c2

Gene Tree G

gs

0inf

c(g , s) =

0 if g ∈ Le(G ) and s = M(g)∞ if g ∈ Le(G ) and s 6= M(g)min{cΣ(g , s), c∆(g , s), cΘ(g , s)} otherwise.

Page 13: Deciphering Reticulate Evolution Using Phylogenetic Reconciliation · 2018-05-29 · Deciphering Reticulate Evolution Using Phylogenetic Reconciliation Mukul S. Bansal Department

DP Algorithm: Nested Post-Order Traversal

I Nested post-order traversal of the gene tree and species treeto compute all values (cΣ(g , s),c∆(g , s),cΘ(g , s),c(g , s)).

A B C D

Species Tree S

a1 c1 b1 d1a2 c2

Gene Tree G

g sΣ

I Source of speedup is to compute each value in constant time.

Page 14: Deciphering Reticulate Evolution Using Phylogenetic Reconciliation · 2018-05-29 · Deciphering Reticulate Evolution Using Phylogenetic Reconciliation Mukul S. Bansal Department

DP Algorithm: Termination

A B C D

Species Tree S

a1 c1 b1 d1a2 c2

Gene Tree G g

11

7

I The optimal reconciliation cost of G and S is simply:mins∈V (S) c(rt(G ), s)

Page 15: Deciphering Reticulate Evolution Using Phylogenetic Reconciliation · 2018-05-29 · Deciphering Reticulate Evolution Using Phylogenetic Reconciliation Mukul S. Bansal Department

Dynamic Programming Algorithm

A B C D

Species Tree S

A B C D

Species Tree S

5

3

Dated species trees:

I Places constraints on transfers. Breaks clean structure.

I Constant time slows to log n time.

Distance dependant transfer costs:

I Each potential transfer event must be evaluated separately.

I Constant time slows to O(n) time.

Page 16: Deciphering Reticulate Evolution Using Phylogenetic Reconciliation · 2018-05-29 · Deciphering Reticulate Evolution Using Phylogenetic Reconciliation Mukul S. Bansal Department

Drastic Improvement in Runtime and Scalability

Dataset Type Dataset Size RANGER-DTL-U AnGST Mowgli

Simulated50 taxa (100 datasets) 2s 3m:26s 28m:30s

100 taxa (100 datasets) 3s 15m:4s 3h:52m200 taxa (100 datasets) 9s 1h:2m 29h:43m500 taxa (100 datasets) 35s >800h >400h

1,000 taxa (100 datasets) 2m:57s – >6,000h10,000 taxa (1 dataset) 4h:7m – –

Biological 4,733 gene trees, 100 taxa 1m:03s 3h:45m 41h:36m

I 50 taxa: 3m/28m → 2s

I 100 taxa: 15m/3h → 3s

I 200 taxa: 1h/29h → 9s

I 500 taxa: >800h/>400h → 35s

I 1000 taxa: –/>6000h → 3m

I 10000 taxa: –/– → 4h.

Page 17: Deciphering Reticulate Evolution Using Phylogenetic Reconciliation · 2018-05-29 · Deciphering Reticulate Evolution Using Phylogenetic Reconciliation Mukul S. Bansal Department

Faster Algorithms Enable Many New Applications

I Efficient genome-scale analysis.

I Much larger species tree.

I Species tree reconstruction.

I Accurate gene tree reconstruction.

I Phylogenetic network inference.

→ M. S. Bansal, E. J. Alm, and M. Kellis, “Efficient Algorithms for the Reconciliation Problem with GeneDuplication, Horizontal Transfer, and Loss”. Twentieth Annual International Conference on IntelligentSystems for Molecular Biology (ISMB 2012); Bioinformatics 2012, 28: i283–i291.

Page 18: Deciphering Reticulate Evolution Using Phylogenetic Reconciliation · 2018-05-29 · Deciphering Reticulate Evolution Using Phylogenetic Reconciliation Mukul S. Bansal Department

(b) Accurate Gene Tree Reconstruction

Page 19: Deciphering Reticulate Evolution Using Phylogenetic Reconciliation · 2018-05-29 · Deciphering Reticulate Evolution Using Phylogenetic Reconciliation Mukul S. Bansal Department

Gene Tree Reconstruction

I Gene trees are hard to reconstruct accurately.I Eukaryotes: TreeBest, SPIDIR, PrIME-GSR, SPIMAP,

NOTUNG, tt, TreeFix.I Prokaryotes: Recent efforts: AnGST and MowgliNNI. (Also

ALE in probabilistic framework.)

GA ..... CA

AT ..... CA

GT ..... CC

A B C D

Species Tree S

a1 c1 b1 d1a2 c2

Gene Tree

Page 20: Deciphering Reticulate Evolution Using Phylogenetic Reconciliation · 2018-05-29 · Deciphering Reticulate Evolution Using Phylogenetic Reconciliation Mukul S. Bansal Department

Developing a Principled Approach

Goal:

I Use fast algorithms to develop an efficient and principledapproach for prokaryotic gene tree reconstruction.

Outcome:

I TreeFix-DTL: A highly accurate method for prokaryotic genetree reconstruction.

Page 21: Deciphering Reticulate Evolution Using Phylogenetic Reconciliation · 2018-05-29 · Deciphering Reticulate Evolution Using Phylogenetic Reconciliation Mukul S. Bansal Department

TreeFix-DTL: Algorithm Overview

Input: ML gene tree, multiple sequence alignment, and rootedspecies tree.Output: Reconstructed (error-corrected) gene tree.

I Statistically informed, fast and scalable, easy to use.

Statistically Equivalent Gene Trees

Species Tree

104

76

91

Page 22: Deciphering Reticulate Evolution Using Phylogenetic Reconciliation · 2018-05-29 · Deciphering Reticulate Evolution Using Phylogenetic Reconciliation Mukul S. Bansal Department

Dataset Generation

Basic simulation setup:

I 50 taxa Yule species trees.

I Gene trees with low, medium, and high rates of D, T, L.

I Mutation rates (substitutions per site) 1, 3, 5, and 10.

I Simulated amino acid sequences of length 173 and 333.

I Reconstructed using RAxML.

Total of 24 different datasets, each with 100 gene trees.

Page 23: Deciphering Reticulate Evolution Using Phylogenetic Reconciliation · 2018-05-29 · Deciphering Reticulate Evolution Using Phylogenetic Reconciliation Mukul S. Bansal Department

Performance: TreeFix-DTL is Highly Accurate

RAxML NOTUNG Treefix MowgliNNI AnGST TreeFix-DTL

0

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

0.1

Rate 1 Rate 3 Rate 5 Rate 10

No

rmal

ize

d R

F d

ista

nce

Reconstruction accuracy for different mutation rates(Low rate of DTL, Sequence Length: 333)

Page 24: Deciphering Reticulate Evolution Using Phylogenetic Reconciliation · 2018-05-29 · Deciphering Reticulate Evolution Using Phylogenetic Reconciliation Mukul S. Bansal Department

Performance: Tree Reconstruction Accuracy

Method Normalized RF distance Perfect reconstructionRAxML 0.097 3.04%

NOTUNG 0.088 13.08%TreeFix 0.079 10.29%

MowgliNNI 0.039 22.17%AnGST 0.032 29.08%

TreeFix-DTL 0.028 38.21%

1. RAxML and eukaryotic methods error prone and ineffective.

2. TreeFix-DTL is highly accurate.

Page 25: Deciphering Reticulate Evolution Using Phylogenetic Reconciliation · 2018-05-29 · Deciphering Reticulate Evolution Using Phylogenetic Reconciliation Mukul S. Bansal Department

Performance: Greatly Improved Reconciliation Accuracy

Accuracy of inferred duplications and transfers: Averaged across allsimulated datasets.

% p

reci

sion

020

6010

0

80 81 8194

% s

ensi

tivity

020

6010

0

8189 90

98

% p

reci

sion

020

6010

0

3871 76

96

% s

ensi

tivity

020

6010

0

68 68 73

92

RAxML AnGST TreeFix-DTL True

Duplications Transfers Losses

% p

reci

sion

020

6010

0

31

73 7685

% s

ensi

tivity

020

6010

0

7687 87 92

Page 26: Deciphering Reticulate Evolution Using Phylogenetic Reconciliation · 2018-05-29 · Deciphering Reticulate Evolution Using Phylogenetic Reconciliation Mukul S. Bansal Department

Impact of TreeFix-DTL

I Now possible to reconstruct gene trees very accurately.

I Large impact on accuracy of reconciliation and on anydownstream analyses.

→ M. S. Bansal, Y. Wu, E. J. Alm, and M. Kellis, “Improved Gene Tree Error-Correction for DecipheringMicrobial Evolution”. Under review.

→ Y. Wu, M. D. Rasmussen, M. S. Bansal, M. Kellis. TreeFix: statistically informed gene tree errorcorrection using species trees. Systematic Biology 62(1): 110-120, 2013.

Page 27: Deciphering Reticulate Evolution Using Phylogenetic Reconciliation · 2018-05-29 · Deciphering Reticulate Evolution Using Phylogenetic Reconciliation Mukul S. Bansal Department

(c) Handling Multiple Optima

Page 28: Deciphering Reticulate Evolution Using Phylogenetic Reconciliation · 2018-05-29 · Deciphering Reticulate Evolution Using Phylogenetic Reconciliation Mukul S. Bansal Department

Optimal Reconciliations Need Not Be Unique

Costs: L=1, Δ=1, Θ=3

A B C D

y

Species Tree S

x

z

a1 c1 b1 d1a2 c2

Gene Tree on G

A B C D

x

y

z

Duplication

Loss

Transfer

A B C D

x

zTransfer

Transfer

y

I Which one is the “correct” reconciliation?

Page 29: Deciphering Reticulate Evolution Using Phylogenetic Reconciliation · 2018-05-29 · Deciphering Reticulate Evolution Using Phylogenetic Reconciliation Mukul S. Bansal Department

Number of Optimal Solutions

1 17%

[2,9] 31%

[10,99] 20%

[100, 10k] 17%

[10k, 100k] 5%

[100k, 100Mil] 7%

>=100Mil 3%

Number of optimal solutions (4699 trees)

Page 30: Deciphering Reticulate Evolution Using Phylogenetic Reconciliation · 2018-05-29 · Deciphering Reticulate Evolution Using Phylogenetic Reconciliation Mukul S. Bansal Department

Exponential Increase with Gene Tree Size

110

1001000

10000100000

100000010000000

1000000001E+091E+101E+111E+121E+131E+141E+151E+16

0 20 40 60 80 100 120 140 160 180 200

Num

ber o

f Sol

utio

ns

Gene Tree Size

Number of Optimal Solutions by Gene Tree Size

Page 31: Deciphering Reticulate Evolution Using Phylogenetic Reconciliation · 2018-05-29 · Deciphering Reticulate Evolution Using Phylogenetic Reconciliation Mukul S. Bansal Department

Dealing With Non-Uniqueness

Fundamental question:

I How different are the different optimal reconciliations?

Costs: L=1, Δ=1, Θ=3

A B C D

y

Species Tree S

x

z

a1 c1 b1 d1a2 c2

Gene Tree on G

A B C D

x

y

z

Duplication

Loss

Transfer

A B C D

x

zTransfer

Transfer

y

x,Σ

z,Σ

y

D,Θ

In this case, most node mappings and event assignments areconsistent!

Page 32: Deciphering Reticulate Evolution Using Phylogenetic Reconciliation · 2018-05-29 · Deciphering Reticulate Evolution Using Phylogenetic Reconciliation Mukul S. Bansal Department

Goals

Goals:

I Understand the space of optimal reconciliations.

I Specifically, understand how different the mapping and eventassignments can be in the different optima.

Solution: Sample the space of optimal reconciliations uniformly atrandom.

I Enables systematic study of space of optima for any inputinstance.

I Enables quick estimates of “support” for any event ormapping assignment.

Page 33: Deciphering Reticulate Evolution Using Phylogenetic Reconciliation · 2018-05-29 · Deciphering Reticulate Evolution Using Phylogenetic Reconciliation Mukul S. Bansal Department

Sample Optimal Reconciliations Uniformly

I We show that this is possible to do using a modification ofthe DP algorithm.

I Idea is to keep track of number of optimal solutions for eachsubproblem, and then weigh different equally optimal choicesduring backtracking step.

I Complexity increases from O(mn) to O(mn2).

Page 34: Deciphering Reticulate Evolution Using Phylogenetic Reconciliation · 2018-05-29 · Deciphering Reticulate Evolution Using Phylogenetic Reconciliation Mukul S. Bansal Department

Application to Biological Dataset

Dataset:>4700 gene trees from a set of 100 species broadly sampled acrossthe tree of life.

I Ran 500 times on each gene tree and aggregated the 500reconciliations.

Aggregation: For each internal node in the gene tree:

I How consistent are the 500 mapping assignments? (Frequencyof most frequent mapping).

I How consistent are the 500 event assignments? (Frequency ofmost frequent event assignment).

Page 35: Deciphering Reticulate Evolution Using Phylogenetic Reconciliation · 2018-05-29 · Deciphering Reticulate Evolution Using Phylogenetic Reconciliation Mukul S. Bansal Department

Mapping Assignment Consistency

73.15

0.90 1.29

19.10

5.55

0

10

20

30

40

50

60

70

80

90

100

100 [90, 99] [80, 89] [50, 79] [0, 49]

Frac

tion

of n

odes

(%)

Consistency (%)

I For many nodes, the mapping assignments are 100%consistent.

I But significant fraction of nodes have inconsistent mappingassignments.

Page 36: Deciphering Reticulate Evolution Using Phylogenetic Reconciliation · 2018-05-29 · Deciphering Reticulate Evolution Using Phylogenetic Reconciliation Mukul S. Bansal Department

Event Assignment Consistency

93.31

0.63 0.79 4.97

0.25 0

10

20

30

40

50

60

70

80

90

100

100 [90, 99] [80, 89] [50, 79] [0, 49]

Frac

tion

of n

odes

(%)

Consistency (%)

I Good news: Event assignments are highly consistent!

Page 37: Deciphering Reticulate Evolution Using Phylogenetic Reconciliation · 2018-05-29 · Deciphering Reticulate Evolution Using Phylogenetic Reconciliation Mukul S. Bansal Department

Impact of Uniform Sampling on Handling Multiple Optima

I Can efficiently study the space of optimal reconciliations.

I Can be used to determine consistency of the mapping andevent assignment for any gene tree node.

I Event assignments are highly consistent: Great for functionalgenomic studies.

A B C D

y

Species Tree S

x

z

a1 c1 b1 d1a2 c2

Gene Tree on G

a1 c1 b1 d1a2 c2

x,Σ

y,Δ

y,Σ z,Σ D,Θ(80%,60%)

→ M. S. Bansal, E. J. Alm, and M. Kellis, “Reconciliation Revisited: Handling Multiple Optima whenReconciling with Duplication, Transfer, and Loss”. Journal of Computational Biology (JCB), 20(10):738-754, 2013.

Page 38: Deciphering Reticulate Evolution Using Phylogenetic Reconciliation · 2018-05-29 · Deciphering Reticulate Evolution Using Phylogenetic Reconciliation Mukul S. Bansal Department

(d) Impact of Event Costs

Page 39: Deciphering Reticulate Evolution Using Phylogenetic Reconciliation · 2018-05-29 · Deciphering Reticulate Evolution Using Phylogenetic Reconciliation Mukul S. Bansal Department

Event Costs Define Optimal Reconciliations

I Which one is the “correct” reconciliation?

Page 40: Deciphering Reticulate Evolution Using Phylogenetic Reconciliation · 2018-05-29 · Deciphering Reticulate Evolution Using Phylogenetic Reconciliation Mukul S. Bansal Department

Uncertainty in Event Cost Assignments

Fundamental question:I What are the “correct” event costs?

I Difficult to estimate.I “Correct” event costs may not even exist.

I How do reconciliations vary as we change event costs?

Page 41: Deciphering Reticulate Evolution Using Phylogenetic Reconciliation · 2018-05-29 · Deciphering Reticulate Evolution Using Phylogenetic Reconciliation Mukul S. Bansal Department

Pareto-optimal Reconciliations

I Keep track of all Pareto-optimal Reconciliations.I No other reconciliation is better in all three event counts.I Represents reconciliations that could be optimal for some

assignment of event costs.

I O(m5n logm) time.

I Use to partition event cost space into equivalence regions.

Page 42: Deciphering Reticulate Evolution Using Phylogenetic Reconciliation · 2018-05-29 · Deciphering Reticulate Evolution Using Phylogenetic Reconciliation Mukul S. Bansal Department

Equivalence RegionsT

rans

fer

cost

rel

ativ

e to

dup

licat

ion

5

4

3

2

1

54321

<6, 1, 2, 3> Count = 3<6, 0, 3, 1> Count = 2<4, 0, 5, 0> Count = 8<4, 1, 4, 0> Count = 2<6, 2, 1, 5> Count = 1<5, 4, 0, 10> Count = 1

Loss cost relative to duplication

Tra

nsfe

r co

st r

elat

ive

to d

uplic

atio

n

Loss cost relative to duplication1 2 3 4 5

1

2

3

4

5(a) (b) (c)

Page 43: Deciphering Reticulate Evolution Using Phylogenetic Reconciliation · 2018-05-29 · Deciphering Reticulate Evolution Using Phylogenetic Reconciliation Mukul S. Bansal Department

Impact in Practice

I Can visualize equivalence regions.

I Can choose suitable event cost assignments.

I Can be used to determine “consensus” mappings and eventassignments.

→ R. Libeskind-Hadas, Y. Wu, M. S. Bansal, and M. Kellis, “Pareto-Optimal Phylogenetic TreeReconciliation”. ISMB 2014; Bioinformatics 30: i87-i95, 2014.

Page 44: Deciphering Reticulate Evolution Using Phylogenetic Reconciliation · 2018-05-29 · Deciphering Reticulate Evolution Using Phylogenetic Reconciliation Mukul S. Bansal Department

In Summary

I There now exist effective algorithms and methods to handlethe major bottlenecks:

1. Slow algorithms → Asymptotically faster algorithms2. Inaccurate gene trees → TreeFix-DTL3. Multiple optima → Uniformly random sampling4. Impact of event costs → Pareto-optimality, equivalence regions

I Easy to use, Fast, Accurate, Scalable.

Page 45: Deciphering Reticulate Evolution Using Phylogenetic Reconciliation · 2018-05-29 · Deciphering Reticulate Evolution Using Phylogenetic Reconciliation Mukul S. Bansal Department

Constructing Phylogenetic Networks

Using existing software:

I Build accurate gene trees.

I Obtain a species tree (can be subset of network).

I Use DTL reconciliation to reconcile individual gene trees.

I Aggregate transfers inferred for gene trees onto species tree.

I Advantages: Scalable, can handle all gene families, not fooledby duplication/loss and ILS.

Future Work:

I Program to aggregate reticulations onto species tree to drawnetworks and infer highways. Use global view to refineindividual reconciliations.

I Reconciliation against species networks.

Page 46: Deciphering Reticulate Evolution Using Phylogenetic Reconciliation · 2018-05-29 · Deciphering Reticulate Evolution Using Phylogenetic Reconciliation Mukul S. Bansal Department

Thank You!

Questions!