Shifted stochastic processes evolving on trees: application to...

31
Shifted stochastic processes evolving on trees: application to models of adaptive evolution on phylogenies Paul Bastide 1,2 , Mahendra Mariadassou 2 , St´ ephane Robin 1 1 UMR 518 MIA - AgroParisTech/INRA, Paris 2 UR 1404 MaIAGE - INRA, Jouy en Josas 2 June 2015

Transcript of Shifted stochastic processes evolving on trees: application to...

Page 1: Shifted stochastic processes evolving on trees: application to …pbastide.github.io/docs/20150602_JdS.pdf · 2020-06-29 · Stochastic Processes on Trees Identi ability Problems

Shifted stochastic processes evolving on trees:application to models of adaptive evolution on

phylogenies

Paul Bastide1,2, Mahendra Mariadassou2, Stephane Robin1

1 UMR 518 MIA - AgroParisTech/INRA, Paris2 UR 1404 MaIAGE - INRA, Jouy en Josas

2 June 2015

Page 2: Shifted stochastic processes evolving on trees: application to …pbastide.github.io/docs/20150602_JdS.pdf · 2020-06-29 · Stochastic Processes on Trees Identi ability Problems

Stochastic Processes on TreesIdentifiability Problems and Counting Issues

Statistical InferenceChelonia Data Set

References

Introduction0

Unit

200 150 100 50 0

Chelonian phylogenetic tree with habitats.(Jaffe et al., 2011).

Dermochelys Coriacea

Homopus Areolatus

How can we explain the diversity, while accounting for thephylogenetic correlations ?

Modelling: a shifted stochastic process on the phylogeny.

PB, MM, SR Shifted stochastic processes on trees 2/17

Page 3: Shifted stochastic processes evolving on trees: application to …pbastide.github.io/docs/20150602_JdS.pdf · 2020-06-29 · Stochastic Processes on Trees Identi ability Problems

Stochastic Processes on TreesIdentifiability Problems and Counting Issues

Statistical InferenceChelonia Data Set

References

Principle of the Modeling and NotationsStochastic Processes UsedShifts

Stochastic Process on a Tree (Felsenstein, 1985)

E

D

C

B

AtAB

t

R

0 200 400 600 800

−4

−2

02

46

time

phen

otyp

e

E

D

C

B

A

R

Brownian Motion:

Var [A | R ] = σ2t

Cov [A; B | R ] = σ2tAB

PB, MM, SR Shifted stochastic processes on trees 3/17

Page 4: Shifted stochastic processes evolving on trees: application to …pbastide.github.io/docs/20150602_JdS.pdf · 2020-06-29 · Stochastic Processes on Trees Identi ability Problems

Stochastic Processes on TreesIdentifiability Problems and Counting Issues

Statistical InferenceChelonia Data Set

References

Principle of the Modeling and NotationsStochastic Processes UsedShifts

BM vs OU (Hansen, 1997; Butler and King, 2004)

Equation Stationary State VarianceModel-Based Comparative Analysis 685

Figure 1: Effect of varying j in a Brownian motion (BM) process. Plottedare random walks through time for a continuous character with phe-notypic value along the Y-axis and time along the X-axis. At each smallstep in time, the phenotype has an equal and independent probabilityof increasing or decreasing in value. Increasing j results in strongerrandom drift and a broader distribution of final states. Because Ornstein-Uhlenbeck (OU) processes contain a drift component, increasing j alsobroadens the distribution of final states in OU processes. Each paneldisplays 30 realizations of the stochastic process and the distribution offinal states (the Gaussian curves to the right of the random walks). Theprocess is simulated from to , with each realization havingt p 0 t p 1the same initial state. An animation of this process is provided in theonline edition of the American Naturalist.

Figure 2: Influence of the selection-strength parameter a and optimumtrait value v on a trait evolving under an Ornstein-Uhlenbeck (OU)process. Larger values of a imply stronger selection and hence a morerapid approach to the optimum value v (dotted line) as well as a tighterdistribution of phenotypes around the optimum. Each panel displays 20realizations of the OU process and the distribution of final trait values.The process is simulated from to , with each realization havingt p 0 t p 1the same initial state and value of v. See figure 1 for explanation of axes.

This term is linear in X, so it is as simple as it mightpossibly be. It contains two additional parameters: a mea-sures the strength of selection, and v gives the optimumtrait value. The force of selection is proportional to thedistance, , of the current trait value from the op-v ! X(t)timum. Thus, if the phenotype has drifted far from theoptimum, the “pull” toward the optimum will be verystrong, whereas if the phenotype is currently at the opti-mum, selection will have no effect until the stochasticitymoves the phenotype away from the optimum again orthere is a change in the optimum, v, itself. Because of itsdependence on the distance from the optimum, the OUprocess can be used to model stabilizing selection. Theeffect of varying a can be seen in figure 2.

Because the OU model reduces to BM when , ita p 0can be viewed as an elaboration of the BM model. As astatistical model, its primary justification is to be soughtin the fact that it represents a step beyond BM in thedirection of realism while yet remaining mathematicallytractable. As a model of evolution, the OU process is con-

sistent with a variety of evolutionary interpretations, twoof which we mention here.

Lande (1976) showed that under certain assumptions,evolution of the species’ mean phenotype can take theform of an OU process. In Lande’s formulation, both nat-ural selection and random genetic drift are assumed to acton the phenotypic character; the OU process’s optimum

denotes the location of a local maximum in a fitnessvlandscape. Felsenstein (1988) pointed out that in the eventthat this optimum itself moves randomly, the correct de-scription of the phenotypic evolution is no longer exactlyan OU process, but an OU process remains a goodapproximation.

Hansen (1997) raised questions concerning the time-scale of the approach of a species’ mean phenotype to itsoptimal value relative to that of macroevolution. Specifi-cally, he suggested that the macroevolutionary OU processhe proposed could only operate on far too slow a timescaleto be identical with the Landean OU process (cf. Lande1980). He proposed a different interpretation based on thesupposition that at any point in its history, a given phe-notypic character is subject to a large number of conflictingselective demands (genetic and environmental) so that itspresent value is the outcome of a compromise among

dW (t) = σdB(t) None. σij = γ2 + σ2tij

Model-Based Comparative Analysis 685

Figure 1: Effect of varying j in a Brownian motion (BM) process. Plottedare random walks through time for a continuous character with phe-notypic value along the Y-axis and time along the X-axis. At each smallstep in time, the phenotype has an equal and independent probabilityof increasing or decreasing in value. Increasing j results in strongerrandom drift and a broader distribution of final states. Because Ornstein-Uhlenbeck (OU) processes contain a drift component, increasing j alsobroadens the distribution of final states in OU processes. Each paneldisplays 30 realizations of the stochastic process and the distribution offinal states (the Gaussian curves to the right of the random walks). Theprocess is simulated from to , with each realization havingt p 0 t p 1the same initial state. An animation of this process is provided in theonline edition of the American Naturalist.

Figure 2: Influence of the selection-strength parameter a and optimumtrait value v on a trait evolving under an Ornstein-Uhlenbeck (OU)process. Larger values of a imply stronger selection and hence a morerapid approach to the optimum value v (dotted line) as well as a tighterdistribution of phenotypes around the optimum. Each panel displays 20realizations of the OU process and the distribution of final trait values.The process is simulated from to , with each realization havingt p 0 t p 1the same initial state and value of v. See figure 1 for explanation of axes.

This term is linear in X, so it is as simple as it mightpossibly be. It contains two additional parameters: a mea-sures the strength of selection, and v gives the optimumtrait value. The force of selection is proportional to thedistance, , of the current trait value from the op-v ! X(t)timum. Thus, if the phenotype has drifted far from theoptimum, the “pull” toward the optimum will be verystrong, whereas if the phenotype is currently at the opti-mum, selection will have no effect until the stochasticitymoves the phenotype away from the optimum again orthere is a change in the optimum, v, itself. Because of itsdependence on the distance from the optimum, the OUprocess can be used to model stabilizing selection. Theeffect of varying a can be seen in figure 2.

Because the OU model reduces to BM when , ita p 0can be viewed as an elaboration of the BM model. As astatistical model, its primary justification is to be soughtin the fact that it represents a step beyond BM in thedirection of realism while yet remaining mathematicallytractable. As a model of evolution, the OU process is con-

sistent with a variety of evolutionary interpretations, twoof which we mention here.

Lande (1976) showed that under certain assumptions,evolution of the species’ mean phenotype can take theform of an OU process. In Lande’s formulation, both nat-ural selection and random genetic drift are assumed to acton the phenotypic character; the OU process’s optimum

denotes the location of a local maximum in a fitnessvlandscape. Felsenstein (1988) pointed out that in the eventthat this optimum itself moves randomly, the correct de-scription of the phenotypic evolution is no longer exactlyan OU process, but an OU process remains a goodapproximation.

Hansen (1997) raised questions concerning the time-scale of the approach of a species’ mean phenotype to itsoptimal value relative to that of macroevolution. Specifi-cally, he suggested that the macroevolutionary OU processhe proposed could only operate on far too slow a timescaleto be identical with the Landean OU process (cf. Lande1980). He proposed a different interpretation based on thesupposition that at any point in its history, a given phe-notypic character is subject to a large number of conflictingselective demands (genetic and environmental) so that itspresent value is the outcome of a compromise among

dW (t) = σdB(t)

+α[β(t)−W (t)]dt

µ = β0

γ2 =σ2

σij =σ2

2αe−αdij

(Root in Stationary

State)

PB, MM, SR Shifted stochastic processes on trees 4/17

Page 5: Shifted stochastic processes evolving on trees: application to …pbastide.github.io/docs/20150602_JdS.pdf · 2020-06-29 · Stochastic Processes on Trees Identi ability Problems

Stochastic Processes on TreesIdentifiability Problems and Counting Issues

Statistical InferenceChelonia Data Set

References

Principle of the Modeling and NotationsStochastic Processes UsedShifts

Shifts

E

D

C

B

A

R

0 200 400 600 800

−4

−2

02

46

time

phen

otyp

e

E

D

C

B

A

R

BM Shifts in the mean:

mj = mpa(j) +∑

k

I{τk = bj}δk

0 200 400 600 800−

2−

10

12

time

phen

otyp

e

E

D

C

B

AR

OU Shifts in the optimal value:

βj = βpa(j) +∑

k

I{τk = bj}δk

PB, MM, SR Shifted stochastic processes on trees 5/17

Page 6: Shifted stochastic processes evolving on trees: application to …pbastide.github.io/docs/20150602_JdS.pdf · 2020-06-29 · Stochastic Processes on Trees Identi ability Problems

Stochastic Processes on TreesIdentifiability Problems and Counting Issues

Statistical InferenceChelonia Data Set

References

Principle of the Modeling and NotationsStochastic Processes UsedShifts

Shifts

E

D

C

B

A

R

δ

0 200 400 600 800

−4

−2

02

46

time

phen

otyp

e

E

D

C

B

A

R

BM Shifts in the mean:

mj = mpa(j) +∑

k

I{τk = bj}δk

0 200 400 600 800−

2−

10

12

time

phen

otyp

e

E

D

C

B

AR

OU Shifts in the optimal value:

βj = βpa(j) +∑

k

I{τk = bj}δk

PB, MM, SR Shifted stochastic processes on trees 5/17

Page 7: Shifted stochastic processes evolving on trees: application to …pbastide.github.io/docs/20150602_JdS.pdf · 2020-06-29 · Stochastic Processes on Trees Identi ability Problems

Stochastic Processes on TreesIdentifiability Problems and Counting Issues

Statistical InferenceChelonia Data Set

References

Principle of the Modeling and NotationsStochastic Processes UsedShifts

Shifts

E

D

C

B

A

R

δ

0 200 400 600 800

05

10

time

phen

otyp

e

E

D

C

BA

E

D

C

BA

R

δ

BM Shifts in the mean:

mj = mpa(j) +∑

k

I{τk = bj}δk

0 200 400 600 800−

2−

10

12

time

phen

otyp

e

E

D

C

B

AR

OU Shifts in the optimal value:

βj = βpa(j) +∑

k

I{τk = bj}δk

PB, MM, SR Shifted stochastic processes on trees 5/17

Page 8: Shifted stochastic processes evolving on trees: application to …pbastide.github.io/docs/20150602_JdS.pdf · 2020-06-29 · Stochastic Processes on Trees Identi ability Problems

Stochastic Processes on TreesIdentifiability Problems and Counting Issues

Statistical InferenceChelonia Data Set

References

Principle of the Modeling and NotationsStochastic Processes UsedShifts

Shifts

E

D

C

B

A

R

δ

0 200 400 600 800

05

10

time

phen

otyp

e

E

D

C

BA

E

D

C

BA

R

δ

BM Shifts in the mean:

mj = mpa(j) +∑

k

I{τk = bj}δk

0 200 400 600 8000

24

6

time

phen

otyp

e

ED

C

BA

ED

C

BAR

δ

OU Shifts in the optimal value:

βj = βpa(j) +∑

k

I{τk = bj}δk

PB, MM, SR Shifted stochastic processes on trees 5/17

Page 9: Shifted stochastic processes evolving on trees: application to …pbastide.github.io/docs/20150602_JdS.pdf · 2020-06-29 · Stochastic Processes on Trees Identi ability Problems

Stochastic Processes on TreesIdentifiability Problems and Counting Issues

Statistical InferenceChelonia Data Set

References

Identifiability ProblemsNumber of Parsimonious SolutionsNumber of Models with K Shifts

Equivalencies

K fixed, several equivalent solutions.

µ

δ1 δ2

µ+δ2µ+δ1

µ

δ2 − δ1

δ1

Problem of over-parametrization: parsimonious configurations.

PB, MM, SR Shifted stochastic processes on trees 6/17

Page 10: Shifted stochastic processes evolving on trees: application to …pbastide.github.io/docs/20150602_JdS.pdf · 2020-06-29 · Stochastic Processes on Trees Identi ability Problems

Stochastic Processes on TreesIdentifiability Problems and Counting Issues

Statistical InferenceChelonia Data Set

References

Identifiability ProblemsNumber of Parsimonious SolutionsNumber of Models with K Shifts

Equivalencies

K fixed, several equivalent solutions.

µ

δ1 δ2

µ+δ2µ+δ1

µ

δ2 − δ1

δ1

µ+ δ1

δ2 δ1

µ+ δ2

µ+δ2µ+δ1

Problem of over-parametrization: parsimonious configurations.

PB, MM, SR Shifted stochastic processes on trees 6/17

Page 11: Shifted stochastic processes evolving on trees: application to …pbastide.github.io/docs/20150602_JdS.pdf · 2020-06-29 · Stochastic Processes on Trees Identi ability Problems

Stochastic Processes on TreesIdentifiability Problems and Counting Issues

Statistical InferenceChelonia Data Set

References

Identifiability ProblemsNumber of Parsimonious SolutionsNumber of Models with K Shifts

Parsimonious Solution : Definition

Definition

A coloring of the tips being given, a parsimonious allocation of theshifts is such that it has a minimum number of shifts.

PB, MM, SR Shifted stochastic processes on trees 7/17

Page 12: Shifted stochastic processes evolving on trees: application to …pbastide.github.io/docs/20150602_JdS.pdf · 2020-06-29 · Stochastic Processes on Trees Identi ability Problems

Stochastic Processes on TreesIdentifiability Problems and Counting Issues

Statistical InferenceChelonia Data Set

References

Identifiability ProblemsNumber of Parsimonious SolutionsNumber of Models with K Shifts

Parsimonious Solution : Definition

Definition

A coloring of the tips being given, a parsimonious allocation of theshifts is such that it has a minimum number of shifts.

PB, MM, SR Shifted stochastic processes on trees 7/17

Page 13: Shifted stochastic processes evolving on trees: application to …pbastide.github.io/docs/20150602_JdS.pdf · 2020-06-29 · Stochastic Processes on Trees Identi ability Problems

Stochastic Processes on TreesIdentifiability Problems and Counting Issues

Statistical InferenceChelonia Data Set

References

Identifiability ProblemsNumber of Parsimonious SolutionsNumber of Models with K Shifts

Parsimonious Solution : Definition

Definition

A coloring of the tips being given, a parsimonious allocation of theshifts is such that it has a minimum number of shifts.

PB, MM, SR Shifted stochastic processes on trees 7/17

Page 14: Shifted stochastic processes evolving on trees: application to …pbastide.github.io/docs/20150602_JdS.pdf · 2020-06-29 · Stochastic Processes on Trees Identi ability Problems

Stochastic Processes on TreesIdentifiability Problems and Counting Issues

Statistical InferenceChelonia Data Set

References

Identifiability ProblemsNumber of Parsimonious SolutionsNumber of Models with K Shifts

Parsimonious Solution : Definition

Definition

A coloring of the tips being given, a parsimonious allocation of theshifts is such that it has a minimum number of shifts.

PB, MM, SR Shifted stochastic processes on trees 7/17

Page 15: Shifted stochastic processes evolving on trees: application to …pbastide.github.io/docs/20150602_JdS.pdf · 2020-06-29 · Stochastic Processes on Trees Identi ability Problems

Stochastic Processes on TreesIdentifiability Problems and Counting Issues

Statistical InferenceChelonia Data Set

References

Identifiability ProblemsNumber of Parsimonious SolutionsNumber of Models with K Shifts

Parsimonious Solution : Definition

Definition

A coloring of the tips being given, a parsimonious allocation of theshifts is such that it has a minimum number of shifts.

PB, MM, SR Shifted stochastic processes on trees 7/17

Page 16: Shifted stochastic processes evolving on trees: application to …pbastide.github.io/docs/20150602_JdS.pdf · 2020-06-29 · Stochastic Processes on Trees Identi ability Problems

Stochastic Processes on TreesIdentifiability Problems and Counting Issues

Statistical InferenceChelonia Data Set

References

Identifiability ProblemsNumber of Parsimonious SolutionsNumber of Models with K Shifts

Parsimonious Solution : Definition

Definition

A coloring of the tips being given, a parsimonious allocation of theshifts is such that it has a minimum number of shifts.

PB, MM, SR Shifted stochastic processes on trees 7/17

Page 17: Shifted stochastic processes evolving on trees: application to …pbastide.github.io/docs/20150602_JdS.pdf · 2020-06-29 · Stochastic Processes on Trees Identi ability Problems

Stochastic Processes on TreesIdentifiability Problems and Counting Issues

Statistical InferenceChelonia Data Set

References

Identifiability ProblemsNumber of Parsimonious SolutionsNumber of Models with K Shifts

Parsimonious Solution : Definition

Definition

A coloring of the tips being given, a parsimonious allocation of theshifts is such that it has a minimum number of shifts.

PB, MM, SR Shifted stochastic processes on trees 7/17

Page 18: Shifted stochastic processes evolving on trees: application to …pbastide.github.io/docs/20150602_JdS.pdf · 2020-06-29 · Stochastic Processes on Trees Identi ability Problems

Stochastic Processes on TreesIdentifiability Problems and Counting Issues

Statistical InferenceChelonia Data Set

References

Identifiability ProblemsNumber of Parsimonious SolutionsNumber of Models with K Shifts

Equivalent Parsimonious Allocations

Definition

Two allocations are said to be equivalent (noted ∼) if they areboth parsimonious and give the same colors at the tips.

Find one solution Several existing Dynamic Programmingalgorithms (see Felsenstein, 2004).

Enumerate all solutions New recursive algorithm, adapted fromprevious ones (and implemented in R).

PB, MM, SR Shifted stochastic processes on trees 8/17

Page 19: Shifted stochastic processes evolving on trees: application to …pbastide.github.io/docs/20150602_JdS.pdf · 2020-06-29 · Stochastic Processes on Trees Identi ability Problems

Stochastic Processes on TreesIdentifiability Problems and Counting Issues

Statistical InferenceChelonia Data Set

References

Identifiability ProblemsNumber of Parsimonious SolutionsNumber of Models with K Shifts

Number of Models with K Shifts

Hypothesis “No Homoplasy”: 1 shift = 1 new color.

“K shifts ⇐⇒ K + 1 colors”

Bijection

SPIK = SP

K/ ∼; SPK = {Parsimonious allocations of K shifts}

SPIK ' {Tree compatible coloring of tips in K + 1 colors}

Problem Size of SPIK ?

Proposition∣∣SPIK

∣∣ ≤ (m+n−1K

)∣∣SPIK

∣∣ depends on the topology of the tree. It can becomputed with a recursive algorithm.

For a binary tree:∣∣SPI

K

∣∣ =(2n−2−K

K

).

PB, MM, SR Shifted stochastic processes on trees 9/17

Page 20: Shifted stochastic processes evolving on trees: application to …pbastide.github.io/docs/20150602_JdS.pdf · 2020-06-29 · Stochastic Processes on Trees Identi ability Problems

Stochastic Processes on TreesIdentifiability Problems and Counting Issues

Statistical InferenceChelonia Data Set

References

Incomplete Data Model : EMLinear Regression ModelModel Selection

Incomplete Data Model : EM

Y5

Y4

Y3

Y2

Y1

Z1

Z4

Z2

Z3

δk

Xj |Xpa(j) ∼ N

(qjXpa(j) + rj + sj

∑k

I{τk = bj}δk , σ2j

)

EM Algorithm Maximize Eθ[ log pθ(Z ,Y ) | Y ].

E step “Upward-Downward” Algorithm.

M step OU: increase objective function (GM).

Initialization LASSO regression (see next).

PB, MM, SR Shifted stochastic processes on trees 10/17

Page 21: Shifted stochastic processes evolving on trees: application to …pbastide.github.io/docs/20150602_JdS.pdf · 2020-06-29 · Stochastic Processes on Trees Identi ability Problems

Stochastic Processes on TreesIdentifiability Problems and Counting Issues

Statistical InferenceChelonia Data Set

References

Incomplete Data Model : EMLinear Regression ModelModel Selection

Linear Regression Model

Y5

Y4

Y3

Y2

Y1

Z1

Z4

Z2

Z3

δ3

δ1

δ2

∆ =

µδ1

00δ2

0δ3

00

T ∆ =

µ+ δ2

µµ+ δ1 + δ3

µ+ δ1

µ+ δ1

T =

Z1 Z2 Z3 Z4 Y1 Y2 Y3 Y4 Y5

Y1 1 0 0 1 1 0 0 0 0Y2 1 0 0 1 0 1 0 0 0Y3 1 1 0 0 0 0 1 0 0Y4 1 1 1 0 0 0 0 1 0Y5 1 1 1 0 0 0 0 0 1

BM : Y = T∆ + E

OU : Y = TW (α)∆+E

PB, MM, SR Shifted stochastic processes on trees 11/17

Page 22: Shifted stochastic processes evolving on trees: application to …pbastide.github.io/docs/20150602_JdS.pdf · 2020-06-29 · Stochastic Processes on Trees Identi ability Problems

Stochastic Processes on TreesIdentifiability Problems and Counting Issues

Statistical InferenceChelonia Data Set

References

Incomplete Data Model : EMLinear Regression ModelModel Selection

Model Selection on K

Proposition (Form of the Penalty and guaranties (α known))

Under our setting:

Y = R∆ + γE with E ∼ N (0,V ) and S = {Sη , η ∈M}, M =⋃

K≥0

SPIK

Define the following penalty:

pen(K) = An − K − 1

n − K − 2EDkhi[K + 2, n−K − 2, e−LK ], LK = log

∣∣∣SPIK

∣∣∣+ 2 log(K + 2)

and the estimator: η = argminη∈M

‖Y − sη‖2V

(1 +

pen(Kη)

n − Kη − 1

)Under some reasonable technical hypothesis, we get the non-asymptotic bound:

E

[∥∥s − sη

∥∥2

V

γ2

]≤ C(A, κ)

[inf

η∈M

{‖s − sη‖2

V

γ2+ Dη(3 + log(n))

}+ 1 + log(n)

]

PB, MM, SR Shifted stochastic processes on trees 12/17

Page 23: Shifted stochastic processes evolving on trees: application to …pbastide.github.io/docs/20150602_JdS.pdf · 2020-06-29 · Stochastic Processes on Trees Identi ability Problems

Stochastic Processes on TreesIdentifiability Problems and Counting Issues

Statistical InferenceChelonia Data Set

References

Incomplete Data Model : EMLinear Regression ModelModel Selection

Model Selection: Important Points

Based on Baraud et al. (2009)

Non-asymptotic bound.Unknown variance.No constant to be calibrated.

Novelties Non iid variance.Penalty depends on the tree topology(through

∣∣SPIK

∣∣).

PB, MM, SR Shifted stochastic processes on trees 13/17

Page 24: Shifted stochastic processes evolving on trees: application to …pbastide.github.io/docs/20150602_JdS.pdf · 2020-06-29 · Stochastic Processes on Trees Identi ability Problems

Stochastic Processes on TreesIdentifiability Problems and Counting Issues

Statistical InferenceChelonia Data Set

References

Chelonia Dataset

0

Unit

200 150 100 50 0

FreshwaterIslandMainlandSaltwater

Colors: habitats.Boxes: selected EM regimes.

PB, MM, SR Shifted stochastic processes on trees 14/17

Page 25: Shifted stochastic processes evolving on trees: application to …pbastide.github.io/docs/20150602_JdS.pdf · 2020-06-29 · Stochastic Processes on Trees Identi ability Problems

Stochastic Processes on TreesIdentifiability Problems and Counting Issues

Statistical InferenceChelonia Data Set

References

Chelonia Dataset

0

Unit

200 150 100 50 0

FreshwaterIslandMainlandSaltwater

Colors: habitats.Boxes: selected EM regimes.

Chelonia mydas

PB, MM, SR Shifted stochastic processes on trees 14/17

Page 26: Shifted stochastic processes evolving on trees: application to …pbastide.github.io/docs/20150602_JdS.pdf · 2020-06-29 · Stochastic Processes on Trees Identi ability Problems

Stochastic Processes on TreesIdentifiability Problems and Counting Issues

Statistical InferenceChelonia Data Set

References

Chelonia Dataset

0

Unit

200 150 100 50 0

FreshwaterIslandMainlandSaltwater

Colors: habitats.Boxes: selected EM regimes.

Geochelone nigra abingdoni

PB, MM, SR Shifted stochastic processes on trees 14/17

Page 27: Shifted stochastic processes evolving on trees: application to …pbastide.github.io/docs/20150602_JdS.pdf · 2020-06-29 · Stochastic Processes on Trees Identi ability Problems

Stochastic Processes on TreesIdentifiability Problems and Counting Issues

Statistical InferenceChelonia Data Set

References

Chelonia Dataset

0

Unit

200 150 100 50 0

FreshwaterIslandMainlandSaltwater

Colors: habitats.Boxes: selected EM regimes.

Chitra indica

PB, MM, SR Shifted stochastic processes on trees 14/17

Page 28: Shifted stochastic processes evolving on trees: application to …pbastide.github.io/docs/20150602_JdS.pdf · 2020-06-29 · Stochastic Processes on Trees Identi ability Problems

Stochastic Processes on TreesIdentifiability Problems and Counting Issues

Statistical InferenceChelonia Data Set

References

Chelonia Dataset

0

Unit

200 150 100 50 0

FreshwaterIslandMainlandSaltwater

Colors: habitats.Boxes: selected EM regimes.

Habitat EM

No. of shifts 16.00 5.00

No. of regimes 4.00 6.00

lnL -135.56 -97.59

ln 2/α (%) 7.83 5.43

γ2 0.35 0.22

CPU time (min) 1.25 134.49

PB, MM, SR Shifted stochastic processes on trees 14/17

Page 29: Shifted stochastic processes evolving on trees: application to …pbastide.github.io/docs/20150602_JdS.pdf · 2020-06-29 · Stochastic Processes on Trees Identi ability Problems

Stochastic Processes on TreesIdentifiability Problems and Counting Issues

Statistical InferenceChelonia Data Set

References

Conclusion and Perspectives

A general inference framework for trait evolution models.

Conclusions Some problems of identifiability arise.An EM can be written to maximize likelihood.Adaptation of model selection results to non-iidframework.

R codes Available on GitHub:https://github.com/pbastide/Phylogenetic-EM

Perspectives Multivariate traits.Deal with uncertainty (tree, data).Use fossil records.

PB, MM, SR Shifted stochastic processes on trees 15/17

Page 30: Shifted stochastic processes evolving on trees: application to …pbastide.github.io/docs/20150602_JdS.pdf · 2020-06-29 · Stochastic Processes on Trees Identi ability Problems

Stochastic Processes on TreesIdentifiability Problems and Counting Issues

Statistical InferenceChelonia Data Set

References

Bibliography

Y. Baraud, C. Giraud, and S. Huet. Gaussian model selection with an unknown variance. Annals of Statistics, 37(2):630–672, Apr. 2009. doi: 10.1214/07-AOS573.

M. A. Butler and A. A. King. Phylogenetic Comparative Analysis: A Modeling Approach for Adaptive Evolution.The American Naturalist, 164(6):pp. 683–695, 2004. ISSN 00030147.

J. Felsenstein. Phylogenies and the Comparative Method. The American Naturalist, 125(1):pp. 1–15, Jan. 1985.ISSN 00030147.

J. Felsenstein. Inferring Phylogenies. Sinauer Associates, Suderland, USA, 2004.

T. F. Hansen. Stabilizing selection and the comparative analysis of adaptation. Evolution, 51(5):1341–1351, oct1997.

A. L. Jaffe, G. J. Slater, and M. E. Alfaro. The evolution of island gigantism and body size variation in tortoisesand turtles. Biology letters, 2011.

Photo Credits :

- ”Parrot-beaked Tortoise Homopus areolatus CapeTown 8” by Abu Shawka - Own work. Licensed under CC0via Wikimedia Commons

- ”Leatherback sea turtle Tinglar, USVI (5839996547)” by U.S. Fish and Wildlife Service Southeast Region -Leatherback sea turtle/ Tinglar, USVIUploaded by AlbertHerring. Licensed under CC BY 2.0 viaWikimedia Commons

- ”Hawaii turtle 2” by Brocken Inaglory. Licensed under CC BY-SA 3.0 via Wikimedia Commons

- ”Dudhwalive chitra” by Krishna Kumar Mishra — Own work. Licensed under CC BY 3.0 via WikimediaCommons

- ”Lonesome George in profile” by Mike Weston - Flickr: Lonesome George 2. Licensed under CC BY 2.0 viaWikimedia Commons

PB, MM, SR Shifted stochastic processes on trees 16/17

Page 31: Shifted stochastic processes evolving on trees: application to …pbastide.github.io/docs/20150602_JdS.pdf · 2020-06-29 · Stochastic Processes on Trees Identi ability Problems

Stochastic Processes on TreesIdentifiability Problems and Counting Issues

Statistical InferenceChelonia Data Set

References

Thank you for listening

PB, MM, SR Shifted stochastic processes on trees 17/17