Considerate Approaches to ABC Model Selection

45
Considerate Approaches to ABC Model Selection Michael P.H. Stumpf, Christopher Barnes, Sarah Filippi, Thomas Thorne Theoretical Systems Biology Group 26/06/2012 Considerate Approaches to ABC Model Selection Stumpf et al. 1 of 15

description

Talk given at ISBA 2012 in the Approximate Bayesian Computation Special Topic Session

Transcript of Considerate Approaches to ABC Model Selection

Page 1: Considerate Approaches to ABC Model Selection

Considerate Approaches to ABC Model Selection

Michael P.H. Stumpf, ChristopherBarnes, Sarah Filippi, Thomas Thorne

Theoretical Systems Biology Group

26/06/2012

Considerate Approaches to ABC Model Selection Stumpf et al. 1 of 15

Page 2: Considerate Approaches to ABC Model Selection

Evolving Networks

(a) Duplication attachment (b) Duplication attachmentwith complimentarity

(c) Linear preferentialattachment

wi

wj

(d) General scale-free

Considerate Approaches to ABC Model Selection Stumpf et al. Model Selection 2 of 15

Page 3: Considerate Approaches to ABC Model Selection

Inference and Model Selection

We have observed data, D, that was generated by some system thatwe seek to describe by a mathematical model. In principle we canhave a model-set, M = M1, . . . , Mν, where each model Mi has anassociated parameter θi .We may know the different constituent parts of the system, Xi , andhave measurements for some or all of them under some experimentaldesigns, T.

Model Posterior︷ ︸︸ ︷Pr(Mi |T,D)

=

Likelihood︷ ︸︸ ︷Pr(D|Mi ,T)

Prior︷ ︸︸ ︷π(Mi)

ν∑j=1

Pr(D|Mj ,T)π(Mj)︸ ︷︷ ︸Evidence

For complicated models and/ordetailed data the likelihoodevaluation can becomeprohibitively expensive.

Approximate InferenceWe can approximate the likelihood and/or the models. The “true”model is unlikely to be in M anyway.

Considerate Approaches to ABC Model Selection Stumpf et al. Model Selection 3 of 15

Page 4: Considerate Approaches to ABC Model Selection

Inference and Model Selection

We have observed data, D, that was generated by some system thatwe seek to describe by a mathematical model. In principle we canhave a model-set, M = M1, . . . , Mν, where each model Mi has anassociated parameter θi .We may know the different constituent parts of the system, Xi , andhave measurements for some or all of them under some experimentaldesigns, T.

Model Posterior︷ ︸︸ ︷Pr(Mi |T,D)

=

Likelihood︷ ︸︸ ︷Pr(D|Mi ,T)

Prior︷ ︸︸ ︷π(Mi)

ν∑j=1

Pr(D|Mj ,T)π(Mj)︸ ︷︷ ︸Evidence

For complicated models and/ordetailed data the likelihoodevaluation can becomeprohibitively expensive.

Approximate InferenceWe can approximate the likelihood and/or the models. The “true”model is unlikely to be in M anyway.

Considerate Approaches to ABC Model Selection Stumpf et al. Model Selection 3 of 15

Page 5: Considerate Approaches to ABC Model Selection

Inference and Model Selection

We have observed data, D, that was generated by some system thatwe seek to describe by a mathematical model. In principle we canhave a model-set, M = M1, . . . , Mν, where each model Mi has anassociated parameter θi .We may know the different constituent parts of the system, Xi , andhave measurements for some or all of them under some experimentaldesigns, T.

Model Posterior︷ ︸︸ ︷Pr(Mi |T,D)=

Likelihood︷ ︸︸ ︷Pr(D|Mi ,T)

Prior︷ ︸︸ ︷π(Mi)

ν∑j=1

Pr(D|Mj ,T)π(Mj)︸ ︷︷ ︸Evidence

For complicated models and/ordetailed data the likelihoodevaluation can becomeprohibitively expensive.

Approximate InferenceWe can approximate the likelihood and/or the models. The “true”model is unlikely to be in M anyway.

Considerate Approaches to ABC Model Selection Stumpf et al. Model Selection 3 of 15

Page 6: Considerate Approaches to ABC Model Selection

Inference and Model Selection

We have observed data, D, that was generated by some system thatwe seek to describe by a mathematical model. In principle we canhave a model-set, M = M1, . . . , Mν, where each model Mi has anassociated parameter θi .We may know the different constituent parts of the system, Xi , andhave measurements for some or all of them under some experimentaldesigns, T.

Model Posterior︷ ︸︸ ︷Pr(Mi |T,D)=

Likelihood︷ ︸︸ ︷Pr(D|Mi ,T)

Prior︷ ︸︸ ︷π(Mi)

ν∑j=1

Pr(D|Mj ,T)π(Mj)︸ ︷︷ ︸Evidence

For complicated models and/ordetailed data the likelihoodevaluation can becomeprohibitively expensive.

Approximate InferenceWe can approximate the likelihood and/or the models. The “true”model is unlikely to be in M anyway.

Considerate Approaches to ABC Model Selection Stumpf et al. Model Selection 3 of 15

Page 7: Considerate Approaches to ABC Model Selection

Inference and Model Selection

We have observed data, D, that was generated by some system thatwe seek to describe by a mathematical model. In principle we canhave a model-set, M = M1, . . . , Mν, where each model Mi has anassociated parameter θi .We may know the different constituent parts of the system, Xi , andhave measurements for some or all of them under some experimentaldesigns, T.

Model Posterior︷ ︸︸ ︷Pr(Mi |T,D)=

Likelihood︷ ︸︸ ︷Pr(D|Mi ,T)

Prior︷ ︸︸ ︷π(Mi)

ν∑j=1

Pr(D|Mj ,T)π(Mj)︸ ︷︷ ︸Evidence

For complicated models and/ordetailed data the likelihoodevaluation can becomeprohibitively expensive.

Approximate InferenceWe can approximate the likelihood and/or the models. The “true”model is unlikely to be in M anyway.

Considerate Approaches to ABC Model Selection Stumpf et al. Model Selection 3 of 15

Page 8: Considerate Approaches to ABC Model Selection

Approximate Bayesian Computation

We can define the posterior as

p(θi |x) =f (x |θi)π(θi)

p(x)Here fi(x |θ) is the likelihood which is often hard to evaluate; considerfor example

y = max[0, y+g1+y×g2] with g1, g2 ∼ N(0,σ1/2) anddydt

= g(y ; θ).

But we can still simulate from the data-generating model, whence

p(θi |x) =∫X

1(y = x)f (y |θi)π(θi)

p(x)dy

≈∫X

1 (∆(y , x) < ε) f (y |θi)π(θi)

p(x)dy

Solutions for Complex Problems (?)Approximate (i) data, (ii) model or (iii) distance.

Considerate Approaches to ABC Model Selection Stumpf et al. Approximate Bayesian Computation 4 of 15

Page 9: Considerate Approaches to ABC Model Selection

Approximate Bayesian Computation

We can define the posterior as

p(θi |x) =f (x |θi)π(θi)

p(x)Here fi(x |θ) is the likelihood which is often hard to evaluate; considerfor example

y = max[0, y+g1+y×g2] with g1, g2 ∼ N(0,σ1/2) anddydt

= g(y ; θ).

But we can still simulate from the data-generating model, whence

p(θi |x) =∫X

1(y = x)f (y |θi)π(θi)

p(x)dy

≈∫X

1 (∆(y , x) < ε) f (y |θi)π(θi)

p(x)dy

Solutions for Complex Problems (?)Approximate (i) data, (ii) model or (iii) distance.

Considerate Approaches to ABC Model Selection Stumpf et al. Approximate Bayesian Computation 4 of 15

Page 10: Considerate Approaches to ABC Model Selection

Approximate Bayesian Computation

We can define the posterior as

p(θi |x) =f (x |θi)π(θi)

p(x)Here fi(x |θ) is the likelihood which is often hard to evaluate; considerfor example

y = max[0, y+g1+y×g2] with g1, g2 ∼ N(0,σ1/2) anddydt

= g(y ; θ).

But we can still simulate from the data-generating model, whence

p(θi |x) =∫X

1(y = x)f (y |θi)π(θi)

p(x)dy

≈∫X

1 (∆(y , x) < ε) f (y |θi)π(θi)

p(x)dy

Solutions for Complex Problems (?)Approximate (i) data, (ii) model or (iii) distance.

Considerate Approaches to ABC Model Selection Stumpf et al. Approximate Bayesian Computation 4 of 15

Page 11: Considerate Approaches to ABC Model Selection

ABC with Summary Statistics

If the data, D, are very complex and detailed, direct comparisonbetween real and simulated data becomes prohibitive. In suchsituations, which originally motivated ABC approaches, summarystatistics of the data are compared. We then have

pS,ε(θi |D) ∝∫X

1 (∆ (S(x)), S(yθ)) < ε) f (y |θ)π(θi)dy

Sufficient StatisticsThis only works is the statistic S(.) is sufficient, i.e. if for s = S(x) wehave

p(x |s, θ) = p(x |s)

Sufficency for Model SelectionIf S(.) is sufficient for parameter estimation (in all models iconsidered) it is not necessarily sufficient for model selection (Robertet al., PNAS (2011)).

Considerate Approaches to ABC Model Selection Stumpf et al. ABC Summary Statistics 5 of 15

Page 12: Considerate Approaches to ABC Model Selection

ABC with Summary Statistics

If the data, D, are very complex and detailed, direct comparisonbetween real and simulated data becomes prohibitive. In suchsituations, which originally motivated ABC approaches, summarystatistics of the data are compared. We then have

pS,ε(θi |D) ∝∫X

1 (∆ (S(x)), S(yθ)) < ε) f (y |θ)π(θi)dy

Sufficient StatisticsThis only works is the statistic S(.) is sufficient, i.e. if for s = S(x) wehave

p(x |s, θ) = p(x |s)

Sufficency for Model SelectionIf S(.) is sufficient for parameter estimation (in all models iconsidered) it is not necessarily sufficient for model selection (Robertet al., PNAS (2011)).

Considerate Approaches to ABC Model Selection Stumpf et al. ABC Summary Statistics 5 of 15

Page 13: Considerate Approaches to ABC Model Selection

ABC with Summary Statistics

If the data, D, are very complex and detailed, direct comparisonbetween real and simulated data becomes prohibitive. In suchsituations, which originally motivated ABC approaches, summarystatistics of the data are compared. We then have

pS,ε(θi |D) ∝∫X

1 (∆ (S(x)), S(yθ)) < ε) f (y |θ)π(θi)dy

Sufficient StatisticsThis only works is the statistic S(.) is sufficient, i.e. if for s = S(x) wehave

p(x |s, θ) = p(x |s)

Sufficency for Model SelectionIf S(.) is sufficient for parameter estimation (in all models iconsidered) it is not necessarily sufficient for model selection (Robertet al., PNAS (2011)).

Considerate Approaches to ABC Model Selection Stumpf et al. ABC Summary Statistics 5 of 15

Page 14: Considerate Approaches to ABC Model Selection

ABC with Summary Statistics

Generate data X ∼ N(1, 1) and use ABC to infer µ (assuming thatσ2 = 1 is known).

θ

p(θ)

0

200

400

600

0

50

100

150

200

250

mean

−4 −2 0 2 4min

−4 −2 0 2 4

0

5

10

15

20

25

30

0

50

100

150

200

250

300

var

−4 −2 0 2 4max

−4 −2 0 2 4

Role of Summary StatisticsMean (sufficient) correctly

infers µ.

Max/Min capture someinformation on µ.

Var fails to capture anyinformation on µ.

We need a way of constructingsets of statistics that together are(approximately) sufficient.

Considerate Approaches to ABC Model Selection Stumpf et al. ABC Summary Statistics 5 of 15

Page 15: Considerate Approaches to ABC Model Selection

ABC with Summary Statistics

Generate data X ∼ N(1, 1) and use ABC to infer µ (assuming thatσ2 = 1 is known).

θ

p(θ)

0

200

400

600

0

50

100

150

200

250

mean

−4 −2 0 2 4min

−4 −2 0 2 4

0

5

10

15

20

25

30

0

50

100

150

200

250

300

var

−4 −2 0 2 4max

−4 −2 0 2 4

Role of Summary StatisticsMean (sufficient) correctly

infers µ.

Max/Min capture someinformation on µ.

Var fails to capture anyinformation on µ.

We need a way of constructingsets of statistics that together are(approximately) sufficient.

Considerate Approaches to ABC Model Selection Stumpf et al. ABC Summary Statistics 5 of 15

Page 16: Considerate Approaches to ABC Model Selection

A Closer Look at Summary Statistics

We interpret a summary statistic as a function,

S : Rd −→ Rw , S(x) = s.

If S is sufficient then (we include the model indicator variable in θ)

p(θ|x) = p(θ|s)

Information Theoretical PerspectiveA summary statistic is an information compression device. Now let Sbe a set of statistics which together are sufficient. Then the mutualinformation

I(Θ; X ) =

∫Ω

∫X

p(θ, x) logp(θ, x)

p(θ)p(x)dθdx = I(θ, S)

Constructing Minimally Sufficient Summary StatisticsWe seek the set U ⊆ S with minimal cardinality such thatI(Θ; S) = I(Θ;U).

Considerate Approaches to ABC Model Selection Stumpf et al. ABC Summary Statistics 6 of 15

Page 17: Considerate Approaches to ABC Model Selection

A Closer Look at Summary Statistics

We interpret a summary statistic as a function,

S : Rd −→ Rw , S(x) = s.

If S is sufficient then (we include the model indicator variable in θ)

p(θ|x) = p(θ|s)Information Theoretical PerspectiveA summary statistic is an information compression device. Now let Sbe a set of statistics which together are sufficient. Then the mutualinformation

I(Θ; X ) =

∫Ω

∫X

p(θ, x) logp(θ, x)

p(θ)p(x)dθdx = I(θ, S)

Constructing Minimally Sufficient Summary StatisticsWe seek the set U ⊆ S with minimal cardinality such thatI(Θ; S) = I(Θ;U).

Considerate Approaches to ABC Model Selection Stumpf et al. ABC Summary Statistics 6 of 15

Page 18: Considerate Approaches to ABC Model Selection

A Closer Look at Summary Statistics

We interpret a summary statistic as a function,

S : Rd −→ Rw , S(x) = s.

If S is sufficient then (we include the model indicator variable in θ)

p(θ|x) = p(θ|s)Information Theoretical PerspectiveA summary statistic is an information compression device. Now let Sbe a set of statistics which together are sufficient. Then the mutualinformation

I(Θ; X ) =

∫Ω

∫X

p(θ, x) logp(θ, x)

p(θ)p(x)dθdx = I(θ, S)

Constructing Minimally Sufficient Summary StatisticsWe seek the set U ⊆ S with minimal cardinality such thatI(Θ; S) = I(Θ;U).

Considerate Approaches to ABC Model Selection Stumpf et al. ABC Summary Statistics 6 of 15

Page 19: Considerate Approaches to ABC Model Selection

Constructing Sufficient Statistics

Proposition

Let X be a random variable generated according to f (·|θ). Let S be asummary statistic and U and T two subsets of S such that U = U(X ),T = T(X ) and S = S(X ) satisfy U ⊂ T ⊂ S. We have

I(Θ; S|T ) = I(Θ; S|U) − I(Θ; T |U) .

In order to construct a subset T of S such that I(Θ; S|T ) = 0, it is thussufficient to add statistics from S one by one until the condition holds.If we denote by S(k) the k th statistic to be added (with k 6 w) we haveS(k) = S(k)(X ), and then

I(Θ; S|S(1), . . . , S(k+1)) 6 I(Θ; S|S(1), . . . , S(k)) .

Considerate Approaches to ABC Model Selection Stumpf et al. ABC Summary Statistics 7 of 15

Page 20: Considerate Approaches to ABC Model Selection

Constructing Sufficient Statistics

I(Θ; S|U) =

∫Ω

∫X

p(θ, S(x),U(x)) logp(θ, S(x)|U(x))

p(θ|U(x))p(S(x)|U(x))dxdθ

=

∫X

p(S(x)) [KL(p(Θ|S(x))||p(Θ|U(x)))] dx

= Ep(X) [KL(p(Θ|S(X ))||p(Θ|U(X )))]

An Impossible Algorithm• for all subsets u∗ ⊆ s∗ , perform ABC to obtain estimates pε(Θ|u∗)• determine the setA = u∗ ⊂ s∗ such that KL (pε(Θ|s∗)||pε(Θ|u∗)) = 0,

• the desired subset is argminu∗∈A |u∗|

Considerate Approaches to ABC Model Selection Stumpf et al. ABC Summary Statistics 7 of 15

Page 21: Considerate Approaches to ABC Model Selection

Constructing Sufficient Statistics

input: a sufficient set of statistics whose values on the dataset is s∗ =s∗1 , . . . , s∗w , a threshold δoutput: a subset v∗ of s∗

choose randomly u∗ in s∗

v∗ ← u∗

q∗ ← s∗\v∗

repeatrepeat

if q∗ = Ø then return v∗

end ifchoose randomly u∗ in q∗

q∗ ← q∗\u∗

perform ABC to obtain pε(Θ|v∗, u∗)until KL (pε(Θ|v∗, u∗)||pε(Θ|v∗)) > δoptionally: v∗ ← OrderDependency (v∗, u∗)v∗ ← v∗ ∪ u∗

q∗ ← s∗\v∗

until q∗ = Øreturn v∗

Considerate Approaches to ABC Model Selection Stumpf et al. ABC Summary Statistics 7 of 15

Page 22: Considerate Approaches to ABC Model Selection

Examples: Normal Distributions

y1, ...yd ∼ N(µ,σ21) and y1, ...yd ∼ N(µ,σ2

2)

Run

20

40

60

80

100

mean S2 range max random

Run

20

40

60

80

100

mean S2 range max random

Considerate Approaches to ABC Model Selection Stumpf et al. ABC Summary Statistics 8 of 15

Page 23: Considerate Approaches to ABC Model Selection

Examples: Normal Distributions

y1, ...yd ∼ N(µ,σ21) and y1, ...yd ∼ N(µ,σ2

2)

−2 0 2 4 6 8

−2

02

46

log(BF) predicted

log(

BF

) A

BC

−2 0 2 4 6 8

−2

02

46

8

log(BF) predicted

log(

BF

) A

BC

Considerate Approaches to ABC Model Selection Stumpf et al. ABC Summary Statistics 8 of 15

Page 24: Considerate Approaches to ABC Model Selection

Examples: Population Genetics

Constant PopulationSize

Run

20

40

60

80

100

S1 S2 S3 S4 S5 S6 S7 S8 S9 S10 S11

ExponentialPopulation Growth

Run

20

40

60

80

100

S1 S2 S3 S4 S5 S6 S7 S8 S9 S10 S11

Two-Island Modelwith Migration

Run

20

40

60

80

100

S1 S2 S3 S4 S5 S6 S7 S8 S9 S10 S11

[S1] Number of Segregating Sites; [S2] Number of Distinct Haplotypes,; [S3] Haplotype Homozygosity; [S4] Average SNPHomozygosity; [S5] Number of occurrences of most common haplotype; [S6] Mean number of pair-wise differences betweenhaplotypes; [S7] Number of Singleton Haplotypes; [S8] Number of Singleton SNPs; [S9] Linkage Disequilibrium.

Summary Statistic ChoiceThe choice of summary statistics appears to depend subtely on thetrue data-generating model. In light of coalescent processes this is,however, to be expected.

Considerate Approaches to ABC Model Selection Stumpf et al. ABC Summary Statistics 9 of 15

Page 25: Considerate Approaches to ABC Model Selection

Examples: Population Genetics

Constant PopulationSize

Run

20

40

60

80

100

S1 S2 S3 S4 S5 S6 S7 S8 S9 S10 S11

ExponentialPopulation Growth

Run

20

40

60

80

100

S1 S2 S3 S4 S5 S6 S7 S8 S9 S10 S11

Two-Island Modelwith Migration

Run

20

40

60

80

100

S1 S2 S3 S4 S5 S6 S7 S8 S9 S10 S11

[S1] Number of Segregating Sites; [S2] Number of Distinct Haplotypes,; [S3] Haplotype Homozygosity; [S4] Average SNPHomozygosity; [S5] Number of occurrences of most common haplotype; [S6] Mean number of pair-wise differences betweenhaplotypes; [S7] Number of Singleton Haplotypes; [S8] Number of Singleton SNPs; [S9] Linkage Disequilibrium.

Summary Statistic ChoiceThe choice of summary statistics appears to depend subtely on thetrue data-generating model. In light of coalescent processes this is,however, to be expected.

Considerate Approaches to ABC Model Selection Stumpf et al. ABC Summary Statistics 9 of 15

Page 26: Considerate Approaches to ABC Model Selection

Examples: Population Genetics

Constant PopulationSize

Run

20

40

60

80

100

S1 S2 S3 S4 S5 S6 S7 S8 S9 S10 S11

ExponentialPopulation Growth

Run

20

40

60

80

100

S1 S2 S3 S4 S5 S6 S7 S8 S9 S10 S11

Two-Island Modelwith Migration

Run

20

40

60

80

100

S1 S2 S3 S4 S5 S6 S7 S8 S9 S10 S11

[S1] Number of Segregating Sites; [S2] Number of Distinct Haplotypes,; [S3] Haplotype Homozygosity; [S4] Average SNPHomozygosity; [S5] Number of occurrences of most common haplotype; [S6] Mean number of pair-wise differences betweenhaplotypes; [S7] Number of Singleton Haplotypes; [S8] Number of Singleton SNPs; [S9] Linkage Disequilibrium.

Summary Statistic ChoiceThe choice of summary statistics appears to depend subtely on thetrue data-generating model. In light of coalescent processes this is,however, to be expected.

Considerate Approaches to ABC Model Selection Stumpf et al. ABC Summary Statistics 9 of 15

Page 27: Considerate Approaches to ABC Model Selection

Examples: Random Walks

Classical RandomWalk

Run

20

40

60

80

100

S1 S2 S3 S4 S5

Persistent RandomWalk

Run

20

40

60

80

100

S1 S2 S3 S4 S5

Biased RandomWalk

Run

20

40

60

80

100

S1 S2 S3 S4 S5

[S1] Mean square displacement; [S2] Mean x and y displacement; [S3] Mean square x and y displacement; [S4] Straightnessindex; [S5] Eigenvalues of gyration tensor.

Parameter Sufficiency for Complex ProblemsHere all statistics that have been chosen for parameter estimation arealso chosen for model selection.

Considerate Approaches to ABC Model Selection Stumpf et al. ABC Summary Statistics 9 of 15

Page 28: Considerate Approaches to ABC Model Selection

Conditioning on Information

s1 s2 s3

x

Θ

StatisticsSufficient: Implicates same area as

full data.

Ancillary: Implicates all values of θequally.

What is the meaning ofp(θ|s0, s1, . . . , sn)?

Let s = (s0, s1, . . . , sn), andassume I(θ, s) < I(θ, x) butε→ 0.This can happen for sufficientand ancillary s. In the lattercase we obtain

p(θ|s) = π(θ).

How about

p(t |s)

if s is not (quite) sufficient?

Considerate Approaches to ABC Model Selection Stumpf et al. Interpreting ABC 10 of 15

Page 29: Considerate Approaches to ABC Model Selection

Conditioning on Information

s1 s2

s3

x

Θ

StatisticsSufficient: Implicates same area as

full data.

Ancillary: Implicates all values of θequally.

What is the meaning ofp(θ|s0, s1, . . . , sn)?

Let s = (s0, s1, . . . , sn), andassume I(θ, s) < I(θ, x) butε→ 0.This can happen for sufficientand ancillary s. In the lattercase we obtain

p(θ|s) = π(θ).

How about

p(t |s)

if s is not (quite) sufficient?

Considerate Approaches to ABC Model Selection Stumpf et al. Interpreting ABC 10 of 15

Page 30: Considerate Approaches to ABC Model Selection

Conditioning on Information

s1 s2

s3

x

Θ

StatisticsSufficient: Implicates same area as

full data.

Ancillary: Implicates all values of θequally.

What is the meaning ofp(θ|s0, s1, . . . , sn)?

Let s = (s0, s1, . . . , sn), andassume I(θ, s) < I(θ, x) butε→ 0.This can happen for sufficientand ancillary s. In the lattercase we obtain

p(θ|s) = π(θ).

How about

p(t |s)

if s is not (quite) sufficient?

Considerate Approaches to ABC Model Selection Stumpf et al. Interpreting ABC 10 of 15

Page 31: Considerate Approaches to ABC Model Selection

Conditioning on Information

s1 s2

s3

x

Θ

StatisticsSufficient: Implicates same area as

full data.

Ancillary: Implicates all values of θequally.

What is the meaning ofp(θ|s0, s1, . . . , sn)?

Let s = (s0, s1, . . . , sn), andassume I(θ, s) < I(θ, x) butε→ 0.This can happen for sufficientand ancillary s. In the lattercase we obtain

p(θ|s) = π(θ).

How about

p(t |s)

if s is not (quite) sufficient?

Considerate Approaches to ABC Model Selection Stumpf et al. Interpreting ABC 10 of 15

Page 32: Considerate Approaches to ABC Model Selection

Model Selection vs. Model Checking

Model Selection: Several models M ∈M are compared and one ormore are chosen in light of the data: Find models whichare better than others.

Model Checking: The quality of a model Mi is assessed against theavailable data: Determine if a model is actually ‘good’.

Alternative Approach: ABCµ [Ratmann et al., PNAS].

Posterior Predictive ChecksWe are interested in the posterior predictive distribution,

p(t(X )|s(X )) =

∫Θ

p(t(X )|θ)p(θ|s(X ))dθ.

In particular we have

p(s(X )|s(X )) 6= p(s(X )|X )

unless t(X ) is sufficient.

Considerate Approaches to ABC Model Selection Stumpf et al. Interpreting ABC 11 of 15

Page 33: Considerate Approaches to ABC Model Selection

Model Selection vs. Model Checking

Model Selection: Several models M ∈M are compared and one ormore are chosen in light of the data: Find models whichare better than others.

Model Checking: The quality of a model Mi is assessed against theavailable data: Determine if a model is actually ‘good’.

Alternative Approach: ABCµ [Ratmann et al., PNAS].

Posterior Predictive ChecksWe are interested in the posterior predictive distribution,

p(t(X )|s(X )) =

∫Θ

p(t(X )|θ)p(θ|s(X ))dθ.

In particular we have

p(s(X )|s(X )) 6= p(s(X )|X )

unless t(X ) is sufficient.

Considerate Approaches to ABC Model Selection Stumpf et al. Interpreting ABC 11 of 15

Page 34: Considerate Approaches to ABC Model Selection

ABC on Network Data

(e) Duplication attachment (f) Duplication attachmentwith complimentarity

(g) Linear preferentialattachment

wi

wj

(h) General scale-free

Considerate Approaches to ABC Model Selection Stumpf et al. Network Evolution 12 of 15

Page 35: Considerate Approaches to ABC Model Selection

ABC on Network Data

Summarizing Networks• Data are noisy and incomplete.• We can simulate models of network

evolution, but this does not allow us tocalculate likelihoods for all but verytrivial models.

• There is also no sufficient statistic thatwould allow us to summarize networks,so ABC approaches require somethought.

• Many possible summary statistics ofnetworks are expensive to calculate.

Full likelihood: Wiuf et al., PNAS (2006).

ABC: Ratman et al., PLoS Comp.Biol. (2008).

ABC (better): Thorne & Stumpf, J.Roy.Soc. Interface (2012).

Stumpf & Wiuf, J. Roy. Soc. Interface (2010).

Considerate Approaches to ABC Model Selection Stumpf et al. Network Evolution 12 of 15

Page 36: Considerate Approaches to ABC Model Selection

Spectral Distances

a

b

c

d e0 1 1 1 01 0 1 1 01 1 0 0 01 1 0 0 10 0 0 1 0

a b c d e

abcde

A =

Graph SpectraGiven a graph G with nodes N and edges (i, j) ∈ E with i, j ∈ N, theadjacency matrix, A, of the graph is defined by

ai,j =

1 if (i, j) ∈ E ,

0 otherwise.

The eigenvalues, λ, of this matrix provide one way of defining thegraph spectrum.

Considerate Approaches to ABC Model Selection Stumpf et al. Network Evolution 12 of 15

Page 37: Considerate Approaches to ABC Model Selection

Spectral Distances

A simple distance measure between graphs having adjacencymatrices A and B, known as the edit distance, is to count the numberof edges that are not shared by both graphs,

D(A, B) =∑

i,j

(ai,j − bi,j)2.

However for unlabelled graphs we require some mapping h fromi ∈ NA to i ′ ∈ NB that minimizes the distance

D(A, B) > D ′h(A, B) =∑

i,j

(ai,j − bh(i),h(j))2,

Given a spectrum (which is relatively cheap to compute) we have

D ′(A, B) =∑

l

(λ(α)l − λ

(β)l

)2

Considerate Approaches to ABC Model Selection Stumpf et al. Network Evolution 13 of 15

Page 38: Considerate Approaches to ABC Model Selection

Spectral Distances

A simple distance measure between graphs having adjacencymatrices A and B, known as the edit distance, is to count the numberof edges that are not shared by both graphs,

D(A, B) =∑

i,j

(ai,j − bi,j)2.

However for unlabelled graphs we require some mapping h fromi ∈ NA to i ′ ∈ NB that minimizes the distance

D(A, B) > D ′h(A, B) =∑

i,j

(ai,j − bh(i),h(j))2,

Given a spectrum (which is relatively cheap to compute) we have

D ′(A, B) =∑

l

(λ(α)l − λ

(β)l

)2

Considerate Approaches to ABC Model Selection Stumpf et al. Network Evolution 13 of 15

Page 39: Considerate Approaches to ABC Model Selection

Spectral Distances

A simple distance measure between graphs having adjacencymatrices A and B, known as the edit distance, is to count the numberof edges that are not shared by both graphs,

D(A, B) =∑

i,j

(ai,j − bi,j)2.

However for unlabelled graphs we require some mapping h fromi ∈ NA to i ′ ∈ NB that minimizes the distance

D(A, B) > D ′h(A, B) =∑

i,j

(ai,j − bh(i),h(j))2,

Given a spectrum (which is relatively cheap to compute) we have

D ′(A, B) =∑

l

(λ(α)l − λ

(β)l

)2

Considerate Approaches to ABC Model Selection Stumpf et al. Network Evolution 13 of 15

Page 40: Considerate Approaches to ABC Model Selection

Protein Interaction Network Data

Species Proteins Interactions Genome size Sampling fraction

S.cerevisiae 5035 22118 6532 0.77

D. melanogaster 7506 22871 14076 0.53

H. pylori 715 1423 1589 0.45

E. coli 1888 7008 5416 0.35

Model

Mod

el p

roba

bilit

y

0.0

0.1

0.2

0.3

0.4

0.5

DA DAC LPA SF DACL DACR

Organism

S.cerevisae

D.melanogaster

H.pylori

E.coli

Model Selection• Inference here was based on all

the data, not summarystatistics.

• Duplication models receive thestrongest support from the data.

• Several models receive supportand no model is chosenunambiguously.

Considerate Approaches to ABC Model Selection Stumpf et al. Network Evolution 14 of 15

Page 41: Considerate Approaches to ABC Model Selection

Protein Interaction Network Data

Species Proteins Interactions Genome size Sampling fraction

S.cerevisiae 5035 22118 6532 0.77

D. melanogaster 7506 22871 14076 0.53

H. pylori 715 1423 1589 0.45

E. coli 1888 7008 5416 0.35

Model

Mod

el p

roba

bilit

y

0.0

0.1

0.2

0.3

0.4

0.5

DA DAC LPA SF DACL DACR

Organism

S.cerevisae

D.melanogaster

H.pylori

E.coli

Model Selection• Inference here was based on all

the data, not summarystatistics.

• Duplication models receive thestrongest support from the data.

• Several models receive supportand no model is chosenunambiguously.

Considerate Approaches to ABC Model Selection Stumpf et al. Network Evolution 14 of 15

Page 42: Considerate Approaches to ABC Model Selection

Protein Interaction Network Data

Species Proteins Interactions Genome size Sampling fraction

S.cerevisiae 5035 22118 6532 0.77

D. melanogaster 7506 22871 14076 0.53

H. pylori 715 1423 1589 0.45

E. coli 1888 7008 5416 0.35

Model

Mod

el p

roba

bilit

y

0.0

0.1

0.2

0.3

0.4

0.5

DA DAC LPA SF DACL DACR

Organism

S.cerevisae

D.melanogaster

H.pylori

E.coli

Model Selection• Inference here was based on all

the data, not summarystatistics.

• Duplication models receive thestrongest support from the data.

• Several models receive supportand no model is chosenunambiguously.

Considerate Approaches to ABC Model Selection Stumpf et al. Network Evolution 14 of 15

Page 43: Considerate Approaches to ABC Model Selection

Protein Interaction Network Data

0.0 0.4 0.8

05

1015

δD

A

0.0 0.4 0.8

02

46

8

α

0.0 0.4 0.8

05

1015

δ

DA

C

0.0 0.4 0.8

02

46

8

α

0.0 0.4 0.8

02

46

810

δ

DA

CL

0.0 0.4 0.8

01

23

4

α

0.0 0.4 0.8

02

46

810

p

0 2 4 6 8 10

0.0

0.2

0.4

0.6

0.8

1.0

m

0.0 0.4 0.8

02

46

8

δ

DA

CR

0.0 0.4 0.8

01

23

4

α

0.0 0.4 0.8

01

23

45

p

0 2 4 6 8 10

0.0

0.2

0.4

0.6

0.8

1.0

m

S.cerevisiaeD. melanogasterH. pyloriE. coli

Considerate Approaches to ABC Model Selection Stumpf et al. Network Evolution 14 of 15

Page 44: Considerate Approaches to ABC Model Selection

Considerate Use of ABC

• ABC is a tool for situations where conventional statisticalapproaches fail or are too cumbersome.

• If all the data are used then this is (relatively) unproblematic; if thedata are compressed/corrupted then caution is required.

• Some of the issues arising in ABC mirror those also encounteredin “conventional” statistics:

Any Bayesian inference uses the data only via the minimalsufficient statistic. This is because the calculation of theposterior distribution involves multiplying the likelihood by theprior and normalizing. Any factor of the likelihood that is afunction of y alone will disappear after normalization.

D. Cox (2006).• In other cases it seems prudent to accept the additional (and

considerable) computational cost of constructing suitable summarystatistics (such as in Barnes et al., Stat&Comp 2012).

Considerate Approaches to ABC Model Selection Stumpf et al. Conclusion 15 of 15

Page 45: Considerate Approaches to ABC Model Selection

Acknowledgements

Considerate Approaches to ABC Model Selection Stumpf et al. Conclusion 15 of 15