2019 SISG Module 8: Bayesian Statistics for Genetics...
Transcript of 2019 SISG Module 8: Bayesian Statistics for Genetics...
![Page 1: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the](https://reader033.fdocuments.us/reader033/viewer/2022060405/5f0f26df7e708231d442be44/html5/thumbnails/1.jpg)
2019 SISG Module 8: Bayesian Statistics forGenetics
Lecture 3: Binomial Sampling
Jon Wakefield
Departments of Statistics and BiostatisticsUniversity of Washington
1 / 67
![Page 2: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the](https://reader033.fdocuments.us/reader033/viewer/2022060405/5f0f26df7e708231d442be44/html5/thumbnails/2.jpg)
Outline
Introduction and Motivating Example
Bayesian Analysis of Binomial DataThe Beta PriorBayes Factors
Analysis of ASE Data
Conclusions
2 / 67
![Page 3: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the](https://reader033.fdocuments.us/reader033/viewer/2022060405/5f0f26df7e708231d442be44/html5/thumbnails/3.jpg)
Introduction
3 / 67
![Page 4: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the](https://reader033.fdocuments.us/reader033/viewer/2022060405/5f0f26df7e708231d442be44/html5/thumbnails/4.jpg)
Introduction
In this lecture we will consider the Bayesian modeling of binomialdata.
Two motivations for a binomial model:I a so-called allele specific expression (ASE) experiment will be
considered.I a time series of counts, in order to model prevalence of a
condition.
Conjugate priors will be described in detail.
Sampling from the posterior will be emphasized as a method forflexible inference.
4 / 67
![Page 5: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the](https://reader033.fdocuments.us/reader033/viewer/2022060405/5f0f26df7e708231d442be44/html5/thumbnails/5.jpg)
Motivating Example: Allele Specific Expression
I Gene expression variation is an important contribution tophenotypic variation within and between populations.
I Expression variation may be due to genetic or environmentalsources.
I Genetic variation may be due to cis- (local) or trans(distant)-acting mechanisms.
I Polymorphisms that act in cis affect expression in an allelespecific manner.
I RNA-Seq is a high throughput technology that allowsallele-specific expression (ASE) to be measured.
5 / 67
![Page 6: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the](https://reader033.fdocuments.us/reader033/viewer/2022060405/5f0f26df7e708231d442be44/html5/thumbnails/6.jpg)
Motivating Example: An Example of ASE
I The data we consider is in yeast, and is a controlled experimentin which two strains, BY and RM, are hybridized.
I Consider a gene with one exon and five SNPs within that exon.I Suppose the BY allele of the gene is expressed at a high level.I In contrast, the RM allele has a mutation in a transcription factor
binding site upstream of the gene that greatly reducesexpression of this allele.
I Then, in the mRNA isolated from the yeast, when we look just atthis gene, there are lots more BY mRNA molecules than RMmRNA molecules.
6 / 67
![Page 7: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the](https://reader033.fdocuments.us/reader033/viewer/2022060405/5f0f26df7e708231d442be44/html5/thumbnails/7.jpg)
Example of ASE
C
A
BY
RM
Figure: In the top figure the transcription factor (blue) leads to hightranscription. In the bottom figure an upstream polymorphism (red star)prevents the transcription factor from binding.
7 / 67
![Page 8: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the](https://reader033.fdocuments.us/reader033/viewer/2022060405/5f0f26df7e708231d442be44/html5/thumbnails/8.jpg)
Specifics of ASE Experiment
Details of the data:I Two “individuals” from genetically divergent yeast strains, BY and
RM, are mated to produce a diploid hybrid.I Three replicate experiments: same individuals, but separate
samples of cells.I Two technologies: Illumina and ABI SOLiD.I Each of a few trillion cells are processed.I Pre- and post-processing steps are followed by fragmentation to
give millions of 200–400 base pair long molecules, with shortreads obtained by sequencing.
I Need SNPs since otherwise the reference sequence is identicaland so we cannot tell which strain the read arises from.
I Strict criteria to call each read as a match are used, to reduceread-mapping bias.
I Data from 25,652 SNPs within 4,844 genes.I More details in Skelly et al. (2011).
8 / 67
![Page 9: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the](https://reader033.fdocuments.us/reader033/viewer/2022060405/5f0f26df7e708231d442be44/html5/thumbnails/9.jpg)
The Data
Table: First few rows of ASE data.
BY Count Total Count MLE θ̂62 107 0.5833 59 0.56
658 1550 0.4214 61 0.2357 153 0.37
218 451 0.4810 19 0.53
......
...
9 / 67
![Page 10: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the](https://reader033.fdocuments.us/reader033/viewer/2022060405/5f0f26df7e708231d442be44/html5/thumbnails/10.jpg)
Simple Approach to Testing for ASEFor a generic gene:
I Let N be the total number of counts at a particular gene, and Ythe number of reads to the BY strain.
I Let θ be the probability of a map to BY.I A simple approach is to assume:
Y |θ ∼ Binomial(N, θ),
and carry out a test of H0 : θ = 0.5, which corresponds to noallele specific expression.
I A non-Bayesian approach might use an exact test, i.e.enumerate the probability, under the null, of all the outcomes thatare equal to or more extreme than that observed.
I Issues:I p-values are not uniform under the null due to discreteness of Y .I How to pick a threshold? In general and when there are multiple
tests.I Do we really want a point null, i.e. θ = 0.5?I How would a Bayesian perform inference for this problem?
10 / 67
![Page 11: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the](https://reader033.fdocuments.us/reader033/viewer/2022060405/5f0f26df7e708231d442be44/html5/thumbnails/11.jpg)
p−values
Freq
uenc
y
0.0 0.2 0.4 0.6 0.8 1.0
020
040
060
080
010
0012
00
Figure: p-values from 4,844 exact tests.
11 / 67
![Page 12: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the](https://reader033.fdocuments.us/reader033/viewer/2022060405/5f0f26df7e708231d442be44/html5/thumbnails/12.jpg)
Motivating Example: Smoothing/Penalization
I When looking at estimates over space or time, we want to know ifthe differences we see are “real”, or simply reflecting samplingvariability.
I In data sparse situations, when one expects similarity smoothinglocal patterns (in time, space, or both) can be highly beneficial.
I This can equivalently be thought of penalization, in which largedeviations from “neighbors”, suitably defined, are discouraged.
I In the examples that follow we will generically think of modelingprevalence.
I We give an example of temporal modeling.
12 / 67
![Page 13: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the](https://reader033.fdocuments.us/reader033/viewer/2022060405/5f0f26df7e708231d442be44/html5/thumbnails/13.jpg)
Motivation for Smoothing: Temporal Case
I Temporal setting: Even if the underlying prevalence is the sameover time, we will see differences in the empirical estimates.
I Figure 3 demonstrates: We sampled binomial data withn = 10,20,200 and p = 0.2 (shown in blue) in all cases.
I In the top plot in particular, we might conclude large temporalvariation, but all we are seeing is sampling variation.
I Figure 4 summarizes estimates from a second simulation inwhich there is a real temporal pattern – here we would not wantto oversmooth and remove the trend.
I Later (Lecture 5) I will apply temporal smoothing models to thesetwo sets of data.
13 / 67
![Page 14: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the](https://reader033.fdocuments.us/reader033/viewer/2022060405/5f0f26df7e708231d442be44/html5/thumbnails/14.jpg)
0.0
0.1
0.2
0.3
0.4
0.5
0 20 40 60
Time (months)
Pre
vale
nce
Est
imat
e
n1=10
0.0
0.1
0.2
0.3
0.4
0.5
0 20 40 60
Time (months)
Pre
vale
nce
Est
imat
en2=20
0.0
0.1
0.2
0.3
0.4
0.5
0 20 40 60
Time (months)
Pre
vale
nce
Est
imat
e
n3=200
Figure: Prevalence estimates over time from simulated data with trueprevalence of p = 0.2 (blue solid lines).
14 / 67
![Page 15: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the](https://reader033.fdocuments.us/reader033/viewer/2022060405/5f0f26df7e708231d442be44/html5/thumbnails/15.jpg)
0.0
0.2
0.4
0.6
0.8
0 20 40 60
Time (months)
Pre
vale
nce
Est
imat
e
n1=10
0.0
0.2
0.4
0.6
0.8
0 20 40 60
Time (months)
Pre
vale
nce
Est
imat
en2=20
0.0
0.2
0.4
0.6
0.8
0 20 40 60
Time (months)
Pre
vale
nce
Est
imat
e
n3=200
Figure: Prevalence estimates over time from simulated data, true prevalencecorresponds to curved blue solid line.
15 / 67
![Page 16: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the](https://reader033.fdocuments.us/reader033/viewer/2022060405/5f0f26df7e708231d442be44/html5/thumbnails/16.jpg)
Bayesian Analysis of Binomial Data
16 / 67
![Page 17: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the](https://reader033.fdocuments.us/reader033/viewer/2022060405/5f0f26df7e708231d442be44/html5/thumbnails/17.jpg)
Bayes Theorem Recap
I We derive the posterior distribution via Bayes theorem:
p(θ|y) =Pr(y |θ)× p(θ)
Pr(y). (1)
I The denominator:
Pr(y) =
∫Pr(y |θ)× p(θ)dθ = E[Pr(y |θ)]
is a normalizing constant to ensure the RHS of (1) integrates to 1(we assume a continuous parameter θ).
I More colloquially:
Posterior ∝ Likelihood × Prior= Pr(y |θ)× p(θ)
since in considering the posterior we only need to worry aboutterms that depend on the parameter θ.
17 / 67
![Page 18: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the](https://reader033.fdocuments.us/reader033/viewer/2022060405/5f0f26df7e708231d442be44/html5/thumbnails/18.jpg)
Overview of Bayesian Inference
Simply put, to carry out a Bayesian analysis one must specify alikelihood (probability distribution for the data) and a prior (beliefsabout the parameters of the model).
And then do some computation... and interpretation...
The approach is therefore model-based, in contrast to approaches inwhich only the mean and the variance of the data are specified(e.g., weighted least squares, quasi-likelihood).
18 / 67
![Page 19: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the](https://reader033.fdocuments.us/reader033/viewer/2022060405/5f0f26df7e708231d442be44/html5/thumbnails/19.jpg)
Overview of Bayesian Inference
To carry out inference, integration is required, and a large fraction ofthe Bayesian research literature focusses on this aspect. Bayesianapproaches to:
1. Estimation: marginal posterior distributions on parameters ofinterest.
2. Hypothesis Testing: Bayes factors give the evidence in the datawith respect to two or more hypotheses, and provide oneapproach.
3. Prediction: via the predictive distribution.These three endeavors will now be described in the context of abinomial model.
19 / 67
![Page 20: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the](https://reader033.fdocuments.us/reader033/viewer/2022060405/5f0f26df7e708231d442be44/html5/thumbnails/20.jpg)
Elements of Bayes Theorem for a Binomial ModelWe assume independent responses with a common “success”probability θ.
In this case, the contribution of the data is through the binomialprobability distribution:
Pr(Y = y |θ) =
(Ny
)θy (1− θ)N−y (2)
and tells us the probability of seeing Y = y , y = 0,1, . . . ,N given theprobability θ.
For fixed y , we may view (2) as a function of θ – this is the likelihoodfunction.
The maximum likelihood estimate (MLE) is that value
θ̂ = y/N
that gives the highest probability to the observed data, i.e. maximizesthe likelihood function.
20 / 67
![Page 21: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the](https://reader033.fdocuments.us/reader033/viewer/2022060405/5f0f26df7e708231d442be44/html5/thumbnails/21.jpg)
0 2 4 6 8 10
0.00
0.05
0.10
0.15
0.20
0.25
N=10,θ=0.5
y
Binom
ial Pro
bability
0 2 4 6 8 10
0.00
0.05
0.10
0.15
0.20
0.25
N=10,θ=0.3
y
Binom
ial Pro
bability
Figure: Binomial distributions for two values of θ with N = 10.
21 / 67
![Page 22: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the](https://reader033.fdocuments.us/reader033/viewer/2022060405/5f0f26df7e708231d442be44/html5/thumbnails/22.jpg)
0.0 0.2 0.4 0.6 0.8 1.0
0.000.05
0.100.15
0.200.25
N=10, y=5
θ
Binomi
al Likel
ihood
0.0 0.2 0.4 0.6 0.8 1.0
0.000.05
0.100.15
0.200.25
N=10, y=3
θ
Binomi
al Likel
ihood
Figure: Binomial likelihoods for values of y = 5 (left) and y = 10 (right), withN = 10. The MLEs are indicated in red.
22 / 67
![Page 23: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the](https://reader033.fdocuments.us/reader033/viewer/2022060405/5f0f26df7e708231d442be44/html5/thumbnails/23.jpg)
The Beta Distribution as a Prior Choice for Binomial θ
I Bayes theorem requires the likelihood, which we have alreadyspecified as binomial, and the prior.
I For a probability 0 < θ < 1 an obvious candidate prior is theuniform distribution on (0,1): but this is too restrictive in general.
I The beta distribution, beta(a,b), is more flexible and so may beused for θ, with a and b specified in advance, i.e., a priori. Theuniform distribution is a special case with a = b = 1.
I The form of the beta distribution is
p(θ) =Γ(a + b)
Γ(a)Γ(b)θa−1(1− θ)b−1
for 0 < θ < 1, where Γ(·) is the gamma function1.I The distribution is valid2 for a > 0,b > 0.
1Γ(z) =∫∞
0 tz−1e−t dt2A distribution is valid if it is non-negative and integrates to 1
23 / 67
![Page 24: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the](https://reader033.fdocuments.us/reader033/viewer/2022060405/5f0f26df7e708231d442be44/html5/thumbnails/24.jpg)
The Beta Distribution as a Prior Choice for Binomial θ
How can we think about specifying a and b?
For the normal distribution the parameters µ and σ2 are just the meanand variance, but for the beta distribution a and b have no suchsimple interpretation.
The mean and variance are:
E[θ] =a
a + b
var(θ) =E[θ](1− E[θ])
a + b + 1.
Hence, increasing a and/or b concentrates the distribution about themean.
The quantiles, e.g. the median or the 10% and 90% points, are notavailable as a simple formula, but are easily obtained within softwaresuch as R using the function qbeta(p,a,b).
24 / 67
![Page 25: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the](https://reader033.fdocuments.us/reader033/viewer/2022060405/5f0f26df7e708231d442be44/html5/thumbnails/25.jpg)
0.0 0.2 0.4 0.6 0.8 1.0
0.00.2
0.40.6
0.81.0
1.2a=1, b=1
θ
Beta
Dens
ity
0.0 0.2 0.4 0.6 0.8 1.0
0.00.5
1.01.5
2.0
a=1, b=2
θ
Beta
Dens
ity
0.0 0.2 0.4 0.6 0.8 1.0
01
23
45
a=1, b=5
θ
Beta
Dens
ity
0.0 0.2 0.4 0.6 0.8 1.0
0.00.5
1.01.5
a=2, b=2
θ
Beta
Dens
ity
0.0 0.2 0.4 0.6 0.8 1.0
0.00.5
1.01.5
2.0
a=4, b=2
θ
Beta
Dens
ity
0.0 0.2 0.4 0.6 0.8 1.0
0.00.5
1.01.5
2.02.5
a=5, b=5
θ
Beta
Dens
ityFigure: Beta distributions, beta(a, b), the red lines indicate the means.
25 / 67
![Page 26: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the](https://reader033.fdocuments.us/reader033/viewer/2022060405/5f0f26df7e708231d442be44/html5/thumbnails/26.jpg)
Samples to Summarize Beta Distributions
Probability distributions can be investigated by generating samplesand then examining histograms, moments and quantiles.
In Figure 8 we show histograms of beta distributions for differentchoices of a and b.
26 / 67
![Page 27: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the](https://reader033.fdocuments.us/reader033/viewer/2022060405/5f0f26df7e708231d442be44/html5/thumbnails/27.jpg)
a=1, b=1
θ
Beta
Dens
ity
0.0 0.2 0.4 0.6 0.8 1.0
0.00.2
0.40.6
0.81.0
a=1, b=2
θ
Beta
Dens
ity
0.0 0.2 0.4 0.6 0.8 1.0
0.00.5
1.01.5
2.0
a=1, b=5
θ
Beta
Dens
ity
0.0 0.2 0.4 0.6 0.8
01
23
4
a=2, b=2
θ
Beta
Dens
ity
0.0 0.2 0.4 0.6 0.8 1.0
0.00.5
1.01.5
a=4, b=2
θ
Beta
Dens
ity
0.0 0.2 0.4 0.6 0.8 1.0
0.00.5
1.01.5
2.0
a=5, b=5
θ
Beta
Dens
ity
0.0 0.2 0.4 0.6 0.8 1.0
0.00.5
1.01.5
2.0
Figure: Random samples from beta distributions; sample means as red lines.
27 / 67
![Page 28: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the](https://reader033.fdocuments.us/reader033/viewer/2022060405/5f0f26df7e708231d442be44/html5/thumbnails/28.jpg)
Samples for Describing Weird Parameters
I So far the samples we havegenerated have producedsummaries we can easilyobtain anyway.
I But what about functions ofthe probability θ, such as theodds θ/(1− θ)?
I Once we have samples for θwe can simply transform thesamples to the functions ofinterest.
I We may have clearer prioropinions about the odds, thanthe probability.
Odds with θ from a beta(10,10)
Odds
Fre
quen
cy
0 1 2 3 4 5
050
010
0015
0020
00Figure: Samples from the prior on theodds θ/(1− θ) with θ ∼ beta(10, 10),the red line indicates the samplemean.
28 / 67
![Page 29: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the](https://reader033.fdocuments.us/reader033/viewer/2022060405/5f0f26df7e708231d442be44/html5/thumbnails/29.jpg)
Issues with UniformityWe might think that if we have little prior opinion about a parameterthen we can simply assign a uniform prior, i.e. a prior
p(θ) ∝ const.
There are two problems with this strategy:I We can’t be uniform on all scales since, if φ = g(θ):
pφ(φ)︸ ︷︷ ︸Prior for φ
= pθ(g−1(φ))︸ ︷︷ ︸Prior for θ
×∣∣∣∣ dθdφ
∣∣∣∣︸ ︷︷ ︸Jacobian
and so if g(·) is a nonlinear function, the Jacobian will be afunction of φ and hence not uniform.
I If the parameter is not on a finite range, an improper distributionwill result (that is, the form will not integrate to 1). This can leadto an improper posterior distribution, and without a properposterior we can’t do inference.
29 / 67
![Page 30: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the](https://reader033.fdocuments.us/reader033/viewer/2022060405/5f0f26df7e708231d442be44/html5/thumbnails/30.jpg)
Are Priors Really Uniform?
I We illustrate the first (non-uniform onall scales) point.
I In the binomial example a uniformprior for θ seems a natural choice.
I But suppose we are going to modelon the logistic scale so that
φ = log
(θ
1− θ
)is a quantity of interest.
I A uniform prior on θ produces thevery non-uniform distribution on φ inFigure 10.
I Not being uniform on all scales is notnecessarily a problem, and is correctprobabilistically, but one should beaware of this characteristic.
Log Odds with θ from a beta(1,1)
Log Odds φ
Fre
quen
cy
−10 −5 0 5
010
020
030
040
050
060
0
Figure: Samples from the prioron the odds φ = log[θ/(1− θ)]with θ ∼ beta(1, 1), the redline indicates the samplemean.
30 / 67
![Page 31: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the](https://reader033.fdocuments.us/reader033/viewer/2022060405/5f0f26df7e708231d442be44/html5/thumbnails/31.jpg)
Posterior Derivation: The Quick Way
I When we want to identify a particular probability distribution weonly need to concentrate on terms that involve the randomvariable.
I For example, if the random variable is X and we see a density ofthe form
p(x) ∝ exp(c1x2 + c2x),
for constants c1 and c2, then we know that the random variable Xmust have a normal distribution.
31 / 67
![Page 32: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the](https://reader033.fdocuments.us/reader033/viewer/2022060405/5f0f26df7e708231d442be44/html5/thumbnails/32.jpg)
Posterior Derivation: The Quick Way
I For the binomial-beta model we concentrate on terms that onlyinvolve θ.
I The posterior is
p(θ|y) ∝ Pr(y |θ)× p(θ)
= θy (1− θ)N−y × θa−1(1− θ)b−1
= θy+a−1(1− θ)N−y+b−1
I We recognize this as the important part of aBeta(y + a,N − y + b) distribution.
I We know what the normalizing constant must be, because wehave a distribution which must integrate to 1.
32 / 67
![Page 33: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the](https://reader033.fdocuments.us/reader033/viewer/2022060405/5f0f26df7e708231d442be44/html5/thumbnails/33.jpg)
Posterior Derivation: The Long (Unnecessary) Way
I The posterior can also be calculated by keeping in all thenormalizing constants:
p(θ|y) =Pr(y |θ)× p(θ)
Pr(y)
=1
Pr(y)
(Ny
)θy (1− θ)N−y Γ(a + b)
Γ(a)Γ(b)θa−1(1− θ)b−1. (3)
I The normalizing constant is
Pr(y) =
∫ 1
0Pr(y |θ)× p(θ)dθ
=
(Ny
)Γ(a + b)
Γ(a)Γ(b)
∫ 1
0θy+a−1(1− θ)N−y+b−1dθ
=
(Ny
)Γ(a + b)
Γ(a)Γ(b)
Γ(y + a)Γ(N − y + b)
Γ(N + a + b)
I The integrand on line 2 is a Beta(y + a,N − y + b) distribution,up to a normalizing constant, and so we know what this constanthas to be.
33 / 67
![Page 34: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the](https://reader033.fdocuments.us/reader033/viewer/2022060405/5f0f26df7e708231d442be44/html5/thumbnails/34.jpg)
Posterior Derivation: The Long (and Unnecessary)Way
I The normalizing constant is therefore:
Pr(y) =
(Ny
)Γ(a + b)
Γ(a)Γ(b)
Γ(y + a)Γ(N − y + b)
Γ(N + a + b)
I This is a probability distribution, i.e.∑N
y=0 Pr(y) = 1 withPr(y) > 0.
I For a particular y value, this expression tells us the probability ofthat value given the model, i.e. the likelihood and prior we haveselected: this will reappear later in the context of hypothesistesting.
I Substitution of Pr(y) into (3) and canceling the terms that appearin the numerator and denominator gives the posterior:
p(θ|y) =Γ(N + a + b)
Γ(y + a)Γ(N − y + b)θy+a−1(1− θ)N−y+b−1
which is a Beta(y + a,N − y + b).34 / 67
![Page 35: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the](https://reader033.fdocuments.us/reader033/viewer/2022060405/5f0f26df7e708231d442be44/html5/thumbnails/35.jpg)
The Posterior Mean: A Summary of the PosteriorI Recall the mean of a Beta(a,b) is a/(a + b).I The posterior mean of a Beta(y + a,N − y + b) is therefore
E[θ|y ] =y + a
N + a + b
=y
N + a + b+
aN + a + b
=yN× N
N + a + b+
aa + b
× a + bN + a + b
= MLE×W + Prior Mean× (1-W).
I The weight W is
W =N
N + a + b.
I As N increases, the weight tends to 1, so that the posterior meangets closer and closer to the MLE.
I Notice that the uniform prior a = b = 1 gives a posterior mean of
E[θ|y ] =y + 1N + 2
.
35 / 67
![Page 36: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the](https://reader033.fdocuments.us/reader033/viewer/2022060405/5f0f26df7e708231d442be44/html5/thumbnails/36.jpg)
The Posterior ModeI First, note that the mode of a Beta(a,b) is
mode(θ) =a− 1
a + b − 2.
I As with the posterior mean, the posterior mode takes a weightedform:
mode(θ|y) =y + a− 1
N + a + b − 2
=yN× N
N + a + b − 2+
a− 1a + b − 2
× a + b − 2N + a + b − 2
= MLE×W? + Prior Mode× (1-W?).
I The weight W? is
W? =N
N + a + b − 2.
I Notice that the uniform prior a = b = 1 gives a posterior mode of
mode(θ|y) =yN,
the MLE. Which makes sense, right?36 / 67
![Page 37: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the](https://reader033.fdocuments.us/reader033/viewer/2022060405/5f0f26df7e708231d442be44/html5/thumbnails/37.jpg)
Other Posterior Summaries
I We will rarely want to report a point estimate alone, whether it bea posterior mean or posterior median.
I Interval estimates are obtained in the obvious way.I A simple way of performing testing of particular parameter values
of interest is via examination of interval estimates.I For example, does a 95% interval contain the value θ0 = 0.5?
37 / 67
![Page 38: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the](https://reader033.fdocuments.us/reader033/viewer/2022060405/5f0f26df7e708231d442be44/html5/thumbnails/38.jpg)
Other Posterior Summaries
I In our beta-binomial running example, a 90% posterior credibleinterval (θL, θU) results from the points
0.05 =
∫ θL
0p(θ|y) dθ
0.95 =
∫ θU
0p(θ|y) dθ
I The quantiles of a beta are not available in closed form, but easyto evaluate in R:
y <− 7; N <− 10; a <− b <− 1qbeta ( c ( 0 . 0 5 , 0 . 5 , 0 . 9 5 ) , y+a ,N−y+b )[ 1 ] 0.4356258 0.6761955 0.8649245
I The 90% credible interval is (0.44,0.86) and the posterior medianis 0.68.
38 / 67
![Page 39: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the](https://reader033.fdocuments.us/reader033/viewer/2022060405/5f0f26df7e708231d442be44/html5/thumbnails/39.jpg)
Prior Sensitivity
I For small datasets in particular it is a good idea to examine thesensitivity of inference to the prior choice, particularly for thoseparameters for which there is little information in the data.
I An obvious way to determine the latter is to compare the priorwith the posterior, but experience often aids the process.
I Sometimes one may specify a prior that reduces the impact ofthe prior.
I In some situations, priors can be found that produce point andinterval estimates that mimic a standard non-Bayesian analysis,i.e. have good frequentist properties.
I Such priors provide a baseline to compare analyses with moresubstantive priors.
I Other names for such priors are objective, reference andnon-subjective.
I We now describe another approach to specification, viasubjective priors.
39 / 67
![Page 40: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the](https://reader033.fdocuments.us/reader033/viewer/2022060405/5f0f26df7e708231d442be44/html5/thumbnails/40.jpg)
Choosing a Prior, Approach One
I To select a beta, we need to specify two quantities, a and b.I The posterior mean is
E[θ|y ] =y + a
N + a + b.
I Viewing the denominator as a sample size suggests a method forchoosing a and b within the prior.
I We need to specify two numbers, but rather than a and b, whichare difficult to interpret, we may specify the meanmprior = a/(a + b) and the prior sample size Nprior = a + b
I We then solve for a and b via
a = Nprior ×mprior
b = Nprior × (1−mprior).
I Intuition: a is like a prior number of successes and b like the priornumber of failures.
40 / 67
![Page 41: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the](https://reader033.fdocuments.us/reader033/viewer/2022060405/5f0f26df7e708231d442be44/html5/thumbnails/41.jpg)
An Example
I Suppose we set Nprior = 5 and mprior = 25 .
I It is as if we saw 2 successes out of 5.I Suppose we obtain data with N = 10 and y
N = 710 .
I Hence W = 10/(10 + 5) and
E[θ|y ] =7
10× 10
10 + 5+
25× 5
10 + 5
=9
15=
35.
I Solving:
a = Nprior ×mprior = 5× 25
= 2
b = Nprior × (1−mprior) = 5× 35
= 3
I This gives a Beta(y + a,N − y + b) = Beta(7 + 2,3 + 3) posterior.
41 / 67
![Page 42: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the](https://reader033.fdocuments.us/reader033/viewer/2022060405/5f0f26df7e708231d442be44/html5/thumbnails/42.jpg)
Beta Prior, Likelihood and Posterior
0.0 0.2 0.4 0.6 0.8 1.0
0.00.5
1.01.5
2.02.5
3.0
θ
Dens
ity
PriorLikelihoodPosterior
Figure: The prior is Beta(2,3) the likelihood is proportional to a Beta(7,3) andthe posterior is Beta(7+2,3+3).
42 / 67
![Page 43: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the](https://reader033.fdocuments.us/reader033/viewer/2022060405/5f0f26df7e708231d442be44/html5/thumbnails/43.jpg)
Choosing a Prior, Approach Two
I An alternative convenient way ofchoosing a and b is by specifying twoquantiles for θ with associated (prior)probabilities.
I For example, we may wishPr(θ < 0.1) = 0.05 andPr(θ > 0.6) = 0.05.
I The values of a and b may be foundnumerically. For example, we maysolve
[p1 − Pr(θ < q1|a,b)]2
+[p2 − Pr(θ < q2|a,b)]2 = 0
for a,b.
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.5
1.0
1.5
2.0
2.5
θ
Bet
a D
ensi
ty
Figure: Beta(2.73,5.67) priorwith 5% and 95% quantileshighlighted.
43 / 67
![Page 44: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the](https://reader033.fdocuments.us/reader033/viewer/2022060405/5f0f26df7e708231d442be44/html5/thumbnails/44.jpg)
Bayesian Sequential Updating
I We show how probabilistic beliefs are updated as we receivemore data.
I Suppose the data arrives sequentially via two experiments:1. Experiment 1: (y1,N1).2. Experiment 2: (y2,N2).
I Prior 1: θ ∼ beta(a,b).I Likelihood 1: y1|θ ∼ binomial(N1, θ).I Posterior 1: θ|y1 ∼ beta(a + y1,b + N1 − y1).I This posterior forms the prior for experiment 2.I Prior 2: θ ∼ beta(a?,b?) where a? = a + y1, b? = b + N1 − y1.I Likelihood 2: y2|θ ∼ binomial(N2, θ).I Posterior 2: θ|y1, y2 ∼ beta(a? + y2,b? + N2 − y2).I Substituting for a?,b?:
θ|y1, y2 ∼ beta(a + y1 + y2,b + N1 − y1 + N2 − y2).
44 / 67
![Page 45: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the](https://reader033.fdocuments.us/reader033/viewer/2022060405/5f0f26df7e708231d442be44/html5/thumbnails/45.jpg)
Bayesian Sequential Updating
I Schematically:
(a,b)→ (a + y1,b + N1−y1)→ (a + y1 + y2,b + N1−y1 + N2−y2)
I Suppose we obtain the data in one go as y? = y1 + y2 successesfrom N? = N1 + N2 trials.
I The posterior is
θ|y? ∼ beta(a + y?,b + N? − y?),
which is the same as when we receive in two separate instances.
45 / 67
![Page 46: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the](https://reader033.fdocuments.us/reader033/viewer/2022060405/5f0f26df7e708231d442be44/html5/thumbnails/46.jpg)
Predictive Distribution
I Suppose we see y successes out of N trials, and now wish toobtain a predictive distribution for a future experiment with Mtrials.
I Let Z = 0,1, . . . ,M be the number of successes.I Predictive distribution:
Pr(z|y) =
∫ 1
0p(z, θ|y)dθ
=
∫ 1
0Pr(z|θ, y)p(θ|y)dθ
=
∫ 1
0Pr(z|θ)︸ ︷︷ ︸binomial
×p(θ|y)︸ ︷︷ ︸posterior
dθ
where we move between lines 2 and 3 because z is conditionallyindependent of y given θ.
46 / 67
![Page 47: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the](https://reader033.fdocuments.us/reader033/viewer/2022060405/5f0f26df7e708231d442be44/html5/thumbnails/47.jpg)
Predictive DistributionContinuing with the calculation:
Pr(z|y) =
∫ 1
0Pr(z|θ)× p(θ|y)dθ
=
∫ 1
0
(Mz
)θ
z (1− θ)M−z
×Γ(N + a + b)
Γ(y + a)Γ(N − y + b)θ
y+a−1(1− θ)N−y+b−1dθ
=
(Mz
)Γ(N + a + b)
Γ(y + a)Γ(N − y + b)
∫ 1
0θ
y+a+z−1(1− θ)N−y+b+M−z−1dθ
=
(Mz
)Γ(N + a + b)
Γ(y + a)Γ(N − y + b)
Γ(a + y + z)Γ(b + N − y + M − z)
Γ(a + b + N + M)
for z = 0,1, . . . ,M.
A likelihood approach would take the predictive distribution asbinomial(M, θ̂) with θ̂ = y/N: this does not account for estimationuncertainty.
In general, we have sampling uncertainty (which we can’t get awayfrom) and estimation uncertainty.
47 / 67
![Page 48: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the](https://reader033.fdocuments.us/reader033/viewer/2022060405/5f0f26df7e708231d442be44/html5/thumbnails/48.jpg)
Predictive Distribution
0 2 4 6 8 10
0.00.1
0.20.3
0.4
z
Pred
ictive
Dist
ributi
onLikelihood PredictionBayesian Prediction
Figure: Likelihood and Bayesian predictive distribution of seeingz = 0, 1, . . . ,M = 10 successes, after observing y = 2 out of N = 20successes (with a = b = 1).
48 / 67
![Page 49: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the](https://reader033.fdocuments.us/reader033/viewer/2022060405/5f0f26df7e708231d442be44/html5/thumbnails/49.jpg)
Predictive Distribution
The posterior and sampling distributions won’t usually combine soconveniently.
In general, we may form a Monte Carlo estimate of the predictivedistribution:
p(z|y) =
∫p(z|θ)p(θ|y)dθ
= Eθ|y [p(z|θ)]
≈ 1S
S∑s=1
p(z|θ(s))
where θ(s) ∼ p(θ|y), s = 1, . . . ,S, is a sample from the posterior.
This provides an estimate of the predictive distribution at the point z.
49 / 67
![Page 50: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the](https://reader033.fdocuments.us/reader033/viewer/2022060405/5f0f26df7e708231d442be44/html5/thumbnails/50.jpg)
Predictive Distribution
I Alternatively, we may samplefrom p(z|θ(s)) a large numberof times to reconstruct thepredictive distribution.
I First sample from theposterior:
θ(s)|y ∼ p(θ|y).
I Next sample from thelikelihood:
z(s)|θ(s) ∼ p(z|θ(s)),
for s = 1, . . . ,S.I To give a sample z(s) from the
posterior, this is illustrated tothe right.
0 1 2 3 4 5 6 7 8
z
050
010
0015
0020
0025
0030
00
Figure: Sampling version of predictionin Figure 13, based on S = 10, 000samples.
50 / 67
![Page 51: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the](https://reader033.fdocuments.us/reader033/viewer/2022060405/5f0f26df7e708231d442be44/html5/thumbnails/51.jpg)
Difference in Binomial Proportions
I It is straightforward to extend the methods presented for a singlebinomial sample to a pair of samples.
I Suppose we carry out two binomial experiments:
Y1|θ1 ∼ binomial(N1, θ1) for sample 1Y2|θ2 ∼ binomial(N2, θ2) for sample 2
I Interest focuses on θ1 − θ2, and often in examing the possibitlitythat θ1 = θ2.
I With a sampling-based methodology, and independent betapriors on θ1 and θ2, it is straightforward to examine the posteriorp(θ1 − θ1|y1, y2).
51 / 67
![Page 52: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the](https://reader033.fdocuments.us/reader033/viewer/2022060405/5f0f26df7e708231d442be44/html5/thumbnails/52.jpg)
Difference in Binomial Proportions
I Savage et al. (2008) give data on allele frequencies within agene that has been linked with skin cancer.
I It is interest to examine differences in allele frequencies betweenpopulations.
I We examine one SNP and extract data on Northern European(NE) and United States (US) populations.
I Let θ1 and θ2 be the allele frequencies in the NE and USpopulation from which the samples were drawn, respectively.
I The allele frequencies were 10.69% and 13.21% with samplesizes of 650 and 265, in the NE and US samples, respectively.
I We assume independent Beta(1,1) priors on each of θ1 and θ2.I The posterior probability that θ1 − θ2 is greater than 0 is 0.12
(computed as the proportion of the samples θ(s)1 − θ
(s)2 that are
greater than 0), so there is little evidence of a difference in allelefrequencies between the NE and US samples.
52 / 67
![Page 53: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the](https://reader033.fdocuments.us/reader033/viewer/2022060405/5f0f26df7e708231d442be44/html5/thumbnails/53.jpg)
Binomial Two Sample Example
θ1
Freque
ncy
0.08 0.12 0.16
0500
1000
1500
θ2
Freque
ncy
0.10 0.15 0.20 0.25
0500
1000
1500
θ1−θ2
Freque
ncy
−0.15 −0.05 0.05
0500
1000
1500
Figure: Histogram representations of p(θ1|y1), p(θ2|y2) and p(θ1 − θ2|y1, y2).The red line in the right plot is at the reference point of zero.
53 / 67
![Page 54: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the](https://reader033.fdocuments.us/reader033/viewer/2022060405/5f0f26df7e708231d442be44/html5/thumbnails/54.jpg)
Bayes Factors for Hypothesis Testing
I The Bayes factor provides a summary of the evidence for aparticular hypothesis (model) as compared to another.
I The Bayes factor is
BF =Pr(y |H0)
Pr(y |H1)
and so is simply the probability of the data under H0 divided bythe probability of the data under H1.
I Values of BF > 1 favor H0 while values of BF < 1 favor H1.I Note the similarity to the likelihood ratio
LR =Pr(y |H0)
Pr(y |θ̂)
where θ̂ is the MLE under H1.I If there are no unknown parameters in H0 and H1 (for example,
H0 : θ = 0.5 versus H1 : θ = 0.3), then the Bayes factor isidentical to the likelihood ratio.
54 / 67
![Page 55: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the](https://reader033.fdocuments.us/reader033/viewer/2022060405/5f0f26df7e708231d442be44/html5/thumbnails/55.jpg)
Calibration of Bayes Factors
I Kass and Raftery (1995) suggest intervals of Bayes factors forreporting:
1/Bayes Factor Evidence Against H0
1 to 3.2 Not worth more than a bare mention3.2 to 20 Positive20 to 150 Strong>150 Very strong
I These provide a guideline, but should not be followed withoutquestion.
55 / 67
![Page 56: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the](https://reader033.fdocuments.us/reader033/viewer/2022060405/5f0f26df7e708231d442be44/html5/thumbnails/56.jpg)
Example: Bayes Factors for Binomial Data
For each gene in the ASE dataset we may be interested inH0 : θ = 0.5 versus H1 : θ 6= 0.5.
The numerator and denominator of the Bayes factor are:
Pr(y |H0) =
(Ny
)0.5y 0.5N−y
Pr(y |H1) =
∫ 1
0
(Ny
)θy (1− θ)N−y Γ(a + b)
Γ(a)Γ(b)θa−1(1− θ)b−1dθ
=
(Ny
)Γ(a + b)
Γ(a)Γ(b)
Γ(y + a)Γ(N − y + b)
Γ(N + a + b)
We have already seen the denominator calculation, when wenormalized the posterior.
56 / 67
![Page 57: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the](https://reader033.fdocuments.us/reader033/viewer/2022060405/5f0f26df7e708231d442be44/html5/thumbnails/57.jpg)
Values Taken by the Negative Log Bayes Factor, as aFunction of y
●
●
●
●
●
●
●●
● ● ● ● ●●
●
●
●
●
●
●
●
0 5 10 15 20
02
46
810
12
y
−Lo
g B
ayes
Fac
tor
Very Strong
Strong
Positive
Bare Mention
Figure: Negative Log Bayes factor as a function of y |θ ∼ Binomial(20, θ) fory = 0, 1, . . . , 20 and a = b = 1. High values indicate evidence against thenull.
57 / 67
![Page 58: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the](https://reader033.fdocuments.us/reader033/viewer/2022060405/5f0f26df7e708231d442be44/html5/thumbnails/58.jpg)
Analysis of ASE Data
58 / 67
![Page 59: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the](https://reader033.fdocuments.us/reader033/viewer/2022060405/5f0f26df7e708231d442be44/html5/thumbnails/59.jpg)
Three Approaches to Inference for the ASE Data
1. Posterior Probabilities:I A simple approach to testing is to calculate the posterior probability
that θ < 0.5.I We can then pick a threshold for indicating worthy of further study,
e.g. if Pr(θ < 0.5|y) < 0.01 or Pr(θ < 0.5|y) > 0.99
2. Bayes Factors:I Calculating the Bayes factor.I Pick a threshold for indicating worthy of further study, e.g. if
reciprocal of the Bayes factor is greater than 150.
3. Decision theory:I Place priors on the null and alternative hypotheses.I Calculate the posterior odds:
Pr(H0|y)
Pr(H1|y)=
Pr(y |H0)
Pr(y |H1)× Pr(H0)
Pr(H1)
Posterior Odds = Bayes Factor× Prior Odds
I Pick a threshold R, so that if the Posterior Odds < R we choose H1.
59 / 67
![Page 60: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the](https://reader033.fdocuments.us/reader033/viewer/2022060405/5f0f26df7e708231d442be44/html5/thumbnails/60.jpg)
Bayesian Analysis of the ASE Data
I Here we give ahistogram of theposterior probabilitiesPr(θ < 0.5|y) and wesee large numbers ofgenes haveprobabilities close to 0and 1, indicating allelespecific expression(ASE).
Posterior Prob of θ < 0.5
Fre
quen
cy
0.0 0.2 0.4 0.6 0.8 1.0
020
040
060
080
0
Figure: Histogram of 4,844 posteriorprobabilities of θ < 0.5.
60 / 67
![Page 61: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the](https://reader033.fdocuments.us/reader033/viewer/2022060405/5f0f26df7e708231d442be44/html5/thumbnails/61.jpg)
Bayesian Analysis of the ASE Data
I To the left we plotPr(θ < 0.5|y) versusthe p-values and thegeneral pattern is whatwe would expect —small p-values haveposterior probabilitiesclose to 0 and 1.
I Weird lines are due todiscreteness of thedata.
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●●●
●
●
●
●
●●
●
●●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●●
●●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●● ●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
● ●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●● ●
●
●
● ●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
● ●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
● ●
●
●●
●●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●● ●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
● ●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●●●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●●
●
●
●
●●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
p−value
Pos
t Pro
b of
θ <
0.5
Figure: Posterior probabilities of θ < 0.5 andp-values from exact tests.
61 / 67
![Page 62: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the](https://reader033.fdocuments.us/reader033/viewer/2022060405/5f0f26df7e708231d442be44/html5/thumbnails/62.jpg)
Bayesian Analysis of the ASE Data
I Here we plot the -LogBayes Factor againstPr(θ < 0.5|y).
I Large values of the formercorrespond to strongevidence of ASE.
I Again we see anaggreement in inference,with large values of thenegative log Bayes factorcorresponding withPr(θ < 0.5|y) close to 0and 1.
● ●
●●●
●● ●● ● ●●●
●●●●
●●● ●●● ●● ● ● ●●● ●● ●●
●
●
●
●● ●
●
● ●●● ● ●●● ●● ● ●●●
● ●● ●● ●●●
●
● ●●●●
●
●●
● ● ●● ●●●●●
●● ● ●●● ●
●
● ●
●
●●●● ●● ●●●
●
● ●
●
●● ●● ●
●
● ● ●●
●
● ●● ● ●● ●● ●● ●●●
● ● ●●● ●● ●● ●● ● ●● ●● ●● ●●●
●●●●●● ● ●●● ●●
●●● ● ● ●●● ● ●
●● ●●●● ●●● ● ●●●
●
●●●
●●●● ●● ●●● ●● ●● ●●● ●
●●● ●●● ● ●●● ● ●● ●● ●●●● ●
●●
●● ●●●
●●●●●
●● ●● ●● ●● ● ●●●
●●
●● ●●●●● ●● ●● ● ●●●
●●●●● ● ●● ●
●
●
● ●● ●● ● ● ● ●● ● ●● ● ● ●●● ●●●●● ● ●● ● ● ●● ●● ●●●●●● ●● ●●●● ●● ●
●● ●●● ●● ● ●●
●●●●●●● ●● ● ● ●●●
●● ●
●
●● ●●●● ●
●
●● ●● ● ● ●● ●●
●●● ●● ●● ●● ●●● ●●● ● ●●● ●●● ● ● ●● ●● ●● ●●● ●●
●● ●● ●● ●● ●●●●● ●
●●
●
● ●● ● ●● ●● ●● ●●● ●● ●● ● ●●●● ● ●● ● ● ● ● ●● ●●● ●●● ●●
● ●
●●
● ●●●● ●● ●●● ●
●
● ●●● ●●● ●●● ● ●● ●● ●● ●●● ●● ●●●
●
● ●●●● ● ●● ● ●● ●● ● ●●● ●
●● ●● ●● ●● ● ●● ●● ● ●●● ●● ● ●● ●● ●●● ● ●● ●●●●
●●
●
● ● ● ●●● ●●●●
●
●● ● ●
● ● ●●●
● ●● ●● ●● ● ●● ●● ●● ● ● ●●● ●● ●●●
●
●● ●●● ● ● ●● ●●●● ●●● ●● ●● ●●●
●
●●●●● ● ●●●● ●● ●● ● ●● ● ●● ●● ● ●●● ●● ● ●●
●
●● ●●● ● ●●
● ● ●●● ● ● ●● ●●●●
●●● ●●
●● ●● ● ●●● ●● ●● ●● ● ●● ● ●●●
●
●●● ● ●●●
●
●●● ●●●●
●●
●
●●● ●● ●●●
● ●●●● ●
●
●●●
●
● ●● ●●● ●● ●● ●●● ●●●● ● ● ●● ●●● ●● ●● ●● ●● ●●● ●●
●
● ●● ●● ●●● ●● ●● ●● ● ●● ●●
● ●●● ● ●● ●●
● ●● ● ●● ● ●● ● ● ●● ●●● ●●● ● ●●● ●● ● ●●●● ● ●●● ● ●●● ● ●● ●●● ●● ●● ●●●●●
●●● ● ●●●●●● ●● ●● ●● ●●●● ● ●●● ● ●●●●●●● ●● ●● ● ●● ●● ● ●●●● ● ● ●● ●● ●
●● ●●● ●●● ●● ● ●●●●● ● ●●● ● ●● ● ●● ●●● ● ●● ●● ●●● ● ●●● ●● ●● ● ●● ●●●● ● ● ● ●● ● ●● ●●● ●● ●● ●● ●● ●● ●●● ● ●● ●● ●●● ●● ● ●● ●●● ●● ●●● ●● ● ●● ● ●●● ●● ●●● ●●●● ● ● ●● ● ●●● ●●● ● ●● ●● ●●●● ● ● ●●● ●●●
● ●● ●● ●●●●● ●● ●
●● ●●●● ●●●
●
● ● ● ●● ●● ● ●●●● ●●●●
● ●●●●● ● ●● ●● ●● ● ● ●● ● ●● ●● ●● ● ●● ●●●●
●● ●● ●● ● ●● ●●●● ●
●
●● ●
●
● ● ●● ●●
●●
● ●● ●● ●●●●●● ●● ●●●●
● ●● ●● ●● ●● ●●
●● ●● ● ●●●
●●●●
●●● ● ●
●●●● ● ●● ●●● ●● ●●● ● ●● ●●● ●● ●● ●● ●●
●● ●● ●● ●●
●●●
●● ●● ●● ●●●● ● ●●●● ● ● ●● ●●●● ●●● ●● ●●●● ●● ●● ● ●
● ● ●●● ●●● ● ●● ●● ● ●● ● ●●●●● ● ●●●●
●●
● ● ●●● ●●● ●●● ●●● ● ●●●
●
● ● ●●● ● ●●●●● ●● ●● ●● ●●
●●
●
●● ●●● ● ●●●
●
●
●●●● ●●●●● ●● ●●● ●● ● ●●● ●● ●● ●● ●●●● ●●●
●●
● ●●●● ●● ●●
●
●● ●● ● ●● ● ●●●●●● ●●●● ● ● ●●
● ●●●● ●●●● ●●
●● ●●● ●● ●●
●●●
● ●
● ●●● ●● ●
●
●● ●●●
●●●● ●● ● ●● ● ●●● ●● ●●●●● ●●
● ●● ● ●●
● ●●● ● ● ●●● ●● ●●●● ●● ●● ●●●● ● ●●●● ●●●
● ●●● ●● ●●●●● ●●
●●● ●●●● ●
●
●●
● ●● ● ●●● ●● ●●●●● ●
●
● ● ● ●● ●●● ●●● ●●●● ●● ●●●
●●
●
●
● ● ●●●●●● ● ● ●●● ●● ●● ●● ●●● ● ●● ●●● ● ●● ●●●
●●
●●● ●
● ●●● ● ●●● ● ●
●●
● ●●● ● ●● ● ●● ●●● ● ●●● ●● ● ● ●● ●●● ●●●● ●●●● ● ●● ●● ●●●● ● ● ●●●
●
● ● ●●● ● ●●●● ●●● ●●●
●●
● ●●●● ●●● ● ● ●● ●●●
●
●● ●●●●
● ● ●● ●● ● ●●
● ● ●● ● ●●● ● ●● ●● ● ●● ●●
● ●● ● ●
●
●●●● ● ●● ●● ● ●●●
● ● ●● ●●●
● ●●
●● ● ●●●● ● ●●● ●● ● ●● ●● ●● ● ●
●
● ● ● ●●●●●● ●●●● ● ●● ● ●● ● ●
●
● ●●
●●● ● ●●●●
●
●● ● ●● ●● ●●●●
●
●● ●● ●● ●● ● ● ●● ●● ● ●● ●●● ●●● ●● ● ●● ●●●
●● ●● ●●● ● ●● ● ● ● ●●● ●● ●● ● ●● ● ●●● ●● ● ●● ●● ● ● ●● ●● ●●● ● ●● ● ●●● ● ●●●● ● ● ● ●●●● ● ●●● ●●● ●● ●● ●
● ●● ●● ●● ● ●●●●
●●● ●●● ●●
● ● ●●
● ●● ●●● ● ●●●●● ●●●●● ● ●●●
●● ● ●
●
●●
●
●●
●
● ●●● ●● ●● ● ● ●●● ●●● ●●●● ●
● ● ●●● ●●●● ● ●●● ●● ●●● ●● ●● ● ●
●
●●● ● ●●● ● ●● ● ● ●●●● ● ●● ● ●
●
●●●
● ●● ● ●● ●● ●●●●
● ● ●●●● ● ●● ● ●● ●● ●● ●● ●● ● ● ●●● ●●●●● ●
●
●●● ●● ●●● ●●● ● ● ● ●●● ●● ● ●● ● ●● ●●●
●● ●●● ●● ●● ● ●
●●●
● ●● ●●● ● ●● ●
●
● ● ●● ●
●
● ●
●●
● ●●● ●●● ● ●●●●● ●● ●●● ● ●● ● ●● ●
●●● ● ●●●● ●● ●● ● ●● ●● ● ●● ●● ●●● ●
●
● ●●● ● ● ●● ● ●● ●●● ●● ●● ●● ● ●● ● ●● ● ● ●●
●● ● ●●●
●
● ●●●●● ●
●
● ● ●●
●
● ●● ●●
● ●●●● ●● ●●● ● ●● ●●
● ●● ●●● ●●●
● ● ●● ● ●●●● ●●● ●● ●
●●
● ●●● ● ●●● ● ●●
●
●
●
●●
●
●● ●● ●
●● ●●●
●
●●●● ● ● ● ●● ● ●●● ●● ●● ● ● ●●●●●● ● ●●●● ● ●● ●● ●● ●●● ● ●● ●● ●●●● ● ●● ●●● ●● ●● ●● ●
●● ●●●●● ●● ●● ●●● ●● ● ● ● ● ●● ● ●● ●
●●●● ●●●
●
●● ●●● ●●● ●● ● ●● ●● ● ●●●
● ●● ●● ●● ● ● ●● ●●
●● ●●● ● ●●
●
●
● ●● ●● ● ●● ●● ● ● ●●●● ● ● ●● ●●
●● ●●● ●● ●● ●● ●
●●● ●● ● ●● ●●● ●● ● ● ●
●
● ● ●●● ● ●●●● ●● ●● ●●● ● ●●● ●● ●● ● ●●●● ●● ●●● ●● ●●
● ● ●●●
● ● ●● ●
●● ●
●
● ●
●
●●●
●
● ● ●● ●●● ● ● ●●● ●●●
●● ● ●●● ●● ●●● ●●● ●● ● ●
● ●●● ● ●●●●● ●
●●●● ● ●● ●●●●● ● ●
●● ● ●●● ● ●
●●●
●● ●●● ●
●●●
●●● ● ●●●●● ● ●●●● ● ● ● ●●●●●●● ●
●● ● ● ●● ●●● ●● ●● ● ●● ● ●●
●● ●●●●
●
●●● ● ●● ●● ● ●●●● ●●●● ●
●●●
●
●● ●● ●●● ●● ● ●● ●● ●● ●●
●
● ●●●● ●● ●● ● ●●● ● ●●●
● ●●
●● ●●●● ●
●
● ●● ●● ●●●●●●● ●●● ●● ●●●● ● ● ●● ●● ●● ●●● ●● ●● ●● ●● ●●
●●●
●● ● ●● ● ●● ● ●● ●● ●● ●●●●
●● ●● ● ●● ● ●● ●●● ●● ●●●● ● ●●● ● ●●●● ● ● ●● ●
●●● ●● ● ●
●
● ●●
●
● ●● ● ●●● ●● ●●● ●●●● ●●● ● ●●● ● ●●● ● ●● ●●● ●●
●● ●● ●● ● ●● ●●● ● ●● ● ●●●●● ●● ●● ●●● ●●● ● ●● ●●●●
●● ●●● ●● ●●● ●●● ●●
●
●●●● ●● ●
●●● ●● ●● ● ●●
●
●●● ● ●●● ●●● ●●● ●●● ●● ●● ●● ● ●●● ● ●● ●● ● ●●●●
●● ● ●● ●●●● ●● ● ● ●●● ●● ●●●
●
●●● ● ● ●● ●●● ●● ●● ●●●●● ● ●●●● ●●● ●●● ●● ●● ● ●● ●●● ●● ●● ●●●● ●●
●● ●● ●● ●●● ● ● ● ● ●● ●
●●● ●●● ●●● ●● ●● ●●● ●●● ● ● ●● ● ● ●●● ●● ●
●● ● ● ●● ●●● ●● ● ●●● ●●● ●●
●
● ●●●●● ●● ●●●● ●● ●● ● ●●●
● ●● ● ●●●
●
●●●
● ●● ●●●●● ●● ●● ● ● ●●● ● ● ●● ●● ● ●● ●● ● ●● ● ●● ●● ●
●●●● ● ●●● ●● ●
●●● ●●
●●● ●● ●●● ●● ●● ●●
●● ● ● ● ●●●● ● ●● ●● ● ● ●● ● ●●● ●● ●● ● ● ● ●●●
●●● ●●● ●●●● ●● ●
●● ● ●● ●● ●●
●●
●● ●● ●● ●●
● ●● ●● ●● ●●●● ●●● ●● ●● ●● ●●●●
●●●● ●●● ●●● ●
●● ●● ●●●● ●● ●●●
●●● ●●●●● ●● ●●
●
● ●● ● ● ● ●● ●●● ●● ● ● ●●
● ●● ●●● ●● ● ●●●● ● ● ● ●
●● ● ●● ●● ●● ● ●●
●
● ●● ●● ●●● ●● ●● ●●● ● ●● ●●● ● ●● ● ●●●●● ● ●● ●● ● ●●● ●● ●
●
● ● ●● ● ●●
●●● ● ●● ● ● ●● ● ● ● ●● ● ●● ●●
●●● ●●●● ●● ●●
● ●●●
● ●●●● ● ●● ●●●● ●●●●
●
●
●
● ● ●● ●● ●●● ● ●● ● ●● ● ● ● ●● ●●●● ●●● ●
●
●●●
●● ● ●● ● ●● ●● ● ● ●●● ●● ●● ●●● ●● ●●● ●●●● ●●● ● ●●● ●● ●● ● ● ●●●● ● ● ●● ●●
●● ●● ●●● ● ● ●● ●● ● ●●●● ●
●●● ●
●●●● ●●● ●●● ●●●
●●
●●●
● ●● ●● ● ●●●● ●●● ● ●● ●●● ● ●●
●●● ●●
●●●● ● ●●
●
●● ●●
● ● ●● ●●●●●● ● ● ●●●●●● ●●● ● ●● ●●●● ●● ●● ●●●
●
●● ●● ● ●
●
● ●● ●● ●● ● ● ●
●
●●● ● ●●●● ● ● ●●
● ●● ●● ●● ● ●●●● ● ● ●●● ●●● ●● ●●
●● ●● ●●●● ●
●
●● ● ● ● ●● ●●●
●
● ●●●●● ●●● ●●● ●●● ●●●● ● ●● ●● ● ●●●●
● ●●●●●
●
● ● ●● ●●● ● ●●●● ●
●
● ●● ●● ●●●●●
●●●●● ●● ●
●
● ● ●●● ●● ● ● ●●●●● ●●●
●
●● ●●
● ●●● ● ●● ●● ● ●● ●● ●●●●● ● ●● ● ●● ●● ●● ●● ●● ●● ● ● ●●● ● ●● ●● ●●●● ●●● ●●● ●●
●● ●● ●●●
●●● ●●●● ● ●●● ●●
●●
●
●●●
●
●● ●●●
●
●● ●● ●● ●●● ● ●● ●●
● ● ● ●●●
●●
●●● ● ●● ● ●●● ● ●●●●●● ● ●●● ● ● ●●● ● ●● ● ●●● ●● ●● ●●● ●
●● ● ● ●●● ●● ● ●●●● ● ●● ●●● ● ●●● ●●● ●
●● ●● ●● ●●
●
●● ●● ●●● ●● ●● ● ●
●
● ●●● ●● ●● ●● ●● ●●●● ●● ● ●●● ● ● ●●●●●●
●●● ●● ●● ●● ●● ● ●● ● ●● ● ●●● ● ●●
● ● ●●● ● ●●● ●●● ●
●
●●●●●●●●
●●●
●● ● ●● ● ● ●●● ●● ●●● ●● ●● ● ●●● ● ●●
●● ●
●
● ●●● ● ● ●●●●● ●● ● ●● ●●●●● ●● ●● ●●● ●● ●● ●●● ● ●●
●● ●● ●●● ●● ●
●
●
● ● ●● ● ●● ● ● ●●● ●● ● ●●● ●● ● ● ●●
●● ●●●● ●
●
●● ● ●● ●● ●● ●●●● ●● ●●●●
●
●●●●● ●●● ●●
● ●● ●●
● ●● ●●● ● ●●●● ●● ●●●● ●●●● ● ●●● ●● ● ●● ●
●
●
● ●● ●
●
●●●●● ●●●●●● ●
●
● ●● ●●●
●● ● ●●● ● ●● ●●● ● ●●●●● ●●● ●● ● ●● ●● ●● ●●● ●●● ●● ●● ●● ●● ●●
●● ● ●● ●● ● ●● ●●●● ●● ●● ●
●
● ● ●● ●● ● ●●●●
●●
●● ●●● ●●
● ●●● ● ●● ●●●●● ●● ●●
●● ●●●
● ●●●●●
● ● ●●● ●●
●
●●●● ● ●●● ● ●●● ● ●●●● ●●● ● ● ● ●●● ● ●● ●●●● ● ● ●● ●● ●●● ● ●● ●●
●●● ●●●● ●●● ●● ●
●● ●● ●● ●●● ● ● ● ●●●●
●●●● ●
●
● ●● ●●
●
● ●●
● ●● ● ●● ●● ●● ●●●● ● ●● ●
●●
●●
●● ●● ●● ●● ●● ●● ● ●● ● ●● ●
●● ● ● ●● ● ●●●● ●● ●●● ●● ● ● ●●●● ● ● ●● ●●● ●● ●● ●●● ●● ●●●● ●●● ●●● ● ● ●● ●● ●● ●
●
●●●●●●
●
● ●●●●● ● ●●
●● ●● ●●●●
● ●●●●● ●● ● ● ●● ●● ● ●● ● ●● ●● ●●● ●● ●●●●
● ●● ● ●● ●● ● ●●● ● ●●● ●● ●● ●●● ● ●●● ● ●● ●
●
● ● ●●●●
0.0 0.2 0.4 0.6 0.8 1.0
050
100
150
200
250
Post Prob of θ < 0.5
−Lo
g B
ayes
Fac
tor
Figure: Negative Log Bayes factor versus posteriorprobabilities of θ < 0.5.
62 / 67
![Page 63: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the](https://reader033.fdocuments.us/reader033/viewer/2022060405/5f0f26df7e708231d442be44/html5/thumbnails/63.jpg)
ASE Example
Applying a Bonferroni correction to control the family wise error rateat 0.05, gives a p-value threshold of 0.05/4844 = 10−5 and 111rejections. More on this later!
There were 278 genes with Pr(θ < 0.5|y) < 0.01 and 242 genes withPr(θ < 0.5|y) > 0.99.
Following the guideline of requiring very strong evidence, there were197 genes with the reciprocal Bayes factor greater than 150.
Requiring less stringent evidence, i.e. strong and very strong(reciprocal BF greater than 20), there were 359 genes.
We later consider a formal decision theory approach to testing.
In this example, the rankings of the different approaches are similar,but the calibration, i.e., picking a threshold, is not straightforward.
63 / 67
![Page 64: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the](https://reader033.fdocuments.us/reader033/viewer/2022060405/5f0f26df7e708231d442be44/html5/thumbnails/64.jpg)
ASE Output Data
I Below are some summaries from the ASE analysis – we orderwith respect to the variable logBFr, which is the reciprocal Bayesfactor (so that high numbers correspond to strong evidenceagainst the null).
I The postprob variable is the posterior probability of θ < 0.5.
a l l v a l s <− data . frame (Nsum, ysum , pvals , postprob , logBFr )oBF <− order (− logBFr )o r d e r a l l v a l s <− a l l v a l s [ oBF , ]head ( o r d e r a l l v a l s )
Nsum ysum pvals postprob logBFr4751 437 6 5.340324e−119 1.000000e+00 267.95724041 625 97 1.112231e−72 1.000000e+00 161.13552370 546 468 8.994944e−69 2.621622e−69 152.25172770 256 245 1.127211e−58 2.943484e−59 129.6198t a i l ( o r d e r a l l v a l s )
Nsum ysum pvals postprob logBFr824 761 382 0.9422103 0.4567334 −2.0866042163 776 390 0.9142477 0.4429539 −2.0919553153 769 384 1.0000000 0.5143722 −2.0970792860 1076 546 0.6474878 0.3129473 −2.146555
64 / 67
![Page 65: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the](https://reader033.fdocuments.us/reader033/viewer/2022060405/5f0f26df7e708231d442be44/html5/thumbnails/65.jpg)
Conclusions
65 / 67
![Page 66: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the](https://reader033.fdocuments.us/reader033/viewer/2022060405/5f0f26df7e708231d442be44/html5/thumbnails/66.jpg)
Conclusions
Monte Carlo sampling provides flexibility of inference.
All this lecture considered Binomial sampling, for which there is only asingle parameter. For more parameters, prior specification andcomputing becomes more interesting...as we shall see.
Multiple testing is considered in Lecture 9.
For estimation and with middle to large sample sizes, conclusionsfrom Bayesian and non-Bayesian approaches often coincide.
For testing it’s more complex, as discussed in Lecture 9.
66 / 67
![Page 67: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the](https://reader033.fdocuments.us/reader033/viewer/2022060405/5f0f26df7e708231d442be44/html5/thumbnails/67.jpg)
Conclusions
Benefits of a Bayesian approach:I Inference is based on probability and output is very intuitive.I Framework is flexible, and so complex models can be built.I Can incorporate prior knowledge!I If the sample size is large, prior choice is less crucial.
Challenges of a Bayesian analysis:I Require a likelihood and a prior, and inference is only as good as
the appropriateness of these choices.I Computation can be daunting, though software is becoming
more user friendly and flexible; later we will describe andillustrate a number of approaches including INLA and Stan.
I One should be wary of model becoming too complex – we havethe technology to contemplate complicated models, but do thedata support complexity?
67 / 67
![Page 68: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the](https://reader033.fdocuments.us/reader033/viewer/2022060405/5f0f26df7e708231d442be44/html5/thumbnails/68.jpg)
ReferencesKass, R. and Raftery, A. (1995). Bayes factors. Journal of the
American Statistical Association, 90, 773–795.Savage, S. A., Gerstenblith, M. R., Goldstein, A., Mirabello, L.,
Fargnoli, M. C., Peris, K., and Landi, M. T. (2008). Nucleotidediversity and population differentiation of the melanocortin 1receptor gene, MC1R. BMC Genetics, 9, 31.
Skelly, D., Johansson, M., Madeoy, J., Wakefield, J., and Akey, J.(2011). A powerful and flexible statistical framework for testinghypothesis of allele-specific gene expression from RNA-Seq data.Genome Research, 21, 1728–1737.
67 / 67