Essential Statistics in Biology: Getting the Numbers Right
description
Transcript of Essential Statistics in Biology: Getting the Numbers Right
![Page 1: Essential Statistics in Biology: Getting the Numbers Right](https://reader036.fdocuments.us/reader036/viewer/2022062309/56815887550346895dc5e8bf/html5/thumbnails/1.jpg)
Essential Statistics in
Biology: Getting the Numbers
Right
Raphael GottardoClinical Research Institute of Montreal (IRCM)
[email protected]://www.rglab.org
![Page 2: Essential Statistics in Biology: Getting the Numbers Right](https://reader036.fdocuments.us/reader036/viewer/2022062309/56815887550346895dc5e8bf/html5/thumbnails/2.jpg)
Day 1 2
Outline
•Exploratory Data Analysis
•1-2 sample t-tests, multiple testing
•Clustering
•SVD/PCA
•Frequentists vs. Bayesians
![Page 3: Essential Statistics in Biology: Getting the Numbers Right](https://reader036.fdocuments.us/reader036/viewer/2022062309/56815887550346895dc5e8bf/html5/thumbnails/3.jpg)
PCA and SVD(Multivariate
analysis)
![Page 4: Essential Statistics in Biology: Getting the Numbers Right](https://reader036.fdocuments.us/reader036/viewer/2022062309/56815887550346895dc5e8bf/html5/thumbnails/4.jpg)
Day 1 - Section 4 4
Outline
•What is SVD? Mathematical definition
•Relation to Principal Component Analysis (PCA)
•Applications of PCA and SVD
•Illustration with gene expression data
![Page 5: Essential Statistics in Biology: Getting the Numbers Right](https://reader036.fdocuments.us/reader036/viewer/2022062309/56815887550346895dc5e8bf/html5/thumbnails/5.jpg)
Day 1 - Section 4 5
SVDLet X be a matrix of size mxn (m≥n) and rank r≤nthen we can decompose X as
XXVVSS
UU= x x T
m
n
m n
n n n
n
- U is the matrix of left singular vectors- V is the matrix of right singular vectors- S is a diagonal matrix who’s diagonal are the singular values
![Page 6: Essential Statistics in Biology: Getting the Numbers Right](https://reader036.fdocuments.us/reader036/viewer/2022062309/56815887550346895dc5e8bf/html5/thumbnails/6.jpg)
Day 1 - Section 4 6
SVDLet X be a matrix of size mxn (m≥n) and rank r≤nthen we can decompose X as
XXVVSS
UU= x x T
m
n
m n
n n n
n
![Page 7: Essential Statistics in Biology: Getting the Numbers Right](https://reader036.fdocuments.us/reader036/viewer/2022062309/56815887550346895dc5e8bf/html5/thumbnails/7.jpg)
Day 1 - Section 4 7
SVDLet X be a matrix of size mxn (m≥n) and rank r≤nthen we can decompose X as
XXVVSS
UU= x x T
m
n
m n
n n n
n
DirectionAmplitude
![Page 8: Essential Statistics in Biology: Getting the Numbers Right](https://reader036.fdocuments.us/reader036/viewer/2022062309/56815887550346895dc5e8bf/html5/thumbnails/8.jpg)
Day 1 - Section 4 8
Relation to PCA
Assume that the rows of X are centered then is (up to a constant) the empirical covariance matrix and SVD is equivalent to PCA
The rows of V are the singular vectors or principal components
New variabl
esVarianc
e
Gene expression: Eigengenes or eigenassays
![Page 9: Essential Statistics in Biology: Getting the Numbers Right](https://reader036.fdocuments.us/reader036/viewer/2022062309/56815887550346895dc5e8bf/html5/thumbnails/9.jpg)
Day 1 - Section 4 9
Applications of SVD and PCA•Dimension reduction (simplify a dataset)
•Clustering
•Discriminant analysis
•Exploratory data analysis tool
•Find the most important signal in data
•2D projections
![Page 10: Essential Statistics in Biology: Getting the Numbers Right](https://reader036.fdocuments.us/reader036/viewer/2022062309/56815887550346895dc5e8bf/html5/thumbnails/10.jpg)
Day 1 - Section 4 10
Toy examples=(13.47,1.45)set.seed(100)
x1<-rnorm(100,0,1)y1<-rnorm(100,1,1)
var0.5<-matrix(c(1,-.5,-.5,.1),2,2)
data1<-t(var0.5%*%t(cbind(x1,y1)))
set.seed(100)x2<-rnorm(100,2,1)y2<-rnorm(100,2,1)
var0.5<-matrix(c(1,.5,.5,1),2,2)
data2<-t(var0.5%*%t(cbind(x2,y2)))
data<-rbind(data1,data2)
svd1<-svd(data1)plot(data1,xlab="x",ylab="y",xlim=c(-6,6),ylim=c(-6,6))abline(coef=c(0,svd1$v[2,1]/svd1$v[1,1]),col=2)abline(coef=c(0,svd1$v[2,2]/svd1$v[1,2]),col=3)
![Page 11: Essential Statistics in Biology: Getting the Numbers Right](https://reader036.fdocuments.us/reader036/viewer/2022062309/56815887550346895dc5e8bf/html5/thumbnails/11.jpg)
Day 1 - Section 4 11
Toy examples=(47.79,13.25)svd2<-svd(data2)
plot(data2,xlab="x",ylab="y",xlim=c(-6,6),ylim=c(-6,6))abline(coef=c(0,svd2$v[2,1]/svd2$v[1,1]),col=2)abline(coef=c(0,svd2$v[2,2]/svd2$v[1,2]),col=3)
svd<-svd(data)
plot(data,xlab="x",ylab="y",xlim=c(-6,6),ylim=c(-6,6))abline(coef=c(0,svd$v[2,1]/svd$v[1,1]),col=2)abline(coef=c(0,svd$v[2,2]/svd$v[1,2]),col=3)
![Page 12: Essential Statistics in Biology: Getting the Numbers Right](https://reader036.fdocuments.us/reader036/viewer/2022062309/56815887550346895dc5e8bf/html5/thumbnails/12.jpg)
Day 1 - Section 4 12
Toy example### Projectiondata.proj<-svd$u%*%diag(svd$d)svd.proj<-svd(data.proj)
plot(data.proj,xlab="x",ylab="y",xlim=c(-6,6),ylim=c(-6,6))abline(coef=c(0,svd.proj$v[2,1]/svd.proj$v[1,1]),col=2)### svd.proj$v[1,2]=0abline(v=0,col=3)
![Page 13: Essential Statistics in Biology: Getting the Numbers Right](https://reader036.fdocuments.us/reader036/viewer/2022062309/56815887550346895dc5e8bf/html5/thumbnails/13.jpg)
Day 1 - Section 4 13
Toy examples=(47.17,11.88)
Newcoordina
tes
Projecteddata
![Page 14: Essential Statistics in Biology: Getting the Numbers Right](https://reader036.fdocuments.us/reader036/viewer/2022062309/56815887550346895dc5e8bf/html5/thumbnails/14.jpg)
Day 1 - Section 4 14
Toy example### New data
set.seed(100)x1<-rnorm(100,-1,1)y1<-rnorm(100,1,1)
var0.5<-matrix(c(1,-.5,-.5,1),2,2)
data1<-t(var0.5%*%t(cbind(x1,y1)))
set.seed(100)x2<-rnorm(100,1,1)y2<-rnorm(100,1,1)
var0.5<-matrix(c(1,.5,.5,1),2,2)
data2<-t(var0.5%*%t(cbind(x2,y2)))
data<-rbind(data1,data2)
svd1<-svd(data1)plot(data1,xlab="x",ylab="y",xlim=c(-
6,6),ylim=c(-6,6))
abline(coef=c(0,svd1$v[2,1]/svd1$v[1,1]),col=2)
abline(coef=c(0,svd1$v[2,2]/svd1$v[1,2]),col=3)
svd2<-svd(data2)plot(data2,xlab="x",ylab="y",xlim=c(-
6,6),ylim=c(-6,6))
abline(coef=c(0,svd2$v[2,1]/svd2$v[1,1]),col=2)
abline(coef=c(0,svd2$v[2,2]/svd2$v[1,2]),col=3)
svd<-svd(data)
plot(data,xlab="x",ylab="y",xlim=c(-6,6),ylim=c(-6,6))
abline(coef=c(0,svd$v[2,1]/svd$v[1,1]),col=2)
abline(coef=c(0,svd$v[2,2]/svd$v[1,2]),col=3)
![Page 15: Essential Statistics in Biology: Getting the Numbers Right](https://reader036.fdocuments.us/reader036/viewer/2022062309/56815887550346895dc5e8bf/html5/thumbnails/15.jpg)
Day 1 - Section 4 15
Toy examples=(26.48,24.98)
![Page 16: Essential Statistics in Biology: Getting the Numbers Right](https://reader036.fdocuments.us/reader036/viewer/2022062309/56815887550346895dc5e8bf/html5/thumbnails/16.jpg)
Day 1 - Section 4 16
Application to microarrays•Dimension reduction (simplify a dataset)
•Clustering (two many samples)
•Discriminant analysis (find a group of genes)
•Exploratory data analysis tool
•Find the most important signal in data
•2D projections (clusters?)
![Page 17: Essential Statistics in Biology: Getting the Numbers Right](https://reader036.fdocuments.us/reader036/viewer/2022062309/56815887550346895dc5e8bf/html5/thumbnails/17.jpg)
Day 1 - Section 4 17
Application to microarrays
Cho cell cycle data set384 genes
We have standardized the datacho.data<-as.matrix(read.table("logcho_237_4class.txt",skip=1)[,3:19])
cho.mean<-apply(cho.data,1,"mean")cho.sd<-apply(cho.data,1,"sd")cho.data.std<-(cho.data-cho.mean)/cho.sd
svd.cho<-svd(cho.data.std)### Contribution of each PCbarplot(svd.cho$d/sum(svd.cho$d),col=heat.colors(17))### First three singular vectors (PCA)plot(svd.cho$v[,1],xlab="time",ylab="Expression profile",type="b")plot(svd.cho$v[,2],xlab="time",ylab="Expression profile",type="b")plot(svd.cho$v[,3],xlab="time",ylab="Expression profile",type="b")
### Projectionplot(svd.cho$u[,1]*svd.cho$d[1],svd.cho$u[,2]*svd.cho$d[2],xlab="PCA 1 ",ylab="PCA 2")plot(svd.cho$u[,1]*svd.cho$d[1],svd.cho$u[,3]*svd.cho$d[3],xlab="PCA 1 ",ylab="PCA 3")plot(svd.cho$u[,2]*svd.cho$d[2],svd.cho$u[,3]*svd.cho$d[3],xlab="PCA 2 ",ylab="PCA 3")
### Select a clusterind<-(svd.cho$u[,2]*svd.cho$d[2])^2+(svd.cho$u[,3]*svd.cho$d[3])^2>5 & svd.cho$u[,2]*svd.cho$d[2]>0 & svd.cho$u[,3]*svd.cho$d[3]<0
plot(svd.cho$u[,2]*svd.cho$d[2],svd.cho$u[,3]*svd.cho$d[3],xlab="PCA 2 ",ylab="PCA 3")points(svd.cho$u[ind,2]*svd.cho$d[2],svd.cho$u[ind,3]*svd.cho$d[3],col=2)
matplot(t(cho.data.std[ind,]),xlab="time",ylab="Expression profiles",type="l")
![Page 18: Essential Statistics in Biology: Getting the Numbers Right](https://reader036.fdocuments.us/reader036/viewer/2022062309/56815887550346895dc5e8bf/html5/thumbnails/18.jpg)
Day 1 - Section 4 18
Application to microarrays
Singular values
Relativecontribution
Why?
Main contribution
![Page 19: Essential Statistics in Biology: Getting the Numbers Right](https://reader036.fdocuments.us/reader036/viewer/2022062309/56815887550346895dc5e8bf/html5/thumbnails/19.jpg)
Day 1 - Section 4 19
Application to microarraysPC1
![Page 20: Essential Statistics in Biology: Getting the Numbers Right](https://reader036.fdocuments.us/reader036/viewer/2022062309/56815887550346895dc5e8bf/html5/thumbnails/20.jpg)
Day 1 - Section 4 20
Application to microarraysPC2
![Page 21: Essential Statistics in Biology: Getting the Numbers Right](https://reader036.fdocuments.us/reader036/viewer/2022062309/56815887550346895dc5e8bf/html5/thumbnails/21.jpg)
Day 1 - Section 4 21
Application to microarraysPC3
![Page 22: Essential Statistics in Biology: Getting the Numbers Right](https://reader036.fdocuments.us/reader036/viewer/2022062309/56815887550346895dc5e8bf/html5/thumbnails/22.jpg)
Day 1 - Section 4 22
Application to microarraysProjection
onto PC1 PC2
![Page 23: Essential Statistics in Biology: Getting the Numbers Right](https://reader036.fdocuments.us/reader036/viewer/2022062309/56815887550346895dc5e8bf/html5/thumbnails/23.jpg)
Day 1 - Section 4 23
Application to microarraysProjection
onto PC1 PC3
![Page 24: Essential Statistics in Biology: Getting the Numbers Right](https://reader036.fdocuments.us/reader036/viewer/2022062309/56815887550346895dc5e8bf/html5/thumbnails/24.jpg)
Day 1 - Section 4 24
Application to microarraysProjection
onto PC2 PC3
![Page 25: Essential Statistics in Biology: Getting the Numbers Right](https://reader036.fdocuments.us/reader036/viewer/2022062309/56815887550346895dc5e8bf/html5/thumbnails/25.jpg)
Day 1 - Section 4 25
Application to microarraysProjection
onto PC2 PC3
24 genes
![Page 26: Essential Statistics in Biology: Getting the Numbers Right](https://reader036.fdocuments.us/reader036/viewer/2022062309/56815887550346895dc5e8bf/html5/thumbnails/26.jpg)
Day 1 - Section 4 26
Application to microarraysProjection
onto PC2 PC3
24 genes
![Page 27: Essential Statistics in Biology: Getting the Numbers Right](https://reader036.fdocuments.us/reader036/viewer/2022062309/56815887550346895dc5e8bf/html5/thumbnails/27.jpg)
Day 1 - Section 4 27
Conclusion
•SVD is a powerful tool
•Can be very useful in gene expression data
•SVD of genes (eigen-genes)
•SVD of samples (eigen-assays)
•Mostly an EDA tool
![Page 28: Essential Statistics in Biology: Getting the Numbers Right](https://reader036.fdocuments.us/reader036/viewer/2022062309/56815887550346895dc5e8bf/html5/thumbnails/28.jpg)
Overview of Statistics
inference: Bayes vs. Frequentists
(If time permits)
![Page 29: Essential Statistics in Biology: Getting the Numbers Right](https://reader036.fdocuments.us/reader036/viewer/2022062309/56815887550346895dc5e8bf/html5/thumbnails/29.jpg)
Day 1 - Section 5 29
Introduction
•Parametric statistical model
•Observation are drawn from a probability distribution where is the parameter vectorLikelihood function →
(Inverted density)
![Page 30: Essential Statistics in Biology: Getting the Numbers Right](https://reader036.fdocuments.us/reader036/viewer/2022062309/56815887550346895dc5e8bf/html5/thumbnails/30.jpg)
Day 1 - Section 5 30
Introduction
•Parametric statistical model
•Observation are drawn from a probability distribution where is the parameter vectorLikelihood function →
(Inverted density)
![Page 31: Essential Statistics in Biology: Getting the Numbers Right](https://reader036.fdocuments.us/reader036/viewer/2022062309/56815887550346895dc5e8bf/html5/thumbnails/31.jpg)
Day 1 - Section 5 31
Introduction
Normal distributionProbability distribution for one observation is
If independence
![Page 32: Essential Statistics in Biology: Getting the Numbers Right](https://reader036.fdocuments.us/reader036/viewer/2022062309/56815887550346895dc5e8bf/html5/thumbnails/32.jpg)
Day 1 - Section 5 32
Introduction15 observations
N(1,1)
![Page 33: Essential Statistics in Biology: Getting the Numbers Right](https://reader036.fdocuments.us/reader036/viewer/2022062309/56815887550346895dc5e8bf/html5/thumbnails/33.jpg)
Day 1 - Section 5 33
Introduction15 observations
N(1,1)
True probability distribution
![Page 34: Essential Statistics in Biology: Getting the Numbers Right](https://reader036.fdocuments.us/reader036/viewer/2022062309/56815887550346895dc5e8bf/html5/thumbnails/34.jpg)
Day 1 - Section 5 34
Inference
•The parameters are unknown
•“Learn” something about the parameter vector θ from the data
•Make inference about θ
‣ Estimate θ
‣ Confidence region
‣ Test an hypothesis (θ=0)
![Page 35: Essential Statistics in Biology: Getting the Numbers Right](https://reader036.fdocuments.us/reader036/viewer/2022062309/56815887550346895dc5e8bf/html5/thumbnails/35.jpg)
Day 1 - Section 5 35
The frequentist approach
•The parameters are fixed but unknown
•Inference is based on the relative frequency of occurrence when repeating the experiment
•For example, one can look at the variance of an estimator to evaluate its efficiency
![Page 36: Essential Statistics in Biology: Getting the Numbers Right](https://reader036.fdocuments.us/reader036/viewer/2022062309/56815887550346895dc5e8bf/html5/thumbnails/36.jpg)
Day 1 - Section 5 36
The Normal Example: Estimation
Normal distribution
is the mean and is the variance
(Sample mean and sample variance)
Numerical example, 15 obs. from N(1,1)
Use the theory of repeated samples to evaluatethe estimators.
![Page 37: Essential Statistics in Biology: Getting the Numbers Right](https://reader036.fdocuments.us/reader036/viewer/2022062309/56815887550346895dc5e8bf/html5/thumbnails/37.jpg)
Day 1 - Section 5 37
The Normal Example: EstimationIn our toy example, the data are normal, and we can derive the sampling distribution of the estimators.For example we know that is normal with mean and variance . The standard deviation of an estimator is called the standard error. What if we can’t derive the sampling distribution?Use the bootstrap!
![Page 38: Essential Statistics in Biology: Getting the Numbers Right](https://reader036.fdocuments.us/reader036/viewer/2022062309/56815887550346895dc5e8bf/html5/thumbnails/38.jpg)
Day 1 - Section 5 38
The Bootstrap- Basic idea is to resample the data we have observed and compute a new value of the statistic/estimator for each resampled data set.- Then one can assess the estimator by looking at the empirical distribution across the resampled data sets.
set.seed(100)x<-rnorm(15)mu.hat<-mean(x)sigma.hat<-sd(x)B<-100mu.hatNew<-rep(0,B)for(i in 1:B){ x.new<-sample(x,replace=TRUE) mu.hatNew[i]<-mean(x.new)}se<-sd(mu.hatNew)set.seed(100)x<-rnorm(15)mu.hat<-mean(x)sigma.hat<-sd(x)B<-100mu.hatNew<-rep(0,B)for(i in 1:B){ x.new<-sample(x,replace=TRUE) mu.hatNew[i]<-median(x.new)}se<-sd(mu.hatNew)
![Page 39: Essential Statistics in Biology: Getting the Numbers Right](https://reader036.fdocuments.us/reader036/viewer/2022062309/56815887550346895dc5e8bf/html5/thumbnails/39.jpg)
Day 1 - Section 5 39
The Normal Example: CIConfidence interval for
the mean :
depends on n but when n is large
and usuallywhere
Numerical example, 15 obs. from N(1,1)
What does this mean?set.seed(100)x<-rnorm(15)t.test(x,mean=0)
> set.seed(100)> x<-rnorm(15)> t.test(x,mean=0)
One Sample t-test
data: x t = 0.3487, df = 14, p-value = 0.7325alternative hypothesis: true mean is not equal to 0 95 percent confidence interval: -0.2294725 0.3185625 sample estimates:mean of x 0.044545
![Page 40: Essential Statistics in Biology: Getting the Numbers Right](https://reader036.fdocuments.us/reader036/viewer/2022062309/56815887550346895dc5e8bf/html5/thumbnails/40.jpg)
Day 1 - Section 5 40
The Normal Example:Testing
Test an hypothesis about the mean:
t-test
If , t follows a t-distribution with n-1 degrees of freedom
p-value
![Page 41: Essential Statistics in Biology: Getting the Numbers Right](https://reader036.fdocuments.us/reader036/viewer/2022062309/56815887550346895dc5e8bf/html5/thumbnails/41.jpg)
Day 1 - Section 5 41
The Bayesian Approach
•Parametric statistical model
•Observation are drawn from a probability distribution where is the parameter vector
● The parameters are unknown but random● The uncertainty on the vector parameter is model through a prior distribution
![Page 42: Essential Statistics in Biology: Getting the Numbers Right](https://reader036.fdocuments.us/reader036/viewer/2022062309/56815887550346895dc5e8bf/html5/thumbnails/42.jpg)
Day 1 - Section 5 42
The Bayesian Approach
A Bayesian statistical model is made of
1. A parametric statistical model
2. A prior distribution
Q: How can we combine the two?A: Bayes Theorem!
![Page 43: Essential Statistics in Biology: Getting the Numbers Right](https://reader036.fdocuments.us/reader036/viewer/2022062309/56815887550346895dc5e8bf/html5/thumbnails/43.jpg)
Day 1 - Section 5 43
The Bayesian ApproachBayes theorem ↔ Inversion of probability
If A and E are events such that P(E)≠0 and P(A)≠0 then P(A|E) and P(E|A) are related by
![Page 44: Essential Statistics in Biology: Getting the Numbers Right](https://reader036.fdocuments.us/reader036/viewer/2022062309/56815887550346895dc5e8bf/html5/thumbnails/44.jpg)
Day 1 - Section 5 44
The Bayesian ApproachFrom prior to posterior:
Information on Information on θθ contained in the contained in the observation observation yy
Prior informationPrior information
Normalizing constant
![Page 45: Essential Statistics in Biology: Getting the Numbers Right](https://reader036.fdocuments.us/reader036/viewer/2022062309/56815887550346895dc5e8bf/html5/thumbnails/45.jpg)
Day 1 - Section 5 45
The Bayesian ApproachSequential nature of Bayes’ theorem:
The posterior is the new prior!
![Page 46: Essential Statistics in Biology: Getting the Numbers Right](https://reader036.fdocuments.us/reader036/viewer/2022062309/56815887550346895dc5e8bf/html5/thumbnails/46.jpg)
Day 1 - Section 5 46
The Bayesian Approach
•Actualization of the information about θ by extracting the information about θ from the data
• Condition upon the observations (Likelihood principle)
•Avoids averaging over the unobserved values of y
•Provide a complete unified inferential scope
Justifications:
![Page 47: Essential Statistics in Biology: Getting the Numbers Right](https://reader036.fdocuments.us/reader036/viewer/2022062309/56815887550346895dc5e8bf/html5/thumbnails/47.jpg)
Day 1 - Section 5 47
The Bayesian Approach
•Calculation of the normalizing constant can be difficult
•Conjugate priors (exact calculation is possible)
•Markov chain Monte Carlo
Practical aspect:
![Page 48: Essential Statistics in Biology: Getting the Numbers Right](https://reader036.fdocuments.us/reader036/viewer/2022062309/56815887550346895dc5e8bf/html5/thumbnails/48.jpg)
Day 1 - Section 5 48
The Bayesian Approach
Conjugate priors:
Example:
and
+ →
Normal mean, one observation
![Page 49: Essential Statistics in Biology: Getting the Numbers Right](https://reader036.fdocuments.us/reader036/viewer/2022062309/56815887550346895dc5e8bf/html5/thumbnails/49.jpg)
Day 1 - Section 5 49
The Bayesian Approach
Conjugate priors:
Example:
and
+ →
Normal mean, n observations
Shrinkage
![Page 50: Essential Statistics in Biology: Getting the Numbers Right](https://reader036.fdocuments.us/reader036/viewer/2022062309/56815887550346895dc5e8bf/html5/thumbnails/50.jpg)
Day 1 - Section 5 50
Introduction15 observations
N(1,1)Standardized
likelihood
![Page 51: Essential Statistics in Biology: Getting the Numbers Right](https://reader036.fdocuments.us/reader036/viewer/2022062309/56815887550346895dc5e8bf/html5/thumbnails/51.jpg)
Day 1 - Section 5 51
Introduction15 observations
N(1,1)Standardized
likelihood
Prior
![Page 52: Essential Statistics in Biology: Getting the Numbers Right](https://reader036.fdocuments.us/reader036/viewer/2022062309/56815887550346895dc5e8bf/html5/thumbnails/52.jpg)
Day 1 - Section 5 52
Introduction15 observations
N(1,1)Standardized
likelihood
Prior
Posterior
![Page 53: Essential Statistics in Biology: Getting the Numbers Right](https://reader036.fdocuments.us/reader036/viewer/2022062309/56815887550346895dc5e8bf/html5/thumbnails/53.jpg)
Day 1 - Section 5 53
Introduction15 observations
N(1,1)Standardized
likelihood
Prior
![Page 54: Essential Statistics in Biology: Getting the Numbers Right](https://reader036.fdocuments.us/reader036/viewer/2022062309/56815887550346895dc5e8bf/html5/thumbnails/54.jpg)
Day 1 - Section 5 54
Introduction15 observations
N(1,1)Standardized
likelihood
Prior
Posterior
![Page 55: Essential Statistics in Biology: Getting the Numbers Right](https://reader036.fdocuments.us/reader036/viewer/2022062309/56815887550346895dc5e8bf/html5/thumbnails/55.jpg)
Day 1 - Section 5 55
The Bayesian Approach
•Many!
•Subjectivity of the prior (most critical)
•The prior distribution is the key to Bayesian inference
Criticism of the Bayesian choice:
![Page 56: Essential Statistics in Biology: Getting the Numbers Right](https://reader036.fdocuments.us/reader036/viewer/2022062309/56815887550346895dc5e8bf/html5/thumbnails/56.jpg)
Day 1 - Section 5 56
The Bayesian Approach
•Prior information is (almost) always available
•There is no such things as a prior distribution
•The prior is a tool summarizing available information as well as uncertainty related with this information
• The use of your prior is ok as long as you can justify it
Response:
![Page 57: Essential Statistics in Biology: Getting the Numbers Right](https://reader036.fdocuments.us/reader036/viewer/2022062309/56815887550346895dc5e8bf/html5/thumbnails/57.jpg)
Day 1 - Section 5 57
The Bayesian Approach
•Make the best of available prior information
•Unified framework
•The prior information can be used to regularize noisy estimates (few replicates)
•Computationally demanding?
Bayesian statistics and Bioinformatics