Arthur Berg Pennsylvania State University · L Anna Karenina(4) L Middlemarch(4) L The Brothers...
Transcript of Arthur Berg Pennsylvania State University · L Anna Karenina(4) L Middlemarch(4) L The Brothers...
Standing Between a Bayesian and a Frequentist: An Emperical BayesExploration of Movies, Baseball, and Williams College
Arthur BergPennsylvania State University
Introduction Bayes Estimation Empirical Bayes Books Books Summary
Bayesian and Frequentist Representatives
Rev. Thomas Bayes FRS (1702-1761)English MathematicianPresbyterian Minister
P (H ∣E) = P (E∣H)P (H)P (E)
Sir Ronald Fisher FRS (1890-1962)English StatisticianEvolutionary Biologist, Geneticist
—Let the data speak for itself.—
Arthur Berg Standing Between a Bayesian and a Frequentist 2 / 27
Introduction Bayes Estimation Empirical Bayes Books Books Summary
Bayes Estimator as a Convex Combination
1st Goal: List the top 250 movies of all time.
Movies are rated on a scale of 1 to 10.
Some movies are rated by many people, and some by only a few.
Movies with fewer than 3000 votes are not considered.
All movies have an average rating of C = 6.9.
⋆ µi represents the mean rating by everyone who has seen movie i.⋆ The real goal is to construct the best estimate of µi, then pick the top 250.
The frequentist approach uses only Xi, the average rating for movie i.
µ(Fisher)i = Xi
The Bayesian approach shrinks Xi towards C with more shrinkingapplied when the number of votes for movie i is small.
µ(Bayes)i = αiXi + (1 − αi)C where αi ∈ (0,1)
Arthur Berg Standing Between a Bayesian and a Frequentist 3 / 27
Introduction Bayes Estimation Empirical Bayes Books Books Summary
Internet Movie Database—Top 250Rank WR R Title Votes
1 9.2 9.2 The Shawshank Redemption (1994) 546,1552 9.1 9.2 The Godfather (1972) 427,9613 9.0 9.0 The Godfather: Part II (1974) 257,6434 8.9 9.0 The Good, the Bad and the Ugly (1966) 170,0455 8.9 9.0 Pulp Fiction (1994) 436,4566 8.9 8.9 Inception (2010) 265,5317 8.9 8.9 Schindler’s List (1993) 289,1708 8.9 8.9 12 Angry Men (1957) 126,9839 8.8 8.9 One Flew Over the Cuckoo’s Nest (1975) 225,419
10 8.8 8.9 The Dark Knight (2008) 487,800⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯85 8.5 8.7 Black Swan (2010) 20,326⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯
142 8.2 8.3 Avatar (2009) 285,005⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯240 8.0 8.5 True Grit (2010) 6,444
Arthur Berg Standing Between a Bayesian and a Frequentist 4 / 27
Introduction Bayes Estimation Empirical Bayes Books Books Summary
IMDb Weighted Ranking—“a true Bayesian estimate”
WRi = viRi +mCvi +m = vi
vi +m´¹¹¹¹¹¹¸¹¹¹¹¹¹¹¶αi
Ri¯Xi
+ m
vi +m´¹¹¹¹¹¹¸¹¹¹¹¹¹¹¶1−αi
C
▸ Ri = average rating of the movie i (Xi)
▸ vi = total number of votes from regular voters
▸ m = minimum # of votes to make the list = 3000
▸ C = grand mean across all movies in the database = 6.9
Arthur Berg Standing Between a Bayesian and a Frequentist 5 / 27
Introduction Bayes Estimation Empirical Bayes Books Books Summary
A Bayesian Calculation
Xi = (Xi,1, . . . ,Xi,vi) represents the vi ratings of movie i.
prior: µi ∼ N (µ0, σ20)
conditional: Xi,j ∣µi iid∼ N (µi, σ2) (j = 1, . . . , vi)µ(Bayes)
i = E[µi∣Xi]= ( vi
vi + σ2/σ20
) Xi + ( σ2/σ20
vi + σ2/σ20
)µ0
= vivi +mRi + m
vi +mC ⇒ µ0 = C, m = σ2/σ20
Arthur Berg Standing Between a Bayesian and a Frequentist 6 / 27
1 ¿Does shrinking really help?
2 ¿How much to shrink by?
1 ¿Does shrinking really help?
2 ¿How much to shrink by?
1 ¿Does shrinking really help?
2 ¿How much to shrink by?
Prediction Error =
1 ¿Does shrinking really help?
2 ¿How much to shrink by?
�
i
(µi − µi)2
Prediction Error =�
i
(µi − µi)2
Prediction Error = n�i=1(µi − µi)2
Introduction Bayes Estimation Empirical Bayes Books Books Summary
Standing Between a Bayesian and a Frequentist
▸ In 1956, Charles Stein proved the existence of an estimator better thanthe sample mean under certain assumptions.
▸ In 1961, Willard James and Charles Stein explicitly constructed such anestimator.
Arthur Berg Standing Between a Bayesian and a Frequentist 8 / 27
Introduction Bayes Estimation Empirical Bayes Books Books Summary
The James-Stein Estimator (n ≥ 4)
µi ∼ N (µ0, σ20) Xi∣µi iid∼ N (µi, σ2) (i = 1, . . . n)
µ(Bayes)
i = E [µi∣Xi] = ( σ2
σ20 + σ2´¹¹¹¹¹¹¹¹¹¹¸¹¹¹¹¹¹¹¹¹¹¹¶α
)µ0 + ( σ20
σ20 + σ2´¹¹¹¹¹¹¹¹¹¹¸¹¹¹¹¹¹¹¹¹¹¹¶1−α
)Xi
µ(JS)
i = ( (n − 3)σ2
∑(Xi − X)2´¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¸¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¶α
)X + (1 − (n − 3)σ2
∑(Xi − X)2´¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¸¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¶1−α
)Xi
In practice, if σ2 is unknown, an estimate is used.
Arthur Berg Standing Between a Bayesian and a Frequentist 9 / 27
Introduction Bayes Estimation Empirical Bayes Books Books Summary
Predicting Batting Averages
2nd Goal: Predict final batting averages from pre-season performances.
Pre-season batting averages for 18 major league players are provided.
Season final batting averages for the same players are also recorded.
Data is from the 1970 season and is published in JASA (1975) andScientific American (1977) by Efron and Morris.
The frequentist approach uses only Xi, the pre-season batting averagefor player i. p
(Fisher)i =Xi
The Emperical Bayes approach shrinks Xi towards X by someempirically determined amount.
p(Stein)i = αXi + (1 − α)X where α ∈ (0,1)
Arthur Berg Standing Between a Bayesian and a Frequentist 10 / 27
Introduction Bayes Estimation Empirical Bayes Books Books Summary
Name hits/AB pre-season (µ(ML)) season final (µ)
1 Clemente 18/45 0.400 0.3462 Robinson 17/45 0.378 0.2983 Howard 16/45 0.356 0.2764 Johnstone 15/45 0.333 0.2225 Berry 14/45 0.311 0.2736 Spencer 14/45 0.311 0.2707 Kessinger 13/45 0.289 0.2638 Alvarado 12/45 0.267 0.2109 Santo 11/45 0.244 0.269
10 Swoboda 11/45 0.244 0.23011 Unser 10/45 0.222 0.26412 Williams 10/45 0.222 0.25613 Scott 10/45 0.222 0.30314 Petrocelli 10/45 0.222 0.26415 Rodriguez 10/45 0.222 0.22616 Campaneris 9/45 0.200 0.28617 Munson 8/45 0.178 0.31618 Alvis 7/45 0.156 0.200
Arthur Berg Standing Between a Bayesian and a Frequentist 11 / 27
Introduction Bayes Estimation Empirical Bayes Books Books Summary
Batting Average Dataset
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
1977 Batting Averages Dataset (Efron)Ba
tting
Ave
rage
0.0
0.1
0.2
0.3
0.4
pre−seasonseason final
Arthur Berg Standing Between a Bayesian and a Frequentist 12 / 27
Introduction Bayes Estimation Empirical Bayes Books Books Summary
James-Stein Estimation of Batting Averages
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
1977 Batting Averages Dataset (Efron)
Batti
ng A
vera
ge
0.0
0.1
0.2
0.3
0.4
pre−seasonseason final
− − − − − − − − − − − − − − − − − −
Arthur Berg Standing Between a Bayesian and a Frequentist 13 / 27
Introduction Bayes Estimation Empirical Bayes Books Books Summary
Ranking Bias—Emperical Bayes + Order Statistics
▸ Genome-wide association studies
▸ SNPS: AA/Aa/aa or 0/1/2(∼ 107)
▸ Estimated effects of the top SNPsare biased up. (winner’s curse)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
1977 Batting Averages Dataset (Efron)
Batti
ng A
vera
ge
0.0
0.1
0.2
0.3
0.4
pre−seasonseason final
▸ ranking bias estimator—part frequentist, part Bayesianwith robust properties
▸ Applied to 2 GWAS studies with2,000 cases and 3,000 controls
Crohn’s DiseaseType 1 Diabetes
Arthur Berg Standing Between a Bayesian and a Frequentist 14 / 27
Introduction Bayes Estimation Empirical Bayes Books Books Summary
Williams College Book Survey
In the summer of 2009, Williams faculty members were asked to listthree books they felt that students should read.
150 faculty members responded.
25 departments are represented.
394 different books were recommended.
The original publication dates were added (wikipedia/openlibrary.org).
▶ Books with unknown publication dates (13 in total) were approximated.
Arthur Berg Standing Between a Bayesian and a Frequentist 15 / 27
Introduction Bayes Estimation Empirical Bayes Books Books Summary
The Top Picks
Most Picked Authors (4+ hits)▸ Fyodor Dostoyevsky (6)The Brothers Karamazov (4)Crime and Punishment (1)Notes from the Underground (1)▸ Gabriel Garcıa Marquez (5)One Hundred Years of Solitude (5)▸ Leo Tolstoy (5)Anna Karenina (4)War and Peace (1)▸ Bill Bryson (4)A Short History of Nearly Everything (3)In a Sunburned Country (1)▸ George Eliot (4)Middlemarch (4)▸ Henry David Thoreau (4)Walden (4)▸ Vladimir Nabokov (4)Speak, Memory (3)Lolita (1)
Most Picked Titles (3+ hits)
▸ One Hundred Years ofSolitude (5)
▸ Anna Karenina (4)
▸ Middlemarch (4)
▸ The Brothers Karamazov (4)
▸ Walden (4)
▸ Independent People (3)
▸ Speak, Memory (3)
▸ The Death and Life of GreatAmerican Cities (3)
▸ The Things They Carried (3)
Arthur Berg Standing Between a Bayesian and a Frequentist 16 / 27
Introduction Bayes Estimation Empirical Bayes Books Books Summary
Average Publication Year Predictions
▸ Let µi represent average publication year for department i.
▸ Let Xi be the average publication year for department i based on onlythe first book selected.
3rd Goal: Estimate µi with only Xi.
Arthur Berg Standing Between a Bayesian and a Frequentist 17 / 27
Introduction Bayes Estimation Empirical Bayes Books Books Summary
Observed Data: First Book (Red), “Truth”: All Books (Gray)12
0014
0016
0018
0020
00
Cla
ssic
s
Asi
an S
tud
Ant
h &
Soc
Rel
igio
n
Hum
aniti
es
Pol
itica
l Sci
Phi
loso
phy
Geo
scie
nces
Mus
ic
Mat
h &
Sta
t
Eng
lish Art
Ast
rono
my
Com
p S
ci
Psy
chol
ogy
His
tory
The
ater
Ger
& R
us
Bio
logy
Eco
nom
ics
Am
er S
tud
Phy
sics
Com
p Li
t
Che
mis
try
Rom
. Lan
g
3
3
5
4
2
33 2 1 5 8 4 3 2 6 3 6 5
10
11 18 10 10 11 12
Arthur Berg Standing Between a Bayesian and a Frequentist 18 / 27
Introduction Bayes Estimation Empirical Bayes Books Books Summary
Results
µi ∼ N (µ0, σ20) Xi∣µi iid∼ N (µi, σ2
i ) (i = 1, . . .25)Set
σ2i = 1
n ∑(Xi − X)2
niwhere ni = the number of observed books in department i.
1 µ(1)i =Xi
2 µ(2)i = αiXi + (1 − αi)X
3 µ(3)i = αiXi + (1 − αi)X where X denotes the median of X’s.
Prediction Error = 25∑i=1(µ(j)i − µi)2
pe2
pe1
= .583pe3
pe1
= .543
Arthur Berg Standing Between a Bayesian and a Frequentist 19 / 27
Introduction Bayes Estimation Empirical Bayes Books Books Summary
James-Stein Shrinkage Toward the Median “Unequal Variances Case”
1200
1400
1600
1800
2000
Cla
ssic
s
Asi
an S
tud
Ant
h &
Soc
Rel
igio
n
Hum
aniti
es
Pol
itica
l Sci
Phi
loso
phy
Geo
scie
nces
Mus
ic
Mat
h &
Sta
t
Eng
lish Art
Ast
rono
my
Com
p S
ci
Psy
chol
ogy
His
tory
The
ater
Ger
& R
us
Bio
logy
Eco
nom
ics
Am
er S
tud
Phy
sics
Com
p Li
t
Che
mis
try
Rom
. Lan
g
3
3
5
4
2
33 2 1 5 8 4 3 2 6 3 6 5
10
11 18 10 10 11 12
●●
●●
●●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
−−
−
−
−−
− − − − − − − − − − − − − − − − − − −
Arthur Berg Standing Between a Bayesian and a Frequentist 20 / 27
Introduction Bayes Estimation Empirical Bayes Books Books Summary
4th Goal: Investigate how the departments cluster based on the book survey.
———–Departments are classified in the following groups———–
Natural Sciences: Astronomy, Biology, Chemistry, Geosciences, Physics
Social Sciences: American Studies, Anthropology & Sociology, AsianStudies, Economics, History, Political Science, Psychology
Formal Sciences: Computer Science, Mathematics & Statistics
Humanities: Art, Classics, Comparative Literature, English, German &Russian, Humanities, Music, Philosophy, Religion, RomanceLanguages, Theater
Arthur Berg Standing Between a Bayesian and a Frequentist 21 / 27
Introduction Bayes Estimation Empirical Bayes Books Books Summary
Departments Ranked by Publication Year14
0016
0018
0020
00
Phi
loso
phy
Ant
h &
Soc
Cla
ssic
s
Asi
an S
tud
Pol
itica
l Sci
Rel
igio
n
Mat
h &
Sta
t
Hum
aniti
es
Ast
rono
my
Geo
scie
nces
Eng
lish
Eco
nom
ics
Com
p S
ci
Ger
& R
us
Mus
ic
Art
Am
er S
tud
His
tory
Psy
chol
ogy
Com
p Li
t
The
ater
Rom
. Lan
g
Phy
sics
Bio
logy
Che
mis
try
9
14
99
3011 32
63 9 54 36 15 9 6 29 6 29 24 8 12 15 18 33 18
Arthur Berg Standing Between a Bayesian and a Frequentist 22 / 27
Introduction Bayes Estimation Empirical Bayes Books Books Summary
Distance Measures
▸ Author/Title Data: Jaccard distance=1 − ∣A∩B∣∣A∪B∣ = ∣A∪B∣−∣A∩B∣∣A∪B∣▸ Year data: absolute value of the two sample t-statistic (non-metricdistance measure)
Homework
Prove the Jaccard distance is a proper metric.
Arthur Berg Standing Between a Bayesian and a Frequentist 23 / 27
Introduction Bayes Estimation Empirical Bayes Books Books Summary
Dendrogram of the Year Distances (Philosophy Removed)
0.0
0.5
1.0
1.5
2.0 ●
Che
mis
try
●
Ant
h &
Soc
●
Mat
h &
Sta
t
●
Rel
igio
n
●
Pol
itica
l Sci
●
Asi
an S
tud
●
Cla
ssic
s
●
Eco
nom
ics
●
Eng
lish
●
Hum
aniti
es
●
Ast
rono
my
●
Geo
scie
nces
●
Art
●M
usic
●
Com
p S
ci
●
Ger
& R
us
●
Psy
chol
ogy
●
Am
er S
tud
●
His
tory
●
Bio
logy
●
Phy
sics
●
Com
p Li
t
●
Rom
. Lan
g
●
The
ater
Phi
loso
phy
Ant
h &
Soc
Cla
ssic
s
Asi
an S
tud
Pol
itica
l Sci
Rel
igio
n
Mat
h &
Sta
t
Hum
aniti
es
Ast
rono
my
Geo
scie
nces
Eng
lish
Eco
nom
ics
Com
p S
ci
Ger
& R
us
Mus
ic
Art
Am
er S
tud
His
tory
Psy
chol
ogy
Com
p Li
t
The
ater
Rom
. Lan
g
Phy
sics
Bio
logy
Che
mis
try
Arthur Berg Standing Between a Bayesian and a Frequentist 24 / 27
Introduction Bayes Estimation Empirical Bayes Books Books Summary
Multidimensional Scaling of Author Distances
−0.5 0.0 0.5
−0.
6−
0.4
−0.
20.
00.
20.
40.
6
Amer Stud
Anth & Soc
Art
Asian StudAstronomy
BiologyChemistry
Classics
Comp Lit
Comp Sci
Economics English
Geosciences
Ger & Rus
History
Humanities
Math & Stat
Music
Philosophy
Physics
Political Sci
Psychology
Religion
Rom. LangTheater
Arthur Berg Standing Between a Bayesian and a Frequentist 25 / 27
Introduction Bayes Estimation Empirical Bayes Books Books Summary
Summary
▸ There are often multiple statistical approaches to a single problem.
▸ The complete statistician makes use of all available tools.
▸ When reporting the mean values of several related quantities, thinkabout shrinkage!
Arthur Berg Standing Between a Bayesian and a Frequentist 26 / 27
Introduction Bayes Estimation Empirical Bayes Books Books Summary
Thank You!!
Williams.ArthurBerg.com
Arthur Berg Standing Between a Bayesian and a Frequentist 27 / 27