Ch15: Decision Theory & Bayesian Inference 15.1: INTRO: We are back to some theoretical statistics:...
-
Upload
garry-cummings -
Category
Documents
-
view
222 -
download
0
Transcript of Ch15: Decision Theory & Bayesian Inference 15.1: INTRO: We are back to some theoretical statistics:...
Ch15: Decision Theory & Bayesian Inference
15.1: INTRO:
We are back to some theoretical statistics:
1. Decision Theory – Make decisions in the presence of uncertainty
2. Bayesian Inference– Alternative to traditional (“frequentist”) method
15.2: Decision Theory
New Terminology:
(true) state of nature = parameter
action :choice based on the observation of data, or a random variable, X, whose CDF depends on
(statistical) decision function:
loss function:
risk function = expected loss
spaceactionspacedata
Ad :
Aa
RAl : ))(,(),( XdlEdR
ErrorSquareMeanXdEdR
estimatesXdXdXdl
functionlossquadraticaofExample
2
2
)()(),(
)()(;)()())(,(
:
Example: Game TheoryA = Manager of oil Company vs B = Opponent (Nature)
Situation: Is there any oil at a given location?
Each of the players A and B has the choice of 2 moves:
A has the choice between actions to continue or to stop drilling
B controls the choice between parameters whether there is oil or not.
21 ,
Aaa 21 ,
11 ,al 12 ,al
21 ,al 22 ,alBtoamountanpaysAThen
parameterchoosesBand
aactionchoosesAIf
functionlossal
j
i
ji
),(
15.2.1: Bayes & Minimax Rules “good decision” with smaller risk
What If
To go around, use either a Minimax or a Bayes Rule:• Minimax Rule: (minimize the maximum risk)
• Bayes Rule: (minimize the Bayes risk)
),(maxminarg dRd d
?),(),(&),(),( 22122111 TROUBLEdRdRdRdR
RiskBayes
d dREd ),(minarg
Classical Stat. vs Bayesian Stat.
Classical (or Frequentist): Unknown but fixed parameters to be estimated from the data.
Bayesian: Parameters are random variables.
Data and prior are combined to estimate posterior.
The same picture as above with some Prior Information
Inference
and/or
Prediction
Model Data
15.2.2: Posterior AnalysisBayesians look at the parameter as a random variable
with prior dist’n and a posterior distribution
.)(
exp)(
)|())(,())(,(
)|())(,())(,(
))(,(minarg)(:
0
0
ruleBayestheisxdThen
lossectedtheasxdaactionanofriskposteriortheis
xhxdlxdlEor
dxhxdlxdlEwhere
xdlExdIfATheorem
casediscrete
casecontinuous
d
)(g )()|()|( gxXfxXh
alproportion
15.2.3: Classification & Hypothesis Testing
Wish: classify an element as belonging to one of the classes partitioning a population of interest.
e.g.e.g. an utterance will be classified by a computer as one of the words in its dictionary via sound measurements.
Hypothesis testing can be seen as a classification matter with a constraint on the probability of misclassification (the probability of type I error).
dofpowerthetoequalorthanlessisdofpowertheThen
levelcesignificanaattestanotherbedLet
levelcesignificanandcxf
xfregionceaccepwith
xfXHvsxfXHfortestabedLetLemmaPearsonNeyman
)|(
)|(tan
)|(~:)|(~::
1
2
2110
15.2.4: Estimation
)|(ˆ:
)|(ˆ:
)|(ˆ,
ˆ)|()|(
ˆ:ˆ
;ˆ),(,Re:Pr
)(:
)|(ˆmin
2
var
ˆ
2
2
xhcaseDiscrete
dxhcaseContinuous
ondistributiposteriortheofmeantheis
xXEruleBayestheThus
xXExXVar
xXEisriskPosteriorestimateswhere
losserrorsquaretheisdlcalloof
ondistributiposteriortheofmeanthe
isestimateruleBayesTheATheorem
biassquared
xXEbyimized
ianceposterior
ontindependen
15.2.4: Estimation (example)
0,3
1
1,3
2
)|(ˆ,
)1(2)1(
1)0|(2)1|(
)|(
1)|()|(:
0,1
1,)|(
)(0)(1
.10,1)(:',
?ˆ
..:
1
0
1
0
xif
xifdxXhisofestimateBayestheFinally
dXhand
dXh
dxf
xfxhisonidistributposteriortheand
xif
xifxfisgivenXofondistributithe
appearsTailaifXandappearsHeadaifXLet
gonondistributiprioruniformaputllwecointheis
biasedhowaboutideanohavewethatfactthereflectToWhat
headsofyprobabilitthebeLetoncethrowniscoinbiasedAExample
'10: BayesfromdifferentareandandareMLEsclassicalTheCaution
15.3.1: Bayesian Inference for the Normal Distribution
textbookpagereadoof
xNxXisthatonDistributiiorthethan
ormativemoremuchisXofnObservatioExperimenttheIf
DatatheMeaniorofaverageweightedaisMeanPosteriorNote
x
NxX
isofondistributiposteriorthe
NXandNAssumeATheorem
589:Pr
1,~|,Pr
inf)(
&Pr:
11,
11
11
~|
:
),(~|),(~:
220
2
220
220
2020
2200
How is the prior distribution altered by a random sample ?
220
220
2020
11
21
200
1,
1
1
~,...,|
:
),(|,...,),(~
:)(
nn
xn
NxXxX
isofondistributiposteriorthe
NXXandNAssume
samplerandomatoextendedATheorem
nn
iidn
15.3.2: The Beta Dist’n is a conjugate prior to the Binomial
)(1
1
1
1|)(
),(~|:
),(~|),(~:
)1()()()(),(~:
10,)1()()(
)()(:00
Pr:
2
11
datatheandmeanpriortheofaverageweightedaismeanposteriorthe
xbaba
baba
xaxXpEand
ba
apESince
xnbxaBetaxXpispofondistributiposteriorthe
pnBinpXandbaBetapAssumenApplicatio
baba
abYVarand
ba
aYEbaBetaYTheorem
xxxba
baxfisbandaparameterswith
onDistributiBetatheofFunctionDensityobabilityTheDefinition
priorposterior
posteriorprior
ba