Ch15: Decision Theory & Bayesian Inference 15.1: INTRO: We are back to some theoretical statistics:...

Ch15: Decision Theory & Bayesian Inference

15.1: INTRO:

We are back to some theoretical statistics:

1. Decision Theory – Make decisions in the presence of uncertainty

2. Bayesian Inference– Alternative to traditional (“frequentist”) method

15.2: Decision Theory

New Terminology:

(true) state of nature = parameter

action :choice based on the observation of data, or a random variable, X, whose CDF depends on

(statistical) decision function:

loss function:

risk function = expected loss

spaceactionspacedata

Ad :

Aa

RAl : ))(,(),( XdlEdR

ErrorSquareMeanXdEdR

estimatesXdXdXdl

functionlossquadraticaofExample

2

2

)()(),(

)()(;)()())(,(

:

Example: Game TheoryA = Manager of oil Company vs B = Opponent (Nature)

Situation: Is there any oil at a given location?

Each of the players A and B has the choice of 2 moves:

A has the choice between actions to continue or to stop drilling

B controls the choice between parameters whether there is oil or not.

21 ,

Aaa 21 ,

11 ,al 12 ,al

21 ,al 22 ,alBtoamountanpaysAThen

parameterchoosesBand

aactionchoosesAIf

functionlossal

j

i

ji

),(

15.2.1: Bayes & Minimax Rules “good decision” with smaller risk

What If

To go around, use either a Minimax or a Bayes Rule:• Minimax Rule: (minimize the maximum risk)

• Bayes Rule: (minimize the Bayes risk)

),(maxminarg dRd d

?),(),(&),(),( 22122111 TROUBLEdRdRdRdR

RiskBayes

d dREd ),(minarg

Classical Stat. vs Bayesian Stat.

Classical (or Frequentist): Unknown but fixed parameters to be estimated from the data.

Bayesian: Parameters are random variables.

Data and prior are combined to estimate posterior.

The same picture as above with some Prior Information

Inference

and/or

Prediction

Model Data

15.2.2: Posterior AnalysisBayesians look at the parameter as a random variable

with prior dist’n and a posterior distribution

.)(

exp)(

)|())(,())(,(

)|())(,())(,(

))(,(minarg)(:

0

0

ruleBayestheisxdThen

lossectedtheasxdaactionanofriskposteriortheis

xhxdlxdlEor

dxhxdlxdlEwhere

xdlExdIfATheorem

casediscrete

casecontinuous

d

)(g )()|()|( gxXfxXh

alproportion

15.2.3: Classification & Hypothesis Testing

Wish: classify an element as belonging to one of the classes partitioning a population of interest.

e.g.e.g. an utterance will be classified by a computer as one of the words in its dictionary via sound measurements.

Hypothesis testing can be seen as a classification matter with a constraint on the probability of misclassification (the probability of type I error).

dofpowerthetoequalorthanlessisdofpowertheThen

levelcesignificanaattestanotherbedLet

levelcesignificanandcxf

xfregionceaccepwith

xfXHvsxfXHfortestabedLetLemmaPearsonNeyman

)|(

)|(tan

)|(~:)|(~::

1

2

2110

15.2.4: Estimation

)|(ˆ:

)|(ˆ:

)|(ˆ,

ˆ)|()|(

ˆ:ˆ

;ˆ),(,Re:Pr

)(:

)|(ˆmin

2

var

ˆ

2

2

xhcaseDiscrete

dxhcaseContinuous

ondistributiposteriortheofmeantheis

xXEruleBayestheThus

xXExXVar

xXEisriskPosteriorestimateswhere

losserrorsquaretheisdlcalloof

ondistributiposteriortheofmeanthe

isestimateruleBayesTheATheorem

biassquared

xXEbyimized

ianceposterior

ontindependen

15.2.4: Estimation (example)

0,3

1

1,3

2

)|(ˆ,

)1(2)1(

1)0|(2)1|(

)|(

1)|()|(:

0,1

1,)|(

)(0)(1

.10,1)(:',

?ˆ

..:

1

0

1

0

xif

xifdxXhisofestimateBayestheFinally

dXhand

dXh

dxf

xfxhisonidistributposteriortheand

xif

xifxfisgivenXofondistributithe

appearsTailaifXandappearsHeadaifXLet

gonondistributiprioruniformaputllwecointheis

biasedhowaboutideanohavewethatfactthereflectToWhat

headsofyprobabilitthebeLetoncethrowniscoinbiasedAExample

'10: BayesfromdifferentareandandareMLEsclassicalTheCaution

15.3.1: Bayesian Inference for the Normal Distribution

textbookpagereadoof

xNxXisthatonDistributiiorthethan

ormativemoremuchisXofnObservatioExperimenttheIf

DatatheMeaniorofaverageweightedaisMeanPosteriorNote

x

NxX

isofondistributiposteriorthe

NXandNAssumeATheorem

589:Pr

1,~|,Pr

inf)(

&Pr:

11,

11

11

~|

:

),(~|),(~:

220

2

220

220

2020

2200

How is the prior distribution altered by a random sample ?

220

220

2020

11

21

200

1,

1

1

~,...,|

:

),(|,...,),(~

:)(

nn

xn

NxXxX

isofondistributiposteriorthe

NXXandNAssume

samplerandomatoextendedATheorem

nn

iidn

15.3.2: The Beta Dist’n is a conjugate prior to the Binomial

)(1

1

1

1|)(

),(~|:

),(~|),(~:

)1()()()(),(~:

10,)1()()(

)()(:00

Pr:

2

11

datatheandmeanpriortheofaverageweightedaismeanposteriorthe

xbaba

baba

xaxXpEand

ba

apESince

xnbxaBetaxXpispofondistributiposteriorthe

pnBinpXandbaBetapAssumenApplicatio

baba

abYVarand

ba

aYEbaBetaYTheorem

xxxba

baxfisbandaparameterswith

onDistributiBetatheofFunctionDensityobabilityTheDefinition

priorposterior

posteriorprior

ba

Ch15: Decision Theory & Bayesian Inference 15.1: INTRO: We are back to some theoretical statistics:...

Documents

Transcript of Ch15: Decision Theory & Bayesian Inference 15.1: INTRO: We are back to some theoretical statistics:...