Week 8 Annotated

63
ACTL2002/ACTL5101 Probability and Statistics: Week 8 ACTL2002/ACTL5101 Probability and Statistics c Katja Ignatieva School of Risk and Actuarial Studies Australian School of Business University of New South Wales [email protected] Week 8 Probability: Week 1 Week 2 Week 3 Week 4 Estimation: Week 5 Week 6 Review Hypothesis testing: Week 7 Week 9 Linear regression: Week 10 Week 11 Week 12 Video lectures: Week 1 VL Week 2 VL Week 3 VL Week 4 VL Week 5 VL

description

.

Transcript of Week 8 Annotated

Page 1: Week 8 Annotated

ACTL2002/ACTL5101 Probability and Statistics: Week 8

ACTL2002/ACTL5101 Probability and Statistics

c© Katja Ignatieva

School of Risk and Actuarial StudiesAustralian School of Business

University of New South Wales

[email protected]

Week 8Probability: Week 1 Week 2 Week 3 Week 4

Estimation: Week 5 Week 6 Review

Hypothesis testing: Week 7 Week 9

Linear regression: Week 10 Week 11 Week 12

Video lectures: Week 1 VL Week 2 VL Week 3 VL Week 4 VL Week 5 VL

Page 2: Week 8 Annotated

ACTL2002/ACTL5101 Probability and Statistics: Week 8

First six weeks

Introduction to probability;

Moments: (non)-central moments, mean, variance (standarddeviation), skewness & kurtosis;

Special univariate (parametric) distributions (discrete &continue);

Joint distributions;

Convergence; with applications LLN & CLT;

Estimators (MME, MLE, and Bayesian);

Evaluation of estimators;

Interval estimation.2201/2250

Page 3: Week 8 Annotated

ACTL2002/ACTL5101 Probability and Statistics: Week 8

Hypothesis testing

Overview

Hypothesis testing

Hypothesis testingOverviewExercise: testing a mean, variance unknown

Properties of hypothesis testingType I and Type II Errors, power of a testExample: type I and II errors and powerp-valueRelation between Confidence intervals and hypothesis tests

Other testsk-Sample testsJarque-Bera test

Other testing methodsIntroductionConditional testsBootstrap tests

SummarySummary

Page 4: Week 8 Annotated

ACTL2002/ACTL5101 Probability and Statistics: Week 8

Hypothesis testing

Overview

This & last weekHypothesis testing:

- Null hypothesis:Selection of the null hypothesis;

Simple v.s. composite.

- Statistical tests;

- Rejection region;Neyman-Pearson Lemma;

Uniformly most powerful;

Likelihood ratio test.

- Value of test;

- Accept the null or reject the null hypothesis.

Properties of hypothesis testing.- Type I and II errors and power;

- p-value.

Using asymptotic distribution of LR for testing.2202/2250

Page 5: Week 8 Annotated

ACTL2002/ACTL5101 Probability and Statistics: Week 8

Hypothesis testing

Overview

Summary last week: Hypothesis tests

Testing procedure: When testing a hypothesis use the followingsteps:

i. Define a statistical hypothesis.Note that this includes a confidence level (α);

ii. Define the test statistic T (using past weeks knowledge);

iii. Determine the rejection region C ?;

iv. Calculate the value of the statistical test, given observed data(x1, . . . , xn);

v. Accept or reject H0.Note: we assume that H0 is true when testing! (see Type Iand Type II errors)

2203/2250

Page 6: Week 8 Annotated

ACTL2002/ACTL5101 Probability and Statistics: Week 8

Hypothesis testing

Overview

Summary last week: Best critical regionNeyman-Pearson Lemma: simple null/simple alternativetest:

Λ(x ; θ0, θ1) =L(x ; θ0)

L(x ; θ1)< k s.t.: Pr (X ∈ C ?|H0) = α

Uniform most powerful test: simple/compositenull/composite alternative test:

i) maxθ∈Ω0

Pr((X1, . . . ,Xn) ∈ C?|θ) = α;

ii) for all θ ∈ Ω− Ω0 and all critical regions C of size α we have:

Pr((X1, . . . ,Xn) ∈ C?|θ) ≥ Pr((X1, . . . ,Xn) ∈ C |θ)

Generalized likelihood ratio test: simple/compositenull/composite alternative test:

Λ(x ; θ0, θ) =L(x ; θ0)

L(x ; θ)< k s.t.: Pr (X ∈ C ?|H0) = α.

2204/2250

Page 7: Week 8 Annotated

ACTL2002/ACTL5101 Probability and Statistics: Week 8

Hypothesis testing

Exercise: testing a mean, variance unknown

Hypothesis testing

Hypothesis testingOverviewExercise: testing a mean, variance unknown

Properties of hypothesis testingType I and Type II Errors, power of a testExample: type I and II errors and powerp-valueRelation between Confidence intervals and hypothesis tests

Other testsk-Sample testsJarque-Bera test

Other testing methodsIntroductionConditional testsBootstrap tests

SummarySummary

Page 8: Week 8 Annotated

ACTL2002/ACTL5101 Probability and Statistics: Week 8

Hypothesis testing

Exercise: testing a mean, variance unknown

Exercise: testing a mean, variance unknown

Exercise: Let X1, . . . ,Xn be a random sample from a normaldistribution. Test at a level of significance α:

H0 : µ = µ0 v.s. H1 : µ = µ1 > µ0.

Note: the variance σ2 is unknown, but s2 is known.Moreover, we have x > µ0.

Question: Which (NP, UMP, LRT) procedure can you use?

The likelihood is given by:

L(x1, . . . , xn;µ, σ2

)=

n∏i=1

fXi(xi )

=

(1√2πσ

)n

· exp

(−1

2

n∑k=1

(xk − µσ

)2).

2205/2250

Page 9: Week 8 Annotated

ACTL2002/ACTL5101 Probability and Statistics: Week 8

Hypothesis testing

Exercise: testing a mean, variance unknown

Exercise: testing a mean, variance unknown

Define the parameter sets:

Ω0 =(µ, σ2

): µ= µ0, σ

2> 0

Ω =(µ, σ2

): µ≥ µ0, σ

2> 0.

We know (week 5) that MLE: µ = x andσ2 = 1

n ·∑n

i=1 (xi − x)2 (note: biased estimator!).

The maximum under Ω, is given by (using ML-estimates):

µ = x and σ2 =1

n

n∑k=1

(xk − x)2.

The maximum under Ω0 is given by (show this for yourself):

µ = µ0 and σ2 =1

n

n∑k=1

(xk − µ0)2,

2206/2250

Page 10: Week 8 Annotated

ACTL2002/ACTL5101 Probability and Statistics: Week 8

Hypothesis testing

Exercise: testing a mean, variance unknown

Thus, the likelihood ratio is given by:

Λ (x1, . . . , xn) =

maxθ∈Ω0

LΩ0 (µ, σ)

maxθ∈ΩLΩ (µ, σ)

=L(µ, σ2

)L (µ, σ2)

< k

(1√2πσ

)n· exp

−12

n∑k=1

(xk−µ0)2

1n

n∑k=1

(xk−µ0)2

(

1√2πσ

)n· exp

−12

n∑k=1

(xk−x)2

1n

n∑k=1

(xk−x)2

=

(2π∑n

k=1 (xk − µ0)2)−n/2

· e−n/2(2π∑n

k=1 (xk − x)2)−n/2

· e−n/2

< k

⇒∑n

k=1 (xk − x)2∑nk=1 (xk − µ0)2

< k1 (= k2/n).

continues next slide.

2207/2250

Page 11: Week 8 Annotated

ACTL2002/ACTL5101 Probability and Statistics: Week 8

Hypothesis testing

Exercise: testing a mean, variance unknown

Simplifying, we have:

=(xk − x)2∑n

k=1 ((xk−x) + (x − µ0))2

=

∑nk=1 (xk − x)2∑n

k=1 (xk − x)2 +∑n

k=1 (x − µ0)2 + 2∑n

k=1( (xk − x)︸ ︷︷ ︸=n·x−n·x=0

· (x − µ0)︸ ︷︷ ︸constant

)

=

∑nk=1 (xk − x)2∑n

k=1 (xk − x)2 + n · (x − µ0)2

=1

1 + n · (x − µ0)2/∑n

k=1 (xk − x)2< k1

⇒ n · (x − µ0)2

n∑k=1

(xk − x)2≥ k2 (=

1

k1− 1︸ ︷︷ ︸

decreasing

).

2208/2250

Page 12: Week 8 Annotated

ACTL2002/ACTL5101 Probability and Statistics: Week 8

Hypothesis testing

Exercise: testing a mean, variance unknown

Exercise: testing a mean, variance unknownConsider:

(x − µ0)2∑nk=1 (xk − x)2 /(n − 1)

∗=

(x − µ0

s

)2

≥ k3

(= k2 ·

n − 1

n

)⇒∣∣∣∣x − µ0

s

∣∣∣∣ ≥ k4 (=√

k3)

* using s2 =∑n

k=1(xk−x)2

n−1 .

Thus we have that the best critical region is of the form:

C ? =

(x1, . . . , xn) :

∣∣∣∣x − µ0

s

∣∣∣∣ ≥ k?,

where k? is such that:

Pr

(∣∣∣∣X − µ0

S

∣∣∣∣ ≥ k?∣∣∣∣H0

)= α.

2209/2250

Page 13: Week 8 Annotated

ACTL2002/ACTL5101 Probability and Statistics: Week 8

Hypothesis testing

Exercise: testing a mean, variance unknown

We have: X =∑n

i=1 Xi/n|H0 ∼ N(µ0, σ2/n), thus:

X − µ0

S=

1√n· X − µ0

σ/√n︸ ︷︷ ︸

=Z

/√(n − 1) · S2

σ2

/(n − 1)︸ ︷︷ ︸

=

√χ2(n−1)

(n−1)

=1√n· T︸︷︷︸∼t(n−1)

.

Thus: Pr

(∣∣∣∣X − µ0

S

∣∣∣∣ ≥ k?∣∣∣∣H0

)=α

Pr(|T | ≥

√n · k?

) ∗=α

⇒ Pr(|T | ≥ t1−α/2,n−1

) ∗∗=α

combining * and **:

⇒ t1−α/2,n−1 =√n · k? ⇒ k? =t1−α/2,n−1/

√n.

Hence: C ? =

(X1, . . . ,Xn) : −t1−α/2,n−1 ≥ x−µ0

s/√n≥ t1−α/2,n−1

.

2210/2250

Page 14: Week 8 Annotated

ACTL2002/ACTL5101 Probability and Statistics: Week 8

Properties of hypothesis testing

Type I and Type II Errors, power of a test

Hypothesis testing

Hypothesis testingOverviewExercise: testing a mean, variance unknown

Properties of hypothesis testingType I and Type II Errors, power of a testExample: type I and II errors and powerp-valueRelation between Confidence intervals and hypothesis tests

Other testsk-Sample testsJarque-Bera test

Other testing methodsIntroductionConditional testsBootstrap tests

SummarySummary

Page 15: Week 8 Annotated

ACTL2002/ACTL5101 Probability and Statistics: Week 8

Properties of hypothesis testing

Type I and Type II Errors, power of a test

Type I and Type II Errors

A Type I error is the mistake or error committed when H0 istrue but is rejected and a Type II error is the mistake or errorcommitted when H0 is false, but is not rejected.

The probability of a Type I error is often denoted by α (alsocalled the level of significance) and the probability of a Type IIerror is often denoted by β.

We have the Type I error:

α = Pr (Reject H0 |H0 is true) ,

and the Type II error:

β = Pr (Do not reject H0 |H0 is false) .

2211/2250

Page 16: Week 8 Annotated

ACTL2002/ACTL5101 Probability and Statistics: Week 8

Properties of hypothesis testing

Type I and Type II Errors, power of a test

Type I, Type II errors

µ0

µ1 c x

f(x|H0) → ← f(x|H

1)

α (Type Ierror) is theblue shadedarea.

β (Type IIerror) is thegreenshaded area.

π(µ1)(power) isthe purpleshaded area.

2212/2250

Page 17: Week 8 Annotated

ACTL2002/ACTL5101 Probability and Statistics: Week 8

Properties of hypothesis testing

Type I and Type II Errors, power of a test

Type I error

Type I error: reject H0 when it is true, with probability α:

- if H0 is simple this is the significance level of the test;

- if H0 is composite then this probability depends on themember of H0 that is true - usually select the maximum of theprobabilities.

Question: Test H0 : µ ≤ µ0 v.s. H1 : µ > µ0, what is theType I error?

Solution: α = Pr (Reject H0|µ = µ0).

2213/2250

Page 18: Week 8 Annotated

ACTL2002/ACTL5101 Probability and Statistics: Week 8

Properties of hypothesis testing

Type I and Type II Errors, power of a test

Type II error & powerType II error: accept H0 when it is false; β is probability of aType II error.

In case H1 is composite this depends on the particularmember of H1 that holds.

Power of a test is the probability that H0 is rejected when it isfalse (= 1− β).

For a test of H0 against H1, the power function of a test is thefunction that gives the probability that, given that H1 is true,the observable samples will fall within the critical region C .

The power function will be denoted by:

π (θ) = Pr((X1, . . . ,Xn) ∈ C |θ ∈ H1).

The value of the power function at a specific parameter pointis called the power of the test at that point.2214/2250

Page 19: Week 8 Annotated

ACTL2002/ACTL5101 Probability and Statistics: Week 8

Properties of hypothesis testing

Example: type I and II errors and power

Hypothesis testing

Hypothesis testingOverviewExercise: testing a mean, variance unknown

Properties of hypothesis testingType I and Type II Errors, power of a testExample: type I and II errors and powerp-valueRelation between Confidence intervals and hypothesis tests

Other testsk-Sample testsJarque-Bera test

Other testing methodsIntroductionConditional testsBootstrap tests

SummarySummary

Page 20: Week 8 Annotated

ACTL2002/ACTL5101 Probability and Statistics: Week 8

Properties of hypothesis testing

Example: type I and II errors and power

Example: type I and II errors and powerSuppose X1,X2, . . . ,Xn is a random sample from N(θ, 1) andthat we wish to test H0 : θ = 0 versus H1 : θ 6= 0. Considerthe rejection (or critical) region as:

C = (x1, x2, . . . , xn) : |x | > 2

The power function is equal to the probability that the samplewill fall within the critical region:

π (θ) = Pr ((X1,X2, . . . ,Xn) ∈ C |θ ∈ H1)

= Pr(∣∣X ∣∣ > 2|θ ∈ H1

)= Pr

(X > 2|θ ∈ H1

)+ Pr

(X < −2|θ ∈ H1

)∗= Pr

(Z >

√n (2− θ)

)+ Pr

(Z <

√n (−2− θ)

)= Φ

(−√n (2− θ)

)+ Φ

(√n (−2− θ)

).

* using X =∑n

i=1 Xi/n ∼ N(µ, σ2

n ) (using m.g.f. technique).2215/2250

Page 21: Week 8 Annotated

ACTL2002/ACTL5101 Probability and Statistics: Week 8

Properties of hypothesis testing

Example: type I and II errors and power

Example: type I and II errors and power

We have that the level of significance is:

α = Pr ((X1,X2, . . . ,Xn) ∈ C |H0 is true)

= Pr(∣∣X ∣∣ > 2 |θ = 0

)= Pr

(X > 2 |θ = 0

)+ Pr

(X < −2 |θ = 0

)= Pr

(√n · X > 2

√n)

+ Pr(√

n · X < −2√n)

∗= Pr

(Z > 2

√n)

+ Pr(Z < −2

√n)

= 2 · Φ(−2√n).

* using X |H0 =∑n

i=1 Xi/n ∼ N(0, 1n ).

2216/2250

Page 22: Week 8 Annotated

ACTL2002/ACTL5101 Probability and Statistics: Week 8

Properties of hypothesis testing

Example: type I and II errors and power

Example: type I and II errors and power

The probability of a Type II error when θ = 1 is given by:

β = Pr (Do not reject H0 |H0 is false)

= 1− Pr ((X1,X2, . . . ,Xn) ∈ C |θ = 1)

= 1− Pr(∣∣X ∣∣ > 2 |θ = 1

)= Pr

(∣∣X ∣∣ ≤ 2 |θ = 1)

= Pr(−2 ≤ X ≤ 2 |θ = 1

)= Pr

(−3 ·√n ≤

(X − 1

)·√n ≤√n∣∣ θ = 1

)∗= Pr

(−3√n ≤ Z ≤

√n)

= Φ(√

n)−(1− Φ

(3√n))

= Φ(√

n)

+ Φ(3√n)− 1.

The complement, 1− β, gives the power of the test when θ = 1.* using X |H1 ∼ N(1, 1/n).

2217/2250

Page 23: Week 8 Annotated

ACTL2002/ACTL5101 Probability and Statistics: Week 8

Properties of hypothesis testing

Example: type I and II errors and power

Exercise powerWe have X ∼ N(µ, 1).

Test: H0 : µ = µ0 v.s. H1 : µ > µ0.

We know: a test at significance level α would reject H0 if:

z0 =x − µ0

σ/√n≥ z1−α

Exercise: Find the power of this test at any value µ.

Solution:π(µ) =Pr

(X − µ0

σ/√n≥ z1−α

∣∣∣∣µ)=Pr

(X − µσ/√n≥ z1−α +

µ0 − µσ/√n

∣∣∣∣µ)=1− Φ

(z1−α +

µ0 − µσ/√n

).

2218/2250

Page 24: Week 8 Annotated

ACTL2002/ACTL5101 Probability and Statistics: Week 8

Properties of hypothesis testing

Example: type I and II errors and power

Power function

µ0

1

αµ

n=20n=100

When nincreasesthe power ofthe test ishigher, i.e.,more likelyto reject H0

if H1 iscorrect.

2219/2250

Page 25: Week 8 Annotated

ACTL2002/ACTL5101 Probability and Statistics: Week 8

Properties of hypothesis testing

Example: type I and II errors and power

Exercise: Bernoulli random variable

Consider a reinsurer which is asked to give a quote whentaking over the financial risk for an insurer of a major flood inthe next La Nina.

Assume that major floods are well defined (using SOI) and ifthey occur they will only occur during a La Nina.

La Ninas only occur (on average) once every 6 years.

Hence, we do not have much data, say we have a time periodof 10 La Nina.

The reinsurer has worldly experience with La Ninas and fromthat he knows that the probability of a major flood during aLa Nina is 50%.

The reinsurer test whether this also holds for Australia, orwhether the probability is higher.

2220/2250

Page 26: Week 8 Annotated

ACTL2002/ACTL5101 Probability and Statistics: Week 8

Properties of hypothesis testing

Example: type I and II errors and power

Exercise: Bernoulli random variable

Bernoulli random variable Y ∼ Ber(p) n = 10 trials, test:

H0 : p = 0.5 v.s. H1 : p > 0.5,

with α = 0.05. SOI> 10 in ’50, ’55, ’71, ’73, ’75, ’88, ’10.

Question: What is the test statistic? Solution: Number ofsuccesses X ∼ Bin(n, p).

Question: What is the rejection region? Solution: largevalues of X , i.e., C = 8, 9, 10, we then havePr (reject H0 when true) = α = 0.0547, C ? = 9, 10:Pr (X ∈ C ?|p = 0.5) = α = 0.0107.

Question: What is the power of the test? Solution: itdepends on the true value of p, becausePr (X = 9, 10|p > 0.5) depends true value of p.

2221/2250

Page 27: Week 8 Annotated

ACTL2002/ACTL5101 Probability and Statistics: Week 8

Properties of hypothesis testing

p-value

Hypothesis testing

Hypothesis testingOverviewExercise: testing a mean, variance unknown

Properties of hypothesis testingType I and Type II Errors, power of a testExample: type I and II errors and powerp-valueRelation between Confidence intervals and hypothesis tests

Other testsk-Sample testsJarque-Bera test

Other testing methodsIntroductionConditional testsBootstrap tests

SummarySummary

Page 28: Week 8 Annotated

ACTL2002/ACTL5101 Probability and Statistics: Week 8

Properties of hypothesis testing

p-value

p-value

When the rejection region is of the form T > t (or T < t)then t is called the critical value of the test (independent ofdata).

The p-value is the smallest value of α for which the nullhypothesis, given the data, will be rejected.

Note: p-value and critical value are both obtained using thetest statistic. However, critical value is computed given α,whereas p-value is calculated by determining the probabilitythat the sample occurs given the null hypothesis (i.e., theinverse).

- In case that the p-value is smaller than the level of significance⇔ reject null.

- In case that the p-value is larger than the level of significance⇔ accept null.

2222/2250

Page 29: Week 8 Annotated

ACTL2002/ACTL5101 Probability and Statistics: Week 8

Properties of hypothesis testing

p-value

Example: p-value

(Continued from slide 2215) Suppose X1,X2, . . . ,Xn is arandom sample from N (θ, 1) and that we wish to testH0 : θ = 0 versus H1 : θ 6= 0 with constant α as level ofsignificance.

Suppose the observed sample mean X turned out to be 3, wecan calculate the p-value:

p-value = Pr(∣∣X ∣∣ ≥ 3 |θ = 0

)= Pr

(X ≥ 3 |θ = 0

)+ Pr

(X ≤ −3 |θ = 0

)= Pr

(Z ≥ 3

√n)

+ Pr(Z ≤ −3

√n)

= 2Φ(−3√n)

which is smaller than the level of significance (check this).

This implies that you would reject the null hypothesis H0.2223/2250

Page 30: Week 8 Annotated

ACTL2002/ACTL5101 Probability and Statistics: Week 8

Properties of hypothesis testing

Relation between Confidence intervals and hypothesis tests

Hypothesis testing

Hypothesis testingOverviewExercise: testing a mean, variance unknown

Properties of hypothesis testingType I and Type II Errors, power of a testExample: type I and II errors and powerp-valueRelation between Confidence intervals and hypothesis tests

Other testsk-Sample testsJarque-Bera test

Other testing methodsIntroductionConditional testsBootstrap tests

SummarySummary

Page 31: Week 8 Annotated

ACTL2002/ACTL5101 Probability and Statistics: Week 8

Properties of hypothesis testing

Relation between Confidence intervals and hypothesis tests

Confidence Intervals and Hypothesis TestsRecall the example LRT: testing means normal r.v..

A possible other test is given by the following:

Reject H0 if this confidence interval does not contain µ0 i.e.,reject H0 if:

x < µ0 − zα/2 ·(σ√n

)or x > µ0 + zα/2 ·

(σ√n

).

This test clearly has size α, because:

Pr(X − z1−α/2 ·

(σ√n

)< µ0 < X + z1−α/2 ·

(σ√n

)∣∣∣H0

)= 1−α,

or, equivalently,

Pr

(X − µ0

σ/√n< −z1−α/2 |H0

)+ Pr

(X − µ0

σ/√n> z1−α/2 |H0

)= α,

which is the probability of the rejection region under the null.2224/2250

Page 32: Week 8 Annotated

ACTL2002/ACTL5101 Probability and Statistics: Week 8

Properties of hypothesis testing

Relation between Confidence intervals and hypothesis tests

Confidence Intervals and Hypothesis Tests

A confidence interval can be obtained by “inverting” ahypothesis test, and vice versa.

You would reject H0 if:∣∣∣∣X − µ0

σ/√n

∣∣∣∣ ≥ z1−α/2,

for a level of significance α.

Therefore, you accept the null hypothesis if:∣∣∣∣x − µ0

σ/√n

∣∣∣∣ < z1−α/2.

Next, consider the CI of µ0.2225/2250

Page 33: Week 8 Annotated

ACTL2002/ACTL5101 Probability and Statistics: Week 8

Properties of hypothesis testing

Relation between Confidence intervals and hypothesis tests

Confidence Intervals and Hypothesis Tests

For the CI of µ0 we have:

−z1−α/2 <x − µ0

σ/√n< z1−α/2,

or

x − z1−α/2 ·(σ√n

)< µ0 < x + z1−α/2 ·

(σ√n

),

which gives also a 100 (1− α)% confidence interval for µ0.

We see that µ0 lies in this confidence interval if and only ifyou accept the null hypothesis.

In other words, the confidence interval consists of “all thevalues of µ0 for which the null hypothesis is accepted.”

2226/2250

Page 34: Week 8 Annotated

ACTL2002/ACTL5101 Probability and Statistics: Week 8

Other tests

k-Sample tests

Hypothesis testing

Hypothesis testingOverviewExercise: testing a mean, variance unknown

Properties of hypothesis testingType I and Type II Errors, power of a testExample: type I and II errors and powerp-valueRelation between Confidence intervals and hypothesis tests

Other testsk-Sample testsJarque-Bera test

Other testing methodsIntroductionConditional testsBootstrap tests

SummarySummary

Page 35: Week 8 Annotated

ACTL2002/ACTL5101 Probability and Statistics: Week 8

Other tests

k-Sample tests

k-sample testsSuppose that k independent random samples N(µj , σ

2j ) are

obtained with sample size nj , j = 1, . . . , k.

We denote xij the observed value of the i th random variable,from j th random sample. Moreover, denote N = n1 + . . .+ nk :

x j =nj∑i=1

xijnj

s2j =

nj∑i=1

(xij−x j )2

nj−1

x =k∑

j=1

nj∑i=1

xijN =

k∑j=1

njx jN s2 =

k∑j=1

nj∑i=1

(xij−x j )2

N−1

Test for equality in distribution means, assuming unknown,but common variance σ2

j = σ2.

Test: H0 : µ1 = . . . = µk ;σ2l = σ2 for l = 1, . . . , k v.s.

H1 : µi 6= µj ;σ2l = σ2, for l = 1, . . . , k for at least one pair

i 6= j .2227/2250

Page 36: Week 8 Annotated

ACTL2002/ACTL5101 Probability and Statistics: Week 8

Other tests

k-Sample tests

k-sample testsWe use GLR. Need to find the ratio of the likelihood functionsand thus the ML- estimates in restricted and “unrestricted”case. Note: in “unrestricted” case we still have σ2

l = σ2 forl = 1, . . . , k .

Likelihood function:

L(x ;µ1, . . . , µk , σ2) =

k∏j=1

nj∏i=1

1√2πσ

exp

(−

(xij − µj)2

2σ2

)

=(2πσ2)−N/2 · exp

− 1

2σ2·

k∑j=1

nj∑i=1

(xij − µj)2

log L =

−N2· log(2πσ2)− 1

2σ2·

k∑j=1

nj∑i=1

(xij − µj)2

2228/2250

Page 37: Week 8 Annotated

ACTL2002/ACTL5101 Probability and Statistics: Week 8

Other tests

k-Sample tests

Parameter spaceΩ = (µ1, . . . , µk , σ

2)| −∞ < µj <∞, σ2 > 0Compute MLE by taking partial derivatives to each of thek + 1 parameters and equating them to zero:

∂ log L∂µj

= − 12σ2

∑nji=1 2(xij − µj)(−1) = 0, j = 1, . . . , k

∂ log L∂σ2 = − N

2σ2 − 12(σ2)2

∑kj=1

∑nji=1(xij − µj)2(−1) = 0

⇒ µj =

nj∑i=1

xij/nj = x j σ2 =k∑

j=1

nj∑i=1

(xij − x j)2/N

Under H0 we have Ω0 = (µ, σ2)| −∞ < µ <∞, σ2 > 0:∂ log L∂µ = − 1

2σ2

∑kj=1

∑nji=1 2(xij − µj)(−1) = 0

∂ log L∂σ2 = − N

2σ2 − 12(σ2)2

∑kj=1

∑nji=1(xij − µ)2(−1) = 0

⇒ µ0 =k∑

j=1

njx j/N = x σ20 =

k∑j=1

nj∑i=1

(xij − x)2/N2229/2250

Page 38: Week 8 Annotated

ACTL2002/ACTL5101 Probability and Statistics: Week 8

Other tests

k-Sample tests

GLR k-sample test

The GLR is:

Λ(x) =L(x ; µ0, σ

20)

L(x ; µ1, . . . , µk , σ2)=

(2πσ20)−N/2 · exp

(− 1

2σ20

∑kj=1

∑nji=1(xij − x)2

)(2πσ2)−N/2 · exp

(− 1

2σ2

∑kj=1

∑nji=1(xij − x j)2

)∗=

(2πσ20)−N/2 · exp (−N/2)

(2πσ2)−N/2 · exp (−N/2)=

(σ2

0

σ2

)−N/2

< k

⇒ σ20

σ2

∗=

∑kj=1

∑nji=1(xij − x)2∑k

j=1

∑nji=1(xij − x j)2

≥ k1 (= k−2/N)

∗∗=1 +

∑kj=1 nj(x j − x)2∑k

j=1

∑nji=1(xij − x j)2

≥ k1 ⇒∑k

j=1 nj(x j − x)2∑kj=1

∑nji=1(xij − x j)2

≥ k?.

* using MLE estimates σ2 and σ20 from slide 2229.

** see next slide.2230/2250

Page 39: Week 8 Annotated

ACTL2002/ACTL5101 Probability and Statistics: Week 8

Other tests

k-Sample tests

**: see tutorial exercise 5:

k∑j=1

nj∑i=1

(xij − x)2 =k∑

j=1

nj∑i=1

(xij − x j)2 +

k∑j=1

nj(x j − x)2 (1)

Thus GLR test is equivalent to rejecting H0 if:

σ2 · V3

σ2 · V2=

∑kj=1 nj(x j − x)2∑k

j=1

∑nji=1(xij − x j)2

=

∑kj=1 nj(x j − x)2

(N − k)s2p

≥ k?, where

- V1 =∑k

j=1

∑nji=1(xij − x)2/σ2 ∼ χ2(N − 1),

- V2 =∑k

j=1

∑nji=1(xij − x j)

2/σ2 ∼ χ2(N − k),

- V3 =∑k

j=1 nj(x j − x)2/σ2 ∗∼ χ2(k − 1).

∗ using (1): V1 = V2 + V3, thus V3 ∼ χ2((N − 1)− (N − k)) = χ2(k − 1).

Define F = V3/(k−1)V2/(N−k) ∼ F (k − 1,N − k)

thus: k? = F1−α(k − 1,N − k) · (k−1)(N−k) .

2231/2250

Page 40: Week 8 Annotated

ACTL2002/ACTL5101 Probability and Statistics: Week 8

Other tests

k-Sample tests

Analysis of variancek-sample test: testing the effect of k different experimentaltreatments.

Example: Are the mean claim sizes in different countriesequivalent.

We will see this also in Linear Regression.

Rational of the name Analysis of variance: recall (1):

k∑j=1

nj∑i=1

(xij − x)2

︸ ︷︷ ︸SST

=k∑

j=1

nj∑i=1

(xij − x j)2

︸ ︷︷ ︸SSW

+k∑

j=1

nj(x j − x)2

︸ ︷︷ ︸SSB

where- SST: total variability of the pooled sample data;- SSW: variation “within” the individual samples (denominator);- SSB: variation “between” the samples (nominator).2232/2250

Page 41: Week 8 Annotated

ACTL2002/ACTL5101 Probability and Statistics: Week 8

Other tests

k-Sample tests

Example: analysis of variance (past CT3 exam)

Summary of claim size data:

group 1 2 3 4

claim size y

0.11 0.52 1.48 1.520.46 1.43 2.05 2.360.71 1.84 2.38 2.951.45 2.47 3.31 4.08

sum assured x 1 2 3 4

Company A B C D total∑4i=1 yi 2.73 6.26 9.22 10.91 29.12∑4i=1 y

2i 2.8303 11.8018 23.0134 33.2289 70.8744

Question: Can you test whether the sum assured has asignificant effect on the claim size?

2233/2250

Page 42: Week 8 Annotated

ACTL2002/ACTL5101 Probability and Statistics: Week 8

Other tests

k-Sample tests

Example: analysis of variance (past CT3 exam)

Solution:

SST= 70.8744− 29.122/16 = 17.8760;

SSB= (2.732+6.262+9.222+10.912)/4−29.122/16 = 9.6709;

SSW= 17.876− 9.6709 = 8.2051.

The value of the test statistic is (N = 16 and k = 4):

F =SSB/(k − 1)

SSW /(N − k)=

9.6709/3

8.2051/12= 4.715

F&T page 173, 174: F3,12(4.474) = 2.5% andF3,12(5.953) = 1%.Hence evidence against (p-value between 1% and 2.5%).

2234/2250

Page 43: Week 8 Annotated

ACTL2002/ACTL5101 Probability and Statistics: Week 8

Other tests

Jarque-Bera test

Hypothesis testing

Hypothesis testingOverviewExercise: testing a mean, variance unknown

Properties of hypothesis testingType I and Type II Errors, power of a testExample: type I and II errors and powerp-valueRelation between Confidence intervals and hypothesis tests

Other testsk-Sample testsJarque-Bera test

Other testing methodsIntroductionConditional testsBootstrap tests

SummarySummary

Page 44: Week 8 Annotated

ACTL2002/ACTL5101 Probability and Statistics: Week 8

Other tests

Jarque-Bera test

We often assume normality.Example: Linear regression residuals (week 10-12).

We can match mean and variance of the assumed distributionand the sample distribution.

How well does the assumed distribution fit the sampledistribution?Use QQ-plots? How to formally test (see a test on slide2247)?

Other approach: look at skewness and excess kurtosis ⇒should be jointly zero.That is the Jarque-Bera test, see Jarque and Bera (1987)

Test statistic: JB = n6 ·(γ2X +

κ2X4

)approx.∼ χ2(2), reject H0 if

JB > χ21−α(2).

2235/2250

Page 45: Week 8 Annotated

ACTL2002/ACTL5101 Probability and Statistics: Week 8

Other tests

Jarque-Bera test

OPTIONAL: The prove consists of four parts:

1. Rewrite γX (and κX ) as functions g(m1,m2,m3), wherem1 =

∑ni=1 xi/n, m2 =

∑ni=1 x

2i /n, and m3 =

∑ni=1 x

3i /n.

2. Apply the (multivariate) CLT to find the joint probabilityfunction of m1,m2 − 1,m3 (MVN):

Y − µσ/√n

n→∞→ Z√n

(1

n∑i=1

Yi − µ

)n→∞→ N(0, σ2)

√n

(1

n∑i=1

Y i − µ

)n→∞→ N(0,Σ)

3. Use the (multivariate) delta method (week 4) to find thevariance of γX = g(m1,m2,m3)

Y =g(X ) ≈ g(µX

) +(X − µ

X

)∇g(µ

X)

E[Y ] ≈g(µX

), Var(Y ) ≈ ∇g>(µX

)Σ∇g(µX

)

4. Use n6 γ

2X ≈ Z 2 ∼ χ2, n

24 κ2X ≈ Z 2 ∼ χ2 hence JB ∼ χ2(2)2236/2250

Page 46: Week 8 Annotated

ACTL2002/ACTL5101 Probability and Statistics: Week 8

Other tests

Jarque-Bera test

Prove: Jarque-Bera test (OPTIONAL)1. Find the function (let mj =

∑ni=1 x

ji /n):

γX ≡1n ·∑n

i=1(xi − x)3(1n ·∑n

i=1(xi − x)2)3/2

=1n

∑(x3i − 3xx2

i + 3x2xi − x3)(

1n

∑(x2i − 2xxi + x2

))3/2

=m3 − 3m2

1m2 + 2m31(

m2 −m21

)3/2= g(m1,m2,m3)

2. Note, under H0 we have E[[m1 m2 m3]>] = [0 1 0]>. Hence√n · [m1 m2 − 1 m3]> → N(0,Σ), where:

Σ =E

xix2i − 1x3i

[xi x2i − 1 x3

i

] = E

xixi xi (x2i − 1) xix

3i

(x2i − 1)xi (x2

i − 1)(x2i − 1) (x2

i − 1)x3i

x3i xi x3

i (x2i − 1) x3

i x3i

=E

x2i x3

i − xi x4i

x3i − xi x4

i − 2x2i + 1 x5

i − x3i

x4i x5

i − x3i x6

i

=

1 0 30 2 03 0 15

2237/2250

Page 47: Week 8 Annotated

ACTL2002/ACTL5101 Probability and Statistics: Week 8

Other tests

Jarque-Bera test

Prove: Jarque-Bera test (OPTIONAL)3. Find E[γX ] and Var(γX ) (* using quotient rule & algebra)

∇g(m1,m2,m3) =

∂g(m1,m2,m3)/∂m1

∂g(m1,m2,m3)/∂m2

∂g(m1,m2,m3)/∂m3

∗=

3(m1m3 −m22)/(m2 −m2

1)5/2

(3m1(m1 − 1) + 3/2(m1m2 −m3))/(m2 −m21)5/2

1/(m2 −m21)5/2

Thus g(µ

X) = g(0, 1, 0) = 0 = E[γX ]

Thus ∇g(µX

) = ∇g(0, 1, 0) = [−3 0 1]>, hence

Var(γX ) = ∇g>(0, 1, 0)Σ∇g(0, 1, 0) = 6.

This implies:√n/6γX

approx.∼ Z ⇒ n/6γ2X

approx.∼ χ2(1).

Similar for κX , where ∇h(µX

) = [0 − 6 0 1].

4. Note: κX and γX are independent (using delta method).2238/2250

Page 48: Week 8 Annotated

ACTL2002/ACTL5101 Probability and Statistics: Week 8

Other tests

Jarque-Bera test

Jarque-Bera test: small sample sizes

The Jarque-Bera test is an approximate test (if n→∞). Howto use in “small” samples?

Calculated p-value equivalents to true α levels at givensample sizes (αJB = α in the approximate JB test):

True α level n =20 n =30 n =50 n =70 n =100

αJB =0.1 0.307 0.252 0.201 0.183 0.1560αJB =0.05 0.1461 0.109 0.079 0.067 0.062αJB =0.025 0.051 0.0303 0.020 0.016 0.0168αJB =0.01 0.0064 0.0033 0.0015 0.0012 0.002

(Calculated using bootstrap test method (see later thislecture) other values: use interpolation).

Hence: Jarque-Bera approximately test rejects the nullhypothesis more often than it should do.

2239/2250

Page 49: Week 8 Annotated

ACTL2002/ACTL5101 Probability and Statistics: Week 8

Other testing methods

Introduction

Hypothesis testing

Hypothesis testingOverviewExercise: testing a mean, variance unknown

Properties of hypothesis testingType I and Type II Errors, power of a testExample: type I and II errors and powerp-valueRelation between Confidence intervals and hypothesis tests

Other testsk-Sample testsJarque-Bera test

Other testing methodsIntroductionConditional testsBootstrap tests

SummarySummary

Page 50: Week 8 Annotated

ACTL2002/ACTL5101 Probability and Statistics: Week 8

Other testing methods

Introduction

We have seen until now pivotal tests.

What if we cannot find a pivot?

Hypothesis testing techniques:

1. Pivotal tests: here we find a pivotal test-statistic T such thatT (X ) has the same distribution for all values of θ ∈ Ω0.

2. Conditional tests: here we convert the composite hypothesisinto a simple by conditioning on a sufficient statistic.

3. Bootstrap tests: here we replace the composite hypothesis Ω0

by the simple θ, θ ∈ Ω0.

2240/2250

Page 51: Week 8 Annotated

ACTL2002/ACTL5101 Probability and Statistics: Week 8

Other testing methods

Conditional tests

Hypothesis testing

Hypothesis testingOverviewExercise: testing a mean, variance unknown

Properties of hypothesis testingType I and Type II Errors, power of a testExample: type I and II errors and powerp-valueRelation between Confidence intervals and hypothesis tests

Other testsk-Sample testsJarque-Bera test

Other testing methodsIntroductionConditional testsBootstrap tests

SummarySummary

Page 52: Week 8 Annotated

ACTL2002/ACTL5101 Probability and Statistics: Week 8

Other testing methods

Conditional tests

Application

OPTIONAL

Consider the case in which an health insurer has two types ofinsured, young individuals and old individuals.

You observe the age, you know the proportion of yourportfolio in young/old and you observe the total number ofclaims each month.

You can approximate the number of claims by a Binomialdistribution (why?).

Question: Can you test wether the probability of issuing aclaim for young individuals is smaller than for old individuals?

Notation: X ∼ Bin(n1, p1) the number of claims for youngindividuals, Y ∼ Bin(n2, p2) the number of claims for oldindividuals, and S = X + Y the total number of claims.

2241/2250

Page 53: Week 8 Annotated

ACTL2002/ACTL5101 Probability and Statistics: Week 8

Other testing methods

Conditional tests

OPTIONAL: We test: H0 : p1 = p2 = p v.s. H1 : p1 < p2.

1. The joint density under H0 is:

fX ,Y (x , y) =

(n1

x

)·(n2

y

)· px+y · (1− p)n1+n2−x−y

2. The density of S = X + Y under H0 is S ∼ Bin(n1 + n2, p).

3. The density of Y |S = s under H0 is equal toY = S − X |S = s:

fY |S(y) =fS ,Y (s, y)

fS(s)=

fX ,Y (s − y , y)

fS(s)

=

( n1s−y)·(n2y

)· ps · (1− p)n1+n2−s(n1+n2

s

)· ps · (1− p)n1+n2−s

=

( n1s−y)·(n2y

)(n1+n2s

)for y = 0, . . . , s and s = 0, . . . , n1 + n2. HenceY |S ∼ Hyper(s, y , n1 + n2), independent of p!

2242/2250

Page 54: Week 8 Annotated

ACTL2002/ACTL5101 Probability and Statistics: Week 8

Other testing methods

Conditional tests

Hence, best critical region, reject H0 if y ≥ k(s), where s isthe largest value such that:

s∑i=y

(n2i

)·( n1s−i)(n1+n2

s

) ≤ α.

Relation with sufficient statistics:

If a sufficient statistic S exists for an unknown nuisanceparameter θ, then the distribution of X |S will not depend onθ.

Exercise: (OPTIONAL) Let Xi ∼ N(µ, σ) and S =∑n

i=1 Xi

is sufficient if σ2 is known. Test H0 : µ ≥ µ0 v.s. H1 : µ < µ0,when n = 2.Note: can do for general n, but becomes messy.

2243/2250

Page 55: Week 8 Annotated

ACTL2002/ACTL5101 Probability and Statistics: Week 8

Other testing methods

Conditional tests

1. The joint density under H0 is:

fX (x) = (2πσ2)−n/2 · exp

(−∑n

i=1(xi − µ)2

σ2

)2. The density of S =

∑ni=1 Xi under H0 is S ∼ N(n · µ, n · σ2).

3. The density of Xn|S = s under H0 is equal toXn = S −

∑n−1i=1 Xi |S = s:

fX1,...,Xn−1|S(x1, . . . , xn−1) =fS ,X1,...,Xn−1(s, x1, . . . , xn−1)

fS(s)

=fX1,...,Xn(x1, . . . , xn−1, s −

∑n−1i=1 xi )

fS(s)

=

(2πσ2)−n/2 · exp

(−

∑n−1i=1 (xi−µ)2+(s−

∑n−1i=1 xi−µ)2

σ2

)(2π · nσ2)−1/2 · exp

(− (s−n·µ)2

n·σ2

)2244/2250

Page 56: Week 8 Annotated

ACTL2002/ACTL5101 Probability and Statistics: Week 8

Other testing methods

Conditional tests

Consider n = 2 we have:

fX1|S(x1) =√n · (2πσ2)−(n−1)/2·

exp

((∑n

i=1 xi − n · µ)2

n · σ2−∑n−1

i=1 (xi − µ)2 + (s −∑n−1

i=1 xi − µ)2

σ2

)∗=(2πσ2/n)−1/2 · exp

(−(x1 − s/2)2

2σ2

)* use: (x1 − µ)2 + (s − x1 − µ)2 − (s − 2µ)2/2 = (x2

1 + µ2 − 2x1µ)

+ ((s − x1)2 + µ2 − 2(s − x1)µ)− s2/2− 2µ2 + 2µ(x1 + x2)

=x21 + (s − x1)2 − s2/2

=2x21 + s2/2− 2sx1 = (2x1 − s) · (x1 − s/2) = 2(x1 − s/2)2

Notice, this is the p.d.f. of N(s/2, σ2/2).

Hence, best critical region, reject H0 ifx1 > k(s) = s/2 + z1−α · σ/

√2:2245/2250

Page 57: Week 8 Annotated

ACTL2002/ACTL5101 Probability and Statistics: Week 8

Other testing methods

Bootstrap tests

Hypothesis testing

Hypothesis testingOverviewExercise: testing a mean, variance unknown

Properties of hypothesis testingType I and Type II Errors, power of a testExample: type I and II errors and powerp-valueRelation between Confidence intervals and hypothesis tests

Other testsk-Sample testsJarque-Bera test

Other testing methodsIntroductionConditional testsBootstrap tests

SummarySummary

Page 58: Week 8 Annotated

ACTL2002/ACTL5101 Probability and Statistics: Week 8

Other testing methods

Bootstrap tests

Introduction

Sometimes, we have a test which does not allow for pivots orconditions.

Sometimes, we only have approximate test ⇒ what if n issmall? Example: Jarque-Bera when n < 200.

Use bootstrap tests.

Idea:

1. Define H0 and H1;2. Create a statistic T , based on H0 and H1;3. Determine T ? based on the sample.4. Simulate m times from H0 (m is large), calculate Tj for

j = 1, . . . ,m.5. Calculate the probability that T ? < Tj , use that for critical

region.

2246/2250

Page 59: Week 8 Annotated

ACTL2002/ACTL5101 Probability and Statistics: Week 8

Other testing methods

Bootstrap tests

Example: Testing goodness of fit1. Test H0 : Xi ∼ Pareto(1, 2) (note: mean does not exist!) v.s.

H1 : not Pareto(1,2) (see excel file, n = 5, m = 100).

2. Use sum of the squared differences between the empiricaldistribution and the fitted Pareto distribution functionevaluated at the sampled points, i.e.,

T (x) =n∑

i=1

((i − 0.5)/n)︸ ︷︷ ︸emp. quantile

− (1− (2/(x(i) + 2))1)︸ ︷︷ ︸FX (x(i))

2

.

3. Calculate T ? using the sample data.

4. Simulate m times n random variables from a Pareto(1,2)distribution and calculate for each the value of T .

5. Reject H0 if T ? is larger than the m · (1− α) largest value ofT in the simulation.2247/2250

Page 60: Week 8 Annotated

ACTL2002/ACTL5101 Probability and Statistics: Week 8

Other testing methods

Bootstrap tests

Example: Comparing models

1. Test H0 : Xi ∼ Gamma(αML, βML) v.s. H1 :Xi ∼ LN(µML, (σ2)ML).

2. Use log-likelihood to compare the models (what do we know ifn is large?)

T (x) = log(L1(µML, (σ2)ML))− log(L0(αML, βML)).

3. Calculate T ? using the sample data.

4. Simulate m times n random variables from a Gammadistribution and calculate for each the value of T .

5. Reject H0 if T ? is larger than the m · (1− α) largest value ofT in the simulation.

2248/2250

Page 61: Week 8 Annotated

ACTL2002/ACTL5101 Probability and Statistics: Week 8

Other testing methods

Bootstrap tests

Bootstrap tests

Advantages:

- Allows a wider variety of tests (i.e., does not require teststatistic);

- Is “exact”, i.e., no approximations made.

Disadvantages:

- Time consuming (programming/ computer time);

- Depends on simulations: how many/how exact (quantifiableusing non-parametric hypothesis test: see week 9);

- More complicated to find the power of the test.

2249/2250

Page 62: Week 8 Annotated

ACTL2002/ACTL5101 Probability and Statistics: Week 8

Summary

Summary

Hypothesis testing

Hypothesis testingOverviewExercise: testing a mean, variance unknown

Properties of hypothesis testingType I and Type II Errors, power of a testExample: type I and II errors and powerp-valueRelation between Confidence intervals and hypothesis tests

Other testsk-Sample testsJarque-Bera test

Other testing methodsIntroductionConditional testsBootstrap tests

SummarySummary

Page 63: Week 8 Annotated

ACTL2002/ACTL5101 Probability and Statistics: Week 8

Summary

Summary

Summary

Type I error: Pr (Reject H0|H0 is true) = α.

Type II error: Pr (Do not reject H0|H1 is true) = β.

Power: π = 1− β = Pr (Do reject H0|H1 is true).

p-value: For any level of significance α less than the p-valueyou would accept the null hypothesis.

k-sample test: k independent normal distributed samples,test whether means are all equal: Use F -statistic.

Jarque-Bera test: Test whether a distribution is normallydistributed (more accurate: is symmetric (skewness = 0) andexcess kurtosis equals zero).

Other testing methods: Conditional tests (using sufficientstatistics) & Bootstrap tests.

2250/2250