Quantitative Introduction ro Risk and Uncertainty in Business...

Hypothesis TestingHoeffding’s Inequalities

K-S Tests for Goodness of FitStudent t Test

Chi-Squared Test

Quantitative Introduction roRisk and Uncertainty in Business Module 5:

Hypothesis Testing

M. Vidyasagar

Cecil & Ida Green ChairThe University of Texas at DallasEmail: [email protected]

October 13, 2012

M. Vidyasagar Hypothesis Testing



Chi-Squared Test

Outline

1 Hypothesis Testing

2 Hoeffding’s Inequalities

3 K-S Tests for Goodness of Fit

K-S (Kolmogorov-Smirnov) Tests: Objectives

Kolmogorov-Smirnov Tests: Statements

4 Student t Test

5 Chi-Squared Test




Chi-Squared Test

Hypothesis Testing: Basic Idea

‘Null’ hypothesis: What we believe in the absence of furtherevidence, e.g. a two-sided coin is ‘fair’ with equal likelihood.

Think: Null hypothesis = default assumption.

Two kinds of testing:

There is only the null hypothesis, and we accept or reject it.

There is a null as well as an alternate hypothesis, and wechoose one or the other.

The second kind of testing is easier: We choose whicheverhypothesis is more likely under the data.

The first kind of testing is harder.




Chi-Squared Test

Choosing Between Alternatives: Example

We are given a coin. The null hypothesis is that the coin is ‘fair’with equal probabilities of heads and tails. Call it H0.

The alternative hypothesis is that the coin is ‘biased’ with theprobability of heads equal to 0.7. Call it H1.

Suppose we toss the coin 20 times and 12 heads result. Whichhypothesis should we accept?




Chi-Squared Test

Choosing Between Alternatives: Example (Cont’d)

Let n = 20 (number of coin tosses), k = 12 (number of heads),p0 = 0.5 (probability of heads under hypothesis H0) and P1 = 0.7(probability of heads under hypothesis H1).

The likelihood of the observed outcome under each hypothesis iscomputed.

L0 =

(2012

)(p0)

12(1− p0)8 = 0.1201,

L1 =

(2012

)(p1)

12(1− p1)8 = 0.1144.

So we accept hypothesis H0, that the coin is fair, but only becausethe alternative hypothesis is even less likely!




Chi-Squared Test

Connection to MLE

We choose the hypothesis that the coin is fair only because thealternate hypothesis is even more unlikely!

So what is the value of p that maximizes

L =

(2012

)p12(1− p)8?

Answer: pMLE = 12/20 = 0.6, the fraction of heads observed.

With MLE (maximum likelihood estimation), we need not choosebetween two competing hypotheses – MLE gives the most likelyvalues for the parameters!




Chi-Squared Test

Outline






4 Student t Test

5 Chi-Squared Test




Chi-Squared Test

Estimating Probabilities of Binary Outcomes

Suppose an event has only two outcomes, e.g. coin toss. Let pequal the true but unknown probability of ‘success’, e.g. that thecoin comes up heads.

After n trials, suppose k successes result. Then p̂ := k/n is calledthe empirical probability of success. As we have seen, it is alsothe maximum likelihood estimate of p.

Question: How close is the empirical probability p̂ to the true butunknown probability p?

Hoeffding’s inequalities answer this question.




Chi-Squared Test

Hoeffding’s Inequalities: Statements

Let ε > 0 be any specified accuracy. Then

Pr{p̂− p ≥ ε} ≤ exp(−2nε2).

Pr{p̂− p ≤ −ε} ≤ exp(−2nε2).

Pr{|p̂− p| ≤ ε} ≥ 1− 2 exp(−2nε2).




Chi-Squared Test

Hoeffding’s Inequalities: Interpretation

Interpretations of Hoeffding’s inequalities:

With confidence 1− 2 exp(−2nε2), we can say that the true butunknown probability p lies in the interval (p̂− ε, p̂+ ε). As weincrease ε, the term δ := 2 exp(−2nε2) decreases, and we can bemore sure of our interval.

The widely used 95% confidence interval corresponds to δ = 0.5.

The one-sided inequalities have similar interpretations.




Chi-Squared Test

An Example of Applying Hoeffding’s Inequality

Suppose we toss a coin 1000 times and it comes up heads 552times. How sure can we be that the coin is biased?

n = 1000, k = 552, p̂ = 0.552. If p > 0.5 then we can say that thecoin is biased. So let ε = p̂− p = 0.052. Compute

δ = exp(−2nε2) = 0.0045

So with confidence 1− δ = 0.9955, we can say that p > 0.5. Inother words, we can be 99.55% sure that the coin is biased. Usingthe two-sided Hoeffding inequality, we can be 99.1% sure thatp̂ ∈ (0.5, 0.614).




Chi-Squared Test

Another Example

An opinion poll of 750 voters (ignoring ‘don’t know’s) shows that387 will vote for candidate A and 363 will vote for candidate B.How sure can we be that candidate A will win?

Let p denote the true but unknown fraction of voters who will votefor A, and p̂ = 387/750 = 0.5160 denote the empirical estimate ofp. If p < 0.5 then A will lose. So the accuracy ε = 0.0160, and thenumber of samples n = 750. The one-sided confidence is

δ = exp(−2nε2) = 0.6811.

So we can be only 1− δ ≈ 32% sure that A will win. In otherwords, the election cannot be ‘called’ with any confidence based onsuch a small margin of preference.




Chi-Squared Test

Relating Confidence, Accuracy and Number of Samples

For the two-sided Hoeffding inequality, the confidence δ associatedwith n samples and accuracy ε is given by

δ = 2 exp(−2nε2).

We can turn this around and ask: Given an empirical estimate p̂based on n samples, what is the accuracy corresponding to a givenconfidence level δ?

Solving the above equation for ε in terms of δ and n gives

ε(n, δ) =

(1

2nlog

2

δ

)1/2

.

So with confidence δ we can say that the true but unknownprobability p is in the interval [p̂− ε(n, δ), p̂+ ε(n, δ)].




Chi-Squared Test

Hoeffding’s Inequalities for More Than Two Outcomes

Suppose a random experiment has more than two possibleoutcomes (e.g. rolling six-sided die). Say there are k outcomes,and in n trials, the i-th outcome appears ni times (and of course∑k

i=1 ni = n).

We can definep̂i =

nin, i = 1, . . . , k,

and as we have seen, these are the maximum likelihood estimatesfor each probability.

Question: How good are these estimates?




Chi-Squared Test

More Than Two Outcomes – 2

Fact: For any sample size n and any accuracy ε, it is the case that

Pr{maxi|p̂i − pi| > ε} ≤ 2k exp(−2nε2).

So with confidence 1− 2k exp(−2nε2), we can assert that everyempirical probability p̂i is within ε of the correct value.




Chi-Squared Test

More Than Two Outcomes: Example

Suppose we roll a six-sided die 1,000 times and get the outcomes 1through 6 in the following order:

p̂1 = 0.169, p̂2 = 0.165, p̂3 = 0.166, p̂4 = 0.165, p̂5 = 0.167, p̂6 = 0.168.

With what confidence can we say that the die is not fair, that is,that p̂i 6= 1/6 for all i?




Chi-Squared Test

More Than Two Outcomes: Example (Cont’d)

Suppose that indeed the true probability is pi = 1/6 for all i. Then

maxi|p̂i − pi| = |p̂1 − 1/6| ≈ 0.0233.

Take ε = 0.233, n = 1000 and compute

δ = 6× 2 exp(−2nε2) ≈ 11.87!

How can a ‘probability’ be greater than one?

Note: This δ is just an upper bound for Pr{maxi |p̂i − pi| > ε}; soit can be larger than one.

So we cannot rule out the possibility that the die is fair (which isquite different from saying that it is fair).




Chi-Squared Test

K-S (Kolmogorov-Smirnov) Tests: ObjectivesKolmogorov-Smirnov Tests: Statements

Outline






4 Student t Test

5 Chi-Squared Test




Chi-Squared Test


K-S Tests: Problem Formulations

There are two widely used tests. They should be called theKolmogorov test and the Smirnov test, respectively. Unfortunatelythe erroneous names ‘one-sample K-S test’ and ‘two-sample K-Stest’ have become popular.

Kolmogorov Test, or One-Sample K-S Test: We have a set ofsamples, and we have a candidate probability distribution.Question: How well does the distribution fit the set of samples?

Smirnov Test, or Two-Sample K-S Test: We have two sets ofsamples, say x1, . . . , xn and y1, . . . , ym. Question: How sure arewe that both sets of samples came from the same (but unknown)distribution?




Chi-Squared Test


Outline






4 Student t Test

5 Chi-Squared Test




Chi-Squared Test


Empirical Distributions

Suppose X is a random variable for which we have generated ni.i.d. samples, call them x1, . . . , xn.

Then we define the empirical distribution of X, based on theseobservations, as follows:

Φ̂(a) =1

n

n∑i=1

I{xi≤a},

where I denotes the indicator function: I = 1 if the conditionbelow is satisfied and I = 0 otherwise.

So in this case Φ̂(a) is just the fraction of the n samples that are≤ a. The diagram on the next slide illustrates this.




Chi-Squared Test


Empirical Distribution Depicted

Note: The diagram shows the samples occurring in increasingorder but they can be in any order.1

1Source:http://www.aiaccess.net/English/Glossaries/GlosMod/e gm distribution function.htm




Chi-Squared Test


Glivenko-Cantelli Lemma

Theorem: As n→∞, the empirical distribution Φ̂(·) approachesthe true distribution Φ(·).

Specifically, if we define the Kolmogorov-Smirnov distance

dn = maxu|Φ̂(u)− Φ(u)|,

then dn → 0 as n→∞.

At what rate does the convergence take place?




Chi-Squared Test


One-Sample Kolmogorov-Smirnov Statistic

Fix a ‘confidence level’ δ > 0 (usually δ is taken as 0.05 or 0.02).Define the threshold

θ(n, δ) =

(1

2nlog

2

δ

)1/2

.

Then with probability 1− δ, we can say that

maxu|Φ̂(u)− Φ(u)| =: dn ≤ θn.




Chi-Squared Test


One-Sample Kolmogorov-Smirnov Test

Given samples x1, . . . , xn, fit it with some distribution F (·) (e.g.Gaussian). Compute the K-S statistic

dn = maxu|Φ̂(u)− F (u)|.

Compare dn with the threshold θ(n, δ). If dn > θ(n, δ), we ‘rejectthe null hypothesis’ at level δ. In other words, if dn > θ(n, δ), thenwe are 1− δ sure that the data was not generated by thedistribution F (·).




Chi-Squared Test

Outline






4 Student t Test

5 Chi-Squared Test




Chi-Squared Test

Student t Test: Motivation

The student t test is used the null hypothesis that two sets ofsamples have the same mean, assuming that they have the samevariance. The test has broad applicability even if the assumption of‘same variance’ is not satisfied.

Problem: We are given two samples x1, . . . , xm1 andxm1+1, . . . , xm1+m2 . Determine whether the two sets of samplesarise from a distribution with the same mean.

Application: Most commonly used in quality control.




Chi-Squared Test

Student t Test: Theory

Let x̄1, x̄2 denote the means of the two sample classes, that is,

x̄1 =1

m1

m1∑i=1

xi, x̄2 =1

m2

m2∑i=1

xm1+i.

Let S1, S2 denote the unbiased estimates of the standarddeviations of the two samples, that is,

S21 =

1

m1 − 1

m1∑i=1

(xi − x̄1)2,

S22 =

1

m2 − 1

m2∑i=1

(xm1+i − x̄2)2.




Chi-Squared Test

Student t Test: Theory – 2

Now define the ‘pooled’ standard deviation S12 by

S212 =

(m1 − 1)S21 + (m2 − 1)S2

2

m1 +m2 − 2.

Then the quantity

dt =x̄1 − x̄2

S12√

(1/m1) + (1/m2)

satisfies the t distribution with m1 +m2 − 2 ‘degrees of freedom.’

As the number of d.o.f. becomes large, the t distributionapproaches the normal distribution. The next slide shows thedensity of the t distribution for various d.o.f.




Chi-Squared Test

Density of the t Distribution




Chi-Squared Test

Outline






4 Student t Test

5 Chi-Squared Test




Chi-Squared Test

Chi-Squared Test: Motivation

The t test is to determine whether two samples have the samemean. The chi-squared test is to determine whether two sampleshave the same variance.

The application is again to quality control.




Chi-Squared Test

Chi-Squared Test: Theory

Given two sets of samples, say x1, . . . , xm1 andxm1+1, . . . , xm1+m2 (where usually m2 � m1), compute theunbiased variance estimate V1 of the larger (first) sample

V1 =1

m1 − 1

m1∑i=1

(xi − x̄1)2,

and the sum of squares of the smaller (second) sample

S2 =

m2∑i=1

(xm1+i − x̄2)2 = (m2 − 1)V2.

Then the ratio S2/V1 satisfies the chi-squared (or χ2) distributionwith m2 − 1 degrees of freedom.




Chi-Squared Test

Distribution Function of the Chi-Squared Variable




Chi-Squared Test

Density Function of the Chi-Squared Variable




Chi-Squared Test

Application of the Chi-Squared Test

Note that the χ2 r.v. is always nonnegative. So, given someconfidence δ (usually δ = 0.05), we need to determine a confidenceinterval

xl = Φ−1χ2,m2−1(δ), xu = Φ−1

χ2,m2−1(1− δ).

If the test statistic S2/V1 lies in the interval [xl, xu], then weaccept the null hypothesis that both samples have the samevariance.


Quantitative Introduction ro Risk and Uncertainty in Business...

Documents

Transcript of Quantitative Introduction ro Risk and Uncertainty in Business...