23 testing

23
Hadley Wickham Stat310 Hypothesis testing Wednesday, 21 April 2010

Transcript of 23 testing

Page 1: 23 testing

Hadley Wickham

Stat310Hypothesis testing

Wednesday, 21 April 2010

Page 2: 23 testing

1. Quick review

2. More about the interval on the sampling distribution

3. More complicated example

4. Introduction to hypothesis testing (aka the statistical justice system)

Wednesday, 21 April 2010

Page 3: 23 testing

ReviewIdentify distribution that connects estimator and true value.

Form confidence interval for known (sampling) distribution.

Write as probability statement.

Back transform.

Write as interval.

Wednesday, 21 April 2010

Page 4: 23 testing

StepsIdentify distribution that connects estimator and true value.

Form confidence interval for known (sampling) distribution.

Write as probability statement.

Back transform.

Write as interval.

Wednesday, 21 April 2010

Page 5: 23 testing

StepsWant P(a < Q < b) = 1 - α, and b - a to be as small as possible.

If Q is symmetric, P(-a < Q < a) = 1 - α. So a = F -1(α/2), and there is no interval smaller.

If Q isn’t symmetric, pick a = F -1(α/2), b = F -1(1 - α/2), but there might be a shorter interval.

Wednesday, 21 April 2010

Page 6: 23 testing

Example

We want a 90% confidence interval, then two possible ends for the interval are F-1(0.05) and F -1(0.95)

Wednesday, 21 April 2010

Page 7: 23 testing

VarianceXi iid Normal(5, σ2), i = 1, ..., 10

I ran the experiment and recorded the following results: 0.15 1.48 1.25 2.47 1.09 1.95 1.46 1.49 2.81 1.96 (var = 0.55)

Find a 95% confidence interval for the variance. (Hint: If X~χ2(9), FX(19) = 0.975 and FX(2.7) = 0.025).

EC: Can you also make a confidence interval for the standard deviation?

Wednesday, 21 April 2010

Page 8: 23 testing

Rest of chapter

The rest of chapter 6 just gives examples of the general method we learned this week. Any of it is fair game in the final.

But don’t memorise specifics: learn the general steps!

Wednesday, 21 April 2010

Page 9: 23 testing

So far

Given Xi iid SomeDist(a, b), and data, we can give estimates (including uncertainty) for a and b.

Next: We’ll learn how to answer questions like is a > 0, or is b in [5, 10].

Wednesday, 21 April 2010

Page 10: 23 testing

Your turn

The following values have generated from iid Normal(μ, 1):

2.9 2.1 3.0 3.2 1.2 3.0 3.3 1.2 2.3 1.5 (mean: 2.13)

Is it possible they came from a normal distribution with mean 1.5?

Wednesday, 21 April 2010

Page 11: 23 testing

http://www.flickr.com/photos/joegratz/117048243

Hypothesis testingThe statistical justice system

Wednesday, 21 April 2010

Page 12: 23 testing

A suspect is accused of a crime. The suspect is declared guilty or innocent based on a trial. Each trial has a defence and a prosecution. On the basis of how evidence compares to a standard, the judge makes a decision to convict or acquit.

Wednesday, 21 April 2010

Page 13: 23 testing

A dataset is accused of having a particular parameter value. The data is declared guilty or innocent based on the results of a statistical test. Each test has a null hypothesis and an alternative hypothesis. On the basis of how a test statistic compares to a standard distribution, we make the decision to reject the null or fail to reject the null hypothesis.

Wednesday, 21 April 2010

Page 14: 23 testing

Your turn

With a partner, write down the criminal equivalents to the following terms:

null hypothesis, alternative hypothesis, test statistic, test distribution, reject the null hypothesis, fail to reject the null hypothesis

Wednesday, 21 April 2010

Page 15: 23 testing

Hypthoses

Every case has two sides:

The null hypothesis (the defence). This is the default, or the status quo, what you need to argue against.

The alternative hypothesis (the prosecution). This is the interesting case, but you need to prove it’s true.

Wednesday, 21 April 2010

Page 16: 23 testing

In the statistical justice system evidence is based on the similarity between the accused and known innocents.

The population of innocents, called the null distribution, is generated by the combination of null hypothesis and test statistic.

To determine the guilt of the accused we compute the proportion of innocents who look more guilty than the accused. This is the p-value, the probability that the accused would look this guilty if they actually were innocent.

Wednesday, 21 April 2010

Page 17: 23 testing

The lady tasting tea

A thought experiment by R. A. Fisher (famous early statistician, 1890-1962)

A lady at a tea party claims that she can tell the difference between putting the milk in first and second.

How can we be sure?

Wednesday, 21 April 2010

Page 18: 23 testing

Experiment8 cups. 4 milk first, 4 milk second. Presented in random order.

What is the null hypothesis? (What is the position of the defence?)

What test-statistic might we use? (What can we measure to assess evidence?)

What is the null-distribution?(What do innocents look like?)

Wednesday, 21 April 2010

Page 19: 23 testing

Right Wrong # %

4 0 1 1%

3 1 16 23%

2 2 36 51%

1 3 16 23%

0 4 1 1%

70 100%

http://www.uwsp.edu/education/wkirby/t&m/5LadyTastingTea.htm

Wednesday, 21 April 2010

Page 20: 23 testing

1. Write down null and alternative hypotheses (positions of defence and prosecution)

2. Figure out good test statistic (what numeric summary captures useful evidence)

3. Work out null distribution (distribution of innocents)

4. Calculate p-value by comparing actual value to null distribution (what proportion of true innocents look more guilty than the suspect)

Wednesday, 21 April 2010

Page 21: 23 testing

Your turn

The following values have generated from iid Normal(μ, 1):

2.9 2.1 3.0 3.2 1.2 3.0 3.3 1.2 2.3 1.5 (mean: 2.13)

Does μ = 1.5 ?

Wednesday, 21 April 2010

Page 22: 23 testing

Wednesday, 21 April 2010

Page 23: 23 testing

Reading

Make sure you are familiar with the similarities and differences between the criminal and statistical justice systems.

Read Sections 7.1 and 7.41 for a classical take on the issue. Can you give the errors names to match the statistical justice system?

Wednesday, 21 April 2010