Today’s lesson (Chapter 13)
description
Transcript of Today’s lesson (Chapter 13)
Today’s lesson (Chapter 13)
• variance(W-Y)
• Two sample (W, Y) standard normal test
• Confidence intervals for E(W-Y)
• Determining the sample size needed to have a specified probability of a Type II error and probability of a Type I error in a two sample test.
Var(W-Y)
• Definition: Variance(Y)=E[(Y-EY)2]
• Fact: Expectation is a linear operator.
• Variance(W-Y)=E[(W-Y-E(W-Y))2]
• Variance(W-Y)=E[(W-EW-(Y-EY))2]
• Expand right term out using high school algebra.
Var(W-Y)
• E[(W-EW-(Y-EY))2]=
• E[(W-EW)2+(Y-EY)2-2(W-EW)(Y-EY)]=
• E[(W-EW)2]+E[(Y-EY)2]-2E[(W-EW)(Y-EY)]=
• var(W)+var(Y)-2cov(W,Y).
New Facts
• New Definition– cov(W,Y)=cov(Y,W)=E[(W-EW)(Y-EY)]– cov(W,Y) is the numerator of the correlation
coefficient of W and Y
• New Fact– var(W-Y)=var(W)+var(Y)-2cov(W,Y)
Two Sample Testing Problem
• Research team has two samples, n observations from W and m observations from Y.
• ASS-U-ME– W sample and Y sample are independent– W is normally distributed with mean E(W) and
variance σW2
– Y is normally distributed with mean E(Y) and variance σY
2
Two Sample Testing Problem
• Null hypothesis: E(W)=E(Y)– New parameter: E(W)-E(Y)
• Alternative hypothesis may be left-sided, right-sided, or two-sided.
• Test statistic:– W mean (of n observations) - Y mean (of m
observations)
Distribution of Test Statistic
• Distribution of W mean - Y mean– either normal or approximately normal– expected value is E(W-Y)
– variance is (σW2/n)+(σY
2/m)
• Under null hypothesis, expected value of difference of means is 0.
Deriving standard error of difference of two means
• Var(W mean-Y mean)=• var(W mean)+var(Y mean)-2cov(W mean, Y
mean)• W sample is assumed to be independent of Y
sample, so covariance of means is 0.
• var(W mean)=σW2/n, n is number in W sample.
• var(Y mean)=σY2/m, m is number in Y sample.
Variances Known
• Find null distribution of test statistic using known variances and sample sizes.
• Standardize the test statistic, which is always the difference of the two sample means.
• Follow standard decision sequence.
Variances Unknown
• This is a Student’s t problem.
• Two possibilities– ASS-U-ME var(W)=var(Y) is reasonable; this
is the classic two independent sample t-test that is usually covered in the prerequisite class.
– Assumption var(W)=var(Y) is not reasonable; use unequal variance t-test.
Checking Assumption of Equal Variances
• Use SPSS– statistics, compare means, independent sample
t-test.
• SPSS uses Levene’s test for equality of variances.– Sig. means p-value of Levene’s test– Use it as you would any observed significance
level.
Choosing Equal or Unequal Variance t-test
• Some statistics professors always want equal variance t-test. Answer their questions with the equal variance t-test. This typically includes Actuary Society questions.
• In AMS315 life, I will tell you which test to use (use the equal variance t-test if there is no specification).
Choosing Equal or Unequal Variance t-test
• In real life, I ALWAYS use the unequal variance t-test.
• Some people choose the unequal variance t-test if the p-value for Levene’s test of the equality of variances is very small.
Example Problem Group I
• I present you with a computer output on the comparison of average irresponsibility at time 5 for subjects who did not use marijuana at time 3 to the average irresponsibility at time 5 for subjects who did use marijuana at time 3.
Example Problem Group I
• You register that this is an A vs. B comparison, with the A group being those who did not use marijuana at time 3 and the B group is those who did use marijuana at time 3. The dependent variable is irresponsibility at time 5.
Example Problem Group I
• Reading the output, you learn that there were 215 subjects who did not use marijuana at time 3 and that their average irresponsibility was 10.7860, with a standard deviation of 2.5779. There were 151 subjects who did use marijuana at time 3, and their average irresponsibility was 10.8411. The standard deviation was 2.3526.
Example Problem 1
• Levene’s test for the equality of variances had sig.=0.206.
• The 2-tailed sig for the equal variance test was 0.835, for the unequal variance test was 0.833.
First Problem
• Which of the following conclusions is correct about the test of the null hypothesis that expected irresponsibility at time 5 for a subject who did not use marijuana at time 3=expected irresponsibility at time 5 for a subject who did use marijuana at time 3
First Problem Continued
• against the alternative hypothesis that expected irresponsibility at time 5 for a subject who did not use marijuana at time 3 was not equal expected irresponsibility at time 5 for a subject who did use marijuana at time 3?
• Usual options.
Solution
• Both p-values were approximately equal and were large (0.8).
• Hence, the correct decision is to accept at the 0.10 level of significance (last option).
Second Problem
• What is the correct decision in the following? The null hypothesis is: Expected irresponsibility at time 5 for a subject who did not use marijuana at time 3 - expected irresponsibility at time 5 for a subject who did use marijuana at time 3 = 0, alpha=0.05;
Second Problem Continued
• the alternative is expected irresponsibility at time 5 for a subject who did not use marijuana at time 3 - expected irresponsibility at time 5 for a subject who did use marijuana at time 3 is not equal to 0.
Solution to Second Problem
• Read your output to find that the mean difference in the two groups was -5.50E-02.– -5.50E-02 is a notation for -5.50x10-02=-0.0550.
• Check means of groups to confirm that the mean difference is that of irresponsibility for subjects who did not use marijuana at time 3-irresponsibility for subjects who did use (10.7860-10.8411=-0.0550.
Solution to Second Problem
• Check that this order is the same as the order that I asked you about.
• Read your computer output to find that the 95 percent confidence interval for the mean difference is -0.5744 to 0.4644.
• Check whether or not the value 0 specified in the problem is in the confidence interval.
• It is, so accept the null hypothesis.
Third Problem
• What is the value of the t-test assuming equal variances?
Solution to the Third Problem
• The computer output has the mean difference (-5.50E-02).
• The computer output has the standard error of the mean difference (0.2641, on the line labeled “Equal variances assumed”)
• t-test is the standard score value of the test statistic (-0.0550-0)/0.2641=-0.21.
Fourth Problem
• How many degrees of freedom does the equal variance independent sample t-test have in this problem?
Solution
• Read the computer output to find that there were 215 subjects who did not use marijuana at time 3.
• There were 151 subjects who did use marijuana at time 3.
• The number of degrees of freedom is n+m-2 in general.
• Here, 215+151-2=364.
Fifth Problem
• Which of the following is a correct decision about the test of the null hypothesis that variance of irresponsibility at time 5 for a subject who did not use marijuana at time 3 is equal to the variance of irresponsibility at time 5 for a subject who did use marijuana against the alternative that these two variances are not equal? Usual options.
Solution
• Read the computer output to find that the sig of Levene’s test is 0.206.
• This is larger than 0.10, the level of significance in the last option.
• The answer is to accept at the 0.10 level (choose the last option).
Example Problem Group II
• Each patient in a study will take a specified medicine, and the patient’s response to that medicine will be measured. Twenty patients will be randomly assigned to two groups of ten each.
Example Problem Group II
• Group 1 will receive an experimental medicine. The random variable X denotes a patient’s response to the experimental medicine and is normally distributed with unknown expected value E(X) and unknown standard deviation σ.
Example Problem Group II
• Group 2 will receive the best available medicine. The random variable B denotes a patient’s response to the experimental medicine and is normally distributed with unknown expected value E(B) and unknown standard deviation σ. The null hypothesis in this experiment is that E(X-B)=0, and the alternative is that E(X-B)<0.
Example Problem Group II
• The experiment was run. The observed x sample average was 274.9; and the observed b sample average was 473.7. The observed X group standard deviation was 233.7, and the B group standard deviation was 348.0. The resulting pooled estimate of the standard deviation was 296.5.
Group II First Problem
• What is the standard deviation of the random variable X average - B average?
Solution
• Var(X average)=σ2/10.
• Var(B average)=σ2/10.
• Two averages are from independent samples, and so the covariance is zero.
• Var(X average-B average)=(σ2/10)+(σ2/10)
• sd(X average-B average)=(0.2)0.5σ=0.447σ.
• The answer is 0.447σ. NOT 0.447(296.5)!
Group II, Second Problem
• Which of the following is a correct decision for accepting or rejecting the null hypothesis based on the sample averages and standard deviations given in the common paragraph?
• Usual options: reject at 0.01, accept at 0.01 and reject at 0.05, accept at 0.05 and reject at 0.10, and accept at 0.10.
Solution
• Calculate the t-statistic (standard score form of the test statistic).– Difference of means is 274.9-473.7=-198.8– Estimated standard error of test statistic is
0.447(296.5)=132.36.– Standard units value=(-198.8-0)/132.36=-1.50.
• Find degrees of freedom.– 10+10-2=18
Solution
• Determine side of test.– Left sided test.
• Stretch normal distribution critical values to values appropriate for 18 degrees of freedom.– Stretch -2.326 (0.01 level) to -2.552, -1.645 to -
1.734, and -1.282 to -1.330
• Decide: Accept at 0.01; accept at 0.05; reject at 0.10. Option C is correct.
Today’s Class
• New fact about var(W-Y)
• Application to testing two independent samples.
• Making Student’s corrections.
Next Class
• Paired t-test.
• Finding smarter ways of making an A vs. B comparison.