Today’s lesson (Chapter 13)

Today’s lesson (Chapter 13)

• variance(W-Y)

• Two sample (W, Y) standard normal test

• Confidence intervals for E(W-Y)

• Determining the sample size needed to have a specified probability of a Type II error and probability of a Type I error in a two sample test.

Var(W-Y)

• Definition: Variance(Y)=E[(Y-EY)2]

• Fact: Expectation is a linear operator.

• Variance(W-Y)=E[(W-Y-E(W-Y))2]

• Variance(W-Y)=E[(W-EW-(Y-EY))2]

• Expand right term out using high school algebra.

Var(W-Y)

• E[(W-EW-(Y-EY))2]=

• E[(W-EW)2+(Y-EY)2-2(W-EW)(Y-EY)]=

• E[(W-EW)2]+E[(Y-EY)2]-2E[(W-EW)(Y-EY)]=

• var(W)+var(Y)-2cov(W,Y).

New Facts

• New Definition– cov(W,Y)=cov(Y,W)=E[(W-EW)(Y-EY)]– cov(W,Y) is the numerator of the correlation

coefficient of W and Y

• New Fact– var(W-Y)=var(W)+var(Y)-2cov(W,Y)

Two Sample Testing Problem

• Research team has two samples, n observations from W and m observations from Y.

• ASS-U-ME– W sample and Y sample are independent– W is normally distributed with mean E(W) and

variance σW2

– Y is normally distributed with mean E(Y) and variance σY

2

Two Sample Testing Problem

• Null hypothesis: E(W)=E(Y)– New parameter: E(W)-E(Y)

• Alternative hypothesis may be left-sided, right-sided, or two-sided.

• Test statistic:– W mean (of n observations) - Y mean (of m

observations)

Distribution of Test Statistic

• Distribution of W mean - Y mean– either normal or approximately normal– expected value is E(W-Y)

– variance is (σW2/n)+(σY

2/m)

• Under null hypothesis, expected value of difference of means is 0.

Deriving standard error of difference of two means

• Var(W mean-Y mean)=• var(W mean)+var(Y mean)-2cov(W mean, Y

mean)• W sample is assumed to be independent of Y

sample, so covariance of means is 0.

• var(W mean)=σW2/n, n is number in W sample.

• var(Y mean)=σY2/m, m is number in Y sample.

Variances Known

• Find null distribution of test statistic using known variances and sample sizes.

• Standardize the test statistic, which is always the difference of the two sample means.

• Follow standard decision sequence.

Variances Unknown

• This is a Student’s t problem.

• Two possibilities– ASS-U-ME var(W)=var(Y) is reasonable; this

is the classic two independent sample t-test that is usually covered in the prerequisite class.

– Assumption var(W)=var(Y) is not reasonable; use unequal variance t-test.

Checking Assumption of Equal Variances

• Use SPSS– statistics, compare means, independent sample

t-test.

• SPSS uses Levene’s test for equality of variances.– Sig. means p-value of Levene’s test– Use it as you would any observed significance

level.

Choosing Equal or Unequal Variance t-test

• Some statistics professors always want equal variance t-test. Answer their questions with the equal variance t-test. This typically includes Actuary Society questions.

• In AMS315 life, I will tell you which test to use (use the equal variance t-test if there is no specification).

Choosing Equal or Unequal Variance t-test

• In real life, I ALWAYS use the unequal variance t-test.

• Some people choose the unequal variance t-test if the p-value for Levene’s test of the equality of variances is very small.

Example Problem Group I

• I present you with a computer output on the comparison of average irresponsibility at time 5 for subjects who did not use marijuana at time 3 to the average irresponsibility at time 5 for subjects who did use marijuana at time 3.


• You register that this is an A vs. B comparison, with the A group being those who did not use marijuana at time 3 and the B group is those who did use marijuana at time 3. The dependent variable is irresponsibility at time 5.


• Reading the output, you learn that there were 215 subjects who did not use marijuana at time 3 and that their average irresponsibility was 10.7860, with a standard deviation of 2.5779. There were 151 subjects who did use marijuana at time 3, and their average irresponsibility was 10.8411. The standard deviation was 2.3526.

Example Problem 1

• Levene’s test for the equality of variances had sig.=0.206.

• The 2-tailed sig for the equal variance test was 0.835, for the unequal variance test was 0.833.

First Problem

• Which of the following conclusions is correct about the test of the null hypothesis that expected irresponsibility at time 5 for a subject who did not use marijuana at time 3=expected irresponsibility at time 5 for a subject who did use marijuana at time 3

First Problem Continued

• against the alternative hypothesis that expected irresponsibility at time 5 for a subject who did not use marijuana at time 3 was not equal expected irresponsibility at time 5 for a subject who did use marijuana at time 3?

• Usual options.

Solution

• Both p-values were approximately equal and were large (0.8).

• Hence, the correct decision is to accept at the 0.10 level of significance (last option).

Second Problem

• What is the correct decision in the following? The null hypothesis is: Expected irresponsibility at time 5 for a subject who did not use marijuana at time 3 - expected irresponsibility at time 5 for a subject who did use marijuana at time 3 = 0, alpha=0.05;

Second Problem Continued

• the alternative is expected irresponsibility at time 5 for a subject who did not use marijuana at time 3 - expected irresponsibility at time 5 for a subject who did use marijuana at time 3 is not equal to 0.

Solution to Second Problem

• Read your output to find that the mean difference in the two groups was -5.50E-02.– -5.50E-02 is a notation for -5.50x10-02=-0.0550.

• Check means of groups to confirm that the mean difference is that of irresponsibility for subjects who did not use marijuana at time 3-irresponsibility for subjects who did use (10.7860-10.8411=-0.0550.

Solution to Second Problem

• Check that this order is the same as the order that I asked you about.

• Read your computer output to find that the 95 percent confidence interval for the mean difference is -0.5744 to 0.4644.

• Check whether or not the value 0 specified in the problem is in the confidence interval.

• It is, so accept the null hypothesis.

Third Problem

• What is the value of the t-test assuming equal variances?

Solution to the Third Problem

• The computer output has the mean difference (-5.50E-02).

• The computer output has the standard error of the mean difference (0.2641, on the line labeled “Equal variances assumed”)

• t-test is the standard score value of the test statistic (-0.0550-0)/0.2641=-0.21.

Fourth Problem

• How many degrees of freedom does the equal variance independent sample t-test have in this problem?

Solution

• Read the computer output to find that there were 215 subjects who did not use marijuana at time 3.

• There were 151 subjects who did use marijuana at time 3.

• The number of degrees of freedom is n+m-2 in general.

• Here, 215+151-2=364.

Fifth Problem

• Which of the following is a correct decision about the test of the null hypothesis that variance of irresponsibility at time 5 for a subject who did not use marijuana at time 3 is equal to the variance of irresponsibility at time 5 for a subject who did use marijuana against the alternative that these two variances are not equal? Usual options.

Solution

• Read the computer output to find that the sig of Levene’s test is 0.206.

• This is larger than 0.10, the level of significance in the last option.

• The answer is to accept at the 0.10 level (choose the last option).

Example Problem Group II

• Each patient in a study will take a specified medicine, and the patient’s response to that medicine will be measured. Twenty patients will be randomly assigned to two groups of ten each.


• Group 1 will receive an experimental medicine. The random variable X denotes a patient’s response to the experimental medicine and is normally distributed with unknown expected value E(X) and unknown standard deviation σ.


• Group 2 will receive the best available medicine. The random variable B denotes a patient’s response to the experimental medicine and is normally distributed with unknown expected value E(B) and unknown standard deviation σ. The null hypothesis in this experiment is that E(X-B)=0, and the alternative is that E(X-B)<0.


• The experiment was run. The observed x sample average was 274.9; and the observed b sample average was 473.7. The observed X group standard deviation was 233.7, and the B group standard deviation was 348.0. The resulting pooled estimate of the standard deviation was 296.5.

Group II First Problem

• What is the standard deviation of the random variable X average - B average?

Solution

• Var(X average)=σ2/10.

• Var(B average)=σ2/10.

• Two averages are from independent samples, and so the covariance is zero.

• Var(X average-B average)=(σ2/10)+(σ2/10)

• sd(X average-B average)=(0.2)0.5σ=0.447σ.

• The answer is 0.447σ. NOT 0.447(296.5)!

Group II, Second Problem

• Which of the following is a correct decision for accepting or rejecting the null hypothesis based on the sample averages and standard deviations given in the common paragraph?

• Usual options: reject at 0.01, accept at 0.01 and reject at 0.05, accept at 0.05 and reject at 0.10, and accept at 0.10.

Solution

• Calculate the t-statistic (standard score form of the test statistic).– Difference of means is 274.9-473.7=-198.8– Estimated standard error of test statistic is

0.447(296.5)=132.36.– Standard units value=(-198.8-0)/132.36=-1.50.

• Find degrees of freedom.– 10+10-2=18

Solution

• Determine side of test.– Left sided test.

• Stretch normal distribution critical values to values appropriate for 18 degrees of freedom.– Stretch -2.326 (0.01 level) to -2.552, -1.645 to -

1.734, and -1.282 to -1.330

• Decide: Accept at 0.01; accept at 0.05; reject at 0.10. Option C is correct.

Today’s Class

• New fact about var(W-Y)

• Application to testing two independent samples.

• Making Student’s corrections.

Next Class

• Paired t-test.

• Finding smarter ways of making an A vs. B comparison.

Today’s lesson (Chapter 13)

Documents

Transcript of Today’s lesson (Chapter 13)