Lecture 15: Tues., Mar. 2 Inferences about Linear Combinations of Group Means (Chapter 6.2)...
-
date post
20-Dec-2015 -
Category
Documents
-
view
217 -
download
0
Transcript of Lecture 15: Tues., Mar. 2 Inferences about Linear Combinations of Group Means (Chapter 6.2)...
Lecture 15: Tues., Mar. 2
• Inferences about Linear Combinations of Group Means (Chapter 6.2)
• Chi-squared test (Handout/Notes)
• Thursday: Simple Linear Regression (Chapter 7)
Review of One-way layout
• Assumptions of ideal model– All populations have same standard deviation.– Each population is normal– Observations are independent
• Planned comparisons: Usual t-test but use all groups to estimate . If many planned comparisons, use Bonferroni to adjust for multiple comparisons
• Test of vs. alternative that at least two means differ: one-way ANOVA F-test
• Unplanned comparisons: Use Tukey-Kramer procedure to adjust for multiple comparisons.
IH 210 :
Case Study 5.1.2: Spock Conspiracy Trial
• In 1968, Dr. Spock was tried in U.S. District Court of Boston on charges of conspiring to violate Selective Service Act by encouraging young men to resist being drafted into military service.
• Defense challenged method by which jurors were selected, claiming that women – many of whom had raised children according to popular methods developed by Dr. Spock - were underrepresented
• Venire for trial contained only one woman.• Defense argued that judge in trial had a history had a history
of venires in which women were systematically underrepresented.
Data for Spock Conspiracy Trial
• Percent of women in recent 30-juror venires for Spock Trial judge and six other Boston area district judges (A,B,C,D,E,F). Seven groups (judges) in one-way layout. Data in spock.JMP.
• Key question: How does the mean percentage of women for Spock Trial judge compare to the average of the mean percentage of women for the other six judges, i.e., what is
6FEDCBA
Spock
Inference about Linear Combinations of Group Means
• Parameter of interest: For Spock study, • Point estimate:• Standard Error:
• 95% Confidence Interval for :• Test of : For level .05 test,
reject if and only if does not belong to the 95% confidence interval.
IICCC 2211
6
1,1 7654321 CCCCCCC
IIYCYCYCg 2211
I
Ip n
C
n
C
n
CsgSE
2
2
22
1
21)(
)(*,975. gSEtg In*:*,:0 aHH
0H *
O n e w a y A n a l y s i s o f P E R C E N T B y J U D G E
PE
RC
EN
T
5
15
25
35
45
A B C D E F SPOCK'S
JUDGE
M e a n s a n d S t d D e v i a t i o n s L e v e l N u m b e r M e a n S t d D e v S t d E r r M e a n L o w e r 9 5 % U p p e r 9 5 %
A 5 3 4 . 1 2 0 0 1 1 . 9 4 1 8 5 . 3 4 0 5 1 9 . 2 9 4 8 . 9 4 8 B 6 3 3 . 6 1 6 7 6 . 5 8 2 2 2 . 6 8 7 2 2 6 . 7 1 4 0 . 5 2 4 C 9 2 9 . 1 0 0 0 4 . 5 9 2 9 1 . 5 3 1 0 2 5 . 5 7 3 2 . 6 3 0 D 2 2 7 . 0 0 0 0 3 . 8 1 8 4 2 . 7 0 0 0 - 7 . 3 1 6 1 . 3 0 7 E 6 2 6 . 9 6 6 7 9 . 0 1 0 1 3 . 6 7 8 4 1 7 . 5 1 3 6 . 4 2 2 F 9 2 6 . 8 0 0 0 5 . 9 6 8 9 1 . 9 8 9 6 2 2 . 2 1 3 1 . 3 8 8 S P O C K ' S 9 1 4 . 6 2 2 2 5 . 0 3 8 8 1 . 6 7 9 6 1 0 . 7 5 1 8 . 4 9 5 O n e w a y A n o v a S u m m a r y o f F i t
R s q u a r e 0 . 5 0 8 2 6 A d j R s q u a r e 0 . 4 3 2 6 0 8 R o o t M e a n S q u a r e E r r o r 6 . 9 1 4 2 0 9 M e a n o f R e s p o n s e 2 6 . 5 8 2 6 1 O b s e r v a t i o n s ( o r S u m W g t s ) 4 6 A n a l y s i s o f V a r i a n c e S o u r c e D F S u m o f S q u a r e s M e a n S q u a r e F R a t i o P r o b > F
J U D G E 6 1 9 2 7 . 0 8 0 8 3 2 1 . 1 8 0 6 . 7 1 8 4 < . 0 0 0 1 E r r o r 3 9 1 8 6 4 . 4 4 5 3 4 7 . 8 0 6 C . T o t a l 4 5 3 7 9 1 . 5 2 6 0
Spock Trial AnalysisLinear Combination of Interest:
6'FEDCBA
sSpock
Point Estimate:
42.176
62.1480.2697.2600.2710.2962.3312.3462.14
g
Standard Error:
62.29
)6/1(
9
)6/1(
6
)6/1(
2
)6/1(
9
)6/1(
6
)6/1(
9
191.6)(
2222222
gSE
95% Confidence Interval for : 62.2*)975(.42.17 645 t
Rounding the degrees of freedom down to nearest entry on Table A.6, 042.2)975(.30 t
Approximate 95% Confidence Interval: )77.22,07.12(62.2*042.242.17
Hypothesis test of 0:0 H vs. 0: aH at level 0.05: Reject 0H since 0 is not in the
95% confidence interval. Conclusion: There is evidence that the mean percent of women in Spock’s trial judge’s venire is less than the average of the mean percent of women in the other six trial judges’ venires. A 95% confidence interval for the difference between the mean percent of percent of women in Spock’s trial judge’s venire and the average of the mean percent of women in the other six trial judges’ venires is (-12.07, -22.77).
Linear Combinations: Comparing Rates
• In mice diet study, we are interested in the rate of increase in lifetime for each additional kilocalorie of reduced diet.
• For example we are interested in comparing rate of increase in lifetime associated with reduction from 50 to 40 kcal/wk vs. rate of increase in lifetime associated with reduction from 85 to 50 kcal/wk
•
))4050(
( 50/40/
RNRN
))5085(
( 85/50/
NNRN
40/85/50/
50/40/85/50/
10
1
35
1
350
451035
RNNNRN
RNRNNNRN
O n e w a y A n a l y s i s o f L I F E T I M E B y D I E T
O n e w a y A n o v a S u m m a r y o f F i t
R s q u a r e 0 . 4 5 4 2 7 5 A d j R s q u a r e 0 . 4 4 6 3 2 R o o t M e a n S q u a r e E r r o r 6 . 6 7 8 2 3 9 M e a n s a n d S t d D e v i a t i o n s L e v e l N u m b e r M e a n S t d D e v S t d E r r M e a n L o w e r 9 5 % U p p e r 9 5 %
N / N 8 5 5 7 3 2 . 6 9 1 2 5 . 1 2 5 3 0 0 . 6 7 8 8 6 3 1 . 3 3 1 3 4 . 0 5 1 N / R 4 0 6 0 4 5 . 1 1 6 7 6 . 7 0 3 4 1 0 . 8 6 5 4 1 4 3 . 3 8 5 4 6 . 8 4 8 N / R 5 0 7 1 4 2 . 2 9 7 2 7 . 7 6 8 1 9 0 . 9 2 1 9 2 4 0 . 4 5 8 4 4 . 1 3 6 N P 4 9 2 7 . 4 0 2 0 6 . 1 3 3 7 0 0 . 8 7 6 2 4 2 5 . 6 4 0 2 9 . 1 6 4 R / R 5 0 5 6 4 2 . 8 8 5 7 6 . 6 8 3 1 5 0 . 8 9 3 0 7 4 1 . 0 9 6 4 4 . 6 7 5 L o p r o 5 6 3 9 . 6 8 5 7 6 . 9 9 1 6 9 0 . 9 3 4 3 0 3 7 . 8 1 3 4 1 . 5 5 8
P a r a m e t e r o f I n t e r e s t : 40/85/50/ 10
1
35
1
350
45RNNNRN
P o i n t E s t i m a t e : 0057.01.45*10
17.32*
35
13.42*
350
45g m o n t h s / ( k c a l / w k )
S t a n d a r d E r r o r : 1359.060
)10/1(
57
)35/1(
71
)350/45(68.6)(
222
gSE
D e g r e e s o f F r e e d o m : 984.1)975(.)975(. 100343 tt
9 5 % C o n f i d e n c e I n t e r v a l : )6906.2,7020.2(1359.0*984.10057.0 m o n t h s / ( k c a l / w k )
H y p o t h e s i s t e s t o f 0:0 H v s . 0: aH d o e s n o t r e j e c t a t 0 . 0 5 l e v e l s i n c e 0 i s i n 9 5 %
c o n f i d e n c e i n t e r v a l . C o n c l u s i o n : N o e v i d e n c e o f a d i f f e r e n c e i n r a t e s o f i n c r e a s e i n l i f e t i m e a s s o c i a t e d w i t h r e d u c t i o n o f d i e t f r o m 8 5 k c a l / w k t o 5 0 k c a l / w k c o m p a r e d t o r e d u c t i o n i n d i e t f r o m 5 0 k c a l / w k t o 4 0 k c a l / w k .
Populations of Nominal Data• So far we have focused on comparing populations of
interval data (e.g., heights, scores, incomes)• We now consider comparing populations of nominal data.
Nominal data are data that are categories. Examples:– Candidate person voted for (Bush or Gore)– Color of M&Ms (brown, yellow, red, orange, green or blue)
• A population of nominal data with k categories can be described by the proportion in each category, in category 1, in category 2, …, in category k, ( ) , e.g., population of M&M’s is supposed to have
1p 2p kp
1.0,2.0 orangegreenblueredyellowbrown pppppp
k
i ip11
One Sample Test for Nominal Data
• Analogue of one sample problem with interval population: Take random sample of size n from a population of nominal data. We want to test whether population frequencies are *,*,*, 2211 kk pppppp
),...,1(* of oneleast at :
*,*,*,: 22110
kippH
ppppppH
iia
kk
SAT example
• People sometimes say that “b” and “c” answers occur most frequently on multiple choice tests. To see if there is any evidence that the answers do not occur with equal frequency, a random SAT exam was selected from The College Board, 10 SATs, New York: College Entrance Examination Board.
2.0,,,, of oneleast at :
2.0:0
edcbaa
edcba
pppppH
pppppH
Data (sat.JMP)
1. d 15. c 29. e 43. a 57. e 71. a 2. d 16. d 30. b 44. a 58. d 72. c 3. b 17. a 31. d 45. b 59. c 73. b 4. b 18. c 32. d 46. e 60. b 74. d 5. c 19. c 33. b 47. d 61. b 75. e 6. e 20. b 34. e 48. b 62. d 76. a 7. b 21. b 35. e 49. d 63. e 77. c 8. a 22. b 36. c 50. b 64. b 78. c 9. a 23. c 37. e 51. a 65. d 79. d 10. b 24. b 38. c 52. a 66. e 80. d 11. c 25. c 39. d 53. c 67. b 81. b 12. b 26. a 40. e 54. c 68. d 82. d 13. e 27. c 41. e 55. a 69. c 83. d 14. e 28. e 42. a 56. c 70. c 84. e 85. b
Chi-squared Test
• Chi-squared test statistic:• Reject for large values of . Critical value for
level .05 test is .95 quantile of distribution with k-1 degrees of freedom (Table A.3)
• Test is only valid if expected frequencies in each cell are 5 or more. When necessary, cells should be combined in order to satisfy this condition.
Category Observed Frequency Expected Frequency under
0H
1 1f 1e np**1
2 2f 2e np**
2 ... k
kf ke npk**
k
ii
ii
e
ef1
22 )(
0H2
2
Chi-Squared Test for SAT data
Letter Observed Frequency Expected Frequency (Observed-Expected)2/Expected
A 12 85*0.2=17 1.47 B 22 85*0.2=17 1.47 C 19 85*0.2=17 0.24 D 17 85*0.2=17 0.00 E 15 85*0.2=17 0.24 Total 3.42 The test statistic 42.32 . The critical value for rejecting at the 0.05 level is
49.9)95(.24 . Since 3.42<9.49, we do not reject 0H . There is no evidence that the
letters are not random on the SAT.
Chi-Squared Test in JMP
• (For the SAT example)• Method I (list all observations in sample): Create a column for
answer and list the sample. Then click Analyze, Distribution, put column with answer in Y, click OK, then click red triangle next to answer, click Test Probabilities and then input the hypothesized probabilities (0.2 for each category for SAT example). Then click OK. The row Pearson gives the chi-squared statistic and the p-value.
• Method II (list frequencies for each category): Create a column for each answer (a,b,c,d,e) and another column frequency which contains the frequency of each answer. Then click Analyze, Distribution, put column with answer in Y and put column with frequency in Freq and click OK. Follow above instructions.
D i s t r i b u t i o n s A n s w e r s
a
b
c
d
e
a
b
c
d
e
F r e q u e n c i e s L e v e l C o u n t P r o b
a 1 2 0 . 1 4 1 1 8 b 2 2 0 . 2 5 8 8 2 c 1 9 0 . 2 2 3 5 3 d 1 7 0 . 2 0 0 0 0 e 1 5 0 . 1 7 6 4 7 T o t a l 8 5 1 . 0 0 0 0 0 5 L e v e l s T e s t P r o b a b i l i t i e s L e v e l E s t i m P r o b H y p o t h P r o b
a 0 . 1 4 1 1 8 0 . 2 0 0 0 0 b 0 . 2 5 8 8 2 0 . 2 0 0 0 0 c 0 . 2 2 3 5 3 0 . 2 0 0 0 0 d 0 . 2 0 0 0 0 0 . 2 0 0 0 0 e 0 . 1 7 6 4 7 0 . 2 0 0 0 0 T e s t C h i S q u a r e D F P r o b > C h i s q
L i k e l i h o o d R a t i o 3 . 4 5 6 8 4 0 . 4 8 4 5 P e a r s o n 3 . 4 1 1 8 4 0 . 4 9 1 4
N o e v i d e n c e a g a i n s t 2.0:0 edcba pppppH , p - v a l u e = 0 . 4 9 1 4 .
Random numbers experiment
• When selecting random numbers (e.g., for a random sample or randomized experiment), you should always use a random number generator or a random number table. People are very bad at picking random numbers themselves.
• Experiment: Everybody pick a random whole number between 1 and 10. We’ll then survey the class and test whether people’s “random” numbers are really random.
Chi-squared test for random numbers experiment
N u m b e r O b s e r v e d E x p e c t e d 1 0 . 1 * n = 2 0 . 1 * n = 3 0 . 1 * n = 4 0 . 1 * n = 5 0 . 1 * n = 6 0 . 1 * n = 7 0 . 1 * n = 8 0 . 1 * n = 9 0 . 1 * n = 1 0 0 . 1 * n =
2 C r i t i c a l v a l u e : R e j e c t 1.0...: 10210 pppH i f 92.16)95(.2
92
M&M’s
• According to the M&M’s web site, the color distribution in peanut butter M&M’s is 20% brown, 20% yellow, 20% red, 20% blue, 10% green and 10% orange. Test
not true. is esprobabilti above of oneleast at :
1.0,2.0:0
a
orangegreenblueredyellowbrown
H
ppppppH
Chi-squared test for M&Ms
C o l o r O b s e r v e d E x p e c t e d B r o w n 0 . 2 * n = Y e l l o w 0 . 2 * n = R e d 0 . 2 * n = B l u e 0 . 2 * n = G r e e n 0 . 1 * n = O r a n g e 0 . 1 * n =
2 C r i t i c a l v a l u e : R e j e c t 1.0,2.0:0 orangegreenblueredyellowbrown ppppppH i f
07.11)95(.25
2