Experimental Statistics - week 4
-
Upload
lara-wilson -
Category
Documents
-
view
44 -
download
2
description
Transcript of Experimental Statistics - week 4
![Page 1: Experimental Statistics - week 4](https://reader035.fdocuments.us/reader035/viewer/2022062304/568134cf550346895d9bf5a2/html5/thumbnails/1.jpg)
1
Experimental StatisticsExperimental Statistics - week 4 - week 4Experimental StatisticsExperimental Statistics - week 4 - week 4
Chapter 8: 1-factor ANOVA models
Using SAS
![Page 2: Experimental Statistics - week 4](https://reader035.fdocuments.us/reader035/viewer/2022062304/568134cf550346895d9bf5a2/html5/thumbnails/2.jpg)
2
EXAM SCHEDULE:
Exam I – Take-home exam (handed out Thursday, March 3, due 8:00 AM Tuesday, March 8)
Exam II – Take-home exam (handed out Thursday, April 14, due 8:00 AM Tuesday, April 19)
Final Exam – optional (scheduled for 8:00 AM – 11:00 AM Friday, May 6)
GRADE COMPUTATION:
Exam Grades (75%)Daily Assignments (25%)
![Page 3: Experimental Statistics - week 4](https://reader035.fdocuments.us/reader035/viewer/2022062304/568134cf550346895d9bf5a2/html5/thumbnails/3.jpg)
3
ANOVA Table Output - hostility data - calculations done in class
Source SS df MS F p-value
Between 767.17 2 383.58 16.7 <.001 samples
Within 205.74 9 22.86 samples
Totals 972.91
![Page 4: Experimental Statistics - week 4](https://reader035.fdocuments.us/reader035/viewer/2022062304/568134cf550346895d9bf5a2/html5/thumbnails/4.jpg)
4
SPSS ANOVA Table for Hostility Data
![Page 5: Experimental Statistics - week 4](https://reader035.fdocuments.us/reader035/viewer/2022062304/568134cf550346895d9bf5a2/html5/thumbnails/5.jpg)
5
ANOVA Models
Consider the random sample
Population has mean .
1 2, ,..., ny y y
1 2 35.5, 3.8, 6.0,y y y where etc.
1 2, ,...,
,
, 1,...,
n
i i
y y y
y i n
2
If is a sample from a population that is
normal with mean and variance then we
can write
Note:
Example:
![Page 6: Experimental Statistics - week 4](https://reader035.fdocuments.us/reader035/viewer/2022062304/568134cf550346895d9bf5a2/html5/thumbnails/6.jpg)
6
11 1 11
12 1 12
y
y
We can write
etc.
For 1-factor ANOVA
![Page 7: Experimental Statistics - week 4](https://reader035.fdocuments.us/reader035/viewer/2022062304/568134cf550346895d9bf5a2/html5/thumbnails/7.jpg)
7
Alternative form of the 1-Factor ANOVA Model
2 ' are (0, )ij s NID
General Form of Model: ij i ijy
(pages 394-395)
- random errors follow a Normal (N) distribution, are independently distributed (ID), and have zero mean and constant variance
1
0t
ii
Note:
i i
ij i ijy
1
1
t
iit
-- i.e. variability does not change from group to group
![Page 8: Experimental Statistics - week 4](https://reader035.fdocuments.us/reader035/viewer/2022062304/568134cf550346895d9bf5a2/html5/thumbnails/8.jpg)
8
0 1 2:
:t
a
H
H
Testing the hypotheses:
at least 2 means a unequal
0 :
:a
H
H
is equivalent to testing the hypotheses:
![Page 9: Experimental Statistics - week 4](https://reader035.fdocuments.us/reader035/viewer/2022062304/568134cf550346895d9bf5a2/html5/thumbnails/9.jpg)
9
Analysis of Variance TableAnalysis of Variance TableAnalysis of Variance TableAnalysis of Variance Table
2
0 2( 1, )B
TW
sH F F t n t
s We reject at significance level if
1F - if factor effects, we expect
2B is 22 estimates constant -
1F - if no factor effects, we expect ;
Recall:
In our model:2 2Ws estimates
![Page 10: Experimental Statistics - week 4](https://reader035.fdocuments.us/reader035/viewer/2022062304/568134cf550346895d9bf5a2/html5/thumbnails/10.jpg)
Introduction to SAS Introduction to SAS Programming LanguageProgramming Language
![Page 11: Experimental Statistics - week 4](https://reader035.fdocuments.us/reader035/viewer/2022062304/568134cf550346895d9bf5a2/html5/thumbnails/11.jpg)
11
Recall CAR DATA
For this analysis, 5 gasoline types (A - E) were to be tested. Twenty carswere selected for testing and were assigned randomly to the groups (i.e. the gasoline types). Thus, in the analysis, each gasoline type was tested on 4 cars. A performance-based octane reading was obtained for each car, and the question is whether the gasolines differ with respect to this octane reading.
A
91.7 91.2 90.9 90.6
B
91.7 91.9 90.9 90.9
C
92.4 91.2 91.6 91.0
D
91.8 92.2 92.0 91.4
E
93.1 92.9 92.4 92.4
![Page 12: Experimental Statistics - week 4](https://reader035.fdocuments.us/reader035/viewer/2022062304/568134cf550346895d9bf5a2/html5/thumbnails/12.jpg)
12
The CAR data set as SAS needs to see it: A 91.7A 91.2A 90.9A 90.6B 91.7B 91.9B 90.9B 90.9C 92.4C 91.2C 91.6C 91.0D 91.8D 92.2D 92.0D 91.4E 93.1E 92.9E 92.4E 92.4
![Page 13: Experimental Statistics - week 4](https://reader035.fdocuments.us/reader035/viewer/2022062304/568134cf550346895d9bf5a2/html5/thumbnails/13.jpg)
13
Case 1: Data within SAS FILE : DATA one;INPUT gas$ octane;DATALINES;A 91.7A 91.2 . . . E 92.4E 92.4 ;PROC GLM; (or ANOVA) CLASS gas; MODEL octane=gas; TITLE 'Gasoline Example - Completely Randomized Design'; MEANS gas/duncans;RUN;PROC MEANS mean var;RUN;PROC MEANS mean var;class gas;RUN;
SAS file for CAR data
![Page 14: Experimental Statistics - week 4](https://reader035.fdocuments.us/reader035/viewer/2022062304/568134cf550346895d9bf5a2/html5/thumbnails/14.jpg)
14
Brief Discussion of Components of the SAS File:
DATA Step
DATA STATEMENT - the first DATA statement names the data set whose variables are defined in the INPUT statement -- in the above, we create data set 'one'
INPUT STATEMENT - 2 forms
1. Freefield - can be used when data values are separated by 1 or more blanks
INPUT NAME $ AGE SEX $ SCORE; ($ indicates character variable)
2. Formatted - data occur in fixed columns
INPUT NAME $ 1-20 AGE 22-24 SEX $ 26 SCORE 28-30;
DATALINES STATEMENT - used to indicate that the next records in the file contain the actual data and the semicolon after the data indicates the end of the data itself
![Page 15: Experimental Statistics - week 4](https://reader035.fdocuments.us/reader035/viewer/2022062304/568134cf550346895d9bf5a2/html5/thumbnails/15.jpg)
15
SPECIFYING THE ANALYSISSPECIFYING THE ANALYSIS -- PROC STATEMENTS
GENERAL FORM PROC xxxxx; implies procedure is to be run on most recently created data set PROC xxxxx DATA = data set name; Note: I did not have to specify DATA=one in the above example
Example PROCs:
PROC REG - regression analysisPROC ANOVA - analysis of variance PROC GLM - general linear model PROC MEANS - basic statistics, t-test for H0:
PROC PLOT - plottingPROC TTEST - t-tests PROC UNIVARIATE - descriptive stats, box-plots, etc.
PROC BOXPLOT - boxplots
![Page 16: Experimental Statistics - week 4](https://reader035.fdocuments.us/reader035/viewer/2022062304/568134cf550346895d9bf5a2/html5/thumbnails/16.jpg)
16
PROC GLMPROC GLMPROC GLMPROC GLM
• Proc GLM data = fn ;
– Class … ; List all the factors.
– Model … / options; e.g., model octane = gas;
– Means … / options;
– Run;
![Page 17: Experimental Statistics - week 4](https://reader035.fdocuments.us/reader035/viewer/2022062304/568134cf550346895d9bf5a2/html5/thumbnails/17.jpg)
17
SAS SyntaxSAS SyntaxSAS SyntaxSAS Syntax
• Every command MUSTMUST end with a semicolon– Commands can continue over two or more lines
• Variable names are 1-8 characters (letters and numerals, beginning with a letter or underscore), but no blanks or special characters
– Note: values for character variables can exceed 8 characters
• Comments – Begin with *, end with ;
![Page 18: Experimental Statistics - week 4](https://reader035.fdocuments.us/reader035/viewer/2022062304/568134cf550346895d9bf5a2/html5/thumbnails/18.jpg)
18
Titles and LabelsTitles and LabelsTitles and LabelsTitles and Labels
• TITLE ‘…’ ;– Up to 10 title lines: TITLE ‘include your title here’;
– Can be placed in Data Steps or Procs
• LABEL name = ‘…’ ;– Can be in a DATA STEP or PROC PRINT
– Include ALL labels, then a single ;
Note: For class assignments, place descriptive titles and labels on the output. Print the data to the output file.
![Page 19: Experimental Statistics - week 4](https://reader035.fdocuments.us/reader035/viewer/2022062304/568134cf550346895d9bf5a2/html5/thumbnails/19.jpg)
19
Case 2: Data in an External File
FILENAME f1 ‘complete directory/file specification’;
FILENAME f1 ‘a:car.data';DATA one;INFILE f1; INPUT gas$ octane;PROC GLM; (or ANOVA) CLASS gas; MODEL octane=gas; TITLE 'Gasoline Example - Completely Randomized Design';RUN;PROC MEANS mean var;RUN;PROC MEANS mean var;class gas;run;
![Page 20: Experimental Statistics - week 4](https://reader035.fdocuments.us/reader035/viewer/2022062304/568134cf550346895d9bf5a2/html5/thumbnails/20.jpg)
20
The SAS Output for CAR data: Gasoline Example - Completely Randomized Design General Linear Models Procedure Dependent Variable: OCTANE Sum of MeanSource DF Squares Square F Value Pr > F Model 4 6.10800000 1.52700000 6.80 0.0025 Error 15 3.37000000 0.22466667 Corrected Total 19 9.47800000 R-Square C.V. Root MSE OCTANE Mean 0.644440 0.516836 0.4739902 91.710000 Source DF Type I SS Mean Square F Value Pr > F GAS 4 6.10800000 1.52700000 6.80 0.0025 Source DF Type III SS Mean Square F Value Pr > F GAS 4 6.10800000 1.52700000 6.80 0.0025
![Page 21: Experimental Statistics - week 4](https://reader035.fdocuments.us/reader035/viewer/2022062304/568134cf550346895d9bf5a2/html5/thumbnails/21.jpg)
21
Text Format for ANOVA Table Output - car data
Source SS df MS F p-value
Between 6.108 4 1.527 6.80 0.0025 samples
Within 3.370 15 0.225 samples
Totals 9.478 19
![Page 22: Experimental Statistics - week 4](https://reader035.fdocuments.us/reader035/viewer/2022062304/568134cf550346895d9bf5a2/html5/thumbnails/22.jpg)
22
PC SAS on Campus
Library
BIC
Student Center
http://support.sas.com/rnd/le/index.html
SAS Learning Edition $125
![Page 23: Experimental Statistics - week 4](https://reader035.fdocuments.us/reader035/viewer/2022062304/568134cf550346895d9bf5a2/html5/thumbnails/23.jpg)
23
1. Calculate the average, standard deviation, minimum, and maximum for the 20 octane readings. CS pp. 25 - 32
2. Graph a histogram of OCTANE. CS pp. 37
3. Calculate descriptive statistics in (1) above for OCTANE for each of the 5 gasolines. CS pp. 32-34
0 : A BH Run 4. t-test to test using GA S typesA and B. CS pp. 138-141
“Lab” AssignmentUsing CAR Data, run the following in this order with one set of code:
5. Plot side-by-side box plots for OCTANE for the 5 levels of the variable GAS
6. Compute a 1-factor ANOVA for the CAR data using only the first 3 GAS types. CS pp.150-155