GENERAL LINEAR MODELS (GLM) - NDSU · 2019. 10. 29. · GENERAL LINEAR MODELS (GLM) • The GLM...

22
GENERAL LINEAR MODELS (GLM) The GLM method allows for performing analysis of variance of balanced or unbalanced data using analysis of variance (ANOVA). GLM uses a general linear model method for performing the ANOVA. The GLM method calculates Type I and Type III sums of squares. The GLM method allows for performing analyses using a mixed model Many journals are requiring that papers submitted to their journals be analyzed using a mixed model method, such as PROC MIXED, instead of GLM. Many of the mixed model methods analyze the data using a mathematical method called restricted maximum likelihood (REML). If you have no missing data and the data are balanced, the GLM and REML methods provide nearly identical results. TYPE I vs. TYPE III Sum of Squares Type I SS Referred to as the Sequential or Hierarchical Sum of Squares. The Type I SS should be used only if the data are balanced, including no missing data. The order the terms appear in the model is important in the sum of square value calculated. The Sum of Square calculated is adjusted for the term(s) that appear before it in the model statement. For example, if the model is yield=A B A*B, the Type I SS are calculated as o SS A o SS B|A (SS B given A is in the model o SS A*B|A, B (SS A*B given A and B are in the model THESE SUM SQUARES SHOULD NOT BE USED FOR HYPOTHESIS TESTING UNLESS YOU HAVE BALANCED DATA, INCLUDING NO MISSING DATA. Type III SS THESE SUMS OF SQUARE ARE THE APPROPRIATE ONES TO USE FOR F- TESTS IF YOU HAVE UNBALANCED DATA, INCLUDING MISSING DATA. The order of terms in the model statement does not affect the calculated sums of squares.

Transcript of GENERAL LINEAR MODELS (GLM) - NDSU · 2019. 10. 29. · GENERAL LINEAR MODELS (GLM) • The GLM...

Page 1: GENERAL LINEAR MODELS (GLM) - NDSU · 2019. 10. 29. · GENERAL LINEAR MODELS (GLM) • The GLM method allows for performing analysis of variance of balanced or unbalanced data using

GENERAL LINEAR MODELS (GLM)

• The GLM method allows for performing analysis of variance of balanced or unbalanced data using analysis of variance (ANOVA).

• GLM uses a general linear model method for performing the ANOVA.

• The GLM method calculates Type I and Type III sums of squares.

• The GLM method allows for performing analyses using a mixed model

• Many journals are requiring that papers submitted to their journals be analyzed using a mixed model method, such as PROC MIXED, instead of GLM.

• Many of the mixed model methods analyze the data using a mathematical method called restricted maximum likelihood (REML).

• If you have no missing data and the data are balanced, the GLM and REML methods provide nearly identical results.

TYPE I vs. TYPE III Sum of Squares Type I SS

• Referred to as the Sequential or Hierarchical Sum of Squares.

• The Type I SS should be used only if the data are balanced, including no missing data.

• The order the terms appear in the model is important in the sum of square value calculated.

• The Sum of Square calculated is adjusted for the term(s) that appear before it in the

model statement.

• For example, if the model is yield=A B A*B, the Type I SS are calculated as o SS A o SS B|A (SS B given A is in the model o SS A*B|A, B (SS A*B given A and B are in the model

• THESE SUM SQUARES SHOULD NOT BE USED FOR HYPOTHESIS TESTING

UNLESS YOU HAVE BALANCED DATA, INCLUDING NO MISSING DATA. Type III SS

• THESE SUMS OF SQUARE ARE THE APPROPRIATE ONES TO USE FOR F-TESTS IF YOU HAVE UNBALANCED DATA, INCLUDING MISSING DATA.

• The order of terms in the model statement does not affect the calculated sums of squares.

Page 2: GENERAL LINEAR MODELS (GLM) - NDSU · 2019. 10. 29. · GENERAL LINEAR MODELS (GLM) • The GLM method allows for performing analysis of variance of balanced or unbalanced data using

• The Type III sums of squares are calculated assuming all other terms are in the model. If

you have a two-factor factorial, the sums of squares would be calculated as:

o SS A|B, AxB (SS A given B and AxB are in the model) o SS B|A, AxB (SS B given A and AxB are in the model) o SS AxB|A, B (SS A*B given A and B are in the model)

Page 3: GENERAL LINEAR MODELS (GLM) - NDSU · 2019. 10. 29. · GENERAL LINEAR MODELS (GLM) • The GLM method allows for performing analysis of variance of balanced or unbalanced data using

Factorial, A and B fixed, and model order is A, B, and A*B

The GLM Procedure

Class Level Information

Class Levels Values

Rep 4 1 2 3 4 A 2 0 1 B 2 0 1

Number of Observations Read 16 Number of Observations Used 16

Page 4: GENERAL LINEAR MODELS (GLM) - NDSU · 2019. 10. 29. · GENERAL LINEAR MODELS (GLM) • The GLM method allows for performing analysis of variance of balanced or unbalanced data using

Factorial, A and B fixed, and model order is A, B, and A*B

The GLM Procedure

Dependent Variable: Yield

Source DF Sum of

Squares Mean Square F Value Pr > F

Model 6 1149.000000 191.500000 82.07 <.0001 Error 9 21.000000 2.333333 Corrected Total 15 1170.000000

R-Square Coeff Var Root MSE Yield Mean

0.982051 6.110101 1.527525 25.00000

Source DF Type I SS Mean Square F Value Pr > F

Rep 3 32.5000000 10.8333333 4.64 0.0317 A 1 930.2500000 930.2500000 398.68 <.0001 B 1 182.2500000 182.2500000 78.11 <.0001 A*B 1 4.0000000 4.0000000 1.71 0.2229

Source DF Type III SS Mean Square F Value Pr > F

Rep 3 32.5000000 10.8333333 4.64 0.0317 A 1 930.2500000 930.2500000 398.68 <.0001 B 1 182.2500000 182.2500000 78.11 <.0001 A*B 1 4.0000000 4.0000000 1.71 0.2229

Page 5: GENERAL LINEAR MODELS (GLM) - NDSU · 2019. 10. 29. · GENERAL LINEAR MODELS (GLM) • The GLM method allows for performing analysis of variance of balanced or unbalanced data using

Factorial, A and B fixed, and model order is A, B, and A*B

The GLM Procedure

15

20

25

30

35

Yield

0 1

A

Distribution of Yield

Page 6: GENERAL LINEAR MODELS (GLM) - NDSU · 2019. 10. 29. · GENERAL LINEAR MODELS (GLM) • The GLM method allows for performing analysis of variance of balanced or unbalanced data using

Factorial, A and B fixed, and model order is A, B, and A*B

The GLM Procedure

t Tests (LSD) for Yield

Note: This test controls the Type I comparisonwise error rate, not the experimentwise error rate.

Alpha 0.05 Error Degrees of Freedom 9 Error Mean Square 2.333333 Critical Value of t 2.26216 Least Significant Difference 1.7278

1

0

32.6250

17.3750

A Estimate

Yield t Grouping for Means of A (Alpha = 0.05)Means covered by the same bar are not significantly different.

Page 7: GENERAL LINEAR MODELS (GLM) - NDSU · 2019. 10. 29. · GENERAL LINEAR MODELS (GLM) • The GLM method allows for performing analysis of variance of balanced or unbalanced data using

Factorial, A and B fixed, and model order is A, B, and A*B

The GLM Procedure

15

20

25

30

35

Yield

0 1

B

Distribution of Yield

Page 8: GENERAL LINEAR MODELS (GLM) - NDSU · 2019. 10. 29. · GENERAL LINEAR MODELS (GLM) • The GLM method allows for performing analysis of variance of balanced or unbalanced data using

Factorial, A and B fixed, and model order is A, B, and A*B

The GLM Procedure

t Tests (LSD) for Yield

Note: This test controls the Type I comparisonwise error rate, not the experimentwise error rate.

Alpha 0.05 Error Degrees of Freedom 9 Error Mean Square 2.333333 Critical Value of t 2.26216 Least Significant Difference 1.7278

1

0

28.3750

21.6250

B Estimate

Yield t Grouping for Means of B (Alpha = 0.05)Means covered by the same bar are not significantly different.

Page 9: GENERAL LINEAR MODELS (GLM) - NDSU · 2019. 10. 29. · GENERAL LINEAR MODELS (GLM) • The GLM method allows for performing analysis of variance of balanced or unbalanced data using

Factorial, A and B fixed, and model order is A, B, and A*B

The GLM Procedure

Level of A

Level of B N

Yield

Mean Std Dev

0 0 4 13.5000000 1.29099445 0 1 4 21.2500000 1.70782513 1 0 4 29.7500000 2.50000000 1 1 4 35.5000000 2.64575131

15

20

25

30

35

Yie

ld

0 0 0 1 1 0 1 1

A*B

Distribution of Yield

Page 10: GENERAL LINEAR MODELS (GLM) - NDSU · 2019. 10. 29. · GENERAL LINEAR MODELS (GLM) • The GLM method allows for performing analysis of variance of balanced or unbalanced data using

Factorial, A and B fixed, and model order is A*B, B, and A

The GLM Procedure

Class Level Information

Class Levels Values

Rep 4 1 2 3 4 A 2 0 1 B 2 0 1

Number of Observations Read 16 Number of Observations Used 16

Page 11: GENERAL LINEAR MODELS (GLM) - NDSU · 2019. 10. 29. · GENERAL LINEAR MODELS (GLM) • The GLM method allows for performing analysis of variance of balanced or unbalanced data using

Factorial, A and B fixed, and model order is A*B, B, and A

The GLM Procedure

Dependent Variable: Yield

Source DF Sum of

Squares Mean Square F Value Pr > F

Model 6 1149.000000 191.500000 82.07 <.0001 Error 9 21.000000 2.333333 Corrected Total 15 1170.000000

R-Square Coeff Var Root MSE Yield Mean

0.982051 6.110101 1.527525 25.00000

Source DF Type I SS Mean Square F Value Pr > F

Rep 3 32.500000 10.833333 4.64 0.0317 A*B 3 1116.500000 372.166667 159.50 <.0001 B 0 0.000000 . . . A 0 0.000000 . . .

Source DF Type III SS Mean Square F Value Pr > F

Rep 3 32.5000000 10.8333333 4.64 0.0317 A*B 1 4.0000000 4.0000000 1.71 0.2229 B 1 182.2500000 182.2500000 78.11 <.0001 A 1 930.2500000 930.2500000 398.68 <.0001

Page 12: GENERAL LINEAR MODELS (GLM) - NDSU · 2019. 10. 29. · GENERAL LINEAR MODELS (GLM) • The GLM method allows for performing analysis of variance of balanced or unbalanced data using

Factorial, A and B random, and model order is A, B, and A*B Output as it Would Appear on the Exam

The GLM Procedure

Class Level Information

Class Levels Values

Rep 4 1 2 3 4 A 2 0 1 B 2 0 1

Number of Observations Read 16 Number of Observations Used 16

Page 13: GENERAL LINEAR MODELS (GLM) - NDSU · 2019. 10. 29. · GENERAL LINEAR MODELS (GLM) • The GLM method allows for performing analysis of variance of balanced or unbalanced data using

Factorial, A and B random, and model order is A, B, and A*B Output as it Would Appear on the Exam

The GLM Procedure

Dependent Variable: Yield

Source DF Sum of

Squares Mean Square F Value Pr > F

Model 6 1149.000000 191.500000 82.07 <.0001 Error 9 21.000000 2.333333 Corrected Total 15 1170.000000

R-Square Coeff Var Root MSE Yield Mean

0.982051 6.110101 1.527525 25.00000

Source DF Type I SS Mean Square F Value Pr > F

Rep 3 32.5000000 10.8333333 4.64 0.0317 A 1 930.2500000 930.2500000 398.68 <.0001 B 1 182.2500000 182.2500000 78.11 <.0001 A*B 1 4.0000000 4.0000000 1.71 0.2229

Source DF Type III SS Mean Square F Value Pr > F

Rep 3 32.5000000 10.8333333 4.64 0.0317 A 1 930.2500000 930.2500000 398.68 <.0001 B 1 182.2500000 182.2500000 78.11 <.0001 A*B 1 4.0000000 4.0000000 1.71 0.2229

Page 14: GENERAL LINEAR MODELS (GLM) - NDSU · 2019. 10. 29. · GENERAL LINEAR MODELS (GLM) • The GLM method allows for performing analysis of variance of balanced or unbalanced data using

Factorial, A and B random, and model order is A, B, and A*B Output as it Would Appear on the Exam

The GLM Procedure

Level of A N

Yield

Mean Std Dev

0 8 17.3750000 4.37321392 1 8 32.6250000 3.88908730

Level of B N

Yield

Mean Std Dev

0 8 21.6250000 8.87914893 1 8 28.3750000 7.89099849

Level of A

Level of B N

Yield

Mean Std Dev

0 0 4 13.5000000 1.29099445 0 1 4 21.2500000 1.70782513 1 0 4 29.7500000 2.50000000 1 1 4 35.5000000 2.64575131

Page 15: GENERAL LINEAR MODELS (GLM) - NDSU · 2019. 10. 29. · GENERAL LINEAR MODELS (GLM) • The GLM method allows for performing analysis of variance of balanced or unbalanced data using

Factorial, A and B random, model order is a, B, and A*B SAS Code is for an all random model

YOU WILL NOT SEE THIS FORMAT ON PLSC 724 EXAMS

The GLM Procedure

Class Level Information

Class Levels Values

Rep 4 1 2 3 4 A 2 0 1 B 2 0 1

Number of Observations Read 16 Number of Observations Used 16

Page 16: GENERAL LINEAR MODELS (GLM) - NDSU · 2019. 10. 29. · GENERAL LINEAR MODELS (GLM) • The GLM method allows for performing analysis of variance of balanced or unbalanced data using

Factorial, A and B random, model order is a, B, and A*B SAS Code is for an all random model

YOU WILL NOT SEE THIS FORMAT ON PLSC 724 EXAMS

The GLM Procedure

Dependent Variable: Yield

Source DF Sum of

Squares Mean Square F Value Pr > F

Model 6 1149.000000 191.500000 82.07 <.0001 Error 9 21.000000 2.333333 Corrected Total 15 1170.000000

R-Square Coeff Var Root MSE Yield Mean

0.982051 6.110101 1.527525 25.00000

Source DF Type III SS Mean Square F Value Pr > F

Rep 3 32.5000000 10.8333333 4.64 0.0317 A 1 930.2500000 930.2500000 398.68 <.0001 B 1 182.2500000 182.2500000 78.11 <.0001 A*B 1 4.0000000 4.0000000 1.71 0.2229

Tests of Hypotheses Using the Type III MS for A*B as an Error Term

Source DF Type III SS Mean Square F Value Pr > F

A 1 930.2500000 930.2500000 232.56 0.0417 B 1 182.2500000 182.2500000 45.56 0.0936

Page 17: GENERAL LINEAR MODELS (GLM) - NDSU · 2019. 10. 29. · GENERAL LINEAR MODELS (GLM) • The GLM method allows for performing analysis of variance of balanced or unbalanced data using

Factorial, A and B random, model order is a, B, and A*B SAS Code is for an all random model

YOU WILL NOT SEE THIS FORMAT ON PLSC 724 EXAMS

The GLM Procedure

t Tests (LSD) for Yield

Note: This test controls the Type I comparisonwise error rate, not the experimentwise error rate.

Alpha 0.05 Error Degrees of Freedom 1 Error Mean Square 4 Critical Value of t 12.70620 Least Significant Difference 12.706

Means with the same letter are not significantly different.

t Grouping Mean N A

A 32.625 8 1

B 17.375 8 0

Page 18: GENERAL LINEAR MODELS (GLM) - NDSU · 2019. 10. 29. · GENERAL LINEAR MODELS (GLM) • The GLM method allows for performing analysis of variance of balanced or unbalanced data using

The GLM Procedure

YOU WILL NOT SEE THIS FORMAT ON PLSC 724 EXAMS

t Tests (LSD) for Yield

Note: This test controls the Type I comparisonwise error rate, not the experimentwise error rate.

Alpha 0.05 Error Degrees of Freedom 1 Error Mean Square 4 Critical Value of t 12.70620 Least Significant Difference 12.706

Means with the same letter are not significantly

different.

t Grouping Mean N B

A 28.375 8 1

A

A 21.625 8 0

Page 19: GENERAL LINEAR MODELS (GLM) - NDSU · 2019. 10. 29. · GENERAL LINEAR MODELS (GLM) • The GLM method allows for performing analysis of variance of balanced or unbalanced data using

Facotiral, A and B random, model order is a, B, and A*B SAS Code is for an all random model

YOU WILL NOT SEE THIS FORMAT ON PLSC 724 EXAMS

The GLM Procedure

Level of A

Level of B N

Yield

Mean Std Dev

0 0 4 13.5000000 1.29099445 0 1 4 21.2500000 1.70782513 1 0 4 29.7500000 2.50000000 1 1 4 35.5000000 2.64575131

Page 20: GENERAL LINEAR MODELS (GLM) - NDSU · 2019. 10. 29. · GENERAL LINEAR MODELS (GLM) • The GLM method allows for performing analysis of variance of balanced or unbalanced data using

Analysis Using PROC MIXED, A and B both random OPTIONAL: NOT REQUIRED TO KNOW FOR PLSC 724 EXAMS

The Mixed Procedure

Model Information

Data Set WORK.FACTORIAL Dependent Variable Yield Covariance Structure Variance Components Estimation Method REML Residual Variance Method Profile Fixed Effects SE Method Model-Based Degrees of Freedom Method Containment

Class Level Information

Class Levels Values

Rep 4 1 2 3 4 A 2 0 1 B 2 0 1

Dimensions

Covariance Parameters 5 Columns in X 1 Columns in Z 12 Subjects 1 Max Obs per Subject 16

Number of Observations

Number of Observations Read 16 Number of Observations Used 16 Number of Observations Not Used 0

Page 21: GENERAL LINEAR MODELS (GLM) - NDSU · 2019. 10. 29. · GENERAL LINEAR MODELS (GLM) • The GLM method allows for performing analysis of variance of balanced or unbalanced data using

Analysis Using PROC MIXED, A and B both random OPTIONAL: NOT REQUIRED TO KNOW FOR PLSC 724 EXAMS

The Mixed Procedure

Iteration History

Iteration Evaluations -2 Res Log Like Criterion

0 1 110.69137712 1 1 73.54143596 0.00000000

Convergence criteria met.

Covariance Parameter Estimates

Cov Parm Estimate

Rep 2.1250 A 115.78 B 22.2813 A*B 0.4167 Residual 2.3333

Fit Statistics

-2 Res Log Likelihood 73.5 AIC (Smaller is Better) 83.5 AICC (Smaller is Better) 90.2 BIC (Smaller is Better) 80.5

Solution for Random Effects

Effect Rep A B Estimate Std Err

Pred DF t Value Pr > |t|

Rep 1 -1.5692 0.9352 9 -1.68 0.1276 Rep 2 -0.1962 0.9352 9 -0.21 0.8385 Rep 3 1.5692 0.9352 9 1.68 0.1276 Rep 4 0.1962 0.9352 9 0.21 0.8385 A 0 -7.5922 7.6249 9 -1.00 0.3454 A 1 7.5922 7.6249 9 1.00 0.3454

Page 22: GENERAL LINEAR MODELS (GLM) - NDSU · 2019. 10. 29. · GENERAL LINEAR MODELS (GLM) • The GLM method allows for performing analysis of variance of balanced or unbalanced data using

Analysis Using PROC MIXED, A and B both random OPTIONAL: NOT REQUIRED TO KNOW FOR PLSC 724 EXAMS

The Mixed Procedure

Solution for Random Effects

Effect Rep A B Estimate Std Err

Pred DF t Value Pr > |t|

B 0 -3.3009 3.3742 9 -0.98 0.3535 B 1 3.3009 3.3742 9 0.98 0.3535 A*B 0 0 -0.2529 0.6100 9 -0.41 0.6882 A*B 0 1 0.2255 0.6100 9 0.37 0.7201 A*B 1 0 0.1911 0.6100 9 0.31 0.7612 A*B 1 1 -0.1638 0.6100 9 -0.27 0.7943