4 normal probability plots at once
-
Upload
kaden-hanson -
Category
Documents
-
view
28 -
download
0
description
Transcript of 4 normal probability plots at once
![Page 1: 4 normal probability plots at once](https://reader033.fdocuments.us/reader033/viewer/2022051401/56813527550346895d9c908f/html5/thumbnails/1.jpg)
4 normal probability plots at oncepar(mfrow=c(2,2))
for(i in 1:4) {
qqnorm(dataframe[,1] [dataframe[,2]==i],ylab=“Data quantiles”)
title(paste(“yourchoice”,i,sep=“”))}
These plots can be produced by going to “file” and “new” and
“script file”. Paste the commands into the script file window,
press “F10” and the four plots are produced automatically.
4 histograms all at onceSame as above, but instead of qqnorm, use hist, and you only
need one column rather than dataframe 1 and 2. Also, don’t forget
to change your label.
![Page 2: 4 normal probability plots at once](https://reader033.fdocuments.us/reader033/viewer/2022051401/56813527550346895d9c908f/html5/thumbnails/2.jpg)
Lab: Chi-Squared Test (X2) Lack of Fit
November 10, 2000
![Page 3: 4 normal probability plots at once](https://reader033.fdocuments.us/reader033/viewer/2022051401/56813527550346895d9c908f/html5/thumbnails/3.jpg)
History
Invented in 1900 Oldest inference procedure still used in
its original form English statistician Karl Pearson
![Page 4: 4 normal probability plots at once](https://reader033.fdocuments.us/reader033/viewer/2022051401/56813527550346895d9c908f/html5/thumbnails/4.jpg)
The X2 Test
When you have data values for two categorical variables
Also called a two-way table For example: men/women and NSOE
track; regenerated seaweed (yes/no) and access level (limpet only/limpet and fish/etc).
![Page 5: 4 normal probability plots at once](https://reader033.fdocuments.us/reader033/viewer/2022051401/56813527550346895d9c908f/html5/thumbnails/5.jpg)
Example: Why do Men and Women Participate in Sports?
Desire to win or do better than others– called social comparison
Desire to improve one’s skills or to do one’s best– called mastery
![Page 6: 4 normal probability plots at once](https://reader033.fdocuments.us/reader033/viewer/2022051401/56813527550346895d9c908f/html5/thumbnails/6.jpg)
Data Collected from 67 male and 67 female
undergraduate students at a large university
Survey given asking about students’ sports goals.
Students were all categorized either high or low with regard to both of the questions:– high or low social comparison– high or low mastery
Duda, Joan L., Leisures Sciences, 10(1988), pp. 95-106
![Page 7: 4 normal probability plots at once](https://reader033.fdocuments.us/reader033/viewer/2022051401/56813527550346895d9c908f/html5/thumbnails/7.jpg)
Groups
This leads to four groups:– High social comparison, high mastery. – High social comparison, low mastery. – Low social comparison, high mastery– Low social comparison, low mastery
We want to compare this for men and women.
![Page 8: 4 normal probability plots at once](https://reader033.fdocuments.us/reader033/viewer/2022051401/56813527550346895d9c908f/html5/thumbnails/8.jpg)
Observed Counts for Sports Goals
Goal Female Male
HSC-HM 14 31
HS-LM 7 18
LSC-HM 21 5
LSC-LM 25 13
Total 67 67
![Page 9: 4 normal probability plots at once](https://reader033.fdocuments.us/reader033/viewer/2022051401/56813527550346895d9c908f/html5/thumbnails/9.jpg)
1. Add Totals
Column: In this case, what population the observation comes from..
Observed Counts for Sports Goals
Goal Female Male Total
HSC-HM 14 31 45
HS-LM 7 18 25
LSC-HM 21 5 26
LSC-LM 25 13 38
Total 67 67 134
Row: Categorical response variable
Grand total
![Page 10: 4 normal probability plots at once](https://reader033.fdocuments.us/reader033/viewer/2022051401/56813527550346895d9c908f/html5/thumbnails/10.jpg)
Observed Counts for Sports Goals
Goal Female Male Total
HSC-HM 14 31 45
HS-LM 7 18 25
LSC-HM 21 5 26
LSC-LM 25 13 38
Total 67 67 134
A Cell
A table with r rows and c columns contains r x c cells
![Page 11: 4 normal probability plots at once](https://reader033.fdocuments.us/reader033/viewer/2022051401/56813527550346895d9c908f/html5/thumbnails/11.jpg)
X2 is really an analysis of 5 things in this table:
Frequency (actual count) Percent of overall total Percent of row Percent of column Expected count
![Page 12: 4 normal probability plots at once](https://reader033.fdocuments.us/reader033/viewer/2022051401/56813527550346895d9c908f/html5/thumbnails/12.jpg)
Observed Counts for Sports Goals
Goal Female Male Total
HSC-HM 14 31 45
HS-LM 7 18 25
LSC-HM 21 5 26
LSC-LM 25 13 38
Total 67 67 134
Frequency: Just the cell count
![Page 13: 4 normal probability plots at once](https://reader033.fdocuments.us/reader033/viewer/2022051401/56813527550346895d9c908f/html5/thumbnails/13.jpg)
Observed Counts for Sports Goals
Goal Female Male Total
HSC-HM 14 31 45
HS-LM 7 18 25
LSC-HM 21 5 26
LSC-LM 25 13 38
Total 67 67 134
Overall Percent: Cell count divided by grand total
14/134=0.105. That is, 10.5% of all those studied were HSC-HM and female.
![Page 14: 4 normal probability plots at once](https://reader033.fdocuments.us/reader033/viewer/2022051401/56813527550346895d9c908f/html5/thumbnails/14.jpg)
Observed Counts for Sports Goals
Goal Female Male Total
HSC-HM 14 31 45
HS-LM 7 18 25
LSC-HM 21 5 26
LSC-LM 25 13 38
Total 67 67 134
Row Percent: Cell count divided by row total
14/45=0.311 That is, of all those students reporting HSC-HM,31% were female.
![Page 15: 4 normal probability plots at once](https://reader033.fdocuments.us/reader033/viewer/2022051401/56813527550346895d9c908f/html5/thumbnails/15.jpg)
Observed Counts for Sports Goals
Goal Female Male Total
HSC-HM 14 31 45
HS-LM 7 18 25
LSC-HM 21 5 26
LSC-LM 25 13 38
Total 67 67 134
Column Percent: Cell count divided by column total
14/67=0.209 That is, of all female student participants, 21% were HSC-HM..
![Page 16: 4 normal probability plots at once](https://reader033.fdocuments.us/reader033/viewer/2022051401/56813527550346895d9c908f/html5/thumbnails/16.jpg)
Expected Count
Coming later to a slide near you...
![Page 17: 4 normal probability plots at once](https://reader033.fdocuments.us/reader033/viewer/2022051401/56813527550346895d9c908f/html5/thumbnails/17.jpg)
These percents are useful in graphical analysis. Overall, row, and column percent can
be calculated for each cell Then questions of interest can be asked We are interested in the effect of sex on
sports goals. In this case, we would examine the
column percents
![Page 18: 4 normal probability plots at once](https://reader033.fdocuments.us/reader033/viewer/2022051401/56813527550346895d9c908f/html5/thumbnails/18.jpg)
Column percents for sports goals
Goal Female Male
HSC-HM 21 46
HSC-LM 10 27
LSC-HM 31 7
LSC-LM 37 19
Total 100 100
![Page 19: 4 normal probability plots at once](https://reader033.fdocuments.us/reader033/viewer/2022051401/56813527550346895d9c908f/html5/thumbnails/19.jpg)
05
1015
2025303540
4550
Female Male
HSC-HMHSC-LM
LSC-HM
LSC-LM
![Page 20: 4 normal probability plots at once](https://reader033.fdocuments.us/reader033/viewer/2022051401/56813527550346895d9c908f/html5/thumbnails/20.jpg)
Surprise, surprise - we want to ask whether these apparently
obvious differences are significant.
Can these differences be attributed to chance?
Calculate the chi-square and compare to a chi-square distribution
Determine the p-value A low p-value means we reject our null
hypothesis (sound familiar?)
![Page 21: 4 normal probability plots at once](https://reader033.fdocuments.us/reader033/viewer/2022051401/56813527550346895d9c908f/html5/thumbnails/21.jpg)
The hypotheses: Null
No association exists between our row and our column variables– No association exists between sex
and sports goals
– The distributions of sports in the male and female populations are the same.
![Page 22: 4 normal probability plots at once](https://reader033.fdocuments.us/reader033/viewer/2022051401/56813527550346895d9c908f/html5/thumbnails/22.jpg)
The hypotheses: Alternative Alternative: An association exists
between the row and column variables– No particular direction (not one- or two-
sided)– The distributions of sports goals in the male
and female populations are not all the same.
– Includes many kinds of possible associations
– “Men rate social comparison higher as a goal than do women”
![Page 23: 4 normal probability plots at once](https://reader033.fdocuments.us/reader033/viewer/2022051401/56813527550346895d9c908f/html5/thumbnails/23.jpg)
OK: Now back to the Expected Count
If the null hypothesis were true, what would the count in each cell be?
For women in the HSC-HM cell, it would work like this:– 33.6% of all respondents are HSC-HM– We have 67 women– So, if no sex difference exists (our null),
we would expect that 33.6% of our 67 women would be HSC-HM --> 22.5 women.
![Page 24: 4 normal probability plots at once](https://reader033.fdocuments.us/reader033/viewer/2022051401/56813527550346895d9c908f/html5/thumbnails/24.jpg)
Observed Counts for Sports Goals
Goal Female Male Total
HSC-HM 14 31 45
HS-LM 7 18 25
LSC-HM 21 5 26
LSC-LM 25 13 38
Total 67 67 134
Expected Count
1. 45/134=33.6% of all respondents are HSC-HM.
2. 33.6% of 67 women is 22.5.
![Page 25: 4 normal probability plots at once](https://reader033.fdocuments.us/reader033/viewer/2022051401/56813527550346895d9c908f/html5/thumbnails/25.jpg)
Finally: The Chi-Squared Statistic Itself
Compare the entire set of observed counts with the set of expected counts.
Take the difference in each cell between observed and expected
Square each difference Normalize these (divide by the expected
count) Sum over all cells.
![Page 26: 4 normal probability plots at once](https://reader033.fdocuments.us/reader033/viewer/2022051401/56813527550346895d9c908f/html5/thumbnails/26.jpg)
The Formula:
Large values of X2 provide evidence against the null hypothesis
A chi-square distribution is used to obtain the p-value
Degrees of freedom are (r-1)(c-1)
2
2 observed count - expected count
expected countX
![Page 27: 4 normal probability plots at once](https://reader033.fdocuments.us/reader033/viewer/2022051401/56813527550346895d9c908f/html5/thumbnails/27.jpg)
In this case... Chi-squared = 24.898 on 3 df. The p-value is less than 0.0005. The chance of obtaining a chi-squared
value greater than or equal to this due to chance alone is very small
Clear evidence against the null hypothesis
Strong evidence that female and male students have different distributions of sports goals.
![Page 28: 4 normal probability plots at once](https://reader033.fdocuments.us/reader033/viewer/2022051401/56813527550346895d9c908f/html5/thumbnails/28.jpg)
Is that all you can say? No, you can and should combine the test with
a description that shows the relationship. – Percents in our earlier table and our graph– Summary comments: the percent fo males in each
of the HSC goal classes is more than twice the percent of females.
– The HSC-HM group contains 46% of the males, but only 21% of the females
– The HSC-LM group contains 27% of the males and only 10% of the females
– We conclude that males are more likely to be motivated by social comparison goals and females are more likely to be motivated by mastery goals.
![Page 29: 4 normal probability plots at once](https://reader033.fdocuments.us/reader033/viewer/2022051401/56813527550346895d9c908f/html5/thumbnails/29.jpg)
Important to remember:
The approximation of the population chi-square by our estimate becomes more accurate as the cell counts increase.
For 2 x 2 tables, the expected count in each of the 4 cells must be five or higher.
For tables larger than 2 x 2, the average of the expected counts must be 5 or higher, and the smallest expected count must be 1 or more.
![Page 30: 4 normal probability plots at once](https://reader033.fdocuments.us/reader033/viewer/2022051401/56813527550346895d9c908f/html5/thumbnails/30.jpg)
Important to remember:
This is sometimes called the chi-squared test for homogeneity or the chi-squared test of independence.
Although this is is one of the most widely used of statistical tools, it is also one of the least informative.– The only thing you produce is a p-value and
there is no associated parameter to describe the degree of dependence
– the alternative hypothesis is very general (that row and columns are not independent)