Handout Two: Describing/Explaining Quantitative Data and Introduction to SPSS
EPSE 592Experimental Designs and Analysis in
Educational ResearchInstructor: Dr. Amery Wu
1
About Analysis of Variance Designs• Measurement of the data: quantitative• Type of statistical inference: descriptive and
inferential• Type of Modeling: summative/descriptive and
explanatory/predictive.
Analysis of Variance Design Measurement of Data
Quantitative Categorical
Type ofInference
Descriptive Summative/Descriptive Summative/Descriptive
Explanatory/Predictive Explanatory/Predictive
Inferential Summative/Descriptive Summative/Descriptive
Explanatory/Predictive Explanatory/Predictive2
3
Goals of Today’s Class
Computing the Standard Deviation of a Sample
- D2CAR
Lab Activity-See Excel File “Mean, SD, Z Scores, & Pearson’s Correlation”
D: Deviation2: Square: 2C: CollectionA: AverageR: Square Root
4
Interpretation of the Standard Deviation of a Sample
Individuals in a sample differ in their values of DV. These differences are of our interest to study.
Standard deviation (SD) is a summative measure of the extent to which individuals in a sample differ.
Within a sample, some individual have values close to the mean, others far away from the mean. Standard Deviation is the average difference from the mean across the n individuals in a given sample.
5
Transforming the Raw Scores to the Z Scores
Lab Activity-See Excel File “Mean, SD, Z Scores, & Pearson’s Correlation”
6
Interpretations of the Z Scores Z scores transformation re-scales the data to
have a center of 0 and unit of 1., so called standardization. That is, the mean of the Z scores of a sample is 0, and the SD is 1.
A person’s Z scores indicates how many standard units he/she is away from either side of the mean.
A Z score shows a person’s relative standing on the scale (-∞ to ∞) to others in the sample.
For example, if Mary’s Z score is -1.75, she is 1.75 standard units away on the left hand side of the mean.
Note that Z score transformation does not normalize a skewed raw score distribution.
7
Computing the Pearson’s Correlation r
Lab Activity-See Excel File “Mean, SD, Z Scores & Pearson’s Correlation”
8
Use & Interpretation of Pearson’s r• One of the X and Y should be quantitative data.
• X and Y are assumed to be linearly related.• It is a standardized measure (-1 to 1) for
quantifying the covariation between the two variables.
• A positive r indicates if people’s X scores are high (low) , their Y scores tend to be high (low). A negative r indicates if people’s X scores are high, their Y scores tend to be low. If there is no trend between the scores of X and Y, then r is zero.
• The square of the r, the coefficient of determination, provides an estimate of the proportion of overlapping variance between X and Y (i.e., the degree to which the two sets of numbers vary together).
9
10
Data SourceSpecial thanks to professor Susan J. Henly from School of Nursing, the University of Minnesota for the SPSS data file presented in today’s class.
Ethics for Data UseUnder the guidelines of Behavioral Research Ethics Board (BREB) UBC, data circulated in this course cannot be used for purposes other than the learning activities required by this course, unless they are open to public use.
Use of Data in the Course
11
This data set includes 40 participants (20 boys) who were randomly assigned to the treatment (new method to reduce injection pain) or control group (just do it quickly!) Immediately after the injection, the children were asked to rate their pain on a 0-100 scale, while a nurse observer who could not hear their response also rated their pain based on their behavioural cues. The dependent variable (i.e., data) we are modeling (describing/summarizing or explaining/predicting) today is the level of pain reported by the kid -“kidrate”
Description of Professor Susan J. Henly’s Data
Q: Judging by the above description, what was the research question? What type of design was used? What type of data was collected? and what kind of inference could be made?
12
DesignExperimentalObservational
DataContinuousCategorical
ModelDescriptive/SummativeExplanatory/Predictive
InferenceDescriptive vs. InferentialRelational vs. Causal
ResearchQuestion
Quantitative Methodology Network
13
This Is Where We will be Today (A, Blue Cell)
• Remember, what we are doing is to model the data by 1. Describe/Summarize2. Explain/Predict Data = Model + Residual
• Note that the inferences remain at the sample level with nointention to generalize to the population. Namely, neitherC nor D is covered today.
Measurement of Data
Continuous Categorical
Type ofthe
Inference
DescriptiveA B
InferentialC D
14
kidrate
1 2.5 2.5 2.51 2.5 2.5 5.01 2.5 2.5 7.51 2.5 2.5 10.02 5.0 5.0 15.01 2.5 2.5 17.52 5.0 5.0 22.51 2.5 2.5 25.01 2.5 2.5 27.51 2.5 2.5 30.01 2.5 2.5 32.51 2.5 2.5 35.02 5.0 5.0 40.02 5.0 5.0 45.01 2.5 2.5 47.51 2.5 2.5 50.02 5.0 5.0 55.02 5.0 5.0 60.01 2.5 2.5 62.51 2.5 2.5 65.01 2.5 2.5 67.51 2.5 2.5 70.01 2.5 2.5 72.51 2.5 2.5 75.03 7.5 7.5 82.51 2.5 2.5 85.01 2.5 2.5 87.51 2.5 2.5 90.01 2.5 2.5 92.53 7.5 7.5 100.0
40 100.0 100.0
35.0036.0038.0041.0046.0047.0050.0051.0052.0054.0057.0059.0060.0061.0063.0064.0065.0069.0070.0071.0072.0073.0074.0075.0077.0083.0084.0086.0095.00100.00Total
ValidFrequency Percent Valid Percent
CumulativePercent
Q1: In your opinion, which statistics best characterize the central tendencyof kidrate, and why?Q2: Can you tell proximately whether the distribution of kidrate is normal, positively skewed, or negative skewed, and how?
Describing/Summarizing Central Tendency by Using Numbers
Statistics
kidrate400
65.325064.5000
77.00
ValidMissing
N
MeanMedianMode
15
Descriptives
65.3250 2.7422159.7784
70.8716
65.055664.5000300.789
17.3432735.00
100.0065.0025.25.289 .374
-.374 .733
MeanLower BoundUpper Bound
95% ConfidenceInterval for Mean
5% Trimmed MeanMedianVarianceStd. DeviationMinimumMaximumRangeInterquartile RangeSkewnessKurtosis
kidrateStatistic Std. Error
Describing/Summarizing Dispersion by Using Numbers
Q1: How can minimum and maximum help detect aberrant data points?Q2: By looking at the mean and SD, can you tell whether the data is normally distributed, positively skewed? or negatively skewed?
16
100.0090.0080.0070.0060.0050.0040.0030.00
kidrate
10
8
6
4
2
0
Freq
uenc
y
Mean = 65.325Std. Dev. =17.34327N = 40
Histogram
Describing & Summarizing Distribution byUsing Pictures
Q: What are the advantages and disadvantages of displaying data using a histogram?
17
Q: What are the advantages and disadvantages of displaying data using a stem and leaf plot?
Describing & Summarizing Distribution by Using Pictures
18
kidrate
100.00
90.00
80.00
70.00
60.00
50.00
40.00
30.00
Describing & Summarizing Distribution by Using Pictures
Boxplot
19
Statistics
kidrate62.500061.5000
60.00a
14.89790221.947
.249
.51259.0036.0095.00
68.150071.5000
77.00a
19.45920378.661
.131
.51265.0035.00
100.00
MeanMedianModeStd. DeviationVarianceSkewnessStd. Error of SkewnessRangeMinimumMaximumMeanMedianModeStd. DeviationVarianceSkewnessStd. Error of SkewnessRangeMinimumMaximum
Boy
Girl
Multiple modes exist. The smallest value is showna.
Explaining/Predicting the data (kidrate) by Gender- Using Numbers
20
Explaining/Predicting the data (kidrate) by Gender - Using Pictures
21
Revisiting the Concept of Statistical Modeling Using Mean
= + Res.
= + Res.Data Model
= +=
= + Res.Kidrate Mean
22
Variable View of SPSS Data Editor- To specify the format of the spread sheet
23
Data View of SPSS Data Editor- To enter and view the raw data
24
Central TendencyMeanMediumMode
DispersionMinimum/MaximumRangeQuartiles/InterquartileSD/Variance
Lab Activity- Hands on SPSS (Statistics)
Please report the following statistics for the variable “Nurse-rated Pain” Instruction: Analyze/Descriptive Statistics/Frequencies/Variables (Enter Nurse-rated Pain)/Statistics…
Alternatively, you can useInstruction: Analyze/Descriptive Statistics/Descriptives/ Variables (Enter Nurse-rated Pain)/Options
25
Lab Activity- Hands on SPSS (Graphs) Please report the histogram for the variable “Nurse-rated Pain”Instruction: Analyze/Descriptive Statistics/Frequencies/Variables(Enter Nurse-rated Pain)/Charts/Histograms
Alternatively, you can use Graphs menuInstruction: Graphs/Histogram/Variable(Enter Nurse-rated Pain)
26
My personal preference for describing a continuousvariable is to use the following command, which
givesoutput of crucial and comprehensive information in
bothnumbers and picturesInstruction: Analyze/ Descriptive Statistics /Explore/Dependent list (enter “Nurse-rated Pain”)
Lab Activity- Hands on SPSS (Explore)
27
How do I remember all these commands & paths?
1. There is no need to memorize them!! explore the drop-down menus.
2. Your navigation of SPSS should be guided by the conceptual frameworks and the statistical methods you learned in this or previous stats courses.
3. SPSS is just a tool not a brain! Be a clever user!
Becoming A Competent User Of SPSS
28
You can find very useful Youtube tutorials on various SPSS tools and analyses. They are less time consuming to learn than reading texts.
As necessary, read the following chapters from the website of Social Science Research and Instructional Council (SSRIC):http://www.csubak.edu/ssric-trd/spss/spsfirst.htm
Chapter One: Getting Started With SPSS for Windows Chapter Two: Creating a Data FileChapter Three: Transforming DataChapter Four: Univariate Statistics
Supplemental Learning Resources for SPSS
29
This Is Where We Have Been Today
Measurement of Data
Continuous Categorical
Type ofthe
Inference
Descriptive Summative/Descriptive Summative/Descriptive
Explanatory/Predictive Explanatory/Predictive
Inferential Summative/Descriptive Summative/Descriptive
Explanatory/Predictive Explanatory/Predictive
Top Related