Download - Handout Two: Describing/Explaining Quantitative Data and Introduction to SPSS EPSE 592 Experimental Designs and Analysis in Educational Research Instructor:

Handout Two: Describing/Explaining Quantitative Data and Introduction to SPSS

EPSE 592Experimental Designs and Analysis in

Educational ResearchInstructor: Dr. Amery Wu

1

About Analysis of Variance Designs• Measurement of the data: quantitative• Type of statistical inference: descriptive and

inferential• Type of Modeling: summative/descriptive and

explanatory/predictive.

Analysis of Variance Design Measurement of Data

Quantitative Categorical

Type ofInference

Descriptive Summative/Descriptive Summative/Descriptive

Explanatory/Predictive Explanatory/Predictive

Inferential Summative/Descriptive Summative/Descriptive

Explanatory/Predictive Explanatory/Predictive2

3

Goals of Today’s Class

Computing the Standard Deviation of a Sample

- D2CAR

Lab Activity-See Excel File “Mean, SD, Z Scores, & Pearson’s Correlation”

D: Deviation2: Square: 2C: CollectionA: AverageR: Square Root

4

Interpretation of the Standard Deviation of a Sample

Individuals in a sample differ in their values of DV. These differences are of our interest to study.

Standard deviation (SD) is a summative measure of the extent to which individuals in a sample differ.

Within a sample, some individual have values close to the mean, others far away from the mean. Standard Deviation is the average difference from the mean across the n individuals in a given sample.

5

Transforming the Raw Scores to the Z Scores

Lab Activity-See Excel File “Mean, SD, Z Scores, & Pearson’s Correlation”

6

Interpretations of the Z Scores Z scores transformation re-scales the data to

have a center of 0 and unit of 1., so called standardization. That is, the mean of the Z scores of a sample is 0, and the SD is 1.

A person’s Z scores indicates how many standard units he/she is away from either side of the mean.

A Z score shows a person’s relative standing on the scale (-∞ to ∞) to others in the sample.

For example, if Mary’s Z score is -1.75, she is 1.75 standard units away on the left hand side of the mean.

Note that Z score transformation does not normalize a skewed raw score distribution.

7

Computing the Pearson’s Correlation r

Lab Activity-See Excel File “Mean, SD, Z Scores & Pearson’s Correlation”

8

Use & Interpretation of Pearson’s r• One of the X and Y should be quantitative data.

• X and Y are assumed to be linearly related.• It is a standardized measure (-1 to 1) for

quantifying the covariation between the two variables.

• A positive r indicates if people’s X scores are high (low) , their Y scores tend to be high (low). A negative r indicates if people’s X scores are high, their Y scores tend to be low. If there is no trend between the scores of X and Y, then r is zero.

• The square of the r, the coefficient of determination, provides an estimate of the proportion of overlapping variance between X and Y (i.e., the degree to which the two sets of numbers vary together).

9

10

Data SourceSpecial thanks to professor Susan J. Henly from School of Nursing, the University of Minnesota for the SPSS data file presented in today’s class.

Ethics for Data UseUnder the guidelines of Behavioral Research Ethics Board (BREB) UBC, data circulated in this course cannot be used for purposes other than the learning activities required by this course, unless they are open to public use.

Use of Data in the Course

11

This data set includes 40 participants (20 boys) who were randomly assigned to the treatment (new method to reduce injection pain) or control group (just do it quickly!) Immediately after the injection, the children were asked to rate their pain on a 0-100 scale, while a nurse observer who could not hear their response also rated their pain based on their behavioural cues. The dependent variable (i.e., data) we are modeling (describing/summarizing or explaining/predicting) today is the level of pain reported by the kid -“kidrate”

Description of Professor Susan J. Henly’s Data

Q: Judging by the above description, what was the research question? What type of design was used? What type of data was collected? and what kind of inference could be made?

12

DesignExperimentalObservational

DataContinuousCategorical

ModelDescriptive/SummativeExplanatory/Predictive

InferenceDescriptive vs. InferentialRelational vs. Causal

ResearchQuestion

Quantitative Methodology Network

13

This Is Where We will be Today (A, Blue Cell)

• Remember, what we are doing is to model the data by 1. Describe/Summarize2. Explain/Predict Data = Model + Residual

• Note that the inferences remain at the sample level with nointention to generalize to the population. Namely, neitherC nor D is covered today.

Measurement of Data

Continuous Categorical

Type ofthe

Inference

DescriptiveA B

InferentialC D

14

kidrate

1 2.5 2.5 2.51 2.5 2.5 5.01 2.5 2.5 7.51 2.5 2.5 10.02 5.0 5.0 15.01 2.5 2.5 17.52 5.0 5.0 22.51 2.5 2.5 25.01 2.5 2.5 27.51 2.5 2.5 30.01 2.5 2.5 32.51 2.5 2.5 35.02 5.0 5.0 40.02 5.0 5.0 45.01 2.5 2.5 47.51 2.5 2.5 50.02 5.0 5.0 55.02 5.0 5.0 60.01 2.5 2.5 62.51 2.5 2.5 65.01 2.5 2.5 67.51 2.5 2.5 70.01 2.5 2.5 72.51 2.5 2.5 75.03 7.5 7.5 82.51 2.5 2.5 85.01 2.5 2.5 87.51 2.5 2.5 90.01 2.5 2.5 92.53 7.5 7.5 100.0

40 100.0 100.0

35.0036.0038.0041.0046.0047.0050.0051.0052.0054.0057.0059.0060.0061.0063.0064.0065.0069.0070.0071.0072.0073.0074.0075.0077.0083.0084.0086.0095.00100.00Total

ValidFrequency Percent Valid Percent

CumulativePercent

Q1: In your opinion, which statistics best characterize the central tendencyof kidrate, and why?Q2: Can you tell proximately whether the distribution of kidrate is normal, positively skewed, or negative skewed, and how?

Describing/Summarizing Central Tendency by Using Numbers

Statistics

kidrate400

65.325064.5000

77.00

ValidMissing

N

MeanMedianMode

15

Descriptives

65.3250 2.7422159.7784

70.8716

65.055664.5000300.789

17.3432735.00

100.0065.0025.25.289 .374

-.374 .733

MeanLower BoundUpper Bound

95% ConfidenceInterval for Mean

5% Trimmed MeanMedianVarianceStd. DeviationMinimumMaximumRangeInterquartile RangeSkewnessKurtosis

kidrateStatistic Std. Error

Describing/Summarizing Dispersion by Using Numbers

Q1: How can minimum and maximum help detect aberrant data points?Q2: By looking at the mean and SD, can you tell whether the data is normally distributed, positively skewed? or negatively skewed?

16

100.0090.0080.0070.0060.0050.0040.0030.00

kidrate

10

8

6

4

2

0

Freq

uenc

y

Mean = 65.325Std. Dev. =17.34327N = 40

Histogram

Describing & Summarizing Distribution byUsing Pictures

Q: What are the advantages and disadvantages of displaying data using a histogram?

17

Q: What are the advantages and disadvantages of displaying data using a stem and leaf plot?

Describing & Summarizing Distribution by Using Pictures

18

kidrate

100.00

90.00

80.00

70.00

60.00

50.00

40.00

30.00

Describing & Summarizing Distribution by Using Pictures

Boxplot

19

Statistics

kidrate62.500061.5000

60.00a

14.89790221.947

.249

.51259.0036.0095.00

68.150071.5000

77.00a

19.45920378.661

.131

.51265.0035.00

100.00

MeanMedianModeStd. DeviationVarianceSkewnessStd. Error of SkewnessRangeMinimumMaximumMeanMedianModeStd. DeviationVarianceSkewnessStd. Error of SkewnessRangeMinimumMaximum

Boy

Girl

Multiple modes exist. The smallest value is showna.

Explaining/Predicting the data (kidrate) by Gender- Using Numbers

20

Explaining/Predicting the data (kidrate) by Gender - Using Pictures

21

Revisiting the Concept of Statistical Modeling Using Mean

= + Res.

= + Res.Data Model

= +=

= + Res.Kidrate Mean

22

Variable View of SPSS Data Editor- To specify the format of the spread sheet

23

Data View of SPSS Data Editor- To enter and view the raw data

24

Central TendencyMeanMediumMode

DispersionMinimum/MaximumRangeQuartiles/InterquartileSD/Variance

Lab Activity- Hands on SPSS (Statistics)

Please report the following statistics for the variable “Nurse-rated Pain” Instruction: Analyze/Descriptive Statistics/Frequencies/Variables (Enter Nurse-rated Pain)/Statistics…

Alternatively, you can useInstruction: Analyze/Descriptive Statistics/Descriptives/ Variables (Enter Nurse-rated Pain)/Options

25

Lab Activity- Hands on SPSS (Graphs) Please report the histogram for the variable “Nurse-rated Pain”Instruction: Analyze/Descriptive Statistics/Frequencies/Variables(Enter Nurse-rated Pain)/Charts/Histograms

Alternatively, you can use Graphs menuInstruction: Graphs/Histogram/Variable(Enter Nurse-rated Pain)

26

My personal preference for describing a continuousvariable is to use the following command, which

givesoutput of crucial and comprehensive information in

bothnumbers and picturesInstruction: Analyze/ Descriptive Statistics /Explore/Dependent list (enter “Nurse-rated Pain”)

Lab Activity- Hands on SPSS (Explore)

27

How do I remember all these commands & paths?

1. There is no need to memorize them!! explore the drop-down menus.

2. Your navigation of SPSS should be guided by the conceptual frameworks and the statistical methods you learned in this or previous stats courses.

3. SPSS is just a tool not a brain! Be a clever user!

Becoming A Competent User Of SPSS

28

You can find very useful Youtube tutorials on various SPSS tools and analyses. They are less time consuming to learn than reading texts.

As necessary, read the following chapters from the website of Social Science Research and Instructional Council (SSRIC):http://www.csubak.edu/ssric-trd/spss/spsfirst.htm

Chapter One: Getting Started With SPSS for Windows Chapter Two: Creating a Data FileChapter Three: Transforming DataChapter Four: Univariate Statistics

Supplemental Learning Resources for SPSS

http://www.csubak.edu/ssric-trd/spss/spsfirst.htm

29

This Is Where We Have Been Today

Measurement of Data

Continuous Categorical

Type ofthe

Inference

Descriptive Summative/Descriptive Summative/Descriptive


Inferential Summative/Descriptive Summative/Descriptive