The Data Analysis Plan

29
The Data Analysis Plan

description

The Data Analysis Plan. The Overall Data Analysis Plan. Purpose : To tell a story. To construct a coherent narrative that explains findings, argues against other interpretations, and supports conclusions. Three Steps of the Data Analysis Plan. - PowerPoint PPT Presentation

Transcript of The Data Analysis Plan

Page 1: The Data Analysis Plan

The Data Analysis Plan

Page 2: The Data Analysis Plan

The Overall Data Analysis Plan

Purpose: To tell a story.

To construct a coherent narrative that explains findings, argues against other interpretations, and supports conclusions.

Page 3: The Data Analysis Plan

Three Steps of the Data Analysis Plan

1) Getting to know the data- A first step is to examine the data set, the “raw numbers”. “Play” with the raw numbers.

2) Summarize the data- Use descriptive statistics to “summarize” the data.

3) Confirm what the data reveal- Most commonly, using null hypothesis significance testing, NHST.

Page 4: The Data Analysis Plan

Step 1: Getting to know the data1) Look at raw numbers and check for

errors and outliers.• Errors are impossible numbers

(outside the possible range).

• Outliers are in the possible range, but exceptional. Could be an error or a true score from an unusual participant.

Page 5: The Data Analysis Plan

Decision Rule for Errors and Outliers

• You “fix” errors if you can• You eliminate outliers (if appropriate).

Follow the rules of the journal or organization where the results will be presented/reported.

• Either way, you must specify the amount of data eliminated and your reason or “rule” for elimination.

Page 6: The Data Analysis Plan

2)Look at a “picture” of raw numbers

• Stem & Leaf Plots• Histogram (frequency distribution)• Examine underlying distribution of

raw scores looking for “unusual” distribution (other than “normal”)

Page 7: The Data Analysis Plan

“Normal” Distribution

Page 8: The Data Analysis Plan

Skewed Distributions

• Skew Distribution: If extremely skewed, you may have to transform the scores (For example, using logarithms or changing the scale you use)

Page 9: The Data Analysis Plan

Positive Skew- tail trails off to the “positive” side

Page 10: The Data Analysis Plan

Negative Skew: Tail trails off to the “negative” side

Page 11: The Data Analysis Plan

“Bi-Modal Distribution” (or multi-modal)

Very problematic for further analysis- refer to “experts” for appropriate data analysis.

Page 12: The Data Analysis Plan

Step two: Summarize the Data (Descriptive Statistics)

• Purpose – to describe the data

To indicate what is a typical score (central tendency)

To asses the degree to which the scores in the data set differ from one and another (variability or dispersion)

Page 13: The Data Analysis Plan

1) Measures of Central Tendency (tendency toward the middle) typical score

• Mode- Most frequently occurring score

Example: 2, 4, 5, 5, 6, 8, 9, 10, 10, 10, 12

Mode=• Median – the “middle” score (50% of scores

below and 50% of scores above)

Median (from above) =

Page 14: The Data Analysis Plan

• Mean - Arithmetic average or mean (sum of scores divided by number of scores)

Example: 2, 4, 5, 5, 6, 8, 9, 10, 10, 10, 12

Mean= 7.363

In a “Normal Distribution”:

Mean=Median=Mode

Page 15: The Data Analysis Plan

Mean=Median=Mode

Page 16: The Data Analysis Plan

Skewed distribution

Page 17: The Data Analysis Plan

If a distribution is “Skew”, the mean may not be the best descriptor of the typical score.

In this case, the median is a better estimate of “typical score”.

Usually report BOTH mean and median if the distribution is skewed.

Page 18: The Data Analysis Plan

2) Measures of Variability (dispersion, how different the numbers are from each other)

• Range – Officially= (highest score – lowest score)

Example: 2, 4, 5, 5, 6, 8, 9, 10, 10, 10, 12

Range=

Usually reported by citing the lowest and highest score in the data set.

Page 19: The Data Analysis Plan

• Variance –The sum of squared deviations of the scores around the mean divided by either:

“N” or “n-1” ???

Page 20: The Data Analysis Plan

Variance of a set of #’s or the population variance (the sum of squares (SS) divided by N):

An estimate of the variance of a population based on a sample (the sum of the squares (SS) divided by n-1):

Page 21: The Data Analysis Plan

• Standard Deviation –

The square root of the variance.

Page 22: The Data Analysis Plan

Effect Size or Effect Magnitude

• An index of the strength of the relationship between the IV and the DV that is independent of sample size.

• How large an effect does the IV have on the DV?

Page 23: The Data Analysis Plan

• Cohen’s d is one measure of “size of effect” or effect magnitude.

• For d, a value of .20 indicates a small magnitude effect, .50 a medium magnitude effect, and .80 a large magnitude of effect.

Page 24: The Data Analysis Plan

• d is a ratio of the difference between the means at two levels of an IV divided by the standard deviation of the population. (the difference between means divided by a measure of variability or dispersion)

• As variability increases (standard deviation increases), d decreases (lower effect size).

Page 25: The Data Analysis Plan

Example• suppose you have two levels of an IV and the

means for these two levels are 8 and 5• The difference between the two means is

8-5=3• If there moderate variability in the DV (say

population standard deviation=6), then:

d=3/6 = .50

a medium effect size or magnitude

Page 26: The Data Analysis Plan

• If the variability of DV is larger (say population standard deviation=15), then:

d=3/15=1/5= .20

a small effect size or magnitude

Page 27: The Data Analysis Plan

• If the variability in DV is really small (standard deviation=3.75) then:

d=3/3.75=.80

a large effect size or magnitude

Page 28: The Data Analysis Plan

• Effect size is one measure that affects the “power” of a statistical analysis and it is used in making decisions about how large a sample size should be used in order to be sufficient to produce a reasonable level of “power”.

Page 29: The Data Analysis Plan

• Because the standard deviation is used as the “denominator” for this measure, it is independent of (not affected by) sample size. Thus, you can compare effect sizes across research studies using various sample sizes. This type of comparison is called a “Meta-analysis”