COMMONLY USED STATISTICS IN
SURVEY RESEARCH
Patrick Barlow and Tiffany Smith
THREE MAJOR CATEGORIES OF STATISTICAL TESTS:
Descriptive Statistics
Parametric Statistics
Non-Parametric Statistics
SOME KEY TERMS Null Hypothesis Alternative Hypothesis Mean Standard Deviation Correlation Confidence Interval
INITIAL QUESTIONS Fit the statistics to the research
question, not the other way around! First, ask yourself, “Am I interested in….
Describing a sample or outcome?”Looking at how groups differ?”Looking at how outcomes are related?”Looking at changes over time?”Creating a new scale or instrument?Assessing reliability and/or validity of an
instrument? Second, “How am I measuring my
outcomes?”
IN THIS PRESENTATION Descriptive Statistics Parametric Statistics
Common tests of relationships Pearson r Linear/multiple regression
Common tests of group differences Independent t-test Between subjects analysis of variance (ANOVA)
Common tests of repeated measures Dependent t-test Within subjects ANOVA
Tests of categorical data Odds Ratio / Chi Square Logistic Regression
Common Psychometric tests Cronbach’s Alpha Principal Components and Factor Analysis
DESCRIPTIVE STATISTICS
Numbers used to describe the sample They do not actually test any hypotheses (or
yield any p-values) Types:
Measures of Center - Mean Median Mode
Measures of Spread - Quartiles Standard Deviation Range Variance
Frequencies
PARAMETRIC STATISTICS Most powerful type of statistics we use Researchers must make sure their data
meets a number of assumptions (or parameters) before these tests can be used properly.Some key assumptions
Normality Independence of observations
In research, you always want to use parametric statistics if possible.
COMMON PARAMETRIC TESTS OF
RELATIONSHIPSPearson r correlation
Linear/Multiple Regression
TESTS OF RELATIONSHIPS:PEARSON r CORRELATION What is it?
A statistical analysis that tests the relationship between two continuous variables.
Commonly Associated Terms: Bivariate correlation, relationship, r-value,
scatterplot, association, direction, magnitude.
Strong Relationship: r > .50Weak Relationship:r ≈ |.10|
10
WHAT TO INTERPRET?
No Relationship:r ≈ |.00|Moderate
Relationship:r ≈ |.30|
11
IMPORTANCE OF GRAPHS
Anscombe, F.J., Graphs in Statistical Analysis, American Statistican, 27, 17-21
Each has a Pearson Correlation of r=.82, is
& is statistically
significant
EXAMPLE What you read:
Study found a relationship between GPA and sense of belonging, r=.35, p = .03.
What to interpret:Results show r = .35, p = .03, R2=.12
How to interpret:There is a weak, significant positive
relationship between college GPA and students’ sense of belonging to the university. As sense of belonging increases, GPA also increases.
13
LINEAR/MULTIPLE REGRESSION What is it?
A statistical analysis that tests the relationship between multiple predictor variables and one continuous outcome variable. Predictors: Any number of continuous or
dichotomous variables, e.g. age, anxiety, SES Outcome: 1 Continuous variable, e.g. ER
visits per Month Commonly Associated Terms:
Multivariate, beta weight, r2-value, model, forward/backward regression, sequential/hierarchical regression, standard/simultaneous regression, statistical/stepwise regression.
REMEMBER: CORRELATION DOES NOT IMPLY CAUSATION
COMMON TESTS OF GROUP DIFFERENCES
Independent t-test
Between Subjects Analysis of Variance (ANOVA)
TESTS OF GROUP DIFFERENCES: INDEPENDENT t-TEST What is it?
Tests the difference between two groups on a single, continuous dependent variable.
Commonly associated terms:Two sample t-test, student’s t-test, means,
group means, standard deviations, mean differences, group difference, confidence interval, group comparison.
INTERPRETATION What to interpret?
p-values (<.05)Mean differences and standard deviationsConfidence intervals
How to interpret? There is a significant difference between the
two groups where one group has a significantly higher/lower score on the dependent variable than the other.
EXAMPLE What you read:
Students who were put on academic probation (M=1.50, SD=.40) had lower sense of belonging than students who were not put on academic probation (M=3.50, SD=.75), p = .02.
What to interpret: p-value: .02 Mean sense of belonging for both groups: academic
probation = 1.50 & non-academic=3.50. Standard deviations for both groups: on academic
probation =.40 & not on academic probation=.75. How to interpret:
Participants on academic probation had significantly lower sense of belonging than students who were not put on academic probation.
BETWEEN SUBJECTS ANALYSIS OF VARIANCE (ANOVA) What is it?
Tests the difference among more than two groups on a single, continuous variable. Post-Hoc tests are required to examine where
the differences are.
Commonly associated terms:F-test, interactions, post-hoc tests (tukey
HSD, bonferroni, scheffe, dunnett).
INTERPRETATION What to interpret?
p-values (<.05) Main effect: Shows overall significance Post-hoc tests: shows specific group differences
Mean differences, standard deviations How to interpret?
Main Effect: There was an overall significant difference among the groups of the independent variable on the dependent variable.
Post-Hoc: Same interpretation as an independent t-test
EXAMPLE What you read:
A researcher looks at differences in average satisfaction on three different reading interventions (A, B, and C). Main effect: Overall F=20.10, p=.01 Post-hoc: Comparison of Intervention “A” to Intervention “B”
shows average satisfaction to be 4.32 (SD=.50) and 3.56 (SD=1.2), respectively, p=.04.
What to interpret: Main effect: p-value=.01 Post-hoc: p-value=.04, group means show Intervention “A”
has higher satisfaction ratings than Intervention “B”. How to interpret:
Main effect: There is a significant overall difference among the three interventions on satisfaction.
Post-hoc: Students who received Intervention “A” have significantly higher satisfaction than those who received Intervention “B”
COMMON PARAMETRIC TESTS OF REPEATED
MEASURES
Dependent t-test
Within Subjects Analysis of Variance (ANOVA)
DEPENDENT t-TEST What is it?
Tests the differences for one group between two time-points or matched pairs
Commonly Associated Terms: Pre and posttest, matched pairs, paired samples,
time. What to interpret?
p-values (<.05) Mean change between measurements (i.e. over time
or between pairs) How to interpret:?
There is a significant difference between the pretest and posttest where the score on the posttest was significantly higher/lower on the dependent variable than the pretest.
EXAMPLE What you read:
An article shows a difference in average test score before (M=79.50, SD=8.00) and after (M=85.25, SD=7.90) an educational intervention, p=.08.
What to interpret: p-value=.08 Mean change=7.75 more points after the
educational intervention. How to interpret:
Average test score did not significantly change from before the intervention to after the intervention; however, there may be a practically relevant difference.
REPEATED MEASURES ANOVA What is it?
A statistical analysis that tests differences of one group between two or more time-points or matched pairs (e.g. pretest, posttest, & follow-up or treatment “A” patient, treatment “B” matched patient, & placebo matched patient).
Commonly Associated Terms: Multiple time-points/matched pairs, repeated measures,
post-hoc. What to interpret?
Main effect: p-values Post-hoc: p-values, mean change, direction of change.
How to interpret: Main Effect – There was an overall significant difference
among the time points/matched pairs on the dependent variable.
Post-Hoc: Same as a dependent t-test.
EXAMPLE What you read:
An article shows a difference in average classroom comfort before (M=1.5, SD=2.0), after (M=3.30, SD=.90), and six months following a cohort-building intervention (M=4.20, SD=3.0). Main effect: Overall F=3.59, p=.02.
What to interpret: p-value=.02, statistically significant Mean change=1.8 higher classroom comfort at post-
intervention How to interpret:
Classroom comfort significantly increased from baseline to six-months following a cohort-building intervention; however, post-hoc tests will be needed to show where that differences lies.
OTHER ANOVAS Mixed ANOVA: Used when comparing more than one group
over more than one time-point on a measure Example – Males vs. female students, before and after a foreign
language course – Average score on an assessment
Factorial ANOVA: Comparing two or more separate independent variables on one dependent variable. Example – Who taught the course (Ms. Lang, Mr. Beard, or Ms.
Brinkley), AND which teaching method was used (online, face to face) – Average post-test assessment score
Analysis of covariance (ANCOVA): Examining the differences among groups while controlling for an additional variable Example – Online or face to face course, controlling for baseline
knowledge – Average post-test assessment score
All of these methods are used to test interaction effects
TESTS USING CATEGORICAL DATA
Odds Ratio / Relative Risk (Chi-square test of independence)
Logistic Regression
ODDS RATIO / RELATIVE RISK (CHI-SQUARE TEST OF INDEPENDENCE) What is it?
A statistical analysis that tests the odds or risk of an event occurring or not occurring based on one or more predictor variables (independent).
Commonly Associated Terms: Unadjusted odds ratio (OR), relative risk (RR), 2x2, chi-square, absolute
risk reduction, absolute risk, relative risk reduction, odds, confidence intervals, protective effect, likelihood, forest plot.
What to interpret? If a 2x2 table: interpret the OR or RR and confidence intervals rather than
the p-value. If more than a 2x2 table, then the p-value and frequencies may be more
useful. How to interpret:
Odds Ratio < 1: For every unit increase in the independent variable, the odds of having the outcome decrease by (OR) times.
Odds Ratio > 1: For every unit increase in the independent variable, the odds of having the outcome increase by (OR) times.
Odds Ratio = 1 or CI crosses 1.0 or p > .05: You are no more or less likely to have the outcome as a result of the predictor variable. (this would be non-significant)
EXAMPLE What are the odds of leaving college if a
student has been placed on academic probation? IV: Academic probation (Y/N) DV: Completion (Y/N)
What you read: The odds ratio (95% CI) for college completion and
academic probation showed OR=2.00 (95% CI=1.44 - 2.88), p <.05.
What you interpret: OR > 1 (2.00) 95% CI is small, and does not cross 1.0 (1.44 to 2.88) p-value is below .05
How you interpret: Students are two times more likely to leave college if they
were placed on academic probation.
LOGISTIC REGRESSION What is it?
A statistical analysis that tests the odds or risk of an event occurring or not occurring based on one or more predictor variables (independent) after controlling for a number of other confounding variables.
Commonly Associated Terms: Adjusted odds ratio (AOR), multivariate adjusted odds ratio,
likelihood, protective effect, risk, odds, 95% confidence interval, classification table, dichotomous DV.
What to interpret? OR (these are your measures for risk of the outcome occurring given the
predictor variable), p-value for OR, confidence intervals for OR (should not cross over 1.0, should not be overly large e.g. 1.2 – 45.5), classification table (if it is provided).
How to interpret: Odds Ratio < 1: For every unit increase in the independent variable, the odds of
having the outcome decrease by (OR) times after controlling for the other predictor variables.
Odds Ratio > 1: For every unit increase in the independent variable, the odds of having the outcome increase by (OR) times after controlling for the other predictor variables.
Odds Ratio = 1 or CI crosses 1.0 or p > .05: You are no more or less likely to have the outcome as a result of the predictor variable after controlling for the other predictor variables. (this would be non-significant)
EXAMPLE Does age, male sex, and time spent playing video
games, increase the odds of being on academic probation? Predictor Variables: age (scale), sex (M/F), Gaming (ordinal, 0hrs, 1-
3hrs, 4-6hrs, etc.) DV: Probation (Y/N)
What you read: The ORs (95% CI) for each predictor variable are:
Age: OR=1.40 (95% CI=0.88 to 6.90), ns Sexmale: OR=3.00 (95% CI=2.22 to 5.20), p <.001 Gaming: OR=6.75 (95% CI=4.69 to 8.80), p <.001
What you interpret: The OR, CI, and p-value for each predictor.
How you interpret: Both sex and time spent gaming increase the odds of being on
academic probation. Specifically, men are 3.00 times more likely to be on academic probation than females, and for every unit increase in gaming time, the odds of being on probation increases by 6.75 times.
Questions?
ACTIVITYWork together (in groups of 3-4) to create survey research scenarios / questions that could be addressed using the analyses you have learned about in class.
Use your “Commonly Used Statistics” handout as a resource!
Be prepared to share your answers
COMMON PSYCHOMETRIC TESTS
Cronbach’s Alpha
Principal Component Analysis / Factor Analysis
Other psychometric tests
WHAT ARE THEY? Psychometric tests are used to examine
the characteristics and performance of a survey or assessment instrument.
Reasons to use these testsReliabilityValidityDimension reduction (constructing new
instruments) Item analysis (objective tests)
CRONBACH’S ALPHA Cronbach’s alpha is one of the most common psychometric
tests used by survey researchers. Looks at the internal consistency of the items in a certain scale
or instrument. In other words, how responses to items in the scale relate to one
another. What to interpret
The overall alpha value for the scale The “alpha if item removed” table
How to interpret For the Alpha values: > .90 is excellent, .80-.90 is good, .70-.80
is acceptable, .60-.70 is questionable, between .50-.60 is poor, and <.50 is unacceptable. If you see a negative value, then recheck your data for coding errors.
For the “alpha if removed” table: Look at the values that the scale would have if the item was removed. If dropping an item makes a meaningful improvement (e.g. from .75 to .80), then consider dropping the item and rerunning the analysis.
TIPS FOR INCREASING RELIABILITY More items More participants Increase the “good” type of redundancy Drop poor items (those that affect
alpha) Clarify item stems Double check coding
PCA / FA PCA and FA are both dimension
reduction techniques that are used when either pretesting a new instrument (exploratory) or gathering validity evidence for an existing instrument (confirmatory).
Both methods look at how items cluster together as latent (not directly measured) “factors” or “components”.Examples: “depression”, “anxiety”,
or “sense of belonging”
FACTORS THAT AFFECT PCA / FA Number of items Number of subjects Technique used
PCAFA
Extraction methodsOrthogonalOblique
SO many others!
Remember:
Just because a finding is not significant does not mean that it is not meaningful. You should always consider the effect size and context of the research when making a decision about whether or not any finding is relevant in
practice.
Top Related