© Judith D. Singer and Lindsay C. Page, Harvard Graduate School of Education Unit 11/Slide 1
Unit 11: Regression modeling in practice
© Judith D. Singer and Lindsay C. Page, Harvard Graduate School of Education Unit 11/Slide 2
The S-030 roadmap: Where’s this unit in the big picture?
Unit 2:Correlation
and causality
Unit 3:Inference for the regression model
Unit 4:Regression assumptions:
Evaluating their tenability
Unit 5:Transformations
to achieve linearity
Unit 6:The basics of
multiple regression
Unit 7:Statistical control in
depth:Correlation and
collinearity
Unit 10:Interaction and quadratic effects
Unit 8:Categorical predictors I:
Dichotomies
Unit 9:Categorical predictors II:
Polychotomies
Unit 11:Regression modeling
in practice
Unit 1:Introduction to
simple linear regression
Building a solid
foundation
Mastering the
subtleties
Adding additional predictors
Generalizing to other types of
predictors and effects
Pulling it all
together
© Judith D. Singer and Lindsay C. Page, Harvard Graduate School of Education Unit 11/Slide 3
In this unit, we’re going to learn about…
• Distinguishing between question predictors, covariates, and rival hypothesis predictors
• Mapping your research questions onto an analytic strategy• What kinds of paths and feedback loops do you need?• Alternative analytic approaches—which are sound, which are
unwise?• Which kinds of rival explanations can you examine and rule out?• What caveats and limitations still remain?• Constructing informative tables and figures• Writing up your results
© Judith D. Singer and Lindsay C. Page, Harvard Graduate School of Education Unit 11/Slide 4
Automated model building strategies (& why you don’t want to use them)
Automated model building strategies
1. All possible subsets: all 2k-1 regression models
2. Forward selection: start with no predictors and sequentially add them so that each maximally increases the R2 statistic at that step
3. Backwards elimination: start with all predictors and sequentially drop them so that each minimally decreases the R2 statistic at that step
4. Stepwise regression (forward selection with backwards glances)
All models are wrong, but some are useful George E.P. Box (1979)
Far better an approximate answer to the right question…than an exact answer to the wrong question John W. Tukey (1962)
The hallmark of good science is that it uses models and ‘theory’ but never believes them attributed to Martin Wilk in Tukey (1962)
Occam’s razor: entia non sunt multiplicanda praeter necessitatem If two competing theories lead to the same predictions, the simpler one is better William of Occam (14th century)
© Judith D. Singer and Lindsay C. Page, Harvard Graduate School of Education Unit 11/Slide 5
Introducing the case study: Exploring public knowledge of current affairs
“Since the late 1980s, the emergence of 24-hour cable news as a dominant news source and the explosive growth of the internet have led to major changes in the American public’s news habits. But a new nationwide survey finds that the coaxial and digital revolutions and attendant changes in news audience behaviors have had little impact on how much Americans know about national and international affairs.
On average, today’s citizens are about as able to name their leaders, and are about as aware of major news events, as was the public nearly 20 years ago. The new survey includes questions that are either identical or roughly comparable to questions asked in the late 1980s and early 1990s. In 2007, somewhat fewer were able to name their governor, the vice president, and the president of Russia, but more respondents than in the earlier era gave correct answers to questions pertaining to national politics.”
What job does Hillary Clinton currently hold?A. Senator from New YorkB. Secretary of StateC. Secretary of Health and Human ServicesD. Ambassador to the United Nations Thinking about the military effort in Afghanistan - do you happen to know if Barack Obama has decided to increase, decrease, or not substantially change the number of U.S. troops stationed in Afghanistan?A. IncreaseB. DecreaseC. Not substantially change As far as you know, which foreign country holds the most U.S. government debt?A. JapanB. ChinaC. CanadaD. Saudi Arabia
Do you happen to know which political party has a majority in the U.S. House of Representatives?A. RepublicansB. Democrats What job does Timothy Geithner currently hold?A. President Obama's Press SecretaryB. The CEO of CitiGroupC. The Treasury SecretaryD. The House Majority Leader
What Americans Know: 1989-2007 – Public Knowledge of Current Affairs Little Changed by News and Information Revolutions. Pew
Research Center for the People and the Press, April 15, 2007.
© Judith D. Singer and Lindsay C. Page, Harvard Graduate School of Education Unit 11/Slide 6
How the study was conducted: Telephone interviews of national sample
Nationwide sample of 1,502 adults, 18 years of
age or older surveyed between February 1 and
February 13 (2007).
Types of information collected:
(1) News knowledge items
(2) Demographics (Age, Gender)
(3) Level of education
(4) Political engagement
(5) Political ideology and affiliation
(6) Level of news exposure
(7) Sources of news
The danger and the opportunity of secondary data analysis:
This dataset contains far more information than we will be able to make sense of in a single analysis. It’s important to approach existing datasets with a specific question or set of questions, rather than going fishing for significant relationships!
© Judith D. Singer and Lindsay C. Page, Harvard Graduate School of Education Unit 11/Slide 7
Research question: Would we be better off as a “Colbert Nation?”
“The results of the new Pew Survey on News Consumption suggest that viewers of the "fake news" programs The Daily Show and The Colbert Report are more knowledgeable about current events than watchers of "real" cable news shows … as well as average consumers of NBC, ABC, Fox News, CNN, C-SPAN and daily newspapers.”
--Greg Mitchell, The Huffington Post
RQ: While there are significant differences in news knowledge between those who do and do not watch The Daily Show or The Colbert Report, these differences may be explained away by differences in background characteristics, levels of political engagement or other news consumption patterns between those who do and do not watch comedy news.
Hypothesis 1: Viewers of comedy news tend to be more highly educated, and this higher level of education serves to explain the difference in news knowledge.
Hypothesis 2: Viewers of comedy news consume more news from a variety of sources and are more politically engaged, in general. Thus, comedy news, per se, is not the source of their greater news knowledge.
© Judith D. Singer and Lindsay C. Page, Harvard Graduate School of Education Unit 11/Slide 8
A first look at the data
Comedy TotalID Knowledge News Age Male Educ News Engagement Turnout1 50 0 44.4 0 14 57 79.555.02 32 1 67.9 0 9 54 69.255.03 45 0 45.7 0 12 43 61.046.44 26 0 39.3 0 15 16 66.852.25 79 0 59.5 1 15 48 80.852.26 31 0 43.8 0 12 59 99.656.47 22 0 29.6 1 10 57 37.456.08 93 0 29.6 1 16 44 88.656.09 23 0 45.1 0 12 6 82.457.010 41 0 18.7 1 11 10 87.342.211 23 0 47.5 0 9 37 67.052.212 35 0 80.7 0 12 23 68.552.213 44 0 18.3 1 10 28 69.449.014 97 0 79.1 1 17 50 78.258.615 5 0 91.3 0 11 5 84.558.616 54 0 32.5 0 13 75 97.858.617 79 0 37.7 0 15 60 83.858.618 58 0 30.8 0 16 60 78.358.619 96 0 49.0 0 15 51 92.457.820 93 0 68.0 1 14 85 66.148.821 80 1 58.9 1 15 59 54.948.822 22 0 50.0 0 12 16 75.253.023 79 1 51.9 1 16 58 83.453.0
News knowledge scores cover entire possible range (0 to
100)
16 percent of those sampled watch comedy
news
Age ranges from 18 (minimum age to participate in survey) to 95 years of age
Highest grade completed: Average person in sample
has almost 14 years of Education
Measure of exposure to news from all sources. Ranges from
low of 0 (no news consumption) to a high of 95 (out of possible
100)
Index of political engagement based on
several questions related to involvement with political
issues and activities
Voter turnout in respondent’s county in 2004 presidential race (ranges from 26 to 80
percent)
Among the other variables that
we are not including:
Ideology
Political Party
Blue State/Red State
© Judith D. Singer and Lindsay C. Page, Harvard Graduate School of Education Unit 11/Slide 9
How should you proceed when you have so many predictors
What have you done for HW assignments and in class?
1. Describe the distributions of the outcome and predictors
2. Examine scatterplots of the outcome vs. each predictor, transforming as necessary (with supplemental residual plots to guide transformation)
3. Examine estimated correlation matrix to see what it foreshadows for model building
4. If there is a clear predictor for which to control statistically, examine the estimated partial correlation matrix to further foreshadow model building
5. Thoughtfully fit a series of MR models
6. Examine the series to select a “final” model that you believe best summarizes your findings
But with > 3 or 4 predictors, model building (step 5) becomes unwieldy…
Advice: Before doing any analysis, place your predictors into up to four conceptual groups based on a combination of substance/theory and their role in your statistical analyses
Question predictor(s)
Key control predictor(s)
Additional control predictor(s)
Rival hypothesis predictor(s)
Challenge:
7 predictors = 27-1 = 127 possible models (+ interactions!)
Comedy News
Education
AgeGender
Total news consumptionPolitical engagement
© Judith D. Singer and Lindsay C. Page, Harvard Graduate School of Education Unit 11/Slide 10
Developing a taxonomy of fitted models (using the predictor classifications)
Question predictorsComedy news
Key control predictorsEducation
Additional control predictorsAge
Gender
Rival hypothesis predictorsTotal news consumption
Political engagement
Strategy 1: Question predictors first
1. Start with your question predictors: after all, those are the variables in which you’re most interested
2. Add key control predictors assessing whether the effects change—probably keep the key control predictors in the model regardless
3. Add additional control predictors, keeping them in the model only as necessary
4. Check rival hypothesis predictors to see whether the effects of the question predictors remain
Strategy 2: Build a control model first
1. Start with the key control predictors: after all, you’re pretty confident they have a major effect that you need to remove
2. Add additional control predictors, keeping them in the model only as necessary
3. Add the question predictors seeing whether they have an effect over and above the control predictors
4. Check rival hypothesis predictors to see whether the effects of the question predictors remainOften the approach of choice because
it focuses attention on the question predictors
Preferable when the effects of the control predictors are so well established that beyond a first “peek” it’s difficult to think about examining the question predictors
uncontrolled
Or some combinati
on
Don’t forget there’s a difference between how you do the analysis and how you report
the results
© Judith D. Singer and Lindsay C. Page, Harvard Graduate School of Education Unit 11/Slide 11
To compare knowledge levels between demographic groups, the sample was divided into roughly equal thirds on the basis of how many of the questions they answered correctly. About 35% were classified as the “High” knowledge group. About 31% were classified as having “Medium” levels of knowledge. [The remainder] were assigned to the “Low” knowledge group.
--Pew Research Center
r=0.22***r=0.45***
Variable: Knowledge Mean 57.09 Std Deviation 22.93
Histogram # Boxplot 102.5+*** 8 | .************** 41 | 92.5+********************** 66 | .*************************** 80 | 82.5+*********************************** 104 | .*************************************** 117 +-----+ 72.5+*********************************** 104 | | .******************************* 93 | | 62.5+***************************** 86 | | .************************************ 108 *--+--* 52.5+******************************** 94 | | .****************************************** 124 | | 42.5+*************************************** 115 +-----+ .*********************************** 104 | 32.5+********************* 61 | .****************** 54 | 22.5+******************* 55 | .************ 36 | 12.5+*********** 32 | .***** 15 | 2.5+** 5 | ----+----+----+----+----+----+----+----+--
Let’s begin by examining the outcome: knowledge of the news
Plots vs. question and key control predictors
Observations
• Distribution of Knowledge is symmetric and mound-shaped
• Variation in Knowledge is similar for those who do (sd=21.3) and do not (sd=22.7) watch comedy news
• Relationship between Knowledge and Education appears linear and moderate in strength (r=0.45, p<0.0001)
• Relationship between Knowledge and Age is potentially non-linear
Decision: explore functional form of age
r=0.18***
© Judith D. Singer and Lindsay C. Page, Harvard Graduate School of Education Unit 11/Slide 12
What functional form should we use for Age?
Histogram # Boxplot 97.5+* 1 | .*** 9 | .******* 26 | .************** 54 | .****************** 71 | .************************** 102 | .************************** 103 | .****************************** 120 +-----+ 57.5+********************************* 130 | | .*************************************** 154 *--+--* .**************************************** 160 | | .*********************************** 138 | | .******************************** 126 +-----+ .************************ 94 | .******************* 76 | .****************** 71 | 17.5+***************** 67 | ----+----+----+----+----+----+----+----+
Linear Model: R2 = 0.0469 Parameter StandardVariable DF Estimate Error t Value Pr > |t|
Intercept 1 42.96843 1.74098 24.68 <.0001Age 1 0.27710 0.03224 8.60 <.0001
Quadratic Model: R2 = 0.0874 Parameter StandardVariable DF Estimate Error t Value Pr > |t|
Intercept 1 12.66833 4.08709 3.10 0.0020Age 1 1.61078 0.16653 9.67 <.0001Age2 1 -0.01291 0.00158 -8.16 <.0001
“Dramatic differences emerge when the results are broken down by age. Young people know the least: Only 15% percent of 18-29 year-olds are among the most informed third of the public, compared with 43% of those ages 65 and older. But it is not these oldest respondents who know the most. Instead, it is people aged 50-64 who are slightly more likely to finish among the third of the sample who know the most…This difference likely is caused by the very different life circumstances of the two oldest age groups. Many of those 65 and older are retired from work, and health problems as well as lifestyle changes can disproportionately work to diminish the interest or ability of some in this generation to keep up with the news.” -– Pew Research Center
© Judith D. Singer and Lindsay C. Page, Harvard Graduate School of Education Unit 11/Slide 13
Where to from here?
Next Steps: Which model building strategy makes the most sense given that…
• The effect of the question predictor (ComedyNews) is statistically significant but relatively modest (r=0.18, p<0.0001).
• The effect of one key control predictor (Education) is moderate (r=0.45, p<0.0001).
• The effect of a second control predictor (Age) is non-linear. Together, Age and Age2 explain 9 percent of the variation in news knowledge.
Decision:
© Judith D. Singer and Lindsay C. Page, Harvard Graduate School of Education Unit 11/Slide 14
Considering effects of control variables
Pearson Correlation Coefficients, N = 1502 Prob > |r| under H0: Rho=0
Knowledge Education Male
Knowledge 1.00000
Education 0.44221 1.00000 <.0001
Male 0.24853 0.03042 1.00000 <.0001 0.2390
Simple Correlations
Partial Correlations
(partialling out Age and Age2)
Individuals with more Education have higher levels of News Knowledge, on average.
Effect of Education persists even after partialling out Age and Age2.
Pearson Correlation Coefficients, N = 1502 Prob > |r| under H0: Rho=0
Knowledge Age Education Male
Knowledge 1.00000
Age 0.21666 1.00000 <.0001
Education 0.44955 0.00646 1.00000 <.0001 0.8023
Male 0.23878 -0.02234 0.03473 1.00000 <.0001 0.3870 0.1785
On average, men have a higher level of News Knowledge than women.
Effect of Male also remains in the partial correlations. As does lack of relationship between gender and Education.
No linear relationship between Education and Age. Actually, a modest non-linear (quadratic in age) relationship.
Age and gender are not correlated, as we would expect. Neither are gender and Education.
Education and gender remain important predictors of news knowledge, even after controlling for (or partialling out) the non-linear effect of age. Because gender and education are not correlated with each other but are correlated with news knowledge, they will both be important variables to include in our regression analysis. We will also want to consider interactions among these control variables.
© Judith D. Singer and Lindsay C. Page, Harvard Graduate School of Education Unit 11/Slide 15
Building the “control” model for the effects of news knowledge
“Final” control model: R2 = 0.3231
Parameter Standard
Variable DF Estimate Error t Value Pr > |t|
Intercept 1 8.45107 20.00031 0.42 0.6727Age 1 -1.27406 0.79271 -1.61 0.1082Age2 1 0.01565 0.00740 2.12 0.0345Education 1 0.52522 1.49291 0.35 0.7250Male 1 21.76688 5.81564 3.74 0.0002Malexeduc 1 -0.80728 0.41017 -1.97 0.0492Educxage 1 0.17842 0.05841 3.05 0.0023Educxage2 1 -0.00176 0.00054006 -3.26 0.0011
Examining the effects of Age, Education and Gender and their interactions.
The results of the analysis showed that, as expected, age, education and gender accounted for a sizeable amount of the variance (32%) in news knowledge. Interestingly, we observe interactions between education and gender such that gender differences in news knowledge are larger among those with lower levels of education (after controlling for the linear and quadratic effects of age). In addition, we observe an interaction between education and the linear and quadratic terms for age. Our graphical representation of this result serves to illuminate the relationship uncovered by our model. Here, we observe that news knowledge differences by education are less pronounced among young adults and the elderly, whereas differences by level of education are comparatively greater among middle-aged adults.
20
30
40
50
60
70
80
8 9 10 11 12 13 14 15 16 17 18
Education (Years)
New
s kn
ow
led
ge
sco
re
male
female
20
30
40
50
60
70
80
15 25 35 45 55 65 75 85 95
Age (Years)
New
s kn
ow
led
ge
sco
reMasters
< HS
College grad
HS grad
Fitted relationship between news knowledge and education and gender, controlling for age
Fitted relationship between news knowledge and education and age, controlling for gender
© Judith D. Singer and Lindsay C. Page, Harvard Graduate School of Education Unit 11/Slide 16
Results of fitting a taxonomy of multiple regression models predicting news knowledge among a random sample of 1502 adults in the US
Predictor Model A Model B Model C
Intercept55.303***(0.633)
8.451(20.000)
-1.707(19.653)
Comedy News11.441***(1.605)
10.376***(1.328)
Age -1.274(0.793)
-0.902(0.779)
Age2 0.016*(0.007)
0.013 ~(0.007)
Education 0.525(1.493)
1.125(1.466)
Male 21.767**(5.816)
21.793***(5.702)
Male*Education -0.807*(0.410)
-0.823*(0.402)
Education*Age 0.178**(0.058)
0.152**(0.057)
Education*Age2 -0.002**(0.001)
-0.002**(0.001)
R2 0.0328 0.3231 0.3497
F 50.83 101.89 100.37
df (1, 1500) (7, 1494) (8, 1493)
p-value <.0001 <.0001 <.0001
Cell entries are estimated regression coefficients (standard errors)~ p<.10, *p<.05, **p<.01, ***p<.001
Examining the effects of the question predictor: Uncontrolled & controlled
Hypothesis 1: Viewers of comedy news tend to be more highly educated, and this higher level of education serves to explain the difference in news knowledge.Before considering other controls, we estimate that comedy news viewers earn news knowledge scores that are an average of 11.4 points higher than those who do not watch comedy news (Model A). We hypothesized that this effect is, in part, due to differences in background characteristics by viewership; in particular, programs like The Daily Show attract a more highly educated audience, on average, and differences in educational background serve to explain much of the test score differential. While the relationship between education and viewership was borne out in the data, it was not as strong as we might have anticipated (r=0.056, p=0.0295). Rather, we observe that even after controlling for the effect of education and other key control variables, the main effect of comedy news viewership remains relatively stable (Model C).
A prescription for a better informed
citizenry?
Maybe… but first we need to consider our rival hypotheses!
© Judith D. Singer and Lindsay C. Page, Harvard Graduate School of Education Unit 11/Slide 17
What about the rival hypothesis predictors?
Hypothesis 2: Viewers of comedy news consume more news from a variety of sources and are more politically engaged, in general. Thus, comedy news, per se, is not the source of their greater news knowledge.
r=0.33*** r=0.32***r=0.19***
Political engagement is a moderately good
predictor of average news knowledge (r=0.33,
p<0.0001).
Voter turnout in a respondent’s county has a weak linear relationship
with news knowledge (r=0.19, p<0.0001).
While the correlation coefficient between news knowledge and total news consumption suggests a
moderate linear relationship (r=0.32), the scatterplot (and theory)
indicate that a non-linear specification may be
preferable.
Decision:
© Judith D. Singer and Lindsay C. Page, Harvard Graduate School of Education Unit 11/Slide 18
Considering a quadratic versus a log transformation for Total News Consumption
35
40
45
50
55
60
65
70
0 20 40 60 80 100
Total News Consumption
New
s K
no
wle
dg
e S
core
quadratic
logarithm
Reasons to prefer logarithm
• Theoretical: Log transformation corresponds to notion of diminishing marginal returns
• Practical: Parameter estimate from log transformation more readily interpretable
Similarities• As shown above, the fitted lines are remarkably similar
•R2 for both similar: 0.1173 for log & 0.1217 for quadratic
Reason to prefer quadratic
• Practical: Including quadratic term in model allows us to formally test for a non-linear relationship.
© Judith D. Singer and Lindsay C. Page, Harvard Graduate School of Education Unit 11/Slide 19
Functional form of Total News Consumption: “Starting” a log transformation
Parameter StandardVariable DF Estimate Error t Value Pr > |t|
Intercept 1 43.91565 1.14560 38.33 <.0001totalnews 1 0.29112 0.02209 13.18 <.0001
NVariable N Miss Mean Std Dev Minimum Maximum-------------------------------------------------------------------totalnews 1502 0 45.24 25.37 0.00 95.00-------------------------------------------------------------------
NVariable N Miss Mean Std Dev Minimum Maximum-------------------------------------------------------------------L2totnews 1451 51 5.26 1.06 0.00 6.57-------------------------------------------------------------------
log(0)
If some observations take on the value of zero on the raw scale, we must add a small adjustment to the variable prior to log transformation. Otherwise, we will have missing observations on the transformed scale. We call this adjustment “starting” a log transformation, which we do by adding an incremental value to all observations for that variable.
NVariable N Miss Mean Std Dev Minimum Maximum
------------------------------------------------------------------- L2totalnews 1502 0 5.13 1.37 0.00 6.58-------------------------------------------------------------------
L2totnews = log2(totalnews);
L2totalnews = log2(totalnews+1);
“starting” a
variable
Parameter StandardVariable DF Estimate Error t Value Pr > |t|
Intercept 1 28.06170 2.15311 13.03 <.0001L2totalnews 1 5.65306 0.40511 13.95 <.0001
© Judith D. Singer and Lindsay C. Page, Harvard Graduate School of Education Unit 11/Slide 20
R2=0.4325 Parameter StandardVariable DF Estimate Error t Value Pr > |t|
Intercept 1 -27.57830 18.55229 -1.49 0.1374 ComedyNews 1 7.24408 1.27068 5.70 <.0001Age 1 -1.19026 0.72903 -1.63 0.1028Age2 1 0.01409 0.00680 2.07 0.0384Education 1 0.16643 1.37534 0.12 0.9037Male 1 22.35841 5.34508 4.18 <.0001MalexEduc 1 -0.84624 0.37706 -2.24 0.0250EducxAge 1 0.16075 0.05367 2.99 0.0028EducxAge2 1 -0.00158 0.00049604 -3.18 0.0015Turnout 1 0.17671 0.05167 3.42 0.0006Engagement 1 0.23259 0.02846 8.17 <.0001L2totalnews 1 3.57360 0.34307 10.42 <.0001
What about the rival hypothesis predictors?News consumption and political engagement
While attenuated somewhat, the effect of our question predictor
remains upon the addition of our rival hypotheses.
Both measures of political engagement are statistically significant. Controlling
for all other variables in the model, individuals with high levels of political engagement and individuals living in areas with higher voter turnout have higher levels of news knowledge, on
average.
As we would expect, individuals who consume more news (from
a greater variety of sources) have higher average levels of
news knowledge. What about other interactions with
rival hypotheses?
It’s plausible that the effect of gender would interact with political engagement or with news consumption such that gender differences are smaller among men and
women who are more politically engaged or who consume more news. These additional
interactions were tested with the model above but were found not to be significant.
© Judith D. Singer and Lindsay C. Page, Harvard Graduate School of Education Unit 11/Slide 21
Final check and our “final” model
R2 = 0.4353
Parameter StandardVariable DF Estimate Error t Value Pr > |t|Intercept 1 -30.32633 18.66522 -1.62 0.1044ComedyNews 1 20.80208 8.36105 2.49 0.0130Age 1 -1.08433 0.73263 -1.48 0.1391Age2 1 0.01318 0.00682 1.93 0.0535Education 1 0.22650 1.37494 0.16 0.8692Male 1 21.89698 5.34360 4.10 <.0001MaleEeduc 1 -0.81002 0.37701 -2.15 0.0318EducxAge 1 0.15969 0.05364 2.98 0.0030EducxAge2 1 -0.00158 0.00049573 -3.19 0.0014Turnout 1 0.17411 0.05164 3.37 0.0008Engagement 1 0.23437 0.02845 8.24 <.0001L2totalnews 1 3.58320 0.34275 10.45 <.0001ComxAge 1 -0.70171 0.35886 -1.96 0.0507ComxAge2 1 0.00772 0.00360 2.14 0.0323
Final check:
We tested for statistical interactions between our question
predictor, ComedyNews, and all other predictors in the model. We
found that after including in the model our rival hypothesis variables
(in addition to all other control predictors), the Comedy News
indicator interacted significantly with Age.
Next Steps:
1. Residual Plots – does it appear as though the assumptions underlying our model are reasonable?
2. Prototypical plots – which effects do we want to highlight through graphical display?
What predictors should we include in our “final” model???
© Judith D. Singer and Lindsay C. Page, Harvard Graduate School of Education Unit 11/Slide 22
Histogram # Boxplot 3.75+* 1 0 . .* 6 0 .*** 14 | .************* 76 | .************************* 150 | .***************************************** 246 +-----+ 0.25+******************************************** 263 *-----* .********************************************** 276 | + | .*************************************** 233 +-----+ .********************** 129 | .************ 72 | .**** 24 | .** 10 0 -3.25+* 2 0 ----+----+----+----+----+----+----+----+----+- * may represent up to 6 counts
Examining residuals from “final” model
n = 57(3.8%)
Reasonably symmetric
© Judith D. Singer and Lindsay C. Page, Harvard Graduate School of Education Unit 11/Slide 23
Contemplating a graph that displays the findings
MalexEducEducxAgeEducxAgeComedyxAgeComedyxAge
TotalNewsLEngageTurnoutMale
EducationAgeAgeComedyNewsledgewKno
81.0002.0160.0008.0702.0
2583.3234.174.0897.21
227.0013.0084.1802.20326.30ˆ
22
2
Want to show the effect of
ComedyNews – it’s my question
predictor but is dichotomous
ComedyNews interacts with Age and Age2 – put Age
on x-axis
Age and Age2 also interact with Education –
construct fitted lines for different
levels of Education
Scatterplot of news knowledge by age for those who watch comedy
news
Scatterplot of news knowledge by age for those who do not watch
comedy news
Less interested in effects of gender,
political engagement and
total news consumption – set
these at their means.
© Judith D. Singer and Lindsay C. Page, Harvard Graduate School of Education Unit 11/Slide 24
MalexEducEducxAgeEducxAgeComedyxAgeComedyxAge
TotalNewsLEngageTurnoutMale
EducationAgeAgeComedyNewsledgewKno
81.0002.0160.0008.0702.0
2583.3234.174.0897.21
227.0013.00084.1802.20326.30ˆ
22
2
Computing fitted values to create prototypical plots
Step 1: Set Male, Turnout, Engage, and L2TotalNews to their means to control for their
effects in plots
Variable Mean---------------------------Male 0.49Turnout 56.40Engagement 73.41L2totalnews 5.09---------------------------
22
2
002.0160.0008.0702.0
17.0013.0084.1802.20633.25ˆ
EducxAgeEducxAgeComedyxAgeComedyxAge
EducationAgeAgeComedyNewsledgewKno
xEducEducxAgeEducxAgeComedyxAgeComedyxAge
EducationAgeAgeComedyNewsledgewKno
)49.0(81.0002.0160.0008.0702.0
)09.5(583.3)41.73(234.)40.56(174.0)49.0(897.21
227.0013.0084.1802.20326.30ˆ
22
2
© Judith D. Singer and Lindsay C. Page, Harvard Graduate School of Education Unit 11/Slide 25
22
2
002.0160.0)1(008.0)1(702.0
17.0013.0084.1)1(802.20633.25ˆ
EducxAgeEducxAgexAgexAge
EducationAgeAgeledgewKno
EducxAgeEducxAgeComedyxAgeComedyxAge
EducationAgeAgeComedyNewsledgewKno
002.0160.0008.0702.0
17.0013.0084.1802.20633.25ˆ2
2
Computing fitted values to create prototypical plots, II
22 002.0160.017.0021.0786.1435.46ˆ EducxAgeEducxAgeEducationAgeAgeledgewKno
Step 2: Create separate equations according to the
ComedyNews indicator
For those who watch comedy news
22
2
002.0160.0)0(008.0)0(702.0
17.0013.0084.1)0(802.20633.25ˆ
EducxAgeEducxAgexAgexAge
EducationAgeAgeledgewKno
For those who do not watch comedy news
22 002.0160.017.0013.0084.1633.25ˆ EducxAgeEducxAgeEducationAgeAgeledgewKno
© Judith D. Singer and Lindsay C. Page, Harvard Graduate School of Education Unit 11/Slide 26
2
22
0110836059323
)12(002.0)12(160.0)12(170.0013.0084.1633.25ˆ
Age.Age..
xAgexAgeAgeAgeledgewKno
:Grad HS
Computing fitted values to create prototypical plots, III
2
22
011.0774.0715.44
)16(002.0)16(160.0)16(170.0021.0786.1435.46ˆ
AgeAge
xAgexAgeAgeAgeledgewKno
:College
2
22
0190476.1913.22
)16(002.0)16(160.0)16(170.0013.0084.1633.25ˆ
Age.Age
xAgexAgeAgeAgeledgewKno
:College
Step 3: Select prototypical values of education
12 Years = High School Graduate
16 Years = College Graduate
18 Years = Master’s degree
2
22
003.0134.0395.44
)12(002.0)12(160.0)12(170.0021.0786.1435.46ˆ
AgeAge
xAgexAgeAgeAgeledgewKno
:Grad HS
22 002.0160.0170.0021.0786.1435.46ˆ EducxAgeEducxAgeEducationAgeAgeledgewKno For those who watch comedy news
For those who do not watch comedy news
22 002.0160.0170.0013.0084.1633.25ˆ EducxAgeEducxAgeEducationAgeAgeledgewKno
2
22
0230796.1573.22
)18(002.0)18(160.0)18(170.0013.0084.1633.25ˆ
Age.Age
xAgexAgeAgeAgeledgewKno
:Masters
2
22
015.0094.1375.44
)18(002.0)18(160.0)18(170.0021.0786.1435.46ˆ
AgeAge
xAgexAgeAgeAgeledgewKno
:Masters
© Judith D. Singer and Lindsay C. Page, Harvard Graduate School of Education Unit 11/Slide 27
Creating prototypical plots
For those who watch comedy news
For those who do not watch comedy news
0
10
20
30
40
50
60
70
80
90
100
0 20 40 60 80 100
Age (Years)
New
s K
no
wle
dg
e S
core Masters
College grad HS grad
0
10
20
30
40
50
60
70
80
90
100
0 20 40 60 80 100
Age (Years)
New
s K
no
wle
dg
e S
core
Masters
College grad
HS grad
© Judith D. Singer and Lindsay C. Page, Harvard Graduate School of Education Unit 11/Slide 28
Creating prototypical plots, II
Figure 1. Panel of plots illustrating the fitted relationship between news
knowledge and age by comedy news viewership status and level of education,
controlling for gender, measures of political engagement and total news consumption
(at their means) for Adult Americans (n=1502)
0
10
20
30
40
50
60
70
80
90
100
0 20 40 60 80 100
Age (Years)
New
s K
no
wle
dg
e S
core
College Degree (Education = 16)
Comedy news viewers
Non-viewers
0
10
20
30
40
50
60
70
80
90
100
0 20 40 60 80 100
Age (Years)
New
s K
no
wle
dg
e S
core
Master’s Degree (Education = 18)
Comedy news viewers
Non-viewers
0
10
20
30
40
50
60
70
80
90
100
0 20 40 60 80 100
Age (Years)
New
s K
no
wle
dg
e S
core
HS Degree (Education =
12)
Comedy news viewers
Non-viewers
© Judith D. Singer and Lindsay C. Page, Harvard Graduate School of Education Unit 11/Slide 29
0
10
20
30
40
50
60
70
80
90
100
0 20 40 60 80 100
Age (Years)
New
s K
no
wle
dg
e S
core
College Degree (Education = 16)
Comedy news viewers
Non-viewers
After controlling for education, gender, political engagement and total news consumption, there
remains a significant, positive relationship between comedy news viewership and news
knowledge, such that those who watch comedy news exhibit higher levels of news knowledge, on
average. As Figure 1 illustrates, the effect of comedy news viewership on news knowledge is
dependent on age. Across levels of education, we observe a general trend whereby the average
difference between comedy news viewers and non-viewers is larger among older Americans. Whereas older Americans who do not watch comedy news have lower levels of news knowledge than their middle-aged counterparts, older Americans who watch comedy news perform as well as (if not better than) their middle-aged counterparts.
Summarizing our results
Does this mean that comedy news is a prescription for a better-informed citizenry? Maybe. While
primarily intended as comedy entertainment, these shows do provide viewers with legitimate news and information –- often very memorably! Therefore, while plausible (especially given that the effect
largely remains after considering our control and rival hypothesis variables), due to the observational nature of this analysis, we are not able to make the causal claim that watching comedy news leads to
better news knowledge. Rather, comedy news viewership may relate to other unobserved
characteristics that also relate to news knowledge, such that our observed effect of Comedy News
viewership remains the result of these unobserved characteristics.
© Judith D. Singer and Lindsay C. Page, Harvard Graduate School of Education Unit 11/Slide 30
How might I organize and present my results?
Four sets of evidence ina typical research presentation
1. Descriptive statistics: a table summarizing distributions (often by interesting subgroups)
2. Correlation matrix summarizing relationships among variables (sometimes with partials as well)
3.3. SelectedSelected regression results documenting key findings from the analysis (not every model you fit)
4. Prototypical plots summarizing the major findings (probably the plots we just constructed)
Don’t forget to distinguish between how you do the analysis and how you report
the results
Helpful hints about presenting results
1. Decide on your key points: Your text, tables and displays (appropriately titled and organized) should support that argument
2. Think about your reader, not yourself: take the reader’s perspective and supply evidence that helps him/her evaluate your argument
3. Try out alternative displays and text: your first attempt is rarely your best
4. Writing up your results usually helps solidify—and often modify—your major argument, tables and graphs. Learn from writing; re-writing is essential
•Be very careful about causal language
•Make clear that the effects that you report estimate what happens “on average”
•Always specify what’s controlled (and implicitly what’s not) when making particular statements
•Be sure to illustrate the magnitude of effects by appropriately interpreting parameter estimates
A well-organized paper will likely include at least three sections:
(1) an introduction that presents an overview of your argument and the research question(s) to be examined
(2) an analysis section that describes what was done and why, as well as the results of these analyses(3) a summary and conclusion section that discusses the interpretation, implications and limitations of
your analyses
© Judith D. Singer and Lindsay C. Page, Harvard Graduate School of Education Unit 11/Slide 31
Table 1. Estimated means and sd’s by comedy news viewership
(with t-statistics testing for differences in means by viewership)
Table 1. Estimated means and standard deviations of news knowledge and predictors, by comedy news viewership status (with t-test for difference in means)
Variable Comedy News
Non-Viewers (n=1268)
Comedy News
Viewers(n=234)
t
News Knowledge
55.30(22.77)
66.74(21.33)
-7.13***
Age (Years) 51.51(17.98)
47.87(17.33)
2.86**
Education (Years)
13.89(2.41)
14.26(2.37)
-2.18*
Male 0.48(0.50)
0.53(0.50)
-1.16
County voter turnout
56.3(8.83)
56.94(8.01)
-1.03
Engagement 73.29(16.71)
74.04(17.38)
-0.63
Total news consumption
42.31(24.80)
61.10(22.44)
-10.81***
Cell entries are sample means and standard deviations*p<0.05; **p<0.01, ***P<0.001
Estimated mean news knowledge is 55.3 for non-viewers and 66.74 for viewers, a raw difference of about 11 points (half a standard deviation). The difference is statistically significant (t=-7.13, p<0.0001).Comedy news viewers are almost 4 years younger than non-viewers, on average (t=2.86, p=0.0043). Nevertheless, viewers have significantly more education than non-viewers (t=-2.18, p=0.0295).
In our sample, 53 percent of comedy new viewers and 48 percent of non-viewers are male. The proportion of males did not differ significantly by viewership status.
Measures of political engagement did not differ significantly by viewership status. On average, individuals in both groups lived in counties where approximately 56 percent of people voted in the 2004 presidential election and had similar levels of individual political engagement.
Comedy news viewers consume more news from a greater variety of sources than non-viewers (t=-11.38, p<0.0001).
Approximately 16 percent of the individuals in our sample watch comedy news.
© Judith D. Singer and Lindsay C. Page, Harvard Graduate School of Education Unit 11/Slide 32
Table 2. Correlation matrix and Partial Correlation Matrix controlling for Age (in linear and quadratic form), n=1502
Knowledge
ComedyNews
Age Education Male Turnout Engagement
ComedyNews0.18***0.20***
Age 0.22*** -0.07**
Education0.45***0.44***
0.06*0.06*
0.01
Male0.24***0.25***
0.030.03
-0.020.030.03
Turnout0.12***0.11***
0.030.03
0.020.06*0.05*
0.020.02
Engagement0.33***0.29***
0.020.03
0.16***0.25***0.23***
-0.05*-0.06*
0.06*0.05*
L2TotalNews0.34***0.33***
0.21***0.23***
0.13***0.13***0.13***
0.020.02
0.020.02
0.17***0.16***
Cell entries are correlations and partial correlations (partialing out Age and Age2).*p<0.05; **p<0.01, ***P<0.001
© Judith D. Singer and Lindsay C. Page, Harvard Graduate School of Education Unit 11/Slide 33
Table 3. Results of fitting a taxonomy of multiple regression models
Table 3. Results of fitting a taxonomy of multiple regression models predicting news knowledge among a random sample of 1502 American adults
Model A Model B Model C Model D Model E
Intercept 55.303***(0.63)
8.451(20.00)
-1.707(19.65)
-27.578(18.54)
-30.326(18.67)
Comedy News 11.441***(1.60)
10.376***(1.33)
7.244***(1.27)
20.802*(8.36)
Age -1.274(0.79)
-0.902(0.78)
-1.190(0.73)
-1.084(0.73)
Age2 0.016*(0.01)
0.013 ~(0.01)
0.014*(0.01)
0.013~(0.01)
Education 0.525(1.49)
1.125(1.47)
0.166(1.38)
0.227(1.37)
Male 21.767**(5.82)
21.793**(5.70)
22.358***(5.35)
21.897***(5.34)
MalexEducation -0.807*(0.41)
-0.823*(0.40)
-0.846*(0.38)
-0.810*(0.38)
EducationxAge 0.178**(0.06)
0.152**(0.06)
0.161**(0.05)
0.160**(0.05)
EducationxAge2
-0.002**(0.0005)
-0.002**(0.0005)
-0.002**(0.0005)
-0.002**(0.0005)
ComedyxAge -0.702~(0.36)
ComedyxAge2 0.008*(0.004)
Turnout 0.177**(0.05)
0.174**(0.05)
Engagement 0.233***(0.03)
0.234***(0.03)
Log2(TotalNews)
3.574***(0.34)
3.583***(0.34)
R2 0.0328 0.3231 0.3497 0.4325 0.4344
© Judith D. Singer and Lindsay C. Page, Harvard Graduate School of Education Unit 11/Slide 34
Another example of model building: The Father Presence study
“A hierarchical linear regression analysis was conducted to determine the effects of fathers’ antisocial behavior and fathers’ presence on child antisocial behavior. Fathers’ antisocial behavior (r=.30, p<.001) and fathers’ presence (r=-.16, p<.001) were significantly correlated with child behavior problems.
DADHOMEDADASBCHILDASB 321
At the second step, we asked whether the effect of father presence was moderated by fathers’ antisocial behavior. Thus, the interaction between fathers’ antisocial behavior and father presence was entered and the model was estimated as:
DADHOMEDADASB
DADHOMEDADASBCHILDASB
*4
321
The interaction was statistically significant, slope = .28, p<.001).
We conducted four additional analyses to test the robustness of the interaction between fathers’ antisocial behavior and father presence. First, we tested whether fathers’ antisocial behavior moderated the effect of father presence controlling for the presence of nonbiological father figures in the home. Second, we tested whether fathers’ antisocial behavior moderated the effect of father presence, controlling for maternal antisocial behavior. Third, we tested whether the interaction between fathers’ antisocial behavior and father presence predicted child behavior problems in the clinical range. Fourth, we tested whether fathers’ antisocial behavior moderated a more fine-grained measure of his involvement, such as his caretaking behavior.”
At the first step, we asked whether fathers’ antisocial behavior and father presence independently predicted child behavior problems. The model was estimated as:
Fathers’ antisocial behavior significantly predicted elevated levels of child antisocial behavior (slope = 0.32, p<0.001), but father presence did not when fathers’ antisocial behavior was controlled (slope = 1.80, p=.33).
© Judith D. Singer and Lindsay C. Page, Harvard Graduate School of Education Unit 11/Slide 35
What’s the big takeaway from this unit?
• Be guided by the research questions– Don’t go on fishing expeditions fitting all possible subsets and don’t rely
on computers to select models for fitting– No automated model selection routine can replace thoughtful model
building strategies– It’s wise to divide your predictors into substantive groupings and use
those groupings to guide the analysis
• There is no single “right answer” or “right model”– Different researchers may make different analytic decisions; hopefully,
substantive findings about question predictors won’t change (but they can)
– Different researchers will choose to make different decisions about what information to present in a paper; hopefully, regardless of approach, there will be sufficient information to judge the soundness of the conclusions
• You can do data analysis!– Think back to the beginning of the semester; you’ve all come a long
way – You can judge the soundness of a research presentation; don’t believe
everything you read and be sure to read the methods section– No matter how much you learn about data analysis, there’s always
more to learn!
© Judith D. Singer and Lindsay C. Page, Harvard Graduate School of Education Unit 11/Slide 36
Multiple RegressionAnalysis
Multiple RegressionAnalysis
22110 XXY
Do your residuals meet the required assumptions?
Test for residual
normality
Use influence statistics to
detect atypical datapoints
Are the data longitudinal?
Use Individual
growth modeling
If your residuals are not independent,
replace OLS by GLS regression analysis
Specify a Multilevel
Model
If time is a predictor, you need discrete-
time survival analysis…
If your outcome is categorical, you need to
use…
Discriminant Analysis
Multinomial logistic
regression analysis
(polychotomous outcome)
Binomial logistic
regression analysis
(dichotomous outcome)
If you have more predictors than you
can deal with,
Create taxonomies of fitted models and compare
them.
Conduct a Principal Components Analysis
Form composites of the indicators of any common
construct.
Use Cluster Analysis
Transform the outcome or predictor
If your outcome vs. predictor relationship
is non-linear,
Use non-linear regression analysis
The S-052 Roadmap (Courtesy of John B. Willett)
© Judith D. Singer and Lindsay C. Page, Harvard Graduate School of Education Unit 11/Slide 37
Epilogue: Reflections on the semester
Course Goal: For you to learn how to use statistical methods to to address RQs
• Solid foundation in regression modeling
• Solid understanding of assumptions
• Appreciation for the model’s flexibility
• Learn how to link methods and substance: how to think like a empirical researcher
• Learn how to communicate quantitative findings
Looking forward: You’ll be able to use statistical methods and evaluate the work of others
• You’ll never skip a methods section again
• You’ll never just believe someone’s findings without evaluating their methodology
• You’ll start thinking about how statistical methods might be used within your substantive arena to address important RQs
© Judith D. Singer and Lindsay C. Page, Harvard Graduate School of Education Unit 11/Slide 38
Thanks and Congratulations!
You, in September
You, now!
Hal Varian [Google’s chief economist] likes to say that the sexy job in the next ten years will be statisticians. After all, who would have guessed that computer engineers would be the cool job of the 90s? When every business has free and ubiquitous data, the ability to understand it and extract value from it becomes the complimentary scarce factor. It leads to intelligence, and the intelligent business is the successful business, regardless of its size. Data is the sword of the 21st century, those who wield it well, the Samurai.
--Jonathan Rosenberg (Googleblog)
S030
Top Related