Sample analysis using the ICCS data An application of HLM
description
Transcript of Sample analysis using the ICCS data An application of HLM
ICCS 2009 IDB Seminar – Nov 24-26, 2010 – IEA DPC, Hamburg, Germany
Sample analysis using the ICCS data An application of HLM
Daniel CaroNovember 25
2
Purpose
Illustrate the use of hierarchical linear models (HLM) with ICCS 2009 data through the evaluation of specific hypotheses
3
Table of contents
HLM theoryApplied research exampleHLM data importing/estimation settingsHypothesis testing
4
Data structure
Often participants of studies are nested within specific contexts
Patients treated in hospitalsFirms operate within countriesFamilies live in neighborhoodsStudents learn in classes within schools
Data stemming from such research designs have a multilevel or hierarchical structure
5
Implications of research design
Observations are not independent within classes/schools
Students within schools tend to share similar characteristics (e.g., socioeconomic background and instructional setting)
Traditional linear regression (OLS) assumes:Correlation (ei,ej)=0, i.e., the ≠ between observed and predicted Y are uncorrelated
Ignoring dependence of observations may lead to wrong conclusions
6
Intra-class correlation coefficient
The intra-class correlation coefficient (ICC) measures the degree of data dependenceIt is equal to the proportion of the variance between schools, i.e., ICC = b / (b + w)
where b is the variance between schools and w the variance within schools or between students
If ICC = 0, responses of students within schools are uncorrelatedSi ICC= 1, responses within schools are identical
7
Effective sample size
A higher ICC value indicates greater dependence among observations within schools
Effective sample size is smaller than observed sample size
Effective n= mk / (1 + ICC*(m-1))where n=sample size, m= number of students per schools and k= number of schools
If ICC=1, effective n is equal to the # of schools (k) If ICC=0, effective n is equal to the observed n (i.e., mk)In general, effective n lies between k and mk
8
Limitations of OLS
OLS neglects ICC and considers standard errors based on observed n
But effective n is smaller than observed n when observations are correlated
Standard error is inversely proportional to nThus, OLS tends to underestimate the standard error
Underestimated standard errors can lead to incorrect significance tests and inferencesThe JRR method produces correct standard errors under a multilevel research design
9
Hierarchical linear models
Additionally, hierarchical linear models distinguish effects between and within clusters/schoolsFor example, they enable evaluating
The effect of SES on student achievement within schools and between schoolsThe effect of school location (urban/rural) on the average achievement between schools
10
Hierarchical linear models
Account explicitly for the multilevel nature of the data with the introduction of random effects
Consider ICC for calculation of standard errors, tests, and p-values
Decompose variance within and between schoolsStudent level variables explain variance within schools or between studentsSchool level variables explain variance between schools
A single R-squared cannot be reportedInstead, there is one for each level
11
Hierarchical linear models
Estimate regressions within schoolsProvide estimates of the intercept and coefficients (e.g., gender gap, SES effect) for each school
Level 1 (students) coefficients may depend on level 2 (schools) characteristics as if they were dependent variables
For example, the gender gap at the student level (i.e., gender coefficient) may vary between classes for the gender of the class teacher at level 2
12
Table of contents
HLM theoryApplied research exampleHLM data importing/estimation settingsHypothesis testing
13
Research goal
Evaluate 10 hypotheses related to the attitudes of students towards equal rights for immigrantsThe literature underscores the importance of:
Family SES, participation in diverse networks, intergroup discussion about civic issues, gender, social dominance orientation, civic knowledge, religion beliefs, the school location (urban/rural), the school climateReferences in ‘C:\ICCS2009\HLM training\References.pdf’
For each hypothesisTheory and independent variables
14
Related data and variables
Selected countryEngland
The analysis is restricted to international scales/variablesA description of the dependent and independent variables, their type, coding scheme, and source is in
C:\ICCS2009\HLM training\List of variables.pdf
The student (england1.sav) and class level (england2.sav) datasets are in
C:\ICCS2009\HLM training\Data
15
Data structure
Students (level 1 units) are nested in classes (level 2 units)The ICCS sample design yields an optimal sample of students within classes, and not optimal sample of students within schoolsUsually one class was selected within each school, rather than students across different grades
16
NOTE
This is a didactic example only. You will not be able to readily repeat this analysis during the presentation
17
Table of contents
HLM theoryApplied research exampleHLM data importing/estimation settingsHypothesis testing
18
HLM software
HLM estimates different type of hierarchical linear models
The applied example is for two-level models (student nested in classes)
Several steps are required to estimate a model:Creating data specifications file (.mdmt)Importing data to HLM (.mdm)Deciding on settings (e.g., weights, plausible values)Specifying model (.hlm)Estimating model
19
Beginning with HLM
20
Data specifications (.mdmt)
21
Selecting student level data
22
Missing data
HLM accepts multiply imputed datasetsMultiple imputation (MI) procedure is performed in another softwareConsult NORM, PAN, MICE in Stata and R, for example
Since missing data are normally not completely at random, it is recommended to conduct MI before model estimationBut for this example we will use available data, onlyHLM offers two options at level 1
Listwise deletion (making mdm): Sample is the same for all modelsPairwise deletion (running analysis): Sample depends on included variables
Missings at level 2 reduce substantially the sample size
23
Selecting class level data
24
Save data specifications (.mdmt)
25
Create data file (.mdm)
26
Check stats
27
Add dependent variable
28
Declare weights
29
Save null model
30
Run null model
31
View output
32
Interpret and save
Folder:‘C:\ICCS2009\HLM training\Models\model0.txt ’
Folder:‘C:\ICCS2009\HLM training\Models\model0.txt ’
Class variance=12.14; Student variance=103.99ICC=12.14/(12.14+103.99)=0.11 11% of differences occur between classes
Class variance=12.14; Student variance=103.99ICC=12.14/(12.14+103.99)=0.11 11% of differences occur between classes
33
Table of contents
HLM theoryApplied research exampleHLM data importing/estimation settingsHypothesis testing
34
Hypotheses
1. The SES Hypothesis2. The Contact Hypothesis3. The Intergroup Discussion Hypothesis 4. The Gender Hypothesis5. The Social Dominance Orientation Hypothesis6. The Learning Hypothesis 7. The Religion Belief Hypothesis8. The National Identity Hypothesis9. The Urban/Rural Differences Hypothesis 10. The School Climate Hypothesis
35
The SES Hypothesis
The SES hypothesis predicts more positive views of minorities among students of higher SES families than among students of lower SES families
Competition among low SESsHigh SESs travel and confront culturally diverse realities
Independent variables Parental education (HISCED)Parental occupational status (HISEI)
36
The SES Hypothesis
37
Centering of Xs
The intercept is the expected value of Y when Xs are zero
E(Y(Xs=0))=E(β0j)+β1j*0+ β2j*0+…+ βkj*0 +E(rij)
Since E(rij) and E(uoj) are zero => 00=Y(Xs=0)
But sometimes zero is not in the range of Xs If X is age, achievement score, etc.Here, the intercept is not interpretable
By centering the Xs, the intercept can be interpreted as the expected value of Y at the centering value(s) of Xs
37
38
Centering of Xs
Two options at level 1Grand and group (class) mean centering
The type of centering depends on the research interest (Enders & Tofighi, 2007; Raudenbush & Bryk, 2002)
Group mean centering is appropriate for unadjusted or pure within and between school effectsGrand mean centering yields school effects adjusted for student characteristics and is preferable for contextual effects
38
39
The SES Hypothesis
40
The SES Hypothesis
The hypothesis is supported by the parental education dataEffect size? (see stats and model estimates)
For a 1 SD increment in HISCED, IMMRGHT increases in 0.67 (1.04*0.64), that is, about 6 percent (0.67/10.75) of a SD in IMMRGHT
41
The Contact Hypothesis
The contact hypothesis anticipates greater tolerance among students participating in diversified and extended social networks (Allport, 1954; Cote & Erikson, 2009)
Independent variablesStudents' civic participation in the wider community (PARTCOM)Students' civic participation at school (PARTSCHL)
Control for SESHigher SES have more diversified social networks (Erickson, 2004) and are more active in voluntary associations (Curtis & Grabb, 1992)
42
The Contact Hypothesis
43
The Contact Hypothesis
The hypothesis holds in EnglandBoth students' civic participation in the wider community (PARTCOM) and students' civic participation at school (PARTSCHL) are positively related to the attitudes toward immigrantsFor a 1 SD increment in the independent variables, the associated positive change in IMMRGHT amounts to
7 percent of SD in IMMRGHT for PARTCOM 11 percent of SD in IMMRGHT for PARTSCHL
44
The Intergroup Discussion Hypothesis
The intergroup discussion hypothesis posits that more positive attitudes toward minorities develop from dialogue on social and civic issues inside and outside the school (Dessel, 2010a)
Independent variablesStudents' discussion of political and social issues outside of school (POLDISC)Student perceptions of openness in classroom discussions (OPDISC)
Control variablesParental education (HISCED)
45
The Intergroup Discussion Hypothesis
46
The Intergroup Discussion Hypothesis
The hypothesis is validated by the dataBoth students' discussion of political and social issues outside of school (POLDISC) and student perceptions of openness in classroom discussions (OPDISC) are positively related to IMMRGHTFor a 1 SD increment in the independent variables, the associated positive change in IMMRGHT amounts to
9 percent of SD in IMMRGHT for POLDISC 18 percent of SD in IMMRGHT for OPDISC
47
The Gender Hypothesis
The gender hypothesis predicts greater tolerance among girls than boys. Women tend to be more liberal, nurturing and social than men and are also expected to be more tolerant (Cote & Erikson, 2009; Gidengil, Blais, Nadeau, & Nevitte, 2003)
Independent variableThe student’s sex (GIRL)
48
The Gender Hypothesis
49
The Gender Hypothesis
The gender hypothesis holds in England
Differences between girls and boys amount to 2.24 score points in the IMMRGHT scale, that is, 21 percent of a SD in IMMRGHT
50
The Social Dominance Orientation Hypothesis
The social dominance orientation (SDO) hypothesis states that gender differences are partly explained by a differences in support for social inequality (Mata, Ghavami, &
Wittig, 2010). Independent variables
Female (GIRL)Students' support for democratic values (DEMVAL)Students' attitudes towards gender equality (GENEQL)Students' attitudes towards equal rights for all ethnic/racial groups (ETHRGHT)
51
The Social Dominance Orientation Hypothesis
52
The Social Dominance Orientation Hypothesis
The hypothesis is supported by the data
When proxies for social dominance orientation are included, gender differences are no longer significant
53
The Learning Hypothesis
The learning hypothesis predicts greater tolerance when individuals know more about minorities and civic issues in general (Cote & Erikson, 2009)
Independent variables Civic knowledge (PV1CIV)
Control for participation (Curtis & Grabb, 1992; Erickson, 2004)
Students' civic participation in the wider community (PARTCOM)Students' civic participation at school (PARTSCHL)
54
The Learning Hypothesis
55
The Learning Hypothesis
The learning hypothesis holds in EnglandStudents showing higher knowledge in civic issues also have more positive attitudes toward immigrants even when civic participation is controlledA 1 SD increment in PV1CIV is associated with a positive increase in IMMRGHT of about 22 percent of a SD
56
The Religion Belief Hypothesis
The religion belief hypothesis anticipates an association between holding religious beliefs and tolerance toward minorities (Hall, Matz,
& Wood, 2010; Schwartz & Huismans, 1995). The direction of the association is not clear
Negative for values of social conformity, tradition, conventionalism, and an authoritarian belief system Positive for humanitarianism, values of benevolence toward others, and a search for spiritual meaning
Independent variablesStudents' belonging to a religion (RELIG), Students' attitudes towards the influence of religion on society (RELINF)
Control variablesParental education (HISCED)
57
The Religion Belief Hypothesis
58
The Religion Belief Hypothesis
The hypothesis is not supported by the dataThe RELIG coefficient is non-significantThe RELINF coefficient is positive and significant, suggesting that students attaching a greater value to the influence of religion in society also share more positive attitudes toward immigrants. But the association with RELINF alone does not evaluate the hypothesis
59
The National Identity Hypothesis
The National Identity Hypothesis maintains that individuals are less tolerant of immigrants when they have a greater sense of national identityIndependent variables
Students' attitudes towards their country (ATTCNT)
Control variablesParental education (HISCED)
60
The National Identity Hypothesis
61
The National Identity Hypothesis
The hypothesis is not supported by the data
62
The Urban/Rural Differences Hypothesis
The urban/rural hypothesis anticipates more positive views of minorities in urban areas than in rural areas (Côté & Erickson, 2009) due to greater opportunities to meet socially and culturally diverse people in cities (Erickson, 2004)
Independent variableSchool location (RURAL)
Control variablesSchool level SES
School mean parental education (MHISCED)School mean parental occupational status (MHISEI)
Availability of resources in local community (RESCOM)
63
The Urban/Rural Differences Hypothesis
64
The Urban/Rural Differences Hypothesis
The hypothesis is not supported by the dataThe RURAL coefficient is non-significant
65
The School Climate Hypothesis
The school climate hypothesis states that a safe and positive school climate favors more positive attitudes toward minorities. Such climate contributes to reduce the anxiety and threat underlying anti-minority attitudes (Comerford, 2003; Dessel, 2010b; Moradi et al., 2006)
Independent variablesTeachers' perceptions of classroom climate (TCLCLIM)Teachers' perceptions of social problems at school (TSCPROB)
ControlsSchool average parental education (HISCED)Availability of resources in local community (RESCOM)
66
The School Climate Hypothesis
67
The School Climate Hypothesis
The hypothesis cannot be supported by the data