Who are the Nonresondents? An Analysis Based on a New Subsample of the German Socio-Economic Panel...

32
Who are the Nonresondents? An Analysis Based on a New Subsample of the German Socio-Economic Panel (SOEP) including Microgeographic Characteristics and Survey-Based Interviewer Characteristics Peter Schräpler 12 , Jürgen Schupp 23 and Gert G. Wagner 24 r-University Bochum, LDS NRW 2 DIW Berlin , 3 FU Berlin, 4 Berlin University of Technology Q 2008 Conference, Rome, 8th.-11th. July 2008

Transcript of Who are the Nonresondents? An Analysis Based on a New Subsample of the German Socio-Economic Panel...

Who are the Nonresondents?

An Analysis Based on a New Subsample of the German Socio-Economic Panel (SOEP)

including Microgeographic Characteristics and Survey-Based Interviewer Characteristics

Jörg-Peter Schräpler12, Jürgen Schupp23 and Gert G. Wagner24

1 Ruhr-University Bochum, LDS NRW 2 DIW Berlin, 3 FU Berlin, 4 Berlin University of Technology

Q 2008 Conference, Rome, 8th.-11th. July 2008

Schräpler

2

Outline

Introduction Reasons for Unit Nonresponse Nonresponse in Sample H

Descriptive Analysis Microgeographic data Interviewer data

Multilevel Analysis Consequences Summary and Conclusion

3

Introduction

Unit nonresponse is one of the most important issues in the empirical social science

Danger of selectivity: leads to biased samples, samples are not random it is important to investigate in which manner the realized

sample differ from the intended sample and to look at the consequences

Main reasons for Nonresponse: Problem of Non-Accessibility Problem of Non-Ability Problem of Refusals

4

Reasons for Nonresponse1. level: Accessibility

Result of impossibility to contact household members. It can be seen as (Groves/Couper 1998):

a function of the physical reachability of the household the circadian rhythm of the household members contact strategies of the interviewers Problem: Causes often can‘t be measured directly

Some empirical findings:

socio-economic status, household size, vocational status and age are important for mobility (cf. Goyder 1987, Schneekloth/Leven 2003, Koch 1997, Schräpler 2000)

Interviewers with higher workload have less nonresponse due to non-reachability (cf. Schräpler 2000)

5

Reasons for Nonresponse2. level: Ability

Unit Nonresponse depends on the ability of the household member to participate

Individuals are ill and can‘t participate. Assumption: health problems increase with the age of the respondent (c.f. Schneekloth/Leven 2003)

Assumption: sometimes an alibi and a „soft refusal“

6

Reasons for Nonresponse 3. level: Motivation/Cooperation

depends on respondents’ assessment of the interview situation and evaluation of the consequences of possible actions (RC theory)

Opportunity costs an interview takes time, survey has to serve a meaningful purpose

Privacy and confidentially concerns invasion of privacy (cf. Singer et al 1993) critical distance and possible mistrust of surveys in more

intellectual environments in Germany (Schneekloth/Leven 2003)

Fear of crime high population density areas, anonymous residential zones (cf.

Schnell 1997, Koch 1997, Goyder 1987, DeMaio 1980)

Interviewer interviewer’s age, gender, motivation, attitudes and experience (cf.

Esser 1986, Loosveldt et al. 1998, Schräpler 2006, 2004, 2000)

7

SOEP - Sample H - Fieldwork

Subsample H of SOEP started in year 2006

From 6,000 household addresses (4 per sample point) overall 3,931 household addresses were recorded by random walk

The process of address recording is separated from the interviewing process: the interviewer receives fixed addresses from the fieldwork

organization

The first wave was launched by 234 interviewers. Of these, 143 were already members of the SOEP staff.

All interviews were carried out by CAPI

8

Nonresponse in Sample H

Reasons for Nonresponse in Sample H N % Gross Sample 3931 100 ./. Non-systematic Drop-Outs Household not detectable 169 4.30 At the moment not feasible 12 0.31 Adjusted Gross Sample 3750 100 ./. Systematic Drop-Outs Not accessible 485 12.93 Refusal 1487 39.65 Not able to participate (c.f. nursing case) 172 4.59 Whole Sample Point lost 15 0.40 Individual household without treatment 82 2.19

Analyzable Interviews 1509 40.24 Source: SOEP 2006, Sample H, household level

9

Nonresponse Analyses – Information gap

Serious problem for nonresponse analysis: Information gap on respondents and nonrespondents

to fill the gap we use• commercial microgeographic data on the households‘

immediate neighbourhood• demographic variables of the interviewers• results of an interviewer questionnaire

10

Microgeographic Information

Use of additional commercial microgeograhic data on the households’ immediate neighbourhoods from the MOSAIC Data system

contains more than 75 individual characteristics used to analyse and describe customer databases or markets for instance Sinus Milieus®, Status, removal volume etc.

information is available at the address level and contains 17.8 million buildings in Germany the building level contains seven or eight households on average (at

least five households)

Important: linked information is not necessary in line with the reality of the particular household (only an approximation for the neighbourhood)

11

Interviewer data

Use of interviewer data from the SOEP interviewer data set mainly demographic variables like gender, age,

education, family status etc. Use of a dataset based on a interviewer

questionnaire mainly personality variables and self assessments filled out by 165 of the 234 SOEP interviewers in

sample H

12

Respondents by Sinus Milieus (N=1,449)

rel. Bias in %Ref.: Milieu distr. for addresses

< -50

> -50 till -30

> -30 till -10

> -10 till +10

> +10 till +30

> +30 till +50

> +50

13

Refusals by Sinus Milieus (N=1,435)

rel. Bias in %Ref.: Milieu distr. for addresses

< -50

> -50 till -30

> -30 till -10

> -10 till +10

> +10 till +30

> +30 till +50

> +50

14

Noncontact by Sinus Milieus (N=470)

rel. Bias in %Ref.: Milieu distr. for addresses

< -50

> -50 till -30

> -30 till -10

> -10 till +10

> +10 till +30

> +30 till +50

> +50

15

„Not Able to Participate“ by Sinus Milieus (N=167)

rel. Bias in %Ref.: Milieu distr. for addresses

< -50

> -50 till -30

> -30 till -10

> -10 till +10

> +10 till +30

> +30 till +50

> +50

16

Four Multilevel Logit Models Model 1 – probability for response variable „interview“

(participation) vs. non-response Model 2 – probability for response variable „refuse to participate“

vs. „participate“ Model 3 – probability for response variable „household not

reachable“ vs. „participate“ Model 4 – probability for response variable „household not able to

participate“ vs. „participate“

Two sets of Predictors:1. Model version A with demographic and household variables for the

potential respondents, microgeographic variables and demographic variables for the interviewer

2. Model version B with additional interviewer variables from the interviewer questionnaire

17

Two-level Logit Models

* participation 1, if 0,

0, otherwiseij

ij

yy

* unit-nonresponse (refuse, nocontact, not able)1, if 0,

0, otherwiseij

ij

yy

ij ij ijy u

1

0 , , 01

1 exp( ( ))H

ij j h ij h ij jh

x v

Random-Intercept Model:

Level 1: respondents, Level 2: interviewers

18

Version A: Multilevel logit estimates – age of thepotential respondents

Variable Coeff. Coeff. Coeff. Coeff.

Fixed effect

(Intercept) -1,482 -2,13 * 0,360 0,47 0,279 0,22 -1,106 -0,58

Age < = 35 years (Ref.)Age > 35 - 40 y. -0,160 -0,78 0,221 0,99 0,297 0,77 -0,238 -0,38Age > 40 - 45 y -0,292 -1,52 0,239 1,14 0,309 0,84 0,550 0,97Age > 45 - 50 y. -0,454 -2,39 * 0,519 2,52 * 0,437 1,18 0,181 0,32Age > 50 - 55 y. -0,019 -0,10 0,050 0,24 0,167 0,43 0,079 0,14Age > 55 - 60 y. -0,128 -0,64 0,102 0,47 0,561 1,46 -0,275 -0,45Age > 60 - 65 y. -0,294 -1,45 0,313 1,43 0,407 1,02 0,232 0,39Age > 65 y. -0,212 -1,00 0,251 1,10 0,306 0,71 0,315 0,52

... ... ... ... ...to continue

Participation vs. Refused Nocontact vs.

Nonparticipation vs. Participation Participation

Not Able vs.

Participation

z-value z-value z-value z-value

19

Version A: Multilevel logit estimates – SinusMilieus for the potential respondents

Variable Coeff. Coeff. Coeff. Coeff.

Sinus MilieuWell-Established (Ref.)Post-Materialists -0,194 -1,12 0,282 1,51 0,023 0,07 0,190 0,30Modern Performers -0,419 -2,06 * 0,499 2,25 * 0,247 0,68 0,764 1,20Upper-Conservatives -0,249 -1,20 0,200 0,89 -0,547 -1,30 1,986 3,27 ***Traditionalists -0,108 -0,62 0,146 0,78 -0,634 -1,83 + 1,483 2,60 **Nostalgics of former DDR -0,052 -0,24 0,064 0,26 -0,255 -0,63 0,359 0,49New Middle Class -0,288 -1,64 + 0,351 1,87 + 0,060 0,17 0,980 1,66 +Materialists -0,267 -1,36 0,336 1,58 0,099 0,27 0,219 0,32Hedonists/Escapists -0,222 -1,02 0,276 1,15 0,044 0,11 0,769 1,13Experimentalists -0,565 -2,24 * 0,508 1,84 + 0,635 1,47 0,695 0,80

... ... ... ... ...to continue

Participation vs. Refused Nocontact vs.

Nonparticipation vs. Participation Participation

Not Able vs.

Participation

z-value z-value z-value z-value

20

Version A: Multilevel logit estimates –Interviewer variables

Variable Coeff. Coeff. Coeff. Coeff.

... ... ... ... ...purch. power per HH > 530 €-0.329 -0.96 0.452 1.22 -0.097 -0.15 0.331 0.29Status 0.018 0.71 -0.017 -0.63 -0.029 -0.61 0.082 1.14East-Germany 0.075 0.37 -0.117 -0.54 -0.199 -0.50 0.601 1.04

interviewerIsex (1 - men) 0.074 0.54 -0.127 -0.87 -0.271 -1.02 0.599 1.69 +age of interviewer 0.002 0.27 0.000 -0.05 -0.002 -0.15 -0.023 -1.16interviewer age < 40 & male-0.508 -1.32 0.785 1.95 + 0.519 0.72 -0.981 -0.86second. modern school (Ref.)

secondary school -0.068 -0.40 0.037 0.20 -0.004 -0.01 -0.081 -0.18high school diploma -0.455 -1.67 + 0.401 1.39 0.618 1.22 0.597 0.89university with and without deg.0.008 0.04 0.049 0.23 -0.158 -0.41 -0.456 -0.87Workload 0.018 3.26 ** -0.019 -3.17 ** -0.018 -1.63 + -0.024 -1.78 +SOEP Experience 0.066 0.45 -0.210 -1.36 0.323 1.16 -0.059 -0.16... ... ... ... ...

to continue

z-value z-value z-value z-value

Not Able vs.

Participationvs. Participation Participation

Participation vs. Refused Nocontact vs.

Nonparticipation

21

Version A: Multilevel logit estimates – area description

Variable Coeff. Coeff. Coeff. Coeff.

... ... ... ... ...City -0,442 -2,58 *** 0,407 2,22 * 0,834 2,79 ** -0,047 -0,10simple urban row estate (Ref.)good earning families, new privat owned home 1,030 2,16 * -0,770 -1,48 -1,251 -1,29 -3,404 -2,31 *old families in outkripts 1,028 3,00 ** -1,002 -2,61 ** -0,724 -1,30 -1,841 -1,89 +self-employed in new houses 0,994 2,92 ** -0,842 -2,24 * -1,382 -2,19 * -2,015 -2,01 +good new detached houses 0,957 1,94 + -0,847 -1,55 -1,487 -1,52 -0,488 -0,39villages in outskirts 0,842 2,44 * -0,732 -1,92 + -1,087 -1,71 + -1,366 -1,47old city centre 0,834 2,33 * -0,605 -1,55 -1,398 -2,12 * -2,139 -1,84 +social climber, upscale professions, outskirts 0,771 2,00 * -0,697 -1,63 + -1,064 -1,67 + -1,424 -1,36dignified detached houses 0,760 2,10 * -0,638 -1,61 -0,812 -1,28 -2,289 -1,92 +simple vocactions in rural areas 0,752 2,15 * -0,494 -1,29 -1,118 -1,83 + -2,364 -2,07 *social housing, simple apartment buildings 0,654 2,05 * -0,588 -1,64 + -0,859 -1,69 + -0,592 -0,69low qualified worker 0,643 1,71 + -0,428 -1,05 -1,095 -1,60 -2,109 -1,88 +middle class in older accomodations 0,629 2,08 * -0,478 -1,42 -0,897 -1,84 + -1,426 -1,61younger villager 0,605 1,73 + -0,234 -0,61 -1,856 -2,61 ** -2,185 -2,19 *social hotspot 0,507 1,52 -0,416 -1,11 -1,117 -2,09 * 0,238 0,29humble-borns in apartments 0,297 0,91 -0,195 -0,54 -0,879 -1,68 + -0,598 -0,68attractive address in city 0,022 0,06 0,021 0,05 -0,382 -0,70 0,919 1,05old social housing -0,301 -0,72 0,462 1,03 -0,180 -0,30 1,103 1,05... ... ... ... ...

to continue

z-value z-value z-value z-value

Nonparticipation vs. Participation Participation Participation

Participation vs. Refused Nocontact vs. Not Able vs.

22

Version A: Multilevel logit estimates –size ofhouses and frequency of moves

Variable Coeff. Coeff. Coeff. Coeff.

... ... ... ... ...1-2 family houses in homog. street section (Ref.)

1-2 family houses in inhomog. street section -0,167 -1,12 0,158 1,00 0,308 0,87 -0,033 -0,073-5 family houses -0,011 -0,06 -0,086 -0,45 0,519 1,35 -0,494 -0,886-9 family houses 0,133 0,62 -0,078 -0,34 0,032 0,07 -0,976 -1,52apartment buildings with 10 - 19 HH 0,290 1,11 -0,234 -0,82 0,157 0,32 -2,253 -2,85 **high-riser with 10 and more HH 0,636 1,49 -0,502 -1,03 -0,127 -0,19 -2,591 -2,23 *maily commercial used -0,392 -0,87 0,196 0,40 1,529 2,06 * -1,725 -1,03

MOVE -0,003 -0,13 -0,033 -1,16 0,082 1,74 + 0,140 1,91 +... ... ... ... ...

to continue

Participation vs. Refused Nocontact vs.

Nonparticipation vs. Participation Participation

Not Able vs.

Participation

z-value z-value z-value z-value

23

Version A: Multilevel logit estimates – familystructure in the neigbourhood

Variable Coeff. Coeff. Coeff. Coeff.

... ... ... ... ...mainly single household (Ref.)far above average share of single HH -0,005 -0,02 0,574 2,31 * -0,597 -1,90 + -0,319 -0,58above average share of single HH -0,130 -0,60 0,686 2,70 ** -0,765 -2,28 * 0,255 0,47light above average share of single HH 0,104 0,47 0,398 1,54 -0,778 -2,25 * 0,078 0,14mixed family structure -0,087 -0,39 0,704 2,71 ** -0,674 -1,92 + -0,052 -0,09light above average share of family with children0,169 0,73 0,434 1,63 -1,062 -2,78 ** 0,082 0,14above average share of family with children 0,164 0,69 0,460 1,69 + -1,211 -3,08 ** -0,226 -0,37far above average share of family with children0,135 0,55 0,581 2,09 * -1,605 -3,70 *** -1,052 -1,56almost only families with children 0,278 1,08 0,451 1,55 -1,536 -3,13 ** -1,793 -2,06 *... ... ... ... ...

to continue

Participation vs. Refused Nocontact vs.

Nonparticipation vs. Participation Participation

Not Able vs.

Participation

z-value z-value z-value z-value

24

Version A: Multilevel logit estimates –Random effects

Variable Coeff. Coeff. Coeff. Coeff.

... ... ... ... ...Random effectsuii π²/3 π²/3 π²/3 π²/3

95% Interv. 95% Interv. 95% Interv. 95% Interv.

v0j (intercept) 0,400 (0,38 - 0,83)0,395 (0,39 - 0,94)1,300 (1,25 - 3,0) 1,370 (0,52 - 1,6)ICC 0,108 0,107 0,283 0,294interviewerhouseholdsLogLikelihood

Pseudo-R²

-405-2142 -1805 -8292774 18253408

227

Participation vs. Refused Nocontact vs.

Nonparticipation

219

vs. Participation Participation

224

Not Able vs.

Participation

215

0,140,140,07 0,07

z-value z-value z-value z-value

1523

25

Version B: Variables from the interviewer data set

Variable Coeff. Coeff. Coeff. Coeff.

... ... ... ... ...interviewerIsex (1 - men) 0,079 0,53 -0,241 -1,45 0,235 0,76 0,959 2,70 *age of interviewer -0,006 -0,74 0,010 1,10 -0,011 -0,68 0,000 0,00interviewer age < 40 & male -0,266 -0,60 0,809 1,64 + -0,717 -0,78 -0,770 -0,73secondary modern school (Ref.)

secondary school -0,211 -1,16 0,219 1,07 0,269 0,71 -0,394 -0,92high school diploma -0,634 -2,27 * 0,625 1,99 * 0,975 1,80 + 0,313 0,51university with and without deg. -0,296 -1,34 0,437 1,75 + 0,236 0,53 -0,583 -1,13Workload 0,021 3,15 ** -0,024 -3,24 ** -0,009 -0,68 -0,043 -2,56 *Soep-Experience 0,295 1,97 + -0,461 -2,71 ** 0,014 0,05 -0,022 -0,06... ... ... ... ...

to continue

Participation vs. Refused vs. Nocontact vs. Not Able vs.

Participation

z-value z-value z-value

Nonparticipation Participation Participation

z-value

26

Version B: Variables from the interviewer questionnaire

Variable Coeff. Coeff. Coeff. Coeff.

... ... ... ... ...years for SOEP in future 0,215 1,54 -0,239 -1,53 0,042 0,15 -0,297 -0,92amicable (1- 7) 0,193 2,66 ** -0,231 -2,87 ** -0,070 -0,47 -0,073 -0,40reserved (1 - 7) 0,147 2,86 ** -0,104 -1,80 + -0,336 -3,16 ** -0,235 -1,91 +life satisfaction (0 - 10) 0,082 2,02 + -0,081 -1,78 + -0,072 -0,92 -0,277 -2,79 **communicative (1 - 7) 0,079 0,83 -0,030 -0,28 -0,109 -0,55 -0,699 -3,01 **inquisitive ( 1- 7) 0,074 0,92 -0,039 -0,43 0,030 0,19 -0,341 -1,91 +often worry about things (1 - 7) 0,073 1,50 -0,096 -1,75 + -0,049 -0,50 -0,020 -0,17creativ (1 - 7) 0,052 0,75 -0,013 -0,16 -0,397 -2,84 ** 0,259 1,37sometimes too brusque(1 - 7) 0,031 0,48 -0,057 -0,79 0,178 1,44 -0,340 -1,86 +own riskpropensity ( 1 - 10) -0,027 -0,76 0,022 0,55 -0,052 -0,74 0,169 1,83 +forgive others (1 - 7) -0,054 -0,92 0,052 0,79 -0,033 -0,27 0,318 2,01 *sluggish (1 - 7) -0,073 -1,14 0,079 1,09 0,024 0,19 0,276 1,82 +patient (1 - 10) -0,097 -2,75 ** 0,082 2,06 * 0,094 1,32 0,164 1,94 +easily flustered (1 - 7) -0,103 -1,76 + 0,095 1,44 0,071 0,60 0,285 1,99 *social desirability indicator (1-0) -0,163 -1,12 0,127 0,78 0,252 0,86 0,773 2,28 *... ... ... ... ...

to continue

Participation vs. Refused vs. Nocontact vs. Not Able vs.

Participation

z-value z-value z-value

Nonparticipation Participation Participation

z-value

27

Version B: Multilevel logit estimates –Random effects

Variable Coeff. Coeff. Coeff. Coeff.

... ... ... ... ...

Random effects

uii π²/3 π²/3 π²/3 π²/3v0j (intercept)ICCinterviewerhouseholdsLogLikelihood

Pseudo-R²

Participation vs. Refused vs. Nocontact vs.

165 165 1632592 2111 1409

0.07 0.07 0.14

Not Able vs.

Participation

1621184

0.22

z-value z-value z-value

Nonparticipation Participation Participation

z-value

-1641 -1368 -636 -310

0.052 0.069 0.184 0.0690,180 0,243 0,740 0,242

28

Summary (1)

Refusals, noncontact and “unable to participate” relate to different respondent, area and interviewer characteristics:

Respondent is easy to persuade: well-established Sinus Milieu age <= 35 years high income families, new private owned houses, old families

in outskirts interviewer with high workload, with experience,

with self assessment: amicable, satisfied with own life, not easy flustered

29

Summary (2) Respondent refuse more likely:

Sinus Milieu: new middle class, experimentalists, modern performer

age > 45 – 50 years families with children cities, simple urban estate interviewer with

low workload, with less experience, high level education, age < 40 & male with self assessment: not amicable, unsatisfied with own life, patient,

not reserved

30

Summary (3)

Respondent is difficult to contact:

Sinus Milieu: experimentalists, modern performer age > 45 – 50 & age > 55 – 60 years single household cities, simple urban estate, areas with high freq. of moves interviewer with

high level education, with self assessment: not creative, not reserved

31

Summary (4) Respondent use “not able to participate” :

Sinus Milieu: upper conservative, traditionalists, new middle class, modern performer

smaller than cities areas with higher frequency of moves interviewer with

male low workload, with self assessment: not communicative, unsatisfied with own life,

sluggish, not inquisitive, easy flustered, patient, not reserved with higher need of social approval

Result does not indicate illness of respondents as expected, but that it may be an alibi used by respondents to avoid participation

32

Conclusion

Microgeographic data, interviewer data as well as interviewer questionnaires are an important source to fill the information gap on respondents and nonrespondents.

Next step of analyses: interaction terms between respondent, interviewer and

area Multilevel Poisson Regressions for the number of

contacts used in this sample