Seppo 1 Laaksonen University of Helsinki, e-mail: Seppo ... · Grid sampling for a mixed-mode human...

15
Grid sampling for a mixed-mode human survey and adjustment for non-response Seppo Laaksonen 1 1 University of Helsinki, e-mail: [email protected] Acknowledgements: The study is a methodological part of the ongoing project that is initiated by the professors Mari Vaattovaara ja Matti Kortteinen from the University of Helsinki. I also thank Henrik Lönnqvist and Teemu Kempainen who are co-working for the project as well. NTTS 2013 _ Seppo Laaksonen 1

Transcript of Seppo 1 Laaksonen University of Helsinki, e-mail: Seppo ... · Grid sampling for a mixed-mode human...

Page 1: Seppo 1 Laaksonen University of Helsinki, e-mail: Seppo ... · Grid sampling for a mixed-mode human survey and adjustment for non-response Seppo 1 Laaksonen 1University of Helsinki,

Grid sampling for a mixed-mode human survey and adjustment for non-response

Seppo Laaksonen1

1University of Helsinki, e-mail: [email protected]

Acknowledgements: The study is a methodological part of the ongoing project that is initiated by the professors Mari Vaattovaara ja Matti Kortteinen from the University of Helsinki. I also thank Henrik Lönnqvist and Teemu Kempainen who are co-working for the project as well.

NTTS 2013 _ Seppo Laaksonen 1

Page 2: Seppo 1 Laaksonen University of Helsinki, e-mail: Seppo ... · Grid sampling for a mixed-mode human survey and adjustment for non-response Seppo 1 Laaksonen 1University of Helsinki,

NTTS 2013 _ Seppo Laaksonen 2

Type of area Number of grids

Population of 25-74

years

Stratum of ’poor’ grids 1058 232416

Stratum of ’rich’ grids 1187 70382

Municipality strata without confidentiality exclusion

5020 390142

Excluded due to confidentiality from the grid-based sample but not from the municipality sampling.

1616 6785

All 8881 699725

Table 1. Statistics of grids where one or more adults living

Page 3: Seppo 1 Laaksonen University of Helsinki, e-mail: Seppo ... · Grid sampling for a mixed-mode human survey and adjustment for non-response Seppo 1 Laaksonen 1University of Helsinki,

NTTS 2013 _ Seppo Laaksonen 3

Median income < 32092Median income > 73206

Figure1. Grids for ‘rich’ people (RED) vs. ‘poor’ people (BLUE) in the municipalities of the survey. The remaining grids are between those two ones or empty of people

Page 4: Seppo 1 Laaksonen University of Helsinki, e-mail: Seppo ... · Grid sampling for a mixed-mode human survey and adjustment for non-response Seppo 1 Laaksonen 1University of Helsinki,

NTTS 2013 _ Seppo Laaksonen 4

Poor grids ‘poor, h’

Rich grids ‘rich, h’

Munici-pality ‘all, h’ Total

25-74 year Population

Helsinki, most urbanised southern area 110 46 1000 1156 27465

Helsinki, most urbanised northern area 1142 8 1000 2150 40206

Helsinki, suburb 2501 1324 2500 6325 147098

Espoo and Kauniainen 546 3127 2000 5673 131840

Hyvinkää 248 64 600 912 24944

Järvenpää 115 38 600 753 21717

Kerava 124 48 600 772 18874

Kirkkonummi 89 173 600 862 20065

Lahti 0 0 1000 1000 57059

Lohja 0 0 600 600 22613

Mäntsälä and Pornainen 49 22 600 671 13850

Nurmijärvi 85 120 600 805 21924

Sipoo 48 134 600 782 10269

Tuusula 118 201 600 919 20948

Vantaa 746 574 1500 2820 104930

Vihti 81 121 600 802 15923

All 6000 6000 15000 27000 699725

Table 2. Distribution of the gross sample to strata. The group ‘Others’ in the above scheme is equal to municipality gross sample size.

Page 5: Seppo 1 Laaksonen University of Helsinki, e-mail: Seppo ... · Grid sampling for a mixed-mode human survey and adjustment for non-response Seppo 1 Laaksonen 1University of Helsinki,

NTTS 2013 _ Seppo Laaksonen 5

h

hk

N

n

Inclusion probabilities

Single municipality strata

Strata with grid sampling and thus with post-strata, ’Rich’ grids (and similarly to ’Poor’ grids and ’All’ grids In which

hrich

hrich

kN

n

,

,

)( ,,, hrichhpoorhhall NNNN

Page 6: Seppo 1 Laaksonen University of Helsinki, e-mail: Seppo ... · Grid sampling for a mixed-mode human survey and adjustment for non-response Seppo 1 Laaksonen 1University of Helsinki,

NTTS 2013 _ Seppo Laaksonen 6

Statistics Grid part Grid part

Munici-pality part

Munici-pality part

Gross Net Gross Net

Obser-vations

12000 4387 15000 5231

Population 302357 302357 397368 397368

Mean 25.8 70.6 27.1 77.8

Minimum 8.3 18.2 13.1 39.0

Maximum 45.6 164.2 57.1 167.8

CV (%) 54.6 61.4 36.4 39.9

Table 3. Some statistics of the gross/net sample design weights

Page 7: Seppo 1 Laaksonen University of Helsinki, e-mail: Seppo ... · Grid sampling for a mixed-mode human survey and adjustment for non-response Seppo 1 Laaksonen 1University of Helsinki,

NTTS 2013 _ Seppo Laaksonen 7

Our strategy for the weight adjustments is as follows: (i) We take those initial weights wk and divide these by the estimated response probabilities (called also response propensities) of each respondent obtained from the probit model, and symbolised by pk. (ii) Before going forward, it is good to check that the probabilities pk are realistic, that is, they are not too small, for instance. Naturally, all probabilities are below one. (iii) Since the sum of the weights (i) does not match to the known population statistics by strata h or by post-strata ‘rich, h’ , ‘poor, h’ or ‘all, h’, they should be calibrated so that the sums are equal to the sums of the initial weights in each stratum. This is made by multiplying the weights (i) by the ratio in which h may refer to post-strata as well. (iv) It is good also to check these weights against basic statistics, for example as presented in Table 3. If the weights are not plausible, the model should be revised.

kh k

h k

hpw

wq

/

Page 8: Seppo 1 Laaksonen University of Helsinki, e-mail: Seppo ... · Grid sampling for a mixed-mode human survey and adjustment for non-response Seppo 1 Laaksonen 1University of Helsinki,

NTTS 2013 _ Seppo Laaksonen 8

Auxiliary variable Category

Probit estimate

Standard error p-value

Type of grid Intermediate -0.064 0.006 <.0001

(ref= Rich) Poor -0.148 0.006 <.0001

Gender Male -0.292 0.003 <.0001

(ref= Female) Female 0,000 0 . Age group 25-34 -0.618 0.006 <.0001

(ref= 65-74 years) 35-44 -0.575 0.006 <.0001

45-54 -0.439 0.006 <.0001

55-64 -0.161 0.005 <.0001

Mother tongue Finnish -0.009 0.007 0.208 No significant (ref=Swedish) Swedish 0,000 0 . Number of people 1 0.179 0.013 <.0001

(ref=6+) 2 0.359 0.013 <.0001

3 0.272 0.013 <.0001

4 0.289 0.013 <.0001

5 0.216 0.014 <.0001

Removed to the Before 1995 0.013 0.004 0.0008

current house Between 1995-2006 -0.049 0.005 <.0001

(ref=After 2006) After 2006 0,000 0 . Current and previous living area

Removed to the southern Finland

0.019 0.007 0.0113

(ref=Removed within the same zip code area

Removed within the southern Finland

0.032 0.004 <.0001

Labour market status (ref=No unemployed)

Unemployed -0.051 0.008 <.0001

Table 4. Outcomes from the response propensity modeling by probit regression

Page 9: Seppo 1 Laaksonen University of Helsinki, e-mail: Seppo ... · Grid sampling for a mixed-mode human survey and adjustment for non-response Seppo 1 Laaksonen 1University of Helsinki,

Interaction by gender in the probit model Females of all age groups are participating better. It is fairly linear since 35 years old.

NTTS 2013 _ Seppo Laaksonen 9

-1,2

-1

-0,8

-0,6

-0,4

-0,2

0

25-34 35-44 45-54 55-64 65-74

Male

Female

Page 10: Seppo 1 Laaksonen University of Helsinki, e-mail: Seppo ... · Grid sampling for a mixed-mode human survey and adjustment for non-response Seppo 1 Laaksonen 1University of Helsinki,

NTTS 2013 _ Seppo Laaksonen 10

-0,4

-0,3

-0,2

-0,1

0

0,1

0,2

0,3

0,4

Highest 2nd highest 3rd highest 3rd lowest 2nd lowest Lowest

Probit estimates by income (earning plus capital) Fairly linear relationship Not that we could not get education. This replaces it.

Page 11: Seppo 1 Laaksonen University of Helsinki, e-mail: Seppo ... · Grid sampling for a mixed-mode human survey and adjustment for non-response Seppo 1 Laaksonen 1University of Helsinki,

NTTS 2013 _ Seppo Laaksonen 11

0

0,005

0,01

0,015

0,02

0,025

0,03

0,035

0,04

0,045

0,05

Current house smaller Current house size about as earlier Current house larger

Probit estimates by current and previous house size If removed to a larger house, the response propensity is higher. If a smaller, not so motivated to participate

Page 12: Seppo 1 Laaksonen University of Helsinki, e-mail: Seppo ... · Grid sampling for a mixed-mode human survey and adjustment for non-response Seppo 1 Laaksonen 1University of Helsinki,

NTTS 2013 _ Seppo Laaksonen 12

0

20

40

60

80

100

0 20 40 60 80

Propensity, %

Figure 2. Example of the cumulative response propensities for the respondents via ‘web’ and via ‘paper’ , respectively. We see that there are lower propensities for web respondents. But a web option is good for the survey participation, any way. More effort to motivate to use web is required

Web Paper

Page 13: Seppo 1 Laaksonen University of Helsinki, e-mail: Seppo ... · Grid sampling for a mixed-mode human survey and adjustment for non-response Seppo 1 Laaksonen 1University of Helsinki,

NTTS 2013 _ Seppo Laaksonen 13

Statistics Grid part Grid part

Munici-pality part

Munici-pality part

Adjusted

weights

for all

Gross Net Gross Net Net

Observations 12000 4387 15000 5231 9618

Population 302357 302357 397368 397368 699725

Mean 25.8 70.6 27.1 77.8 72.8

Minimum 8.3 18.2 13.1 39.0 12.4

Maximum 45.6 164.2 57.1 167.8 754.3

CV (%) 54.6 61.4 36.4 39.9 67.8

Adjustment leads to a new weight with a slightly higher variation as expected

Page 14: Seppo 1 Laaksonen University of Helsinki, e-mail: Seppo ... · Grid sampling for a mixed-mode human survey and adjustment for non-response Seppo 1 Laaksonen 1University of Helsinki,

NTTS 2013 _ Seppo Laaksonen 14

Results on people’s opinion on their living area by the type of grid; the means and standard errors in parenthesis. Indicators are scaled so that 0 = lowest, 100= highest.

Intermediate grids

Poor grids

Rich grids

General assessment 74.6 (0.48) 62.3 (0.55) 83.3 (0.44)

Quality of environment 74.5 (0.31) 65.4 (0.37) 79.6 (0.32)

Quality of housing conditions

72.7 (0.34) 65.4 (0.36) 77.4 (0.33)

Quality of services 68.9 (0.37) 73.2 (0.37) 68.8 (0.42)

Assessment of living area

74.8 (0.35) 67.6 (0.40) 80.1 (0.31)

Amount of problems 44.2 (0.60) 66.7 (0.58) 34.9 (0.58)

Page 15: Seppo 1 Laaksonen University of Helsinki, e-mail: Seppo ... · Grid sampling for a mixed-mode human survey and adjustment for non-response Seppo 1 Laaksonen 1University of Helsinki,

NTTS 2013 _ Seppo Laaksonen 15

Conclusion: Administrative areas and either postal zip codes are not ideal when designing and analysing survey data. Grids offer a flexible tool since they can be of a whatever size in principle, but confidentiality should be taken carefully into account. Results based on small grids are also more interesting comparing to those of ordinary areas. Basically, people living in a small grid know each other but this is not true with administrative and similar areas.