Services and Identity Management 1.12.2008 Prof. Sasu Tarkoma.
Oversampling the capital cities in the EU SAfety SUrvey (EU-SASU) Task Force on Victimization...
-
Upload
hilda-lester -
Category
Documents
-
view
214 -
download
0
Transcript of Oversampling the capital cities in the EU SAfety SUrvey (EU-SASU) Task Force on Victimization...
![Page 1: Oversampling the capital cities in the EU SAfety SUrvey (EU-SASU) Task Force on Victimization Eurostat, 17-18 February 2010 Guillaume Osier Service Central.](https://reader036.fdocuments.us/reader036/viewer/2022062518/5697bf791a28abf838c825b1/html5/thumbnails/1.jpg)
Oversampling the capital cities in the EU SAfety SUrvey (EU-SASU)
Task Force on VictimizationEurostat, 17-18 February 2010
Guillaume OsierService Central de la Statistique et des Etudes Economiques (STATEC)
Social Statistics [email protected]
![Page 2: Oversampling the capital cities in the EU SAfety SUrvey (EU-SASU) Task Force on Victimization Eurostat, 17-18 February 2010 Guillaume Osier Service Central.](https://reader036.fdocuments.us/reader036/viewer/2022062518/5697bf791a28abf838c825b1/html5/thumbnails/2.jpg)
Outline
I. Some theory1 . Definitions and concepts
2 . How to over-sample?3 . Why over-sample?4 . Impact on national accuracy
II. Over-sampling the capital cities in the EU-SASU1 . Is this proposal (statistically) relevant?
2 . How to determine the over-sampling rates?3 . Impact on the national accuracy
III. Specific issues in relation to over-sampling
![Page 3: Oversampling the capital cities in the EU SAfety SUrvey (EU-SASU) Task Force on Victimization Eurostat, 17-18 February 2010 Guillaume Osier Service Central.](https://reader036.fdocuments.us/reader036/viewer/2022062518/5697bf791a28abf838c825b1/html5/thumbnails/3.jpg)
Definitions and concepts(i) A sub-group (d) in the population is said to be over-sampled (or
over-represented) when the proportion of units from the sub-group is, on average, higher in the sample than in the reference population:
(ii) Conversely, a sub-group is said to be under-sampled (or under-represented) when the proportion of units from the sub-group is, on average, lower in the sample than in the reference population:
(iii) When a sub-group is neither over-sampled nor under-sampled, it is said to be well-sampled (or well-represented)
N
N
n
nE
dd
Proportion of units from (d) in the population
Average proportion of units from (d) in the sample
![Page 4: Oversampling the capital cities in the EU SAfety SUrvey (EU-SASU) Task Force on Victimization Eurostat, 17-18 February 2010 Guillaume Osier Service Central.](https://reader036.fdocuments.us/reader036/viewer/2022062518/5697bf791a28abf838c825b1/html5/thumbnails/4.jpg)
How to over-sample?
In order to get implemented, over-sampling requires the units in the sub-group to be identified in advance of sampling (issue with telephone surveys)
Two main techniques to over-sample:
• Stratification using unequal sampling fractions in the strata
• More general « proportional-to-size » sampling (ps, pps…)
Over-sampling rate for (d):
NN
nE
nEOR d
d
d
Expected sample size in (d) under
no over-sampling (i.e. under Simple Random Sampling)
Expected sample size in (d)
![Page 5: Oversampling the capital cities in the EU SAfety SUrvey (EU-SASU) Task Force on Victimization Eurostat, 17-18 February 2010 Guillaume Osier Service Central.](https://reader036.fdocuments.us/reader036/viewer/2022062518/5697bf791a28abf838c825b1/html5/thumbnails/5.jpg)
Why over-sample? 1/2
By selecting more people from certain groups than would typically be done if everyone in the sample had an equal chance of being selected, over-sampling leads to more accurate estimates for those groups.
The technique has proven particularly suitable to:• Small sub-populations;• Sub-populations having severe non-response
problems;• Sub-populations with large internal variability on the
key variables (e.g., household wealth)
![Page 6: Oversampling the capital cities in the EU SAfety SUrvey (EU-SASU) Task Force on Victimization Eurostat, 17-18 February 2010 Guillaume Osier Service Central.](https://reader036.fdocuments.us/reader036/viewer/2022062518/5697bf791a28abf838c825b1/html5/thumbnails/6.jpg)
Why over-sample? 2/2
More generally, one can resort to over-sampling whenever the sample size doesn’t allow us to reach specified precision targets over certain sub-populations.
Besides, in cross-national surveys (like the EU-SASU), over-sampling is essential for precision and hypothesis testing in cross-country comparisons.
The choice of the sub-groups to over-sample is policy-driven (political matter)
![Page 7: Oversampling the capital cities in the EU SAfety SUrvey (EU-SASU) Task Force on Victimization Eurostat, 17-18 February 2010 Guillaume Osier Service Central.](https://reader036.fdocuments.us/reader036/viewer/2022062518/5697bf791a28abf838c825b1/html5/thumbnails/7.jpg)
Impact on national accuracy 1/3
Optimal (Neyman) allocation: in order to maximize the precision of the national sample under stratified simple random sampling, the sample size in stratum h depends both on the stratum population Nh and the standard deviation Sh of the study variable
Stratum 1Size N1
St. deviation S1
Stratum 2Size N2
St. deviation S2
Stratum HSize NH
St. deviation SH
…
Total population aged 16+
H
kkk
hhopth
SN
SNnn
1
![Page 8: Oversampling the capital cities in the EU SAfety SUrvey (EU-SASU) Task Force on Victimization Eurostat, 17-18 February 2010 Guillaume Osier Service Central.](https://reader036.fdocuments.us/reader036/viewer/2022062518/5697bf791a28abf838c825b1/html5/thumbnails/8.jpg)
Impact on national accuracy 2/3
According to the previous formula, a larger sample should be taken if:* the stratum is larger* the stratum is more variable internally
These national considerations may conflict with more “local” considerations: as said, from a local point of view, over-sampling often focus on small sub-populations, while national considerations lead to taking larger samples from the largest strata. Nevertheless, the loss in national accuracy is often limited:
211 g
σ
σopt
opt
h
hopt
h
Hh n
nnmaxg 1
![Page 9: Oversampling the capital cities in the EU SAfety SUrvey (EU-SASU) Task Force on Victimization Eurostat, 17-18 February 2010 Guillaume Osier Service Central.](https://reader036.fdocuments.us/reader036/viewer/2022062518/5697bf791a28abf838c825b1/html5/thumbnails/9.jpg)
Impact on national accuracy 3/3
Thus, if g=20%, we have /(opt) 1.02, which makes an increase in accuracy (as measured by the standard error) of 2%. Similarly, if g=30%, we have /(opt) 1.04, which makes an increase of 4%. In this sense the optimum can be described as flat.
As a result, the impact of over-sampling on national accuracy should be limited, provided the sample sizes are not “extremely” different from the optimal ones. The impact is all the more limited given that the national sample sizes are generally large (thousands of units). Besides, by using powerful auxiliary information at national level, one may hope to increase sample precision a posteriori.
![Page 10: Oversampling the capital cities in the EU SAfety SUrvey (EU-SASU) Task Force on Victimization Eurostat, 17-18 February 2010 Guillaume Osier Service Central.](https://reader036.fdocuments.us/reader036/viewer/2022062518/5697bf791a28abf838c825b1/html5/thumbnails/10.jpg)
Over-sampling the capital cities in the EU-SASU: is this proposal relevant?
Capital city = most populated city of the country
Always the same as the political capital (except for Switzerland)
Is the proposal (statistically) relevant?• Sample size of individuals over the capital cities: is it enough to
draw reliable conclusions?• Victimization rates in the capital cities: are they generally higher
than those for the rest of the country?• Higher non-response in the capital cities? (often correct)
![Page 11: Oversampling the capital cities in the EU SAfety SUrvey (EU-SASU) Task Force on Victimization Eurostat, 17-18 February 2010 Guillaume Osier Service Central.](https://reader036.fdocuments.us/reader036/viewer/2022062518/5697bf791a28abf838c825b1/html5/thumbnails/11.jpg)
Minimum sample sizes for the capital cities
276
329
341
351
355
364
402
474
558
572
574
594
641
684
712
725
769
804
916
921
966
992
1025
1345
1375
1453
1462
1902
2600
0 500 1000 1500 2000 2500 3000
France
Germany
Switzerland
Italy
Poland
Netherlands
Portugal
Slovakia
Denmark
Greece
Spain
Sweden
Finland
Norway
Romania
Ireland
Belgium
Slovenia
Czech Republic
Luxembourg
Lithuania
United Kingdom
Bulgaria
Hungary
Austria
Estonia
Cyprus
Latvia
Malta
![Page 12: Oversampling the capital cities in the EU SAfety SUrvey (EU-SASU) Task Force on Victimization Eurostat, 17-18 February 2010 Guillaume Osier Service Central.](https://reader036.fdocuments.us/reader036/viewer/2022062518/5697bf791a28abf838c825b1/html5/thumbnails/12.jpg)
![Page 13: Oversampling the capital cities in the EU SAfety SUrvey (EU-SASU) Task Force on Victimization Eurostat, 17-18 February 2010 Guillaume Osier Service Central.](https://reader036.fdocuments.us/reader036/viewer/2022062518/5697bf791a28abf838c825b1/html5/thumbnails/13.jpg)
![Page 14: Oversampling the capital cities in the EU SAfety SUrvey (EU-SASU) Task Force on Victimization Eurostat, 17-18 February 2010 Guillaume Osier Service Central.](https://reader036.fdocuments.us/reader036/viewer/2022062518/5697bf791a28abf838c825b1/html5/thumbnails/14.jpg)
NONCALIBCALIB EVarYVar
0
10
20
30
40
50
60
70
Victimization rate (%) - National Victimization rate (%) - Capital city
Source: International Crime and Victimization Survey (ICVS), 2005
Victimization rates in capital cities
Victimization rates are higher in the capital cities than in the rest of the countries
![Page 15: Oversampling the capital cities in the EU SAfety SUrvey (EU-SASU) Task Force on Victimization Eurostat, 17-18 February 2010 Guillaume Osier Service Central.](https://reader036.fdocuments.us/reader036/viewer/2022062518/5697bf791a28abf838c825b1/html5/thumbnails/15.jpg)
How to determine the over-sampling rates? 1/4
Step 1: set up a precision target for every capital citiesStep 2: determine the minimum sample size needed to achieve the
level of precision specified at Step 1
Precision target (1): under simple random sampling, a relative margin of error of % in each capital city for any victimization rate higher than P%
1
11962
Pαnmin
![Page 16: Oversampling the capital cities in the EU SAfety SUrvey (EU-SASU) Task Force on Victimization Eurostat, 17-18 February 2010 Guillaume Osier Service Central.](https://reader036.fdocuments.us/reader036/viewer/2022062518/5697bf791a28abf838c825b1/html5/thumbnails/16.jpg)
0
5000
10000
15000
20000
25000
30000
35000
40000
0 10 20 30 40 50 60
P
nmin
= 10%
How to determine the over-sampling rates? 2/4
![Page 17: Oversampling the capital cities in the EU SAfety SUrvey (EU-SASU) Task Force on Victimization Eurostat, 17-18 February 2010 Guillaume Osier Service Central.](https://reader036.fdocuments.us/reader036/viewer/2022062518/5697bf791a28abf838c825b1/html5/thumbnails/17.jpg)
0
2000
4000
6000
8000
10000
12000
14000
16000
18000
0 10 20 30 40 50 60
alpha
nmin
P = 20%
How to determine the over-sampling rates? 3/4
![Page 18: Oversampling the capital cities in the EU SAfety SUrvey (EU-SASU) Task Force on Victimization Eurostat, 17-18 February 2010 Guillaume Osier Service Central.](https://reader036.fdocuments.us/reader036/viewer/2022062518/5697bf791a28abf838c825b1/html5/thumbnails/18.jpg)
Precision target (2): under simple random sampling, an absolute margin of error of % points in each capital city for any victimization rate higher than P%
PPα
nmin
1196
2
How to determine the over-sampling rates? 4/4
![Page 19: Oversampling the capital cities in the EU SAfety SUrvey (EU-SASU) Task Force on Victimization Eurostat, 17-18 February 2010 Guillaume Osier Service Central.](https://reader036.fdocuments.us/reader036/viewer/2022062518/5697bf791a28abf838c825b1/html5/thumbnails/19.jpg)
Consider the national victimization rate for the 10 main crimes as used in the International Crime and Victimization Survey (ICVS):
Impact on the national accuracy 1/8
NCNC
CC P
~N
NP~
N
NP~
Victimization rate in the capital city Victimization rate in
the rest of the country
![Page 20: Oversampling the capital cities in the EU SAfety SUrvey (EU-SASU) Task Force on Victimization Eurostat, 17-18 February 2010 Guillaume Osier Service Central.](https://reader036.fdocuments.us/reader036/viewer/2022062518/5697bf791a28abf838c825b1/html5/thumbnails/20.jpg)
P~
nP~
P~
NN
nP~
P~
NN
RMENC
NCNCNC
C
CCC
11
19622
Variance:
Impact on the national accuracy 2/8
NC
NCNCNC
C
CCC
n
P~
P~
N
N
n
P~
P~
N
NV
1122
Relative margin of error:
NC
NCNCNC
C
CCC
n
P~
P~
N
N
n
P~
P~
N
NAME
11196
22
Absolute margin of error:
![Page 21: Oversampling the capital cities in the EU SAfety SUrvey (EU-SASU) Task Force on Victimization Eurostat, 17-18 February 2010 Guillaume Osier Service Central.](https://reader036.fdocuments.us/reader036/viewer/2022062518/5697bf791a28abf838c825b1/html5/thumbnails/21.jpg)
Case 1: fixed national sample size
Impact on the national accuracy 3/8
CNC
C
nnn
n,Pα
Minn 111196
2
![Page 22: Oversampling the capital cities in the EU SAfety SUrvey (EU-SASU) Task Force on Victimization Eurostat, 17-18 February 2010 Guillaume Osier Service Central.](https://reader036.fdocuments.us/reader036/viewer/2022062518/5697bf791a28abf838c825b1/html5/thumbnails/22.jpg)
Impact on the national accuracy 4/8Table 3: Relative margin of error (%) for the national victimization rate – fixed sample size at national level (Case 1)
CountryOver-sampling
No over-sampling P=0.1 P=0.2 P=0.3 P=0.4 P=0.5
France 7.5 6.3 6.0 5.9 5.9 5.9
Germany 7.0 5.9 5.7 5.6 5.6 5.6
Switzerland 6.6 5.3 5.1 5.0 5.0 5.0
Italy 7.2 6.1 5.8 5.8 5.7 5.8
Poland 6.5 5.5 5.3 5.2 5.2 5.2
Netherlands 5.5 4.6 4.4 4.4 4.4 4.4
Portugal 8.0 6.8 6.5 6.4 6.4 6.4
Denmark 7.1 5.4 5.2 5.2 5.3 5.2
Greece 7.1 6.0 5.8 5.8 5.9 5.8
Spain 8.1 7.0 6.8 6.9 7.1 6.9
Sweden 6.6 5.4 5.2 5.3 5.4 5.3
Finland 8.4 6.6 6.4 6.5 6.8 6.5
Norway 7.4 5.8 5.7 5.8 6.0 5.7
Ireland 6.2 4.8 4.7 4.7 4.9 4.7
Belgium 5.5 4.7 4.7 4.7 4.9 4.7
United Kingdom 4.5 3.9 4.0 4.1 4.4 3.9
Hungary 6.9 6.4 6.9 7.6 8.5 6.5
Austria 6.6 6.2 6.8 7.6 8.7 6.3
Estonia 5.5 4.9 5.6 6.5 7.7 4.9
![Page 23: Oversampling the capital cities in the EU SAfety SUrvey (EU-SASU) Task Force on Victimization Eurostat, 17-18 February 2010 Guillaume Osier Service Central.](https://reader036.fdocuments.us/reader036/viewer/2022062518/5697bf791a28abf838c825b1/html5/thumbnails/23.jpg)
Impact on the national accuracy 5/8
Table 4: Absolute margin of error (% points) for the national victimization rate – fixed sample size at national level (Case 1)
CountryOver-sampling
No over-sampling P=0.1 P=0.2 P=0.3 P=0.4 P=0.5
France 0.9 0.8 0.7 0.7 0.7 0.7
Germany 0.9 0.8 0.7 0.7 0.7 0.7
Switzerland 1.2 1.0 0.9 0.9 0.9 0.9
Italy 0.9 0.8 0.7 0.7 0.7 0.7
Poland 1.0 0.8 0.8 0.8 0.8 0.8
Netherlands 1.1 0.9 0.9 0.9 0.9 0.9
Portugal 0.8 0.7 0.7 0.7 0.7 0.7
Denmark 1.3 1.0 1.0 1.0 1.0 1.0
Greece 0.9 0.7 0.7 0.7 0.7 0.7
Spain 0.7 0.6 0.6 0.6 0.6 0.6
Sweden 1.1 0.9 0.8 0.8 0.9 0.8
Finland 1.1 0.8 0.8 0.8 0.9 0.8
Norway 1.2 0.9 0.9 0.9 1.0 0.9
Ireland 1.4 1.1 1.0 1.0 1.1 1.0
Belgium 1.0 0.8 0.8 0.8 0.9 0.8
United Kingdom 0.9 0.8 0.8 0.9 0.9 0.8
Hungary 0.7 0.6 0.7 0.8 0.8 0.6
Austria 0.8 0.7 0.8 0.9 1.0 0.7
Estonia 1.1 1.0 1.1 1.3 1.6 1.0
![Page 24: Oversampling the capital cities in the EU SAfety SUrvey (EU-SASU) Task Force on Victimization Eurostat, 17-18 February 2010 Guillaume Osier Service Central.](https://reader036.fdocuments.us/reader036/viewer/2022062518/5697bf791a28abf838c825b1/html5/thumbnails/24.jpg)
Case 2: national sample size not fixed
Impact on the national accuracy 6/8
N
Nn
N
Nnnn
N
Nn,
PαMaxn
NCCNC
CC 1
11962
![Page 25: Oversampling the capital cities in the EU SAfety SUrvey (EU-SASU) Task Force on Victimization Eurostat, 17-18 February 2010 Guillaume Osier Service Central.](https://reader036.fdocuments.us/reader036/viewer/2022062518/5697bf791a28abf838c825b1/html5/thumbnails/25.jpg)
Impact on the national accuracy 7/8Table 5: Relative margin of error (%) for the national victimization rate – national sample size not fixed (Case 2)
CountryOver-sampling
No over-sampling P=0.1 P=0.2 P=0.3 P=0.4 P=0.5
France 5.7 5.8 5.8 5.8 5.9 5.9
Germany 5.4 5.4 5.5 5.5 5.6 5.6
Switzerland 4.8 4.8 4.9 4.9 5.0 5.0
Italy 5.6 5.6 5.6 5.7 5.7 5.8
Poland 5.0 5.0 5.1 5.1 5.2 5.2
Netherlands 4.2 4.2 4.3 4.3 4.4 4.4
Portugal 6.2 6.3 6.3 6.4 6.4 6.4
Denmark 4.9 4.9 5.0 5.2 5.2 5.2
Greece 5.5 5.6 5.7 5.8 5.8 5.8
Spain 6.4 6.5 6.7 6.9 6.9 6.9
Sweden 4.9 5.0 5.1 5.3 5.3 5.3
Finland 5.9 6.0 6.3 6.5 6.5 6.5
Norway 5.2 5.4 5.6 5.7 5.7 5.7
Ireland 4.3 4.5 4.6 4.7 4.7 4.7
Belgium 4.4 4.5 4.6 4.7 4.7 4.7
United Kingdom 3.6 3.8 3.9 3.9 3.9 3.9
Hungary 5.8 6.3 6.5 6.5 6.5 6.5
Austria 5.5 6.2 6.3 6.3 6.3 6.3
Estonia 4.0 4.8 4.9 4.9 4.9 4.9
![Page 26: Oversampling the capital cities in the EU SAfety SUrvey (EU-SASU) Task Force on Victimization Eurostat, 17-18 February 2010 Guillaume Osier Service Central.](https://reader036.fdocuments.us/reader036/viewer/2022062518/5697bf791a28abf838c825b1/html5/thumbnails/26.jpg)
Impact on the national accuracy 8/8Table 6: Absolute margin of error (% points) for the national victimization rate – national sample size not fixed (Case 2)
CountryOver-sampling
No over-sampling P=0.1 P=0.2 P=0.3 P=0.4 P=0.5
France 0.7 0.7 0.7 0.7 0.7 0.7
Germany 0.7 0.7 0.7 0.7 0.7 0.7
Switzerland 0.9 0.9 0.9 0.9 0.9 0.9
Italy 0.7 0.7 0.7 0.7 0.7 0.7
Poland 0.8 0.8 0.8 0.8 0.8 0.8
Netherlands 0.8 0.8 0.8 0.8 0.9 0.9
Portugal 0.6 0.7 0.7 0.7 0.7 0.7
Denmark 0.9 0.9 0.9 1.0 1.0 1.0
Greece 0.7 0.7 0.7 0.7 0.7 0.7
Spain 0.6 0.6 0.6 0.6 0.6 0.6
Sweden 0.8 0.8 0.8 0.8 0.8 0.8
Finland 0.7 0.8 0.8 0.8 0.8 0.8
Norway 0.8 0.8 0.9 0.9 0.9 0.9
Ireland 1.0 1.0 1.0 1.0 1.0 1.0
Belgium 0.8 0.8 0.8 0.8 0.8 0.8
United Kingdom 0.8 0.8 0.8 0.8 0.8 0.8
Hungary 0.6 0.6 0.6 0.6 0.6 0.6
Austria 0.6 0.7 0.7 0.7 0.7 0.7
Estonia 0.8 1.0 1.0 1.0 1.0 1.0
![Page 27: Oversampling the capital cities in the EU SAfety SUrvey (EU-SASU) Task Force on Victimization Eurostat, 17-18 February 2010 Guillaume Osier Service Central.](https://reader036.fdocuments.us/reader036/viewer/2022062518/5697bf791a28abf838c825b1/html5/thumbnails/27.jpg)
Specific issues• The initial difficulty is in obtaining the sampling frame
appropriate for the over-sampling the inhabitants of the capital cities. For the countries conducting a face-to-face survey, this should not be a serious issue. On the other hand, the countries which plan to conduct the survey by telephone might be unable to do so; unless specific phone numbers are allocated to the households in the capital city (e.g., when the first digits of a phone number represent the city code)
• Since individuals in capital cities are in general more difficult to contact, over-sampling them will necessitate more attempted contacts; which will likely imply higher costs and more time to reach the minimum sample size required for the survey.
• Finally, over-sampling might make the problem of anonymisation of the data more acute
![Page 28: Oversampling the capital cities in the EU SAfety SUrvey (EU-SASU) Task Force on Victimization Eurostat, 17-18 February 2010 Guillaume Osier Service Central.](https://reader036.fdocuments.us/reader036/viewer/2022062518/5697bf791a28abf838c825b1/html5/thumbnails/28.jpg)
Questions for the TF
1. Is over-sampling the habitants of the capital cities policy relevant? Which geographical areas might be over-sampled instead?
• NUTS2 or NUTS3 regions• Groups of cities (like in Eurostat’s Urban Audit)• Densely populated areas (based on degree or urbanization)• City areas….
2. What level of accuracy is needed for the capital cities/other geographical areas?
3. What about higher non-response?
4. What about telephone surveys?