Annika Lindblom Alex Teterukovsky Statistics Sweden

16
Annika Lindblom Alex Teterukovsky Statistics Sweden On coordination of stratified Pareto ps and simple random samples

description

On coordination of stratified Pareto  ps and simple random samples. Annika Lindblom Alex Teterukovsky Statistics Sweden. The paper focuses on:. Presentation of the sampling designs in the SAMU: stratified SRS and Pareto  ps - PowerPoint PPT Presentation

Transcript of Annika Lindblom Alex Teterukovsky Statistics Sweden

Page 1: Annika Lindblom Alex Teterukovsky Statistics Sweden

Annika Lindblom

Alex Teterukovsky

Statistics Sweden

On coordination of stratified Pareto ps and simple random samples

Page 2: Annika Lindblom Alex Teterukovsky Statistics Sweden

The paper focuses on:

• Presentation of the sampling designs in the SAMU:

stratified SRS and Pareto ps

• Sample co-ordination, in particular the

implementation of Pareto ps design

• Overlap between ps and SRS samples:

- theoretical findings for SRS

- empirical findings based on surveys in practice

Page 3: Annika Lindblom Alex Teterukovsky Statistics Sweden

The SAMU

• A system for co-ordination of frame populations and samples from the Business Register at Statistics Sweden since 1972

• Three main objectives:

- obtain comparable statistics

- ensure high precision in estimates of change

over time

- spread the response burden

Page 4: Annika Lindblom Alex Teterukovsky Statistics Sweden

Inclusion probabilities

Frame population divided into H disjoint strata Uh, h = 1,..H, where Uh contains Nh units. A sample of fixed size nh from each Uh is to be drawn. Inclusion probabilities are:

hk N

nh

h

jj

kh

Nk

x

xn

1

Stratified SRS:

same for all units within a stratum

(Pareto) ps:

unique for each unit, xk size measure for unit k

Page 5: Annika Lindblom Alex Teterukovsky Statistics Sweden

Permanent random numbers

• To each unit k in the Business Register a permanent random number uk uniformly distributed over the interval (0,1), is attached

• For SRS we choose the starting point and the direction, and sample the necessary number of units:

Different blocks

Page 6: Annika Lindblom Alex Teterukovsky Statistics Sweden

Pareto ps sampling procedure

1. Compute the desired inclusion probabilities within each stratum

2. If k>1 then unit is sampled with probability 1

3. For other units calculate the ranking variable:

4. The sample consists of the units with the nh

smallest q-values within stratum h

kk

kk

k u

uq

1

1

h

jj

kh

Nk

x

xn

1

Page 7: Annika Lindblom Alex Teterukovsky Statistics Sweden

Random number transformation is necessary.

The objective of the transformation is to select the nh units with the smallest q-values within a stratum h independently of what starting point S is chosen.

Transform uk into zk as follows.

Sampling direction right:

Sampling direction left:

1,SuModz kk 1,kk uSModz

Pareto and starting points

kk

kkk z

zq

1

1

Page 8: Annika Lindblom Alex Teterukovsky Statistics Sweden

Coordination and overlap

• Theoretical. SRS/SRS.

• Empirical. SRS/SRS.

• Empirical. Pareto/Pareto.

• Empirical. SRS/Pareto. Same surveys.

• Empirical. SRS/Pareto. Different surveys.

• Empirical. SRS/SRS and Pareto/Pareto over time.

Page 9: Annika Lindblom Alex Teterukovsky Statistics Sweden

Theoretical overlap. SRS/SRS.

Coordinate 2 equal SRS samples (h is stratum):

sample sizes nh

frame population Nh

completely enumerated units mh

What is the expected overlap for all a’s and b’s?

GEOMETRY!

Page 10: Annika Lindblom Alex Teterukovsky Statistics Sweden

Theoretical overlap. SRS/SRS.Type Condition Expected overlap

Ia b-a<Rh<1-(b-a) nh –(b-a)(Nh-

mh)

IIa 1-(b-a)<Rh<b-a nh + (b-a-1)(Nh-

mh)

IIIa max(b-a, 1-(b-a))<Rh

2nh - Nh

IVa Rh<min(b-a,1-(b-

a))

mh

Rh=hh

hh

mN

mn

Page 11: Annika Lindblom Alex Teterukovsky Statistics Sweden

Theoretical overlap. SRS/SRS.Type Condition Expected overlap

Ib 0.5(b-a)<Rh<b-a 2nh –(b-a)Nh-

(1-b+a)mh

IIb b-a < Rh < 0.5(1+

(b-a))

mh+(b-a)(Nh-

mh)

IIIb max(b-a, 0.5(1+(b-a)))<Rh

2nh - Nh

IVb Rh<0.5(b-a) mh

Page 12: Annika Lindblom Alex Teterukovsky Statistics Sweden

Empirical overlap. SRS/SRS.

Wages and salaries, private sector (stratified after #employees)

N= 66 083 SRS: 1 732 completely enumerated (23% of the sample)n = 7 497

SAMU block Actual Expected

Start Direction Units % Units %

0.0 Right2 258 30 2 258 30

1.0 Left

0.0 Right3 111 41 3 103 41

0.2 Right

0.0 Right2 613 35 2 616 35

0.7 Right

0.0 Right2 956 39 2 927 39

0.7 Left

0.7 Right2 258 30 2 258 30

0.7 Left

Turnover in retail trade (stratified after the turnover)

N= 31 732 SRS: 246 completely enumerated(10 % of the sample)n = 2 405

SAMU block Actual Expected

Start Direction Units % Units %

0.0 Right444 18 444 18

1.0 Left

0.0 Right657 27 639 27

0.2 Right

0.0 Right

549 23 521 220.7 Right

0.0 Right611 25 606 25

0.7 Left

0.7 Right444 18 444 18

0.7 Left

Page 13: Annika Lindblom Alex Teterukovsky Statistics Sweden

Empirical overlap. Pareto/Pareto.

Wages and salaries, private sector(stratified after #employees)

N= 66 083Pareto: 1 659 completely enumeratedSRS: 1 732 completely enumeratedn = 7 497

SAMU block Pareto SRS

Start Direction Units % Units %

0.0 Right2 265 30 2 258 30

1.0 Left

0.0 Right3 005 40 3 111 41

0.2 Right

0.0 Right2 594 35 2 613 35

0.7 Right

0.0 Right2 814 38 2 956 39

0.7 Left

0.7 Right2 299 31 2 258 30

0.7 Left

Turnover in retail trade(stratified after the turnover)

N= 31 732 Pareto: 287 completely enumeratedSRS: 246 completely enumerated

n = 2 405

SAMU block Pareto SRS

Start Direction Units % Units %

0.0 Right466 19 444 18

1.0 Left

0.0 Right705 29 657 27

0.2 Right

0.0 Right566 24 549 23

0.7 Right

0.0 Right613 25 611 25

0.7 Left

0.7 Right461 19 444 18

0.7 Left

Page 14: Annika Lindblom Alex Teterukovsky Statistics Sweden

Empirical overlap. SRS/Pareto. Same surveys.

Use of Information and Communication in Enterprises(stratified after #employees)

N= 29 124 SRS: 1 214 completely enumeratedPareto: 1 116 completely enumeratedn = 4 355

SAMU block Overlap SRS/Pareto

Start Direction Units %

0.0 Right 3 593 83

0.2 Right 3 595 83

0.7 Left 3 605 83

0.7 Right 3 601 83

1.0 Left 3 559 82

Stock of goods in the Wholesale and Retail Trade(optimally stratified after the turnover)

N= 9 322 SRS: 382 completely enumeratedPareto: 402 completely enumeratedn = 1 753

SAMU block Overlap SRS/Pareto

Start Direction Units %

0.0 Right 1 418 81

0.2 Right 1 406 80

0.7 Left 1 397 80

0.7 Right 1 383 79

1.0 Left 1 405 80

Page 15: Annika Lindblom Alex Teterukovsky Statistics Sweden

Empirical overlap. SRS/Pareto. Different surveys.

Overlap between two different surveys (Inf and Sto) placed in different blocks

N (Inf) = 29 124N (Sto) = 9 322

Completely enumeratedSRS(Inf): 1 214 Pareto(Inf) : 1 116 SRS(Sto): 382 Pareto(Sto): 402

n (Inf) = 4 355n (Sto) = 1 753

SAMU block Pareto SRS

Survey Start Direction Units % Units %

Inf 0.0 Right241

 

14 

200 

11 Sto 1.0 Left

Inf 0.0 Right359

 

21 

348 

20 Sto 0.2 Right

Inf 0.0 Right263

 

15 

215 

12 Sto 0.7 Right

Inf 0.0 Right275

 

16 

244 

14 Sto 0.7 Left

Inf 0.7 Right235 13 188 11

Sto 0.7 Left

Overlap between two different surveys (Inf and Sto) placed in different blocks

N (Inf) = 29 124N (Sto) = 9 322

Completely enumeratedSRS(Inf): 1 214 Pareto(Inf) : 1 116 SRS(Sto): 382 Pareto(Sto): 402

n (Inf) = 4 355n (Sto) = 1 753

SAMU block Pareto SRS

Survey Start Direction Units % Units %

Sto 0.0 Right222

 

13 

177 

10 Inf 1.0 Left

Sto 0.0 Right250

 

14 

202 

12 Inf 0.2 Right

Sto 0.0 Right290

 

17 

284 

16 Inf 0.7 Right

Sto 0.0 Right262

 

15 

214 

12 Inf 0.7 Left

Sto 0.7 Right240 14 196 11

Inf 0.7 Left

Page 16: Annika Lindblom Alex Teterukovsky Statistics Sweden

Empirical. Overlap over time.

Use of Information and Communication in Enterprises (stratified after #employees)

SAMU block Pareto SRS

Start Direction Units % Units %

0.0 Right 3 560 82 3 428 79

0.2 Right 3 564 82 3 432 79

0.7 Left 3 584 82 3 483 80

0.7 Right 3 590 82 3 455 79

1.0 Left 3 553 82 3 411 78

Stock of goods in the Wholesale and Retail Trade(optimally stratified after the turnover)

SAMU block Pareto SRS

Start Direction Units % Units %

0.0 Right 1 352 77 1 305 74

0.2 Right 1 322 75 1 288 74

0.7 Left 1 337 76 1 313 75

0.7 Right 1 342 77 1 327 76

1.0 Left 1 316 75 1 310 75