Session 10

15
1 Session 10 Sampling Weights: an appreciation

description

Session 10. Sampling Weights: an appreciation. Session Objectives. To provide you with an overview of the role of sampling weights in estimating population parameters To demonstrate computation of sampling weights for a simple scenario - PowerPoint PPT Presentation

Transcript of Session 10

Page 1: Session 10

1

Session 10

Sampling Weights:an appreciation

Page 2: Session 10

2

To provide you with an overview of the role of sampling weights in estimating population parameters

To demonstrate computation of sampling weights for a simple scenario

To highlight difficulties in calculating sampling weights for complex survey designs and the need to seek professional expertise for this purpose

To learn about file merging and continue with the on-going project work

Session Objectives

Page 3: Session 10

3

Real surveys are generally multi-stage

At each stage, probabilities of selecting units at that stage are not generally equal

When population parameters like a mean or proportion is to be estimated, results from lower levels need to be scaled-up from the sample to the population

This scaling-up factor, applied to each unit in the sample is called its sampling weight.

What are sampling weights?

Page 4: Session 10

4

Suppose for example, a simple random sampleof 500 HHs in a rural district (having 7349 HHs in total) showed 140 were living below the poverty line

Hence total in population living below the povertyline = (140/500)*7349 =2058

Data for each HH was a 0,1 variable, 1 being allocated if HH was below poverty line.

Multiplying this variable by 7349/500=14.7 & summing would lead to the same answer.

i.e. sampling weight for each HH = 14.7

A simple example

Page 5: Session 10

5

Above was a trivial example with equalprobabilities of selection

In general, units in the sample have very differing probabilities of selection

To allow for unequal probabilities of selection, each unit is weighted by the reciprocal of its probability of selection

Thus sampling weight=(1/prob of selection)

Why are weights needed?

Page 6: Session 10

6

Consider a conveniently rectangular forest witha river running down in the middle, thus dividingthe forest into Region 1 and Region 2.

Region 1 is divided into 96 strips, each 50m x 50m, while Region 2 is divided into 72 strips.

Data are the number of small trees and the number of large trees in each strip.

Aim: To find the total number of large trees, the total number of small trees, and hence the total number of trees in the forest.

An example

Page 7: Session 10

7

Each region can be regarded as a stratum: 8strips were chosen from region 1 and 6 from region 2.

Mean number of large trees per strip were: 97.875 in region 1, based on n1=8

83.500 in region 2, based on n2=6

Hence total number of large trees in the forest can be computed as(96*97.875) + (72*83.5) = 15408

So what are the sampling weights used for each unit (strip)?

Weights in stratified sampling

Page 8: Session 10

8

The sampling weights are the same for all strips, whether in region 1 or region 2. Why is this?

What are the probabilities of selection here? In region 1, each unit is selected with prob=8/96 In region 2, each unit is selected with prob=6/72

A design where probabilities of selection are equal for all selected units is called a self-weighting design.

Regarding the sample as a simple random sample then gives us the correct mean.

Self-weighting

Page 9: Session 10

9

Easy to see that the mean number of large treesin the forest is[(96/168)*97.875 ] + [(72/168)*83.5] = 91.71

Regarding the 14 observations as though they were drawn as a simple random sample gives 91.71, i.e. the same answer.

The results for variances however differ Variance of stratified sample mean=1.28 Variance of mean ignoring stratification = 2.18

Results for means

Page 10: Session 10

10

Important to note that the weights used incomputing a mean, i.e. (96/168)*(1/8) = 1/14 for strips in region 1, & (72/168)*(1/6) = 1/14 for strips in region 2, are not sampling weights

Sampling weights refer to the multiplying factor when estimating a total.

Essentially they represent the number of elements in the population that an individual sampling unit represent.

More on weights

Page 11: Session 10

11

Weights are also used to deal withnon-responses and missing values

If measurements on all units are not availablefor some reason, may re-compute the sampling weights to allow for this.

e.g. In conducting the Household Budget Survey 2000/2001 in Tanzania, not all rural areas planned in the sampling scheme were visited. As a result, sampling weights had to be re-calculated and used in the analysis.

Other uses of weight

Page 12: Session 10

12

General approach is to find the probability ofselecting a unit at every stage of the sample selection process

e.g. in a 3-stage design, three set of probabilities will result

Probability of selecting each final stage unit is then the product of these three probabilities

The reciprocal of the above probability is then the sampling weight

Computation of weights

Page 13: Session 10

13

Standard methods as illustrated in textbooks on sampling, often do not apply in real surveys

Complex sampling designs are common

Computing correct probabilities of selection can then be very challenging

Usually professional assistance is needed to determine the correct sampling weights and to use it correctly in the analysis

Difficulties in computations

Page 14: Session 10

14

When analysing data from complex surveydesigns, it is important to check that the softwarecan deal with sampling weights

Packages such as Stata, SAS, Epi-info have facilities for dealing with sampling weights

However, need to be careful that the approaches used are appropriate for your own survey design

Note: Above discussion was aimed at providing you with an overview of sampling weights. See next slide for work of the remainder of this session.

Software for dealing with weights

Page 15: Session 10

15

To understand how files may be merged, work through sections 10.5 and 10.6 of the Stata Guide.

Now move to your project work and practice file merging to address objectives 4 and 5 of your task.

A description of the work you should undertake is provided in the handout titled Practical 10.

Practical work