ICCS 2009 IDB Seminar – Nov 24-26, 2010 – IEA DPC, Hamburg, Germany Training Workshop on the...

Post on 17-Jan-2018

217 views 0 download

description

What are weights? Values assigned to all sampling units. Students, teachers, schools. The weight of a sampled unit indicates the number of units in the population that is represented by this sampled unit.

Transcript of ICCS 2009 IDB Seminar – Nov 24-26, 2010 – IEA DPC, Hamburg, Germany Training Workshop on the...

ICCS 2009 IDB Seminar – Nov 24-26, 2010 – IEA DPC, Hamburg, Germany

Training Workshop on the ICCS 2009 database

Weights and Variance Estimation

picture

Content of this presentation

WeightsWhat are they?Why do we need them?How do we use them?

Standard errorsWhat are they?Why do we need them?How do we estimate them?

What are weights?

Values assigned to all sampling units.Students, teachers, schools.

The weight of a sampled unit indicates the number of units in the population that is represented by this sampled unit.

Weights and selection probabilities

Weights are based on the sample selection probabilities.ICCS sampling units had different selection probabilities.

High selection probability small weight, Low selection probability large weight.

Weights get applied at each sampling stage.School sampling weights,Within-school sampling weights.

Weights and non-participation adjustments

Weights are adjusted for non-participation.Non-participation can happen at each level.

School level,Within-school level.

Non-participation adjustments compensate losses for specific groups of sampled units.

Why do we need weights?

Weights allow conclusions to be drawn about the population based on information from the sample.Weights allow estimates of population parameters.Un-weighted data only allow conclusions about the sampled units.

Population sample

Example: in the population, 20% of the students are in private schools (red), 80% are in public schools (blue).In the sample, 50% of the students are from private schools, 50% are from public schools.

Sample estimate

In order to estimate the correct proportion of students in the population, different weights must be assigned to the students in the sample.

Analyzing weighted data – a simple example

1:101:1

Un-weighted mean

nx

x )(Mean Unweighted

Weighted mean

wgtxwgt

x )(Mean Weighted

Weights in ICCS

The ICCS Data contain several weight variables. Total Student Weight: TOTWGTS,Total Teacher Weight: TOTWGTT,Total School Weight: TOTWGTC.

The IDB Analyzer automatically selects the correct weight.

ICCS example: civic knowledge in Chile

Unweighted: average of 493.83

ICCS example (cont.)

Difference: 10.8 score points.Reason for the difference: over-sampling of students in private schools.

13.7% of the tested students,5.9% of the sum of weights.

What are standard errors?

The standard error of a statistic is the standard deviation of the sampling distribution of that statistic.The sampling distribution is the distribution of the statistic for all possible samples of the same size and method.Since no one can select all possible samples, the standard error can only be estimated.

Title of presentation: "Weights and Variance Estimation"

Why do we need standard errors?

The ICCS results are based on samples.All ICCS results are therefore estimates of unknown population values.Standard errors can be used to measure how close these estimates are to the real values.

Standard errors and confidence intervals

Let ε stand for any statistic of interest (mean, percentage...)A 95% confidence interval is defined as

This is the black bar in Table 3.10 of the ICCS International Report.

With a confidence of 95%, the true mean is between 554.3 and 563.7...Take rounding into account!

SE 96.1

Standard errors and significance tests

Whenever you see a funny little triangle in the ICCS international report, standard errors are hidden in the background.

Estimating standard errors

In a simple random sample, estimating the standard error of a statistic ε is easy.

Just divide the standard deviation of the sample (s) by the square root of the sample size (n)

In a complex sample design like in ICCS, it is not as easy to estimate the standard error as in a simple random sample.

nsSE

^

Complex sample design - clustering

Clustered sample:Students within a school are likely to be more similar to each other than students from different schools.

Similarly for teachers.Clustering usually decreases sampling precisison.

Complex sample design – strata and weights

Stratification:Students within a stratum are more similar to each other than students from different strata.Similar for schools and teachers.Stratification usually increases sampling precision.

Weights:Using weights usually decreases sampling precision.They complicate the calculations.

Why not just use SPSS?

Standard software packages like SPSS will not provide correct estimates for standard errors.The software assumes that the data is from a simple random sample, and uses the incorrect formula.Generally, the estimate will be too small.

Jackknife Repeated Replication

Solution: Jackknife Repeated Replication (JRR).Used for estimating standard errors in complex designs.Basic idea: systematically re-compute a statistic on a set of replicated samples:

By setting the weights to zero for one school at a time,While doubling the weights of another school.

Estimate the standard error of a statistic from the variability of that statistic between the full sample and the replicates.

The basics of the JRR method

Jackknife variance estimation in ICCS:Participating schools are paired according to the order in which they were sampled.These school pairs are called jackknife zones.

JKZONES, JKZONET, JKZONECOne school in each zone is randomly assigned an indicator of 1 (0 for the other school).

JKREPS, JKREPT, JKREPCThis indicator decides whether a school gets its replicate weight doubled or zeroed.

A look inside the IDB Analyzer

Standard error:

2 3 4 5 67 8 9 10 11 1213 ...14 75...

75

1i

2i εεSE

1

^

ICCS example: teacher age in Chile

Standard error of the teacher age in Chile.SPSS just can’t do that:

Standard errors and plausible values

For ICCS achievement data, the standard error consists of two components.Sampling error:

This is what we just discussed.

Addtionally, measurement error:Resulting from the use of plausible values.

This is the topic of the next presentation.

Conclusion

Use sampling weights!

Compute standard errors using the JRR!

Thank you for your attention!