Rudi Seljak Statistical Office of the Republic of Slovenia
-
Upload
leila-oliver -
Category
Documents
-
view
59 -
download
1
description
Transcript of Rudi Seljak Statistical Office of the Republic of Slovenia
![Page 1: Rudi Seljak Statistical Office of the Republic of Slovenia](https://reader035.fdocuments.us/reader035/viewer/2022062422/56812d53550346895d9258d5/html5/thumbnails/1.jpg)
STANDARD ERRORS PRESENTATION AND DISEMINATION AT THE STATISTICAL OFFICE OF THE
REPUBLIC OF SLOVENIA
Rudi Seljak Statistical Office of the Republic of Slovenia
![Page 2: Rudi Seljak Statistical Office of the Republic of Slovenia](https://reader035.fdocuments.us/reader035/viewer/2022062422/56812d53550346895d9258d5/html5/thumbnails/2.jpg)
The summary of the presentation
• Introduction • The “old” system for standard error presentation • Reasons for the revision • The revised system• Conclusions
![Page 3: Rudi Seljak Statistical Office of the Republic of Slovenia](https://reader035.fdocuments.us/reader035/viewer/2022062422/56812d53550346895d9258d5/html5/thumbnails/3.jpg)
Introduction
• Sample surveys are still the most commonly used means for the collection of the needed data in the official statistics.
• Precision of the statistical results is still very important criteria of the quality of the statistical results.
• It is important obligation of the national statistical institutes to estimate the sampling errors and then to disseminate and represent these errors to the users in the transparent and clearly readable form.
![Page 4: Rudi Seljak Statistical Office of the Republic of Slovenia](https://reader035.fdocuments.us/reader035/viewer/2022062422/56812d53550346895d9258d5/html5/thumbnails/4.jpg)
The “old” system at SORS
• The standard errors for the estimated statistical results were very rarely explicitly published.
• Estimated coefficients of variation were used to determine degree of precision.
• The four different degrees of precision were defined:– Estimates with sufficient degree of precision (CV <10%). Value
disseminated with no denotation.– Less precise estimates (10% ≤ CV < 15%). Value disseminated
in the single bracket.– Imprecise estimates (15% < CV ≤ 30%). Value disseminated in
the double bracket.– Extremely imprecise estimates (30% < CV). Value not published
but replaced with the sign “.”.
![Page 5: Rudi Seljak Statistical Office of the Republic of Slovenia](https://reader035.fdocuments.us/reader035/viewer/2022062422/56812d53550346895d9258d5/html5/thumbnails/5.jpg)
The “old” system – example (LFS)
![Page 6: Rudi Seljak Statistical Office of the Republic of Slovenia](https://reader035.fdocuments.us/reader035/viewer/2022062422/56812d53550346895d9258d5/html5/thumbnails/6.jpg)
Reasons for the revision
• Using brackets for the denotation signs is not the best solution for the dissemination through the electronic means.
• Using the denotation of different degrees of precision is not sufficient for the needs of some more demanding users.
• Definition of the degrees of precision by using only the coefficient of variation is not appropriate for some (non-dimensional) statistics (e.g. proportions).
![Page 7: Rudi Seljak Statistical Office of the Republic of Slovenia](https://reader035.fdocuments.us/reader035/viewer/2022062422/56812d53550346895d9258d5/html5/thumbnails/7.jpg)
“CV criteria” problem
• Applying the direct CV criteria in the case of the non-dimensional statistics could cause the serious inconsistencies.
• Suppose we want to estimate the proportion of the units with certain characteristic in the large population.
95n 01.0ˆ p %5.10)ˆ( pcv )01.0(
• Disseminating q=1-p instead
95n 99.0ˆ q %1.0)ˆ( qcv 99.0
![Page 8: Rudi Seljak Statistical Office of the Republic of Slovenia](https://reader035.fdocuments.us/reader035/viewer/2022062422/56812d53550346895d9258d5/html5/thumbnails/8.jpg)
CV- p:q differences
![Page 9: Rudi Seljak Statistical Office of the Republic of Slovenia](https://reader035.fdocuments.us/reader035/viewer/2022062422/56812d53550346895d9258d5/html5/thumbnails/9.jpg)
The revised system
• Standard errors could be disseminated explicitly or by using the denotations for the different degrees of precision. – Denotations for the releases with small amount of the results (e.g. First
release) and for the general releases (e.g. Statistical Yearbook).
– Explicit dissemination in the case of more exhaustive and domain oriented releases.
• Three degrees of precision: – The estimate of acceptable precision → published without limitations
– Less precise estimates → flagged for caution with letter M
– Too imprecise estimates to be published → suppressed for use by letter N
![Page 10: Rudi Seljak Statistical Office of the Republic of Slovenia](https://reader035.fdocuments.us/reader035/viewer/2022062422/56812d53550346895d9258d5/html5/thumbnails/10.jpg)
The revised system cont’d
• For the totals or means of the positive, continuous variables the limits are determined directly on the basis of the coefficient of variation.– CV < 10% → estimate of acceptable precision →
published without denotations
– CV between 10%-30% → less precise estimate → flagged for caution with letter M
– CV>30% → too imprecise estimate to be published → suppressed for use by letter N
![Page 11: Rudi Seljak Statistical Office of the Republic of Slovenia](https://reader035.fdocuments.us/reader035/viewer/2022062422/56812d53550346895d9258d5/html5/thumbnails/11.jpg)
Limits for the proportions
• The “limit standard errors” limits are calculated for the case p=0.5 by using the CV criteria. Then this limits are used for each value of the estimated proportion.
0 10.5 0 1p
![Page 12: Rudi Seljak Statistical Office of the Republic of Slovenia](https://reader035.fdocuments.us/reader035/viewer/2022062422/56812d53550346895d9258d5/html5/thumbnails/12.jpg)
Limits for the proportions cont’d
• se < 0.05 → estimate of acceptable precision → published without denotations
• se between 0.05 and 0.15 → less precise estimate → flagged for caution with letter M
• se>0.15 → too imprecise estimate to be published → suppressed for use by letter N
• For the number of the units with certain characteristics, the limits are derived from the above limits by using the formula se(N·p)=N ·se(p)
![Page 13: Rudi Seljak Statistical Office of the Republic of Slovenia](https://reader035.fdocuments.us/reader035/viewer/2022062422/56812d53550346895d9258d5/html5/thumbnails/13.jpg)
The revised system – example (Tourism Travels of Domestic Population)
![Page 14: Rudi Seljak Statistical Office of the Republic of Slovenia](https://reader035.fdocuments.us/reader035/viewer/2022062422/56812d53550346895d9258d5/html5/thumbnails/14.jpg)
Application for standard error calculation
• To enable standardize and transparent calculation of the standard errors a special sas application was built.
• The application enables aggregation, standard error calculation and also denotation with the special signs, if needed.
• The application is designed as a metadata driven system.
• So far the application enables calculation of standard error for five types of statistics.
![Page 15: Rudi Seljak Statistical Office of the Republic of Slovenia](https://reader035.fdocuments.us/reader035/viewer/2022062422/56812d53550346895d9258d5/html5/thumbnails/15.jpg)
Conclusions
• The system of denotation of different degrees of less precise estimates, based on the coefficient of variation was used at SORS for many years.
• Some critical exploration pointed out the need for theoretical and technical revision of the system.
• The main new features of the revised system: – Denotation with signs as well as explicit dissemination of standard
errors could be used.– The “bracket signs” are replaced with “letter signs”. – Different criteria for different types of statistics are used. – Metadata driven application which enables standardized process of
standard error calculation was built.