Estimation of Pareto Distribution Functions from Samples Contaminated by Measurement Errors
description
Transcript of Estimation of Pareto Distribution Functions from Samples Contaminated by Measurement Errors
![Page 1: Estimation of Pareto Distribution Functions from Samples Contaminated by Measurement Errors](https://reader036.fdocuments.us/reader036/viewer/2022062805/56814d5f550346895dbaa3fa/html5/thumbnails/1.jpg)
Presenter: Lwando KondloSupervisor: Prof. C. Koen
SKA Postgrad Bursary ConferenceDecember 5, 2009
![Page 2: Estimation of Pareto Distribution Functions from Samples Contaminated by Measurement Errors](https://reader036.fdocuments.us/reader036/viewer/2022062805/56814d5f550346895dbaa3fa/html5/thumbnails/2.jpg)
The model for variable X measured with error is
Estimation of the density/distribution function of X is often important.
This is a classical deconvolution problem. The specific case where X has a Pareto form
is discussed.
![Page 3: Estimation of Pareto Distribution Functions from Samples Contaminated by Measurement Errors](https://reader036.fdocuments.us/reader036/viewer/2022062805/56814d5f550346895dbaa3fa/html5/thumbnails/3.jpg)
Pareto distribution – model for positive data. Example includes the
Distribution of income and wealth among individuals
Masses of molecular clouds, etc.
![Page 4: Estimation of Pareto Distribution Functions from Samples Contaminated by Measurement Errors](https://reader036.fdocuments.us/reader036/viewer/2022062805/56814d5f550346895dbaa3fa/html5/thumbnails/4.jpg)
The Finite-Support Pareto distribution (FSPD) is
![Page 5: Estimation of Pareto Distribution Functions from Samples Contaminated by Measurement Errors](https://reader036.fdocuments.us/reader036/viewer/2022062805/56814d5f550346895dbaa3fa/html5/thumbnails/5.jpg)
Distributional parameters are estimated by fitting the FSPD to a set of data.
This is not appropriate if the data are contaminated by errors
![Page 6: Estimation of Pareto Distribution Functions from Samples Contaminated by Measurement Errors](https://reader036.fdocuments.us/reader036/viewer/2022062805/56814d5f550346895dbaa3fa/html5/thumbnails/6.jpg)
To develop methodology for deconvolution when X is known to be of Pareto form.
Apply the methodology to the real (radio astronomical) data.
![Page 7: Estimation of Pareto Distribution Functions from Samples Contaminated by Measurement Errors](https://reader036.fdocuments.us/reader036/viewer/2022062805/56814d5f550346895dbaa3fa/html5/thumbnails/7.jpg)
If X has the PDF g(.) and has the PDF h(.). Then Y has the PDF
Then the convolved PDF (CPDF)
![Page 8: Estimation of Pareto Distribution Functions from Samples Contaminated by Measurement Errors](https://reader036.fdocuments.us/reader036/viewer/2022062805/56814d5f550346895dbaa3fa/html5/thumbnails/8.jpg)
The CPDF could differ substantially from FSPD.
Probability-Probability plots (compares observed and theoretical distribution functions) can be used.
![Page 9: Estimation of Pareto Distribution Functions from Samples Contaminated by Measurement Errors](https://reader036.fdocuments.us/reader036/viewer/2022062805/56814d5f550346895dbaa3fa/html5/thumbnails/9.jpg)
Simulated data with
are used.
L U a σ
3 6 1.5 0.4
![Page 10: Estimation of Pareto Distribution Functions from Samples Contaminated by Measurement Errors](https://reader036.fdocuments.us/reader036/viewer/2022062805/56814d5f550346895dbaa3fa/html5/thumbnails/10.jpg)
![Page 11: Estimation of Pareto Distribution Functions from Samples Contaminated by Measurement Errors](https://reader036.fdocuments.us/reader036/viewer/2022062805/56814d5f550346895dbaa3fa/html5/thumbnails/11.jpg)
1. The contaminated data extend beyond the interval [L,U] over which the error-free data occur
2. The shape of the distribution is changed
◦ This will lead to biased estimates of L, U and power-law exponent a.
![Page 12: Estimation of Pareto Distribution Functions from Samples Contaminated by Measurement Errors](https://reader036.fdocuments.us/reader036/viewer/2022062805/56814d5f550346895dbaa3fa/html5/thumbnails/12.jpg)
Based on maximising the likelihood (or log-likelihood) of the observed data given the model.
Log-likelihood of CPDF
![Page 13: Estimation of Pareto Distribution Functions from Samples Contaminated by Measurement Errors](https://reader036.fdocuments.us/reader036/viewer/2022062805/56814d5f550346895dbaa3fa/html5/thumbnails/13.jpg)
Application to the data in the histogram leads
N.B: CPDF fitted to the data with errors gives favourable MLEs with true parameter values 3; 6 and 1.5.
L U a
FSDP 2.267 7.124 1.186
CPDF 3.047 6.028 1.445
![Page 14: Estimation of Pareto Distribution Functions from Samples Contaminated by Measurement Errors](https://reader036.fdocuments.us/reader036/viewer/2022062805/56814d5f550346895dbaa3fa/html5/thumbnails/14.jpg)
The methodology is illustrated by fitting CPDF to a sample of giant molecular clouds masses in the galaxy M33 (Engargiola et. al., 2003).
![Page 15: Estimation of Pareto Distribution Functions from Samples Contaminated by Measurement Errors](https://reader036.fdocuments.us/reader036/viewer/2022062805/56814d5f550346895dbaa3fa/html5/thumbnails/15.jpg)
![Page 16: Estimation of Pareto Distribution Functions from Samples Contaminated by Measurement Errors](https://reader036.fdocuments.us/reader036/viewer/2022062805/56814d5f550346895dbaa3fa/html5/thumbnails/16.jpg)
L U a σ
MLE 6.9 77.7 1.33 3.47
s.e 0.65 4.97 0.26 0.55
The unit mass is solar masses.
Good agreement with the Engargiola et al (2003) estimates. More especially a = 1.6 +/- 0.3.
![Page 17: Estimation of Pareto Distribution Functions from Samples Contaminated by Measurement Errors](https://reader036.fdocuments.us/reader036/viewer/2022062805/56814d5f550346895dbaa3fa/html5/thumbnails/17.jpg)
The linear form of the P-P plot indicates that the estimated distribution fits the sample of giant molecular clouds very well.
![Page 18: Estimation of Pareto Distribution Functions from Samples Contaminated by Measurement Errors](https://reader036.fdocuments.us/reader036/viewer/2022062805/56814d5f550346895dbaa3fa/html5/thumbnails/18.jpg)
Deconvolution is a useful statistical method for recovering an unknown distribution of X in the presence of errors.
The methodology for deconvolution when X is known to be of Pareto form is developed
Satisfactory results were found by MLE method.
The price paid is that the analysis is more complicated
![Page 19: Estimation of Pareto Distribution Functions from Samples Contaminated by Measurement Errors](https://reader036.fdocuments.us/reader036/viewer/2022062805/56814d5f550346895dbaa3fa/html5/thumbnails/19.jpg)
Everyone contributed to the work presented.1. Prof. C. Koen (Supervisor)2.Funding: SKA SA (Kim, Anna and Daphne) 3. University of the Western Cape (Leslie and
Rennet)
![Page 20: Estimation of Pareto Distribution Functions from Samples Contaminated by Measurement Errors](https://reader036.fdocuments.us/reader036/viewer/2022062805/56814d5f550346895dbaa3fa/html5/thumbnails/20.jpg)