An Empirical Likelihood Ratio Based Goodness-of-Fit Test for Two-parameter Weibull Distributions...
-
Upload
frank-hines -
Category
Documents
-
view
216 -
download
0
Transcript of An Empirical Likelihood Ratio Based Goodness-of-Fit Test for Two-parameter Weibull Distributions...
An Empirical Likelihood RatioBased Goodness-of-Fit Test for
Two-parameter Weibull Distributions
Presented by: Ms. Ratchadaporn Meksena
Student ID: 555020227-5
Advisor: Assoc. Prof. Dr. Supunnee Ungpansattawong
Date: 29th November 2013
Department of Statistics, Faculty of Science,
Khon Kaen University
OUTLINE
1. Introduction Rationale and Background Objective of Study Scope and Limitation of Study Anticipated Outcomes
2. Literature Review
3. Research Methodology Empirical Likelihood Method Goodness-of-Fit Test Based on Empirical Likelihood Ratio Calculation of Critical Values and Evaluation of Type I Error
Control Evaluation of the Power of the Proposed Test
1. Introduction
Rationale and Background
Weibull distribution is commonly used in many fields such as
β’ Survival Analysis
β’ Reliability Engineering & Failure Analysis
β’ Extreme Value Theory
β’ Weather Forecasting
β’ General Insurance
β’ etc.
The two-parameter Weibull distribution is the most widely used distribution for life data analysis.
1. Introduction
Rationale and Background (cont.)
The important part of data analysis is ensuring that the data come from a particular family of distributions. The goodness-of-fit tests for Weibull distribution are generally based on the empirical distribution function (EDF), such as the Kolmogorov-Smirnov (KS) test, Cramer-von Mises (CvM) test, or the Anderson-Darling (AD)
test. Recently, there are some literature about a goodness-of-fit test based on empirical likelihood ratio which the study results showed
the goodness-of-fit tests based on empirical likelihood ratio is competitive when compared with other available tests. Therefore, in
this study, we will propose an empirical likelihood ratio based goodness of fit test for two-parameter Weibull distributions.
1. Introduction
Objective of Study
The objective of this study is to propose a new
goodness-of-fit statistic based on empirical likelihood
ratio for two-parameter Weibull distributions.
1. Introduction
Scope and Limitation of Study
In this study, we will derive an empirical likelihood ratio based goodness-of-fit test for two-parameter Weibull distributions and its asymptotic properties, calculate the critical values for fixed sample sizes using Monte Carlo
simulations, and evaluate the performance of the proposed test in controlling the Type I error. Finally, we
will compare the power of the test between the proposed test statistic and Kolmogorov-Smirnov, CramΓ©r-von Mises,
and Anderson-Darling statistic.
1. Introduction
Anticipated Outcomes
We expect that we will get a new goodness-of-
fit test based on empirical likelihood ratio for two-
parameter Weibull distributions.
2. Literature Review
Examples of Goodness-of-Fit Tests for Two-Parameter Weibull Distributions:
β’ Shapiro and Brain (1987) proposed the test statistic is based on similar principles used in the derivation of the well known W-test for normality.
β’ Coles (1989) proposed a test via the stabilized probability plot, which involves estimating scale and shape parameters.
β’ Khamis (1997) proposed the Ξ΄-corrected Kolmogorov-Smirnov test, where the MLE for scale and shape parameters was employed.
2. Literature Review
Examples of Goodness-of-Fit Tests for Two-Parameter Weibull Distributions (cont.):
β’ Cabana and Quiroz (2005) proposed to employ the empirical moment generating function and a ne invariant estimators for estimating scale ffiand shape parameters such as moment estimators.
2. Literature Review
Examples of Goodness-of-Fit Tests Based on Empirical Likelihood Ratio:
β’ Vexler and Gurevich (2010) constructed an empirical likelihood ratio based goodness of fit
test to approximate the optimal NeymanβPearson ratio test with an unknown alternative density
function. β’ Vexler et al. (2011) proposed a similar goodness
of fit test based on the empirical likelihood method to test the null hypothesis of an inverse Gaussian
distribution.
2. Literature Review
Examples of Goodness-of-Fit Tests Based on Empirical Likelihood Ratio (cont.):
β’ Ning and Ngunkeng (2013) proposed a similar goodness of fit test based on the empirical
likelihood method to test the null hypothesis of a skew normality.
3. Research Methodology
Consider the two-parameter Weibull distribution which has the cumulative distribution function and the
probability density function defined as
and
respectively, where x > 0, Ξ² > 0 is the scale parameter and Ξ± > 0 is the shape parameter.
πΉαΊπ₯;π½,πΌα»= 1β ππ₯πβ࡬π₯π½ΰ΅°πΌ
ࡨ (1)
παΊπ₯;π½,πΌα»= πΌπ½ΰ΅¬π₯π½ΰ΅°πΌβ1 ππ₯πβ࡬π₯π½ΰ΅°πΌ
ࡨ , (2)
3. Research Methodology
Empirical Likelihood Method
Let X1, X2, β¦, Xn be independently and identically distributed observations, which follow an unknown population distribution F. The
empirical likelihood function of F be defined as
where the component pi , i =1, 2 , β¦, n, maximize the likelihood Lp(F) and satisfy empirical constraints corresponding to hypotheses of interest. For example, when a population parameter ΞΈ identified by E(X) = ΞΈ is
of interest, and the true value of ΞΈ is ΞΈ0. The null hypothesis
is Ho βΆ E(X) = ΞΈ0 . To maximize Lp(F), the values of pi in Lp(F) should be
chosen given the constraints and ,
where the constraint is an empirical version of E(X) = ΞΈ0.
πΏπαΊπΉα»= ΰ·οΏ½ πππ
π=1
ππ β₯ 0, ππ = 1ππ=1
ππππ = π0ππ=1
ππππ = π0ππ=1
3. Research Methodology
Empirical Likelihood Method (cont.)
The empirical log-likelihood ratio statistic to test ΞΈ = ΞΈ0 is given by
where R(ΞΈ) is the empirical log-likelihood ratio function defined through the definition of the empirical likelihood ratio function by Owen (1988).
π αΊπ0α»= maxΰ΅ log αΊπππα» ; ππ β₯ 0,ππ=1 ππ
ππ=1 = 1, πππ₯π
ππ=1 = π0ΰ΅‘
3. Research Methodology
Goodness-of-Fit Test
The goodness-of-fit test is a statistical test to determine whether the observations are consistent
with the particular statistical model. It describes how well the particular model fits a set of observations.
Measures of goodness of fit typically summarize the discrepancy between observed values and the
values expected under a statistical model.
3. Research Methodology
Goodness-of-Fit Test Based on Empirical Likelihood Ratio
The hypothesis to be tested is
where fH0 and fH1
are both unknown.
π»0 βΆ π= ππ»0 ~ ππ΅(π½,πΌ)
π»1 βΆ π= ππ»1 β ππ΅αΊπ½,πΌα»,
3. Research Methodology
Goodness-of-Fit Test
When density functions fH0 and fH1
are completely
known, the most powerful test statistics is the likelihood ratio
where under the null hypothesis X1, X2, β¦, Xn follows a Weibull distribution with parameters Ξ² and .
πΏπ = Ο ππ»1ππ=1 αΊππα»Ο ππ»0ππ=1 αΊππα»= Ο ππ»1ππ=1 αΊππα»Ο πΌπ½απ₯ππ½απΌβ1 ππ₯παβαπ₯ππ½απΌ
αππ=1 , (3)
3. Research Methodology
Goodness-of-Fit Test Based on Empirical Likelihood Ratio
In this study, forms of fH0 and fH1
are both unknown, but are
estimable. We follow the similar idea by Vexler and Gurevich (2010) and Ning and Ngunkeng (2013) to construct a test
statistic in forms of estimated likelihood ratios based goodness-of-fit test for the two-parameter Weibull distribution.
Apply the maximum empirical likelihood method to estimate of the numerator of the ratio (3). Rewrite the likelihood
function in the form of
where X(1) β€ X(2) β€ β€ β― X(n) are the order statistics based on the observations X1, X2, β¦, Xn .
πΏπ = ΰ·οΏ½ ππ»1(ππ)ππ=1 = ΰ·οΏ½ ππ»1(π(π))π
π=1 = ΰ·οΏ½ πππ
π=1 ,
3. Research Methodology
Goodness-of-Fit Test Based on Empirical Likelihood Ratio
Following the maximum empirical likelihood method, we can derive values of fi that maximize Lf and satisfy the empirical constraints under the alternative hypothesis H1. Obviously, values of fi should
be restricted by the equation β« f(s)ds = 1. Thus, we need an empirical form of the constraint β« f(s)ds = 1. We first give the following lemma by Vexler and Gurevich (2010) to obtain this empirical constraint.
Lemma 1 Let f(x) be a density function. Then
where X(j-m) = X(1) if j-m β€ 1 and X(j+m) = X(n) , if j+m β₯ n.
ΰΆ± παΊπ₯α»ππ₯π(π+π)
π(πβπ)π
π=1 = 2π ΰΆ± παΊπ₯α»ππ₯π(π)
π(1)β (πβπ)πβ1
π=1 ΰΆ± παΊπ₯α»ππ₯π(πβπ+1)
π(πβπ)β (πβπ)πβ1
π=1 ΰΆ± παΊπ₯α»ππ₯π(π+1)
π(π)
β 2π ΰΆ± παΊπ₯α»ππ₯π(π)
π(1)β π(πβ 1)π (3)
3. Research Methodology
Goodness-of-Fit Test Based on Empirical Likelihood Ratio
It is obvious that since and we denote
, using the empirical approximation to the
remainder term in Lemma 1, we have
From Lemma 1,we can empirically estimate Ξ΄m via
Notice that Ξ΄m β 1 when m β n β 0 as m, nββ.
ΰΆ± παΊπ₯α»ππ₯π(π)π(1) β€ ΰΆ± παΊπ₯α»ππ₯β
ββ = 1
πΏπ = 12π ΰΆ± παΊπ₯α»ππ₯β€ 1π(π+π)
π(πβπ)π
π=1
πΏπ β ΰΆ± παΊπ₯α»ππ₯παΊπα»
παΊ1α»βαΊπβ 1α»2π β€ 1βαΊπβ 1α»2π .
πΏαπ = ΰΆ± ππ₯πΉπΰ΅«παΊπα»ΰ΅―πΉπΰ΅«παΊ1α»ΰ΅―
βαΊπβ 1α»2π = 1β 1πβαΊπβ 1α»2π .
3. Research Methodology
Goodness-of-Fit Test Based on Empirical Likelihood Ratio
By applying the mean value theorem to the term of ,
we have
Thus, the empirical constraint under the alternative hypothesis H1 is given by
ΰΆ± παΊπ₯α»ππ₯π(π+π)
π(πβπ)π
π=1
ΰΆ± παΊπ₯α»ππ₯π(π+π)
π(πβπ)π
π=1 β (παΊπ+πα»
ππ=1 β παΊπβπα»)πΰ΅«παΊπα»ΰ΅―= (παΊπ+πα»
ππ=1 β παΊπβπα»)ππ.
πΏπ = 12π ΰΆ± παΊπ₯α»ππ₯β 12π (π(π+π)π
π=1 β π(πβπ))ππ β πΏαπ β€ 1π(π+π)
π(πβπ)
ππ=1
3. Research Methodology
Goodness-of-Fit Test Based on Empirical Likelihood Ratio
Apply the Lagrange multiplier method to maximize
that subject to the constraint . The Lagrange function defined by
where Ξ» is a lagrange multiplier. By taking the derivative of the above equation with respect to each fj , j = 1, 2, β¦, n, and Ξ» , we obtain
logππππ=1
πΏαπ β€ 1
π¬αΊπ1,π2,β¦,ππ,πα»= πππππππ=1 + πα 12π (π(π+π)
ππ=1 β π(πβπ))ππ β 1α
1ππ + π2πΰ΅«π(π+π) β π(πβπ)ΰ΅―= 0 (4)
3. Research Methodology
Goodness-of-Fit Test Based on Empirical Likelihood Ratio
and
respectively. From the equation (5), we have
Then multiply equation (4) by fj and taking summation, we have
12π (π(π+π)π
π=1 β παΊπβπα»)ππ β 1 = 0 , (5)
ππ = β 2ππΰ΅«παΊπ+πα»β παΊπβπα»ΰ΅― .
π + π 12π ΰ΅«π(π+π) β π(πβπ)ΰ΅―ππππ=1 = 0 .
3. Research Methodology
Goodness-of-Fit Test Based on Empirical Likelihood Ratio
Since , we have Ξ» = -n. Finally, we will
obtain the estimate value of fj to maximize , which also
maximizes as
where X(j-m) = X(1) if j-m β€ 1 and X(j+m) = X(n) , if j+m β₯ n.
Thus, using the maximum empirical likelihood method, the empirical likelihood ration based goodness-of-fit test for the two-
parameter Weibull distribution can be constructed as
12π (π(π+π)π
π=1 β π(πβπ))ππ β€ 1
logππππ=1
ΰ·οΏ½ ππππ=1
ππ = 2ππ(παΊπ+πα»β παΊπβπα») , (6)
3. Research Methodology
Goodness-of-Fit Test Based on Empirical Likelihood Ratio
where ΞΈ = (Ξ², Ξ±)' is the parameter vector of a two-parameter Weibull distribution. To maximize the denominator, since the parameters
Ξ² and Ξ± are unknown, the maximum likelihood estimate of Ξ± based on the observations can be applied.
The maximum likelihood estimators and of Ξ² and Ξ± , respectively, are solutions of the equations:
and
ππ΅ππ = Ο 2ππ(παΊπ+πα»βπαΊπβπα»)ππ=1maxπ½ Ο ππ»0(πππ½)ππ=1 (7)
π½α πΌΰ·ΰ·οΏ½
1πΌΰ·ΰ·οΏ½+ lnπππ
π=1 β Ο πππΌΰ·ΰ·οΏ½lnππππ=1Ο πππΌΰ·ΰ·οΏ½ππ=1 = 0 π½α= ΰ΅ πππΌΰ·ΰ·οΏ½π
π=1 ΰ΅±
1 πΌΰ·ΰ·οΏ½Ξ€
3. Research Methodology
Goodness-of-Fit Test Based on Empirical Likelihood Ratio
We notice that the distribution of the test statistic WBnm strongly depends on the integer m. Thus, the optimal values of m should be evaluated to make the test more efficient. We follow the same argument by Vexler and Gurevich (2010) to reconstruct the test statistic according to the properties of the empirical likelihood
method. We adopt their idea here to reconstruct the test statistic in (7) as
where Ξ΄ (0, 1). β
ππ΅π = min1β€π<ππΏ Ο 2ππ(παΊπ+πα»βπαΊπβπα»)ππ=1maxπ½ Ο ππ»0(πππ½)ππ=1 , (8)
3. Research Methodology
Goodness-of-Fit Test Based on Empirical Likelihood Ratio
Similar to the argument of Vexler et al. (2011) and Ning and
Ngunkeng (2013), we take Ξ΄ =0.5 in the equation (8). Thus, the
final form of the test statistic is
ππ΅π = min1β€π<ΞΎπΟ 2ππ(παΊπ+πα»βπαΊπβπα»)ππ=1maxπ½ Ο ππ»0(πππ½)ππ=1 (9)
3. Research Methodology
Asymptotic Properties of the Proposed Test Statistic
Denote and
We assume the following conditions hold:
(C1)
(C2) Under the null hypothesis, in probability.
(C3) Under alternative hypothesis, in probability where ΞΈ0
is a constant vector with finite components.
(C4) There are open intervals and containing ΞΈ and ΞΈ0 respectively. There also exists a function s(x) such that
for all x β R and .
βπαΊπ₯,π½α»= ππππππ»0(π₯;π½)ππ½π ,π = 1,2 , π½= αΊπ1,π2α»= (π½,πΌ)
πΈ(logπαΊπ1α»)2 < β
π½ β π½= max1β€iβ€2ππ β ππβ0
π½ βπ½0
0π 3 1π 3
β(π₯,)β€ π (π₯) β0 βͺ1
3. Research Methodology
Asymptotic Properties of the Proposed Test Statistic (cont.)
Proposition 1 Assume that the condition (C1)β(C4) hold. Then, under H0,
in probability as βπ β,
while, under H1 ,
in probability as βπ β.Given condition (C1)β(C4), Proposition 1 shows that the power of the
test goes to 1 as βπ β under the alternative hypothesis. Thus, the proposed test is consistent.
1πlogαΊππ΅πα»β0 1πlogαΊππ΅πα»βπΈπππα ππ»1(π1)ππ»0(π1;π0)α
3. Research Methodology
Calculation of Critical Values and Evaluation of Type I Error Control
To calculate the critical values for fixed sample sizes n = 10, 20, 30, 40, 50, 100, 200, 500, we simulate 5,000
samples from WB(Ξ², ) with different values of (Ξ², ) = (1, 0.5), (1, 2), (1, 4), (1, 8). For each simulated sample, we use R package MASS to estimate parameters Ξ² and . Then we can calculate a statistic for each sample
based on equation (9). After we obtain all 5,000 test statistics, we order them and choose 90th, 95th and 99th
percentiles to be the critical values corresponding to the significance level = 0.1, 0.05 and 0.01, respectively.
3. Research Methodology
Calculation of Critical Values and Evaluation of Type I Error Control (cont.)
Consequently, to investigate the performance of the proposed test in controlling the Type I error with the significance level = 0.1, 0.05 and 0.01, we conduct
simulations 5,000 times under WB(Ξ², ) with different values of (Ξ², ) = (1, 0.5), (1, 2), (1, 4), (1, 8)
and sample sizes n = 20, 50, 100, 200, 500, 1000. For each sample, we calculate a sample statistic based on
equation (9) and compares to the critical value. The percentage of rejecting the null hypothesis will be the
size of the proposed test.
3. Research Methodology
Evaluation of the Power of the Proposed Test
In order to study the power of the proposed test, we simulate 10,000 samples with sample size sizes n = 20, 50, 100, 200, 500, 1000 from Beta(0.25, 0.25), Beta(2, 2), N(0,
1) TruncN(-1,1). Then we compute the powers of Kolmogorov-Smirnov test, CramΓ©r-von Mises test,
Anderson-Darling test and the proposed test WBn at the nominal level 0.05.