Confidence intervals for the ratio of medians of two ...uregina.ca/~volodin/Confidence.pdfnormal...

11
Full Terms & Conditions of access and use can be found at https://www.tandfonline.com/action/journalInformation?journalCode=lssp20 Communications in Statistics - Simulation and Computation ISSN: (Print) (Online) Journal homepage: https://www.tandfonline.com/loi/lssp20 Confidence intervals for the ratio of medians of two independent log-normal distributions Lapasrada Singhasomboon , Wararit Panichkitkosolkul & Andrei Volodin To cite this article: Lapasrada Singhasomboon , Wararit Panichkitkosolkul & Andrei Volodin (2020): Confidence intervals for the ratio of medians of two independent log- normal distributions, Communications in Statistics - Simulation and Computation, DOI: 10.1080/03610918.2020.1812649 To link to this article: https://doi.org/10.1080/03610918.2020.1812649 Published online: 28 Aug 2020. Submit your article to this journal View related articles View Crossmark data

Transcript of Confidence intervals for the ratio of medians of two ...uregina.ca/~volodin/Confidence.pdfnormal...

Page 1: Confidence intervals for the ratio of medians of two ...uregina.ca/~volodin/Confidence.pdfnormal distributions; for example, the interval for a single log-normal mean has been addressed

Full Terms & Conditions of access and use can be found athttps://www.tandfonline.com/action/journalInformation?journalCode=lssp20

Communications in Statistics - Simulation andComputation

ISSN: (Print) (Online) Journal homepage: https://www.tandfonline.com/loi/lssp20

Confidence intervals for the ratio of medians oftwo independent log-normal distributions

Lapasrada Singhasomboon , Wararit Panichkitkosolkul & Andrei Volodin

To cite this article: Lapasrada Singhasomboon , Wararit Panichkitkosolkul & AndreiVolodin (2020): Confidence intervals for the ratio of medians of two independent log-normal distributions, Communications in Statistics - Simulation and Computation, DOI:10.1080/03610918.2020.1812649

To link to this article: https://doi.org/10.1080/03610918.2020.1812649

Published online: 28 Aug 2020.

Submit your article to this journal

View related articles

View Crossmark data

Page 2: Confidence intervals for the ratio of medians of two ...uregina.ca/~volodin/Confidence.pdfnormal distributions; for example, the interval for a single log-normal mean has been addressed

Confidence intervals for the ratio of medians of twoindependent log-normal distributions

Lapasrada Singhasomboona, Wararit Panichkitkosolkula, and Andrei Volodinb

aDepartment of Mathematics and Statistics, Thammasat University, Phathum Thani, Thailand; bDepartment ofMathematics and Statistics, University of Regina, Saskatchewan, Canada

ABSTRACTWe focus on the construction of confidence intervals for the ratios ofmedians of two independent, log-normal distributions based on the nor-mal approximation (NA) approach, the method of variance estimate recov-ery (MOVER), and the generalized confidence interval (GCI) approach. Wealso compare the performance of the three confidence intervals in termsof the coverage probabilities, and average lengths, using Monte Carlo sim-ulations. The results show that the GCI confidence interval is generally pre-ferred in terms of coverage probabilities; however, the average length forthe GCI is always wider than for other approaches. The NA and MOVERapproaches could be recommended on the basis of the specific values ofli,r

2i and/or sample sizes. The confidence intervals are illustrated using

real data examples.

ARTICLE HISTORYReceived 9 April 2020Accepted 16 August 2020

KEYWORDSCentral tendency;Generalized confidenceinterval; Interval estimation;Normal approximation;simulation; Skewdistribution; Varianceestimates recovery

1. Introduction

In many applications, such as medicine, biology, exposure, pollution, economics, finance, reliabil-ity, survival and meteorology data analysis, measurements are often right-skewed. In these data,analyses are usually assumed by log-normal models. The log-normal distribution is common inmany application areas. Estimating the parameters of the log-normal distribution is an interestingproblem. There are many studies about approaches for constructing confidence intervals for log-normal distributions; for example, the interval for a single log-normal mean has been addressedmultiple times in the literature (e.g., see Land 1972; Angus 1988, 1994; Zhou and Gao 1997, etc.)The interval for the ratio, or difference of two log-normal means is addressed in Zhou and Tu(2000); Wu et al. (2002); Krishnamoorthy and Mathew (2003), etc. The interval estimation forthe mean of several log-normal distributions is discussed in Baklizi and Ebrahem (2005);Behboodian and Jafari (2006); Tian and Wu (2007); Lin and Wang (2013); Malekzadeh andKharrati-Kopaei (2018). However, log-normal distributions that follow right-skewed data typicallyhave extremely low measurements, which affect the median less than the mean. Thus, in this situ-ation, the median is a more meaningful central tendency measure than the mean.

Some authors have considered the median of the log-normal distribution. Zellner (1971) pro-posed a Bayesian and non-Bayesian estimator for the parameters of the mean and median of thelog-normal distribution. Rao and D’Cunha (2016) proposed the Bayes credible interval for themedian of the log-normal distribution, and compared the interval based on the MLE. The conclu-sion was that the Bayes credible interval has a shorter average length compared to the one inter-val. To our knowledge, there is no research paper on the confidence interval for medians of two

CONTACT Lapasrada Singhasomboon [email protected] Department of Mathematics and Statistics,Thammasat University, Phathum Thani, Thailand.� 2020 Taylor & Francis Group, LLC

COMMUNICATIONS IN STATISTICS - SIMULATION AND COMPUTATIONVR

https://doi.org/10.1080/03610918.2020.1812649

Page 3: Confidence intervals for the ratio of medians of two ...uregina.ca/~volodin/Confidence.pdfnormal distributions; for example, the interval for a single log-normal mean has been addressed

log-normal distributions. In this article, we take this new challenge by investigating the newaspect of confidence intervals construction. We propose the Normal Approximation (NA), themethod of variance estimates recovery (MOVER) and the generalized confidence interval (GCI)approaches to construct confidence intervals for the ratio of medians for two independent, log-normal distributions. We assess these three confidence intervals using the coverage probabilitiesand their expected lengths. Typically, we prefer a confidence interval with a coverage probabilityof at least the nominal level (1� a); its expected length is short.

The description of notation and the log-normal model, followed by three confidence intervalapproaches for constructing confidence intervals for the ratio medians of two independent, log-normal distributions are discussed in Sec. 2. A simulation study comparing the proposed intervalis presented in Sec. 3, with Sec. 3.1 containing the discussion and the results. In Sec. 4, the pro-posed CI construction approaches are illustrated with PM2.5 datasets. Finally, the concludingremarks are given in Sec. 5.

2. Methods

LetXij; i ¼ 1, 2, j ¼ 1, :::, ni: be random variables from two independent, log-normal distributionswith the following parameters: the meansli and variancesr2i , respectively, are denotedas Xij � LNðli, r2i Þ:

The probability density function is

f ðXÞ ¼ 1

xijffiffiffiffiffiffiffiffiffiffiffi2pri2

p e�1

2

ln xij�liri

� �2

where xij > 0, �1 < li < 1,ri2 > 0 and i ¼ 1, 2, j ¼ 1, :::, ni:We know that the logarithm of Xij: i.e., Yij ¼ ln ðXijÞ � Nðli, r2i Þ is normally distributed.

Therefore, the unbiased estimators (and MLE) for li, r2i , i ¼ 1, 2 are

�Y i ¼ 1ni

Xnij¼1

Yij

S2i ¼1

ni � 1

Xnij¼1

Yij � �Yi� �2, i ¼ 1, 2

It is well-known that the medians of two independent, log-normal distribution can be calculatedas follows:

m1 ¼ el1

and

m2 ¼ el2

which may be applied to obtain the ratio medians of two independent log-normalðwÞ Inferencesare made on

w ¼ m1

m2¼ e l1�l2ð Þ (1)

By using the plug-in estimator, the unbiased (and MLE) point estimator of w is

w ¼ e �y1��y2ð Þ (2)

where �Y i ¼ 1ni

Pnij¼1 Yij, i ¼ 1, 2:

In the rest of this section, we address constructing the confidence interval for w by theNormal approximation, Generalized Confidence Interval and MOVER approaches.

2 L. SINGHASOMBOON ET AL.

Page 4: Confidence intervals for the ratio of medians of two ...uregina.ca/~volodin/Confidence.pdfnormal distributions; for example, the interval for a single log-normal mean has been addressed

2.1. The normal approximation confidence interval approach

Consider the following point estimator of w as in Eq. (2):

w ¼ e �y1��y2ð Þ:

In the Normal Approximation approach, the main statistical tool we use to obtain an asymptotic-ally normal and limiting distribution of an estimator w is the famous Delta method. It can beexplained briefly in the following way.

Let gðv1, v2Þ be a differentiable scalar function of two variables. Consider an estimator T ¼gðV1,V2Þ, which is a function of two other basic statistics V1 and V2: Usually, statistics V1 andV2 have a simple form, and it is known that they are jointly asymptotically normal. The asymp-totic distribution of an estimator, T, can be found by the Delta method, which is a procedure ofstochastic representation of T:

Now, we apply the Delta method to prove its asymptotically normality as ni ! 1 and to findthe asymptotic mean and variance for the estimator w:

In the Delta method, function g is used to expand into the Taylor series at the point l1 ¼EðV1Þ and l2 ¼ EðV2Þ :

g V1,V2ð Þ ¼ g l1, l2ð Þ þ@g l1,l2ð Þ

@v1V1 � l1ð Þ þ @g l1, l2ð Þ

@v2V2 � l2ð Þ þ Remainder:

Note that it is possible to prove thatffiffiffin

pRemainder ! 0 in probability as the sample

size n1, n2 ! 1:For our case, gðv1, v2Þ ¼ ev1�v2 and V1 ¼ �Y 1,V2 ¼ �Y 2: The statistic V1 ¼ �Y 1 is normally dis-

tributed as N �Y 1, S21�n1

� �and the statistic V2 ¼ �Y 2 is normally distributed as N �Y 2, S

22�n2

� �:

To calculate partical derivatives,

@g v1, v2ð Þ@v1

¼ ev1�v2 and@g v1, v2ð Þ

@v2¼ �ev1�v2 :

Hence,

@g l1, l2ð Þ@v1

¼ el1�l2 and@g l1,l2ð Þ

@v2¼ �el1�l2 :

By the Taylor series,

g V1,V2ð Þ � el1�l2 þ el1�l2 l1 � l1ð Þ � el1�l2 l2 � l2ð Þ� el1�l2 l1 � l2 � l1 þ l2 þ 1ð Þ :

From this, we take the expectation and variance on the both sides, we obtained the asymptoticmean and variance for the estimator wNA are given by

lwNA¼ el1�l2 and r2

wNA¼ e2 l1�l2ð Þ r21

n1þ r22

n2

� ,

respectively.To sum up, as sample sizes n1, n2 ! 1, we have the wNA is approximately normal with a

mean lwNAand variance of the form r2

wNA: Obviously, values l1,l2,r

21 and r22 are unknown when

we estimate the parameter function wNA having only samples in our hands. In this case, we usethe plug-in estimators of lwNA

and r2wNA

as follows:

lwNA¼ e�y1��y2 and r2

wNA¼ e2 �y1��y2ð Þ s21

n1þ s22n2

where �yi and s2i are the observed values of �Yi and S2i , respectively.

COMMUNICATIONS IN STATISTICS - SIMULATION AND COMPUTATIONVR 3

Page 5: Confidence intervals for the ratio of medians of two ...uregina.ca/~volodin/Confidence.pdfnormal distributions; for example, the interval for a single log-normal mean has been addressed

Therefore, the ð1� aÞ100% two-sided approximate confidence interval for the w based on NAappraoch is given by

CINA ¼ wNA � za=2

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffie2 �y1��y2ð Þ s21

n1þ s22n2

� s, wNA þ za=2

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffie2 �y1��y2ð Þ s21

n1þ s22n2

� s0@

1A (3)

where za=2 is thea 2= Þð -th quantile value from the standard normal distribution.

2.2. The confidence interval based on the MOVER approach

Zou (2008) and Zou and Donner (2008) introduced the method of variance estimates recovery(MOVER) for constructing a confidence interval for a linear combination of the parameters esti-mated by confidence limits for each component of the parameters.

Recall in Eq. (1) that the ratio median of two independent log-normal is

w ¼ m1

m2¼ e l1�l2ð Þ:

The logarithm of the ratio, which we will denote as h, is

h ¼ ln ðwÞ ¼ l1 � l2: (4)

We start by constructing the confidence interval for h ¼ ln ðwÞ ¼ l1 � l2: Then, we take theexponent to obtain a confidence interval for the ratio of medians: w ¼ eðl1�l2Þ:

Let li, i ¼ 1, 2 as each component parameters of the h, and let Li and Ui be the lower andupper limits of the interval for li, i ¼ 1, 2, respectively.

Then, we have

Li ¼ �yi � za=2ffiffiffiffiffiffiffiffiffiffiffiffiffiffivar �yið Þ

p,Ui ¼ �yi þ za=2

ffiffiffiffiffiffiffiffiffiffiffiffiffiffivar �yið Þ

p(5)

where �Y i � N �Yi, S2i�ni

� �and za=2 is the a=2ð Þ-th quantile value from the standard normal

distribution.Following Zou (2008) and Zou and Donner (2008), the ð1� aÞ100% two-sided confidence

interval for the parameters h, based on the MOVER approach, is given by

CI ¼ �y1 � �y2 �ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi�y1 � L1ð Þ2 þ U2 � �y2ð Þ2

q,�y1 � �y2 þ

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiU1 � �y1ð Þ2 þ �y2 � L2ð Þ2

q �(6)

Therefore, we will take the exponent in Eq. (6) to obtain a confidence interval for the w asfollows:

CIMOVER ¼ exp �y1 � �y2 �ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi�y1 � L1ð Þ2 þ U2 � �y2ð Þ2

q,�y1 � �y2 þ

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiU1 � �y1ð Þ2 þ �y2 � L2ð Þ2

q �(7)

where Li,Ui are defined by Eq. (5).

2.3. The generalized confidence interval approach

Weerahandi (1993) introduced the generalized inference approach for testing a confidence inter-val in the situation where the parameter of interest consists of several component parameters,such as nuisance parameter(s). The basic concept of the GCI is as follows. Let X ¼ ðX1,X2, :::,XnÞbe a random sample from the probability density function f ðx; h, kÞ, where h is the parameter ofinterest and k is a nuisance parameter(s). Let x ¼ ðx1, x2, :::, xnÞ denote the observed value of X ¼ðX1,X2, :::,XnÞ: To obtain the confidence limits for the parameter of interest, h, we first need to

4 L. SINGHASOMBOON ET AL.

Page 6: Confidence intervals for the ratio of medians of two ...uregina.ca/~volodin/Confidence.pdfnormal distributions; for example, the interval for a single log-normal mean has been addressed

construct a generalized pivotal quantity RðX; x, h, kÞ, which is a function of the random sample X:The observed data, x, and the unknown parameters, h, k, must satisfy the following conditions:

(i) The distribution of RðX; x, h, kÞ is free of unknown parameters;(ii) The observed value of RðX; x, h, kÞ is equal to the parameter of interestðhÞ

Similar to the MOVER approach, we start by constructing the confidence intervalforh ¼ ln ðwÞ ¼ l1 � l2: Finally, we take the exponent to obtain a confidence interval for theratio of mediansðwÞ:

The generalized pivotal quantity of h ¼ l1 � l2 is defined as follows:

Rh ¼ Rl1 � Rl2 , (8)

where Rli according to Krishnamoorthy and Mathew (2003), is given by:

Rli ¼ �yi ��Y i � liSi=

ffiffiffiffini

p siffiffiffiffini

p

¼ �yi �Ziffiffiffiffini

p siffiffiffiffiffiffiffiffiffiffiffiffini � 1

pUi

i ¼ 1, 2

where �yi and s2i are the observed values of �Yi and S2i , respectively. Zi and Ui are independent,

where Zi ¼ ð�Y i�liÞri=

ffiffiffini

p � Nð0, 1Þ and U2i ¼ ðni�1ÞS2i

r2i� v2ni�1, i ¼ 1, 2:

It is easy to verify that Rh satisfies the above two conditions. The last expression suggests thatthe distribution of Rh is free of unknown parameters. In the first expression, substituting �yi ands2i for �Yi and S2i is equal to h: So, the ð1� aÞ100% two-sided generalized confidence interval forh is simply the Rhða=2Þ,Rhð1� a=2Þ of percentile of Rh: Next, we will take the exponent to obtaina generalized confidence interval for the w:

Therefore, the ð1� aÞ100% two-sided generalized confidence interval for w, based on the GCIapproach, is given by

CIGCI ¼ exp Rh a=2ð Þ,Rh 1� a=2ð Þ½ �: (9)

Constructing the GCI for the w can be summarized by the following algorithm:

Algorithm:for a given �y1,�y2, s

21, s

22

for i¼ 1 to m:Generate the value for Z1,Z2,U2

1 ,U22 from the standard normal distribution and the chi-

squared distribution with n-1 degree of freedom, respectively.Calculate Rh as Rl1 � Rl2End loop for iThe ð1� aÞ100% two-sided generalized confidence interval for w is then obtained by taking

the exponent of the 100ða=2Þ percentile of Rh defined by Rhða=2Þ and 100ð1� a=2Þ percentile ofRh defined by Rhð1� a=2Þ .

3. Simulation study

In the simulation studies, we evaluate the performance of the proposed CI constructionapproaches. We estimated the coverage probabilities and average length through Monte Carlosimulation with the R statistical software. For the parameter configurations, we have generated10,000 random samples from two independent, log-normal populations, with the parameters liand r2i , i ¼ 1, 2: Numerical results on the coverage probabilities and average length of the 95%

COMMUNICATIONS IN STATISTICS - SIMULATION AND COMPUTATIONVR 5

Page 7: Confidence intervals for the ratio of medians of two ...uregina.ca/~volodin/Confidence.pdfnormal distributions; for example, the interval for a single log-normal mean has been addressed

two-sided confidence interval for w of two independent log-normal distributions when equalðr21, r22Þ ¼ ð0:20, 0:20Þ, ðr21,r22Þ ¼ ð0:30, 0:30Þ and unequal ðr21, r22Þ ¼ ð0:20, 0:30Þ are reported inTables 1, 2, and 3, respectively, for various values li ¼ 1, 3; i ¼ 1, 2, and sample sizes ðn1, n2Þvarying from small to large under equal and unequal sample sizes. In practical applications, thevalues of ðn1, n2Þ are usually unequal, so we consider unequal sample sizes in our simulationstudies and also include the PM2.5 datasets, which can be seen in Sec. 4.

3.1. Discussion

Based on Table 1, we can conclude that:

(i) The coverage probability of the GCI approach is always greater than the nominal level,regardless of ni,li, r

2i values. However, the average length of this approach is wider than

other approaches.(ii) For r21 ¼ r22, and for small sample sizes, the GCI approach is recommended. For moderate

sample sizes, the average length of all proposed NA, MOVER, GCI approaches performwell, except ðl1 ¼ 1,l2 ¼ 1,r21 ¼ 0:20, r22 ¼ 0:20Þ; the NA performs better than the GCIand MOVER approaches. For large sample sizes, the NA approach is recommended.

(iii) For r21 6¼ r22, for moderate to large sample sizes, the NA approach is quite satisfactory.However, the GCI approach is recommended for all sample sizes.

Table 1. Coverage probabilities and Average length of 95% CIs for w when ðr21,r22Þ ¼ ð0:20, 0:20Þ:

Parameters (n1,n2)

Coverage probabilities Average length

NA MOVER GCI NA MOVER GCI

ðl1 ¼ 1,l2 ¼ 1Þ (25) 0.9439 0.9433 0.9552 0.4308 0.4342 0.4551(25,40) 0.9435 0.9447 0.9532 0.3866 0.3890 0.4048(25,50) 0.9457 0.9477 0.956 0.3729 0.3751 0.3899(25,100) 0.9411 0.9449 0.9538 0.3373 0.3389 0.3530(25,40) 0.9442 0.9472 0.9552 0.3861 0.3885 0.4043(40) 0.9468 0.9487 0.9539 0.3391 0.3408 0.3504(40, 50) 0.9483 0.944 0.9495 0.3231 0.3246 0.3326(40,100) 0.9468 0.9479 0.9538 0.2836 0.2846 0.2911(25, 50) 0.9429 0.9442 0.9516 0.3727 0.3749 0.3897(40, 50) 0.9465 0.9441 0.9505 0.3229 0.3243 0.3326(50) 0.9452 0.9475 0.9524 0.3038 0.3050 0.3116(50,100) 0.9437 0.9453 0.9484 0.2627 0.2634 0.2681(100,25) 0.9465 0.9467 0.9544 0.3382 0.3399 0.3539(100,40) 0.9454 0.947 0.953 0.2830 0.2840 0.2905(100,50) 0.9475 0.9466 0.9507 0.2628 0.2636 0.2683(100,100) 0.948 0.9498 0.9509 0.2148 0.2152 0.2173

ðl1 ¼ 3,l2 ¼ 3Þ (25) 0.9414 0.9433 0.9532 0.4294 0.4328 0.4537(25,40) 0.9429 0.9433 0.9506 0.3879 0.3904 0.4062(25,50) 0.9433 0.9446 0.9519 0.3717 0.3739 0.3888(25,100) 0.9428 0.9414 0.9511 0.3396 0.3412 0.3554(25,40) 0.9452 0.9451 0.9549 0.3869 0.3893 0.4050(40) 0.944 0.9456 0.9513 0.3405 0.3422 0.3517(40,50) 0.9453 0.9461 0.9525 0.3223 0.3237 0.3318(40,100) 0.9484 0.95 0.9546 0.2838 0.2847 0.2914(25,50) 0.9419 0.9436 0.9529 0.3721 0.3742 0.3891(40,50) 0.9455 0.947 0.9528 0.3215 0.3229 0.3310(50) 0.9479 0.9488 0.9526 0.3038 0.3050 0.3115(50,100) 0.9456 0.9468 0.9499 0.2632 0.2640 0.2687(100,25) 0.9391 0.9415 0.9501 0.3379 0.3395 0.3535(100,40) 0.9474 0.9479 0.9513 0.2835 0.2845 0.2910(100,50) 0.9487 0.9494 0.9517 0.2632 0.2639 0.2686(100,100) 0.9473 0.9473 0.9488 0.2149 0.2154 0.2174

6 L. SINGHASOMBOON ET AL.

Page 8: Confidence intervals for the ratio of medians of two ...uregina.ca/~volodin/Confidence.pdfnormal distributions; for example, the interval for a single log-normal mean has been addressed

4. An applications to real data

Air pollution, especially particulate matter (PM) from vehicles, has a major impact on the healthof people who live in urban centers and near traffic. The World Bank studies the health effects ofparticulate matter air pollution in Bangkok. In one of their studies, they found that there were4,000–5,500 premature deaths each year in urban centers, with hospital admissions for respiratorydiseases related to the levels of particulate matter (Pollution Control Department, Ministry ofScience Technology and Environment, Bangkok, Thailand 1999). The top causes of air pollutionin Thailand are (1) vehicular emissions in cities, (2) biomass burning and transboundary haze inrural and border areas, and (3) industrial discharges in concentrated industrialized zones.Therefore, in this example, we study the PM2.5 mass concentrations in Bangkapi and Dindaengareas, which are located in urban centers with busy roads representing a high traffic site. Wecompare the ratio of medians of the measurements in both areas with our proposed CI. ThePM2.5 mass concentration measurements (ug/m3) were recorded simultaneously in the areas bythe Pollution Control Department (PCD) every fourth day at 9:00 AM local time from March2019 to February 2020. The data can be found in http://aqmthai.com/aqi.php.

The data sets from this study are as followsThe PM2.5 mass concentration (ug/m3) in Bangkapi: 25,22,16,32,31,20,28,17,18,14,13,19,20,16,

10,17,34,23,24,21,18,7,11,19,9,6,5,14,7,18,14,12,9,8,16,14,11,7,9,7,11,12,11,8,8,15,15,7,18,15,36,48,47,27,14,23,18,24,24,21,13,36,39,24,30,29,17,21,32,38,32,28,25,31,18,34,49,24,32,56,16,16,41,21,39,12,20,50,56, 31

Table 2. Coverage probabilities and Average length of 95% CIs for w when ðr21,r22Þ ¼ ð0:30, 0:30Þ:

Parameters (n1,n2)

Coverage probabilities Average length

NA MOVER GCI NA MOVER GCI

ðl1 ¼ 1,l2 ¼ 1Þ (25) 0.9373 0.9433 0.9532 0.6104 0.6200 0.6505(25,40) 0.9415 0.9433 0.9506 0.5514 0.5585 0.5814(25,50) 0.9415 0.9446 0.9519 0.5280 0.5342 0.5558(25,100) 0.942 0.9415 0.9511 0.4825 0.4872 0.5077(25,40) 0.945 0.9451 0.9549 0.5497 0.5567 0.5795(40) 0.9435 0.9457 0.9513 0.4838 0.4885 0.5022(40,50) 0.9444 0.9461 0.9525 0.4572 0.4613 0.4729(40,100) 0.9478 0.95 0.9546 0.4024 0.4051 0.4146(25,50) 0.941 0.9436 0.9529 0.5285 0.5348 0.5562(40,50) 0.9452 0.947 0.9528 0.4559 0.4599 0.4716(50) 0.9466 0.9489 0.9526 0.4309 0.4343 0.4436(50,100) 0.944 0.9469 0.9499 0.3731 0.3753 0.3820(100,25) 0.938 0.9416 0.9501 0.4793 0.4841 0.5042(100,40) 0.948 0.9479 0.9513 0.4019 0.4047 0.4139(100,50) 0.9469 0.9494 0.9517 0.3731 0.3752 0.3819(100,100) 0.9465 0.9473 0.9488 0.3045 0.3057 0.3086

ðl1 ¼ 3,l2 ¼ 3Þ (25) 0.9381 0.9422 0.9512 0.6111 0.6208 0.6510(25,40) 0.9386 0.9422 0.9505 0.5493 0.5563 0.5790(25,50) 0.9445 0.9462 0.954 0.5287 0.5350 0.5566(25,100) 0.9424 0.9433 0.9504 0.4813 0.4861 0.5064(25,40) 0.945 0.9488 0.9575 0.5491 0.5561 0.5788(40) 0.9393 0.9435 0.9516 0.4821 0.4868 0.5006(40,50) 0.9438 0.9441 0.9499 0.4577 0.4617 0.4734(40,100) 0.949 0.9494 0.9541 0.4028 0.4056 0.4153(25,50) 0.9448 0.946 0.9539 0.5277 0.5339 0.5555(40,50) 0.9449 0.9477 0.9543 0.4571 0.4611 0.4725(50) 0.9457 0.9485 0.9534 0.4312 0.4345 0.4440(50,100) 0.9465 0.9507 0.9538 0.3725 0.3746 0.3813(100,25) 0.9413 0.9436 0.952 0.4812 0.4860 0.5064(100,40) 0.9471 0.948 0.9526 0.4027 0.4055 0.4149(100,50) 0.9464 0.9472 0.9523 0.3733 0.3755 0.3822(100,100) 0.9516 0.9515 0.9514 0.3042 0.3054 0.3082

COMMUNICATIONS IN STATISTICS - SIMULATION AND COMPUTATIONVR 7

Page 9: Confidence intervals for the ratio of medians of two ...uregina.ca/~volodin/Confidence.pdfnormal distributions; for example, the interval for a single log-normal mean has been addressed

The PM2.5 mass concentration (ug/m3) in Dindaeng: 33,24,18,38,32,24,28,20,21,21,14,16,20,20,21,32,23,23,19,19,9,10,20,11,10,9,15,10,17,13,15,14,12,17,15,14,11,11,10,15,14,11,10,14,15,14,11,18,19,35,48,54,29,17,25,19,23,24,20,14,37,38,24,30,27,22,23,27,39,54,44,35,26,41,27,39,51,31,40,60,24,23,53,26,44,15,20,50,53,29

Figure 1 shows the QQ-plots for the original data and log-transformed data. These plots showthat the distribution of PM2.5 concentrations are positively skewed, and the logarithmically trans-formed data are approximately symmetric. The Shapiro-Wilk tests for the normality on the log-transformed data give a p-value of 0.3718 for the Bangkapi area and 0.0610 for the Dindaengarea, while the same tests on the original data give a p-value of 4.55e-05 for the Bangkapi areaand 4.118e-06 for the Dindaeng area. So, the log- transformation normalizes the data, and thesummary statistics of the log-transformed of the PM2.5 concentration data are given in Table 4.

Table 5 gives the 95% two-sided confidence intervals for the w based on the proposed CI con-struction approaches. This result shows that the NA confidence interval has a shorter length sizethan the MOVER and GCI, respectively. The results are consistent with the results of our simula-tion study (see Table 3). As previously mentioned, the NA approach is consistent for thisexample; we obtain a 95% CI for the ratio of medians that is equal to (0.7267, 0.9915). Theresults can be interpreted as the median (average) of the PM2.5 mass concentration in theBangkapi area is less than the Dindaeng area for the period from March 2019 to February 2020.It also means that the chances of death of people living in the Bangkapi area affected by PM2.5might be less than in the Dindaeng area.

Table 3. Coverage probabilities and Average length of 95% CIs for w when ðr21,r22Þ ¼ ð0:30, 0:20Þ:

Parameters (n1,n2)

Coverage probabilities Average length

NA MOVER GCI NA MOVER GCI

ðl1 ¼ 1,l2 ¼ 1Þ (25) 0.9408 0.9451 0.9544 0.5278 0.5340 0.5602(25,40) 0.9418 0.9437 0.9537 0.4928 0.4979 0.5201(25,50) 0.9379 0.9386 0.9486 0.4798 0.4845 0.5060(25,100) 0.9357 0.9384 0.9478 0.4553 0.4594 0.4805(25,40) 0.9433 0.9443 0.9529 0.4572 0.4612 0.4784(40) 0.9472 0.9473 0.9527 0.4165 0.4195 0.4314(40,50) 0.9487 0.9485 0.9546 0.4026 0.4054 0.4162(40,100) 0.9386 0.943 0.9479 0.3717 0.3739 0.3837(25,50) 0.9422 0.9445 0.9515 0.4300 0.4333 0.4485(40,50) 0.9479 0.9475 0.9534 0.3889 0.3913 0.4009(50) 0.9446 0.9462 0.9495 0.3721 0.3743 0.3825(50,100) 0.9421 0.9443 0.9482 0.3389 0.3406 0.3474(100,25) 0.9452 0.9462 0.9547 0.3716 0.3738 0.3870(100,40) 0.9517 0.9511 0.9549 0.3221 0.3235 0.3300(100,50) 0.944 0.9438 0.9477 0.3048 0.3060 0.3108(100,100) 0.9475 0.9461 0.9492 0.2633 0.2640 0.2666

ðl1 ¼ 3,l2 ¼ 3Þ (25) 0.9418 0.9454 0.9548 0.5279 0.5341 0.5602(25,40) 0.9417 0.9411 0.9495 0.4932 0.4983 0.5205(25,50) 0.9419 0.9433 0.9517 0.4809 0.4857 0.5073(25,100) 0.9373 0.938 0.9492 0.4545 0.4586 0.4798(25,40) 0.9476 0.9474 0.9546 0.4571 0.4611 0.4781(40) 0.9458 0.9457 0.9515 0.4171 0.4201 0.4321(40,50) 0.9455 0.9456 0.951 0.4020 0.4047 0.4156(40,100) 0.9443 0.9453 0.9507 0.3721 0.3743 0.3841(25,50) 0.9392 0.942 0.95 0.4303 0.4337 0.4487(40,50) 0.9496 0.9505 0.9548 0.3887 0.3912 0.4008(50) 0.9471 0.9463 0.9506 0.3726 0.3748 0.3829(50,100) 0.9456 0.9455 0.9505 0.3406 0.3423 0.3491(100,25) 0.9419 0.9413 0.9495 0.3715 0.3736 0.3870(100,40) 0.9456 0.9453 0.9483 0.3225 0.3239 0.3302(100,50) 0.9463 0.9462 0.9503 0.3039 0.3051 0.3099(100,100) 0.9495 0.952 0.9534 0.2633 0.2641 0.2666

8 L. SINGHASOMBOON ET AL.

Page 10: Confidence intervals for the ratio of medians of two ...uregina.ca/~volodin/Confidence.pdfnormal distributions; for example, the interval for a single log-normal mean has been addressed

5. Concluding remarks

In this paper, we focus on the construction of confidence intervals for the ratio of medians oftwo independent, log-normal distributions based on the NA, MOVER and GCI approaches. Theperformances of the proposed CIs are compared in terms of the coverage probabilities (CP) andaverage length (AL) in our simulation study sections. The results show that the GCI confidenceinterval can be preferred generally in terms of CP; however, the AL is widest for all situations.The NA approach is recommended for moderate to large sample sizes when li and r2i are smallvalues. In addition, the MOVER approach may not be reliable when r2i has unequal values for allsample sizes.

Acknowledgments

We would also like to thank the editor and referees for their valuable suggestions on the revision of this paper.

Figure 1. Quantile plots of PM2.5 mass concentration data and log of PM2.5 mass concentration data of both areas.

Table 4. The summary statistics of the log-transformed of the PM2.5 mass concentration data ofboth areas.

Areas ni li r2iBangkapi 90 2.9286 0.3145Dindaeng 90 3.0805 0.2417

Table 5. The 95% confidence intervals for the w based on the proposed CI construc-tion approaches.

Methods Confidence interval Length

NA (0.7267, 0.9915) 0.2647MOVER (0.7364, 1.0022) 0.2658GCI (0.7355, 1.0034) 0.2679

COMMUNICATIONS IN STATISTICS - SIMULATION AND COMPUTATIONVR 9

Page 11: Confidence intervals for the ratio of medians of two ...uregina.ca/~volodin/Confidence.pdfnormal distributions; for example, the interval for a single log-normal mean has been addressed

Funding

The research of the last author is partially supported by Center of the Volga Region Federal District (project num-ber 075-02-2020-1478). The first author gratefully acknowledges the financial support provided by ScienceAchievement Scholarship of Thailand.

References

Angus, J. E. 1988. Inferences on the lognormal mean for complete samples. Communications in Statistics -Simulation and Computation 17 (4):1307–31. doi:10.1080/03610918808812726.

Angus, J. E. 1994. Bootstrap one-sided confidence intervals for the lognormal mean. The Statistician 43 (3):395–401. doi:10.2307/2348575.

Baklizi, A., and M. Ebrahem. 2005. Interval estimation of common lognormal mean of several populations. Journalof Probability and Statistical Science 3 (1):1–16.

Behboodian, J., and A. Jafari. 2006. Generalized inference for the common mean of several lognormal populations.Journal of Statistical Theory and Applications 5 (3):240–59.

Krishnamoorthy, K., and T. Mathew. 2003. Inferences on the means of lognormal distributions using generalizedp-values and generalized confidence intervals. Journal of Statistical Planning and Inference 115 (1):103–20. doi:10.1016/S0378-3758(02)00153-2.

Land, C. E. 1972. An evaluation of approximate confidence interval estimation methods for lognormal means.Technometrics 14 (1):145–58. doi:10.1080/00401706.1972.10488891.

Lin, S. H., and R. S. Wang. 2013. Modified method on the means for several log-normal distributions. Journal ofApplied Statistics 40 (1):194–208. doi:10.1080/02664763.2012.740622.

Malekzadeh, A., and M. Kharrati-Kopaei. 2018. Inferences on the common mean of several heterogeneous log-nor-mal distributions. Journal of Applied Statistics 33:1–18.

Pollution Control Department, Ministry of Science Technology and Environment, Bangkok, Thailand. 1999. Reporton situation and management of air pollution and Noise in Thailand 1998. Bangkok, Thailand: Ministry ofScience Technology and Environment.

Rao, K. A., and J. G. D’Cunha. 2016. Bayesian inference for median of the lognormal distribution. Journal ofModern Applied Statistical Methods 15 (2):526–35. doi:10.22237/jmasm/1478003400.

Tian, L., and J. Wu. 2007. Inferences on the common mean of several log-normal populations: The generalizedvariable approach. Biometrical Journal. Biometrische Zeitschrift 49 (6):944–51. doi:10.1002/bimj.200710391.

Weerahandi, S. 1993. Generalized confidence intervals. Journal of the American Statistical Association 88 (423):899–905. doi:10.1080/01621459.1993.10476355.

Wu, J., G. Jiang, A. C. M. Wong, and X. Sun. 2002. Likelihood analysis for the ratio of means of two independentlog-normal distributions. Biometrics 58 (2):463–69. doi:10.1111/j.0006-341x.2002.00463.x.

Zellner, A. 1971. Bayesian and non-Bayesian analysis of the log-normal distribution and log-normal regression.Journal of the American Statistical Association 66 (334):327–30. doi:10.1080/01621459.1971.10482263.

Zhou, X. H., and S. Gao. 1997. Confidence intervals for the lognormal mean. Statistics in Medicine 16 (7):783–90.doi:10.1002/(SICI)1097-0258(19970415)16:7<783::AID-SIM488>3.0.CO;2-2.

Zhou, X.-H., and W. Tu. 2000. Interval estimation for the ratio in means of log-normally distributed medical costswith zero values. Computational Statistics & Data Analysis 35 (2):201–10. doi:10.1016/S0167-9473(00)00009-8.

Zou, G. Y. 2008. On the estimation of additive interaction by use of the four-by-two table and beyond. AmericanJournal of Epidemiology 168 (2):212–24. doi:10.1093/aje/kwn104.

Zou, G. Y., and A. Donner. 2008. Construction of confidence limits about effect measures: A general approach.Statistics in Medicine 27 (10):1693–702. doi:10.1002/sim.3095.

10 L. SINGHASOMBOON ET AL.