Kumaraswamy’s distribution A beta-type distribution with

12
Statistical Methodology 6 (2009) 70–81 Contents lists available at ScienceDirect Statistical Methodology journal homepage: www.elsevier.com/locate/stamet Kumaraswamy’s distribution: A beta-type distribution with some tractability advantages M.C. Jones * Department of Mathematics and Statistics, The Open University, Walton Hall, Milton Keynes MK7 6AA, UK article info Article history: Received 10 December 2007 Received in revised form 1 April 2008 Accepted 2 April 2008 Keywords: Beta distribution Distribution theory Minimax distribution Minimum of maxima Order statistics abstract A two-parameter family of distributions on (0, 1) is explored which has many similarities to the beta distribution and a number of advantages in terms of tractability (it also, of course, has some disadvantages). Kumaraswamy’s distribution has its genesis in terms of uniform order statistics, and has particularly straightforward distribution and quantile functions which do not depend on special functions (and hence afford very easy random variate generation). The distribution might, therefore, have a particular role when a quantile-based approach to statistical modelling is taken, and its tractability has appeal for pedagogical uses. To date, the distribution has seen only limited use and development in the hydrological literature. © 2008 Elsevier B.V. All rights reserved. 1. Introduction Despite the many alternatives and generalisations [20,27], it remains fair to say that the beta distribution provides the premier family of continuous distributions on bounded support (which is taken to be (0, 1)). The beta distribution, Beta(a, b), has density g(x) = 1 B(a, b) x a-1 (1 - x) b-1 , 0 < x < 1, (1.1) where its two shape parameters a and b are positive and B(·, ·) is the beta function. Beta densities are unimodal, uniantimodal, increasing, decreasing or constant depending on the values of a and b relative to 1 and have a host of other attractive properties ([15], Chapter 25). The beta distribution is fairly tractable, but in some ways not fabulously so; in particular, its distribution function is an incomplete beta function ratio and its quantile function the inverse thereof. * Tel.: +44 1908 652209; fax: +44 1908 655515. E-mail address: [email protected]. 1572-3127/$ – see front matter © 2008 Elsevier B.V. All rights reserved. doi:10.1016/j.stamet.2008.04.001

Transcript of Kumaraswamy’s distribution A beta-type distribution with

Page 1: Kumaraswamy’s distribution A beta-type distribution with

Statistical Methodology 6 (2009) 70–81

Contents lists available at ScienceDirect

Statistical Methodology

journal homepage: www.elsevier.com/locate/stamet

Kumaraswamy’s distribution: A beta-type distribution withsome tractability advantagesM.C. Jones ∗Department of Mathematics and Statistics, The Open University, Walton Hall, Milton Keynes MK7 6AA, UK

a r t i c l e i n f o

Article history:Received 10 December 2007Received in revised form1 April 2008Accepted 2 April 2008

Keywords:Beta distributionDistribution theoryMinimax distributionMinimum of maximaOrder statistics

a b s t r a c t

A two-parameter family of distributions on (0, 1) is exploredwhich has many similarities to the beta distribution and anumber of advantages in terms of tractability (it also, of course,has some disadvantages). Kumaraswamy’s distribution has itsgenesis in terms of uniform order statistics, and has particularlystraightforward distribution and quantile functions which donot depend on special functions (and hence afford very easyrandom variate generation). The distribution might, therefore, havea particular role when a quantile-based approach to statisticalmodelling is taken, and its tractability has appeal for pedagogicaluses. To date, the distribution has seen only limited use anddevelopment in the hydrological literature.

© 2008 Elsevier B.V. All rights reserved.

1. Introduction

Despite the many alternatives and generalisations [20,27], it remains fair to say that the betadistribution provides the premier family of continuous distributions on bounded support (which istaken to be (0, 1)). The beta distribution, Beta(a, b), has density

g(x) =1

B(a, b)xa−1(1− x)b−1, 0 < x < 1, (1.1)

where its two shape parameters a and b are positive and B(·, ·) is the beta function. Beta densitiesare unimodal, uniantimodal, increasing, decreasing or constant depending on the values of a and brelative to 1 and have a host of other attractive properties ([15], Chapter 25). The beta distributionis fairly tractable, but in some ways not fabulously so; in particular, its distribution function is anincomplete beta function ratio and its quantile function the inverse thereof.

∗ Tel.: +44 1908 652209; fax: +44 1908 655515.E-mail address: [email protected].

1572-3127/$ – see front matter © 2008 Elsevier B.V. All rights reserved.doi:10.1016/j.stamet.2008.04.001

Page 2: Kumaraswamy’s distribution A beta-type distribution with

M.C. Jones / Statistical Methodology 6 (2009) 70–81 71

In this paper, I take a look at an alternative two-parameter distribution on (0, 1) which I willcall Kumaraswamy’s distribution, Kumaraswamy(α,β), where I have denoted its two positive shapeparameters α and β. It has many of the same properties as the beta distribution but has someadvantages in terms of tractability. Its density is

f (x) = f (x;α,β) = αβxα−1(1− xα)β−1, 0 < x < 1. (1.2)

Alert readers might recognise it in some way, especially if they are familiar with the hydrologicalliterature where it dates back to [22]. However, the distribution does not seem to be very familiarto statisticians, has not been investigated systematically in much detail before, nor has its relativeinterchangeability with the beta distribution been widely appreciated. For example, Kumaraswamy’sdensities are also unimodal, uniantimodal, increasing, decreasing or constant depending in the sameway as the beta distribution on the values of its parameters. (Boundary behaviour and the mainspecial cases are also common to both beta and Kumaraswamy’s distribution.) And yet the normalisingconstant in (1.2) is very simple and the corresponding distribution and quantile functions also needno special functions. The latter gives Kumaraswamy’s distribution an advantage if viewed from thequantile modelling perspective popular in some quarters [28,7]. Some other properties are alsomore readily available, mathematically, than their counterparts for the beta distribution. Yet thebeta distribution also has its particular advantages and I hesitate to claim whether, in the end,the tractability advantages of Kumaraswamy’s distribution will prove to be of immense practicalsignificance in statistics; at the very least, Kumaraswamy’s distribution might find a pedagogical role.

The background and genesis of Kumaraswamy’s distribution are given in Section 2 along with itsprincipal special cases. The basic properties of Kumaraswamy’s distribution are given in Section 3whose easy reading reflects the tractability of the distribution. A deeper investigation of the skewnessand kurtosis properties of the distribution is given in Section 4. Inference by maximum likelihoodis investigated in Section 5 while, in Section 6, a number of further related distributions arebriefly considered. It should be the case that similarities and differences between the beta andKumaraswamy’s distributions are made clear as the paper progresses but, in any case, they aresummarised and discussed a little more in the closing Section 7.

For references to the hydrological literature on Kumaraswamy’s distribution, see [26]. The currentarticle gives a much more complete account of the properties of the Kumaraswamy distribution thanany previous publication, however.

Note that the linear transformation `+ (u− `)X moves a random variable X on (0, 1) to any otherbounded support (`, u). So, provided ` and u don’t depend on α or β and are known, there is no needto mention such an extension further.

2. Genesis, forebears and special cases

Temporarily, set a = m, b = n + 1 − m in the beta distribution, where m and n are positiveintegers. Then, as is well known, the beta distribution is the distribution of the m’th order statisticfrom a random sample of size n from the uniform distribution (on (0, 1)). Now consider another simpleconstruction involving uniform order statistics. Take a set of n independent random samples each ofsize m from the uniform distribution and collect their maxima; take X, say, to be the minimum ofthe set of maxima; then X has Kumaraswamy’s distribution with parameters m and n. (Likewise, bysymmetry, one minus the maximum of a set of n minima of independent uniform random samplesof size m also has Kumaraswamy’s distribution.) A more descriptive name might be the minimaxdistribution which I used in an earlier version of this paper and under which name the distribution islisted in the recent work of [23].

Now, the maxima themselves constitute a random sample of size n from the power functiondistribution which is the Beta(m, 1) distribution (it has density mxm−1, 0 < x < 1). Hence X is also theminimum of a random sample from the power function distribution.

As with the beta distribution, for greater generality the integer-valued parameters m and n may bereplaced in Kumaraswamy’s distribution by real-valued, positive, parameters α and β.

Kumaraswamy [22] was interested in distributions for hydrological random variables and actuallyproposed a mixture of a probability mass, F0, at zero and density (1.2) over (0, 1), although I am using

Page 3: Kumaraswamy’s distribution A beta-type distribution with

72 M.C. Jones / Statistical Methodology 6 (2009) 70–81

the terminology ”Kumaraswamy’s distribution” to refer solely to the latter. Kumaraswamy gave anumber of the basic properties of the distribution but made no mention of its comparison with thebeta distribution. The ”double power” distribution described by [21] is that of one minus a randomvariable with Kumaraswamy’s distribution, but they were interested only in α,β > 1.

It is also clear that both beta and Kumaraswamy distributions are special cases of the three-parameter distribution with density

g(x) =p

B(γ, δ)xγp−1(1− xp)δ−1, 0 < x < 1, (2.1)

and p > 0. This is the generalised beta distribution of [25]. It is the distribution of the 1/pth power ofa Beta(γ, δ) random variable or of the γth order statistic of a sample of size γ + δ− 1 from the powerfunction distribution Beta(p, 1) (for γ, δ integer). (The order statistics version of the generalised betadensity can be found, for example, in Example 2.2.2 of [1].) Beta and Kumaraswamy distributionsare therefore the (p, γ, δ) = (1, a, b) and (α, 1,β) special cases, respectively, of (2.1). This point isalso made by [26]. The similarities between beta and Kumaraswamy distributions that will becomeclear through the rest of this article lead me, however, to the conclusion that this generalised betadistribution is not very useful in practice, since it produces very similar distributions for quite differentparameter values.

The beta and Kumaraswamy distributions share their main special cases. Beta(a, 1) andKumaraswamy(α, 1) distributions are both the power function distribution mentioned above andBeta(1, a) and Kumaraswamy(1,α) distributions are both the distribution of one minus that powerfunction random variable. Beta(1, 1) and Kumaraswamy(1, 1) distributions are both the uniformdistribution.

A further special case of the Kumaraswamy distribution has also appeared elsewhere. TheKumaraswamy(2,β) distribution is that of the ”generating variate” R =

√x2

1 + x22 when {x1, x2} follow

a bivariate Pearson Type II distribution ([5], Section 3.4.1). This has been used in an algorithm togenerate univariate symmetric beta random variates [31,4, p. 436].

3. Basic properties

The distribution function of the Kumaraswamy distribution is

F(x) = 1− (1− xα)β, 0 < x < 1. (3.1)

This compares extremely favourably in terms of simplicity with the beta distribution’s incompletebeta function ratio.

The distribution function is readily invertible to yield the quantile function

Q(y) = F−1(y) = {1− (1− y)1/β}

1/α, 0 < y < 1. (3.2)

As already mentioned, this facilitates ready quantile-based statistical modelling [28,7]. Moreover, Iknow of no other two-parameter quantile family on (0, 1) so simply defined and yet with such goodbehaviour (as follows, with an explicit simple density function to boot!). The popular generalisedlambda family ([29,7, Section 7.3]) with quantile function yγ−(1−y)δ has support (0, 1) forγ, δ > 0 butencompasses bimodality, repeated incarnations of the uniform distribution and complicated patternsof skewness and kurtosis (e.g. [19]). A better competitor, almost as tractable as the Kumaraswamydistribution and with some similar properties, is the LB distribution [30].

Formula (3.2) also facilitates trivial random variate generation (as noted by [22]). If U ∼ U(0, 1),then X ∼ f if

X = (1− U1/β)1/α. (3.3)

This compares extremely favourably with the sophisticated algorithms preferred to generate randomvariates from the beta distribution ([4], Section IX.4, [15], Section 25.2).

Page 4: Kumaraswamy’s distribution A beta-type distribution with

M.C. Jones / Statistical Methodology 6 (2009) 70–81 73

Fig. 1. Some Kumaraswamy densities. α = 5,β = 2: left-skewed unimodal density; α = 2,β = 2.5: almost symmetricunimodal density; α = 1/2,β = 1/2: uniantimodal density; α = 1,β = 3: decreasing density.

It can be shown that the Kumaraswamy distribution has the same basic shape properties as thebeta distribution, namely:

α > 1,β > 1⇒ unimodal; α < 1,β < 1⇒ uniantimodal;α > 1,β ≤ 1⇒ increasing; α ≤ 1,β > 1⇒ decreasing;α = β = 1⇒ constant

([22] again). In the first two cases, the mode/antimode is at

x0 =

(α− 1αβ− 1

)1/α

. (3.4)

Both beta and Kumaraswamy densities are log-concave if and only if both their parameters are greaterthan or equal to 1.

The behaviour of the Kumaraswamy density also matches that of the beta density at the boundariesof their support:

f (x) ∼ xα−1 as x→ 0;f (x) ∼ (1− x)β−1 as x→ 1.

Some illustrative examples of Kumaraswamy densities are plotted in Fig. 1. Note the similarity toanalogous depictions of the beta family (e.g. [15], Figure 25.1b).

In [16], I suggested swapping the roles of distribution and quantile functions for distributions on(0, 1) to produce ”complementary distributions”. There, I applied the idea to the beta distribution toproduce the complementary beta distribution. Applying the same idea here, a pleasing symmetry isobserved that means that nothing is really gained. The density of the complementary distribution isthe quantile density function, q(y) = Q ′(y), of the Kumaraswamy distribution, and

q(y) =1αβ

(1− y)(1/β)−1{1− (1− y)1/β

}(1/α)−1

= f (1− y; 1/β, 1/α).

The moments of the Kumaraswamy distribution are both immediate and obtainable from [22,25,21,6] or [1, p. 15]:

E(Xr) = βB(

1+r

α,β

). (3.5)

Like those of the beta distribution they exist for all r > −α. I note the first appearance of aspecial function, although it is still only the (complete) beta function. (It might be interesting to

Page 5: Kumaraswamy’s distribution A beta-type distribution with

74 M.C. Jones / Statistical Methodology 6 (2009) 70–81

note the moment formula in terms of the binomial coefficient with non-integer arguments: E(Xr) =

1/(β+(r/α)

β

).) In particular,

E(X) = βB(

1+1α

);

Var(X) = βB(

1+2α

)−

{βB

(1+

r

α,β

)}2.

The first L-moment is the mean. Expressions for higher L-moments [12,14] are explicitly available— an improvement on the situation for the beta distribution. However, the general formula for the rthL-moment is a sum of r terms involving several gamma functions; see Appendix A for details. Here,just the scale measure which is the second L-moment (half the Gini mean difference) is given:

λ2 = β

{B(

1+1α

)− 2B

(1+

, 2β)}

.

Distributions of order statistics and their moments are also relatively tractable but not especiallyedifying (but this too is an improvement on the situation for the beta distribution). They are dealt withbriefly in Appendix B.

4. Skewness and kurtosis

4.1. Skewness

A strong skewness ordering between distributions is the classical one due to [32]. It is immediatethat skewness to the right increases, in this sense, with decreasing α for fixed β. This is because thetransformation from X1 ∼ Kumaraswamy(α1,β) to a Kumaraswamy(α2,β) random variate is of theform X

α1/α21 which is convex for α1 > α2. There seems to be no such simple property for changing β

and fixed α.Various scalar skewness measures can be plotted as functions of α and β. For example, the L-

skewness ([12]; formula from Appendix A),

τ3 =λ3

λ2=

B(

1+ 1α,β)− 6B

(1+ 1

α, 2β

)+ 6B

(1+ 1

α, 3β

)B(

1+ 1α,β)− 2B

(1+ 1

α, 2β

) ,

is shown in Fig. 2. Note the increasing nature of the skewness as Fig. 2 is traversed diagonally frombottom right to top left. Of course, this measure respects the van Zwet ordering, being a decreasingfunction of α for fixed β. Similar patterns arise for the classical third-moment measure and thequantile-based skewness measure, {Q(3/4) − 2Q(1/2) + Q(1/4)}/{Q(3/4) − Q(1/4)} [3]; these arenot shown. For α,β > 1, one can also plot the skewness measure 1 − 2F(x0) [2], with similar resultsagain (not shown).

The pattern in Fig. 2 is broadly in line with similar pictures for the beta distribution, although thelatter are symmetric about the line a = b. They both contrast with the more complicated patterns ofskewness associated with the generalised lambda distribution ([19], Figure 7(a), (c) for τ3).

4.2. Symmetry and near-symmetry

Perhaps the least attractive feature of the Kumaraswamy distribution, at least at first thought, isthat, unlike the beta distribution, it has no symmetric special cases other than the uniform distribution(α = β = 1). It does, however, have a range of almost-symmetric special cases. Some of these areshown in Fig. 3. They correspond to zero skewness in the sense of [2]. I chose this measure (and usedit for uniantimodal densities as well as unimodal ones) because it has a simple formula:

γ = 1− 2F(x0) = 2{α(β− 1)

αβ− 1

}β− 1;

Page 6: Kumaraswamy’s distribution A beta-type distribution with

M.C. Jones / Statistical Methodology 6 (2009) 70–81 75

Fig. 2. L-skewness for the Kumaraswamy distribution, plotted as a function of log(α) and log(β). The straight linessuperimposed on the plot correspond to α = 1 and β = 1.

Fig. 3. Some almost-symmetric Kumaraswamy densities. In order of decreasing height of mode/antimode: (i) α = 3.211,β =

100; (ii) α = 2.470,β = 5; (iii) α = 1.707,β = 2; (iv) α = β = 1 (the symmetric uniform density); (v) α = 2/5,β = 1/2;(vi) α = 0.082,β = 1/4.

it is easy to see that γ = 0 whenever

α = 1/{β− (β− 1)21/β

}.

The zero curves for other skewness measures follow very similar trajectories.The densities in Fig. 3 are indeed reasonably symmetric and, provided α and β are not too big,

resemble symmetric beta distributions. It is only when β becomes large that, while almost-symmetryis retained (as is a small variance), something happens which is different from – but not necessarily lessdesirable than – the beta distribution: the modal location, (3.4), shifts away from the centre. (Almost-symmetry about a point towards the right of the unit interval would necessitate f (1− x).)

Page 7: Kumaraswamy’s distribution A beta-type distribution with

76 M.C. Jones / Statistical Methodology 6 (2009) 70–81

Fig. 4. L-kurtosis for the Kumaraswamy distribution, plotted as a function of log(α) and log(β). The straight lines superimposedon the plot correspond to α = 1 and β = 1.

4.3. Kurtosis

The L-kurtosis ([12]; formula from Appendix A), defined by τ4 = λ4/λ2, is

τ4 =B(

1+ 1α,β)− 12B

(1+ 1

α, 2β

)+ 30B

(1+ 1

α, 3β

)− 20B

(1+ 1

α, 4β

)B(

1+ 1α,β)− 2B

(1+ 1

α, 2β

) ;

it is plotted as a function ofα andβ in Fig. 4. Notice how the L-kurtosis increases along the line of ‘near-symmetry’ as the parameters get bigger. It also increases as one goes away from near-symmetry wheneither parameter decreases. This behaviour is also observed for the classical fourth-moment and thethird-difference-quantile-based kurtosis measures (not shown) as well as being, once more, in linewith similar pictures for the beta distribution, except for the symmetry about the diagonal line there.Again, the generalised lambda distribution on (0, 1) displays a more complicated pattern of kurtosis([19], Figure 7(b), (d) for τ4).

5. Likelihood inference

In this section, I consider maximum likelihood estimation for the Kumaraswamy distribution;maximum likelihood has not been considered for this distribution before. Let X1, . . . , Xn be a randomsample from the Kumaraswamy distribution and let circumflexes denote maximum likelihoodestimates of parameters. Differentiating the log likelihood with respect to β leads immediately tothe relation

β = −n

/n∑

i=1log(1− Xαi ). (5.1)

I can now concentrate on the equation to be satisfied by α arising from substituting for β in the otherscore equation ([24], p. 762). This is

S(α) ≡n

α

{1+ T1(α)+

T2(α)

T3(α)

}= 0, (5.2)

Page 8: Kumaraswamy’s distribution A beta-type distribution with

M.C. Jones / Statistical Methodology 6 (2009) 70–81 77

where

T1(α) = n−1n∑

i=1

log Yi

1− Yi, T2(α) = n−1

n∑i=1

Yi log Yi

1− Yi, T3(α) = n−1

n∑i=1

log(1− Yi)

and Yi = Xαi , i = 1, . . . , n, remembering that each Yi is a function of α. It is shown in Appendix C thatlimα→0 S(α) > 0 and limα→∞ S(α) < 0 and so, given the continuity of S(α), there is at least one zeroof (5.2) in (0,∞). However, although it is possible to obtain various further properties of Tj(α) and itsderivatives, j = 1, 2, 3, I have been able to derive no further general properties of the shape of S(α).The safest approach, therefore, is to evaluate S(α) over a fine grid of values (on the logα scale), plotit if desired, identify all zeroes of the function and then evaluate the log likelihood at those zeroes toidentify its global maximum.

Properties of the maximum likelihood estimates thus obtained follow from the usual asymptoticlikelihood theory. It is readily shown that the elements of the observed information matrix are:

Iαα =n

α2 +(β− 1)

α2

n∑i=1

Yi(log Yi)2

(1− Yi)2 ; Iαβ =n

αT2(α); Iββ =

n

β2 .

And a little more manipulation taking advantage of the fact that Y ∼ Beta(1,β) gives the elements ofthe expected information matrix as:

n−1Iαα =A

α2 ; n−1Iαβ =B

α; n−1Iββ =

1β2 , (5.3)

where

A = A(β) = 1+β

β− 2

[{ψ(β)−ψ(2)}2 − {ψ′(β)−ψ′(2)}

]and

B = B(β) = −{ψ(β+ 1)−ψ(2)}/(β− 1) < 0.

Here, ψ(z) = d log Γ(z)/dz is the digamma function.It follows from (5.3) that, asymptotically,

n−1Var(α) 'α2

A− β2B2 ; n−1Var(β) 'β2A

A− β2B2 ; Corr(α, β) '−βB√A> 0.

Notice that the standard deviation of α is proportional to α and (it seems, numerically) the constantof proportionality decreases with β; the standard deviation of β is independent of α and increaseswith β. The correlation between the parameter estimates does not depend on α and increases with βfrom somewhere around 1/4 for small β to 1 for large β; this behaviour is very similar to that of thecorrelation between maximum likelihood estimators of the parameters of the beta distribution tracedalong the line a = b.

6. Related distributions

6.1. Limiting distributions

It is straightforward to see that if the Kumaraswamy distribution is normalised by looking at thedistribution of Y = β1/αX (on (0,β1/α)), then its density,αyα−1

{1−(yα/β)}β−1, tends toαyα−1 exp(−yα)(on (0,∞)), the density of the Weibull distribution, as β→∞. This is the interesting analogue of thegamma limit that arises in similar circumstances in the case of the beta distribution.

Similarly, the distribution of Z = α(1− X) has limiting density

βe−z(1− e−z)β−1 (6.1)

(on (0,∞)) as α→∞. This is the distribution of minus the logarithm of the Beta(1,β) power functiondistribution (since for finite α, Z is minus the Box–Cox transformation with power 1/α of a Beta(1,β)

Page 9: Kumaraswamy’s distribution A beta-type distribution with

78 M.C. Jones / Statistical Methodology 6 (2009) 70–81

random variable). This distribution has recently become quite popular under the name ”generalisedexponential distribution” [9,10].

As both α and β become large (in any relationship to one another), the limit associated with theoverall normalisation α(1− β1/αX) is the extreme value density e−x exp(−e−x).

The exact distribution of the overall normalised random variable used above is, in fact, the kappadistribution of [13]. Notice, however, that this observation does not invalidate the novelty of theremainder of the paper since the support of the kappa distribution depends on α and β while thatof the Kumaraswamy distribution does not. All of the limiting distributions correspond, of course,with those mentioned in [13] or obtained from extreme value theory when α = m,β = n.

6.2. Distributions related by transformation

Let us ignore the glib implication of the subsection title which is ”all of them” and considertransformations from a limited class. First, let us briefly consider some of the most obvioustransformations to positive half-line and whole real line supports. These might include the oddsratio Y = X/(1 − X) from (0, 1) to (0,∞) and what I have argued [18] is a natural extensionZ = (Y − (1/Y))/2 from (0,∞) to (−∞,∞) which maintain the power tails of the Kumaraswamydistribution. Alternatively, take logs (Y = − log(1 − X), Z = log Y) to decrease tailweights with eachtransformation. Or combine the two. In particular, the Kumaraswamy odds distribution is quite aninteresting heavy-tailed alternative to the F and generalized Pareto distributions. It has density

αβyα−1

(1+ y)αβ−1

{(1+ y)α − yα

}β−1, y > 0.

On the other hand, Y = − log(1 − X) yields a scaled version of the exponentially-tailed distributionwith density (6.1).

A natural way of generating families of distributions on some other support from a simple startingdistribution with density h and distribution function H, say, is to apply the quantile function H−1

to a family of distributions on (0, 1). See [17] for this idea applied to the beta distribution. For theKumaraswamy distributions, the resulting family has density

αβh(x)Hα−1(x){1− Hα(x)}β−1.

The transformations of the previous paragraph can, of course, be interpreted in this light, andare amongst the simplest transformations of this type. I particularly like to generate families ofdistributions on the whole real line from a symmetric H, α and β then becoming the shape parametersassociated with the family.

7. Conclusions

To assist the reader in deciding whether the Kumaraswamy distribution might be of use to himor her in terms of either research or teaching, I summarise the pros, cons and equivalences betweenthe two below. (Some of the pros of the beta distribution have not been mentioned previously in thispaper.)

The Kumaraswamy and beta distributions have the following attributes in common:

• their general shapes (unimodal, uniantimodal, monotone or constant) and the dependence of thoseshapes on the values of their parameters;• power function and uniform distributions as special cases;• straightforward interpretations in terms of order statistics from the uniform distribution;• explicit expressions for the mode/antimode (where appropriate);• behaviour of densities as x→ 0, 1;• good behaviour of skewness and kurtosis measures as functions of the parameters of the

distribution;• broadly similar maximum likelihood estimation;• simple standard (if different) limiting distributions.

Page 10: Kumaraswamy’s distribution A beta-type distribution with

M.C. Jones / Statistical Methodology 6 (2009) 70–81 79

The Kumaraswamy distribution has the following advantages over the beta distribution:

• a simple explicit formula for its distribution function not involving any special functions;• ditto for the quantile function;• as a consequence of the simplicity of the quantile function, a simple formula for random variate

generation;• explicit formulae for L-moments;• simpler formulae for moments of order statistics.

The beta distribution has the following advantages over the Kumaraswamy distribution:

• simpler formulae for moments and the moment generating function;• a one-parameter subfamily of symmetric distributions;• simpler moment estimation;• more ways of generating the distribution via physical processes;• in Bayesian analysis, conjugacy with a simple distribution, the binomial.

This paper has offered the Kumaraswamy distribution as a viable alternative to the betadistribution that shares many of the latter’s properties while being easier to handle in several ways.I cautiously commend it to the reader. However, the Kumaraswamy distribution is certainly notsuperior to the beta distribution in every way! But it might be worth consideration from time to timeby researchers who wish to utilise one or more of its simpler properties (e.g. quantile-based work andrandom variate generation) and it provides a useful example for the teaching of distributions.

Acknowledgements

I am very grateful to an anonymous referee of the first version of this paper for pointing out thehistory of this distribution in the hydrological literature, to Dr. Daniel Henderson for later drawingmy attention to the existence of an anonymous Wikipedia article on Kumaraswamy’s distributionat http://en.wikipedia.org/wiki/Kumaraswamy_distribution, and to the review team at StatisticalMethodology for prompting final polishes to the presentation.

Appendix A. L-moments

A general form for the L-moments for r ≥ 2 is

λr =1r

r−2∑j=0

(−1)j(r − 2

j

)(r

j+ 1

)J(r − 1− j, j+ 1),

where

J(i1, i2) =∫ 1

0F i1(x)(1− F)i2(x)dx.

[11]. In the case of the Kumaraswamy distribution

J(i1, i2) =∫ 1

0{1− (1− xα)β}i1(1− xα)βi2 dx

=1α

∫ 1

0{1− wβ}i1wβi2(1− w)(1/α)−1dw

=1α

i1∑k=0

(i1k

)(−1)kB

(β(k+ i2)+ 1,

)

= βi1∑

k=0

(i1k

)(−1)k(k+ i2)B

(β(k+ i2), 1+

).

Page 11: Kumaraswamy’s distribution A beta-type distribution with

80 M.C. Jones / Statistical Methodology 6 (2009) 70–81

Therefore,

λr =β

r

r−2∑j=0

r−j−1∑k=0

(−1)j+k(r − 2

j

)(r

j+ 1

)(r − 1− j

k

)(k+ j+ 1)B

(β(k+ j+ 1), 1+

)

r

r∑`=1

(−1)`−1`

{min(`−1,r−2)∑

j=0

(r − 2

j

)(r

j+ 1

)(r − 1− j

r − `

)}B(β`, 1+

)

r

r∑`=1

(−1)`−1`

(r

`

){min(`−1,r−2)∑j=0

(r − 2

j

)(`

j+ 1

)}B(β`, 1+

)

r

r∑`=1

(−1)`−1`

(r

`

)(r + `− 2

r − 1

)B(β`, 1+

).

The final equality follows from (3.20) and (3.29) of [8].

Appendix B. Moments of order statistics

The ith order statistic, Xi:n, of a random sample of size n from the Kumaraswamy distribution hasdensity

αβ

B(i, n+ 1− i)xα−1(1− xα)β(n+1−i)−1

{1− (1− xα)β}i−1, 0 < x < 1.

It follows that

E(Xri:n) =

αβ

B(i, n+ 1− i)

∫ 1

0xr+α−1

{1− (1− xα)β}i−1(1− xα)β(n+1−i)−1dx

B(i, n+ 1− i)

∫ 1

0{1− wβ}i−1wβ(n+1−i)−1(1− w)(r/α)dw

B(i, n+ 1− i)

i−1∑k=0

(i− 1k

)(−1)kB

(β(k+ n+ 1− i), 1+

r

α

)

B(i, n+ 1− i)

n∑`=n+1−i

(i− 1n− `

)(−1)`−(n+1−i)B

(β`, 1+

r

α

).

Appendix C. Limiting behaviour of S(α)

Recall the notation Yi = Xαi , i = 1, . . . , n.(I) α→ 0. First, note that 1 − Yi ' − log Yi > 0 and hence that log(1 − Yi) ' logα < 0. It follows

that

T1(α) ' T2(α) ' −1 and T3(α) ' logα

so that

n−1S(α) '−1

α logα→∞.

(II) α→∞. In this case,

T1(α) ' n−1n∑

i=1log Yi, T2(α) ' n−1

n∑i=1

Yi log Yi, and T3(α) ' −n−1

n∑i=1

Yi.

Page 12: Kumaraswamy’s distribution A beta-type distribution with

M.C. Jones / Statistical Methodology 6 (2009) 70–81 81

It follows that

n−1S(α) ' −

n∑i=1

(− log Xi)

n+

n∑i=1

Yi(− log Xi)

n∑i=1

Yi

' −

n∑i=1

(− log Xi)

n+ (− log Xn:n) < 0

because the mean of positive quantities is larger than their minimum.

References

[1] B.C. Arnold, N. Balakrishnan, H.N. Nagaraja, A First Course in Order Statistics, Wiley, New York, 1992.[2] B.C. Arnold, R.A. Groeneveld, Measuring skewness with respect to the mode, The American Statistician 49 (1995) 34–38.[3] A.L. Bowley, Elements of Statistics, sixth ed., Scribner, New York, 1937.[4] L. Devroye, Non-Uniform Random Variate Generation, Springer, New York, 1986, Available at:

http://cg.scs.carleton.ca/~luc/rnbookindex.html.[5] K.T. Fang, S. Kotz, K.W. Ng, Symmetric Multivariate and Related Distributions, Chapman and Hall, London, 1990.[6] S.C. Fletcher, K. Ponnambalam, Estimation of reservoir yield and storage distribution using moments analysis, Journal of

Hydrology 182 (1996) 259–275.[7] W.G. Gilchrist, Statistical Modelling With Quantile Functions, Chapman & Hall/CRC, Boca Raton, LA, 2001.[8] H.W. Gould, Combinatorial Identities, Morgantown Printing and Binding Co, Morgantown, VA, 1972.[9] R.D. Gupta, D. Kundu, Generalized exponential distributions, Australian and New Zealand Journal of Statistics 41 (1999)

173–188.[10] R.D. Gupta, D. Kundu, Generalized exponential distribution: Existing results and some recent developments, Journal of

Statistical Planning and Inference 137 (2007) 3537–3547.[11] J.R.M. Hosking, The theory of probability weighted moments, Research report RC 12210 (Revised Version), IBM Research

Division, Yorktown Heights, NY, 1989.[12] J.R.M. Hosking, L-moments: Analysis and estimation of distributions using linear combinations of order statistics, Journal

of the Royal Statistical Society Series B 52 (1990) 105–124.[13] J.R.M. Hosking, The four-parameter kappa distribution, IBM Journal of Research and Development 38 (1994) 251–258.[14] J.R.M. Hosking, J.R. Wallis, Regional Frequency Analysis; an Approach Based on L-Moments, Cambridge University Press,

Cambridge, 1997.[15] N.L. Johnson, S. Kotz, N. Balakrishnan, Continuous Univariate Distributions, second ed., vol. 2, Wiley, New York, 1994.[16] M.C. Jones, The complementary beta distribution, Journal of Statistical Planning and Inference 104 (2002) 329–337.[17] M.C. Jones, Families of distributions arising from distributions of order statistics (with discussion), Test 13 (2004) 1–43.[18] M.C. Jones, Connecting distributions with power tails on the real line, the half line and the interval, International Statistical

Review 75 (2007) 58–69.[19] J. Karvanen, A. Nuutinen, Characterizing the generalized lambda distribution by L-moments, Computational Statistics and

Data Analysis 52 (2008) 1971–1983.[20] S. Kotz, J.R. van Dorp, Beyond Beta; Other Continuous Families of Distributions With Bounded Support and Applications,

World Scientific, New Jersey, 2004.[21] D. Koutsoyiannis, T. Xanthopoulos, On the parametric approach to unit hydrograph identification, Water Resources

Management 3 (1989) 107–128.[22] P. Kumaraswamy, A generalized probability density function for double-bounded random processes, Journal of Hydrology

46 (1980) 79–88.[23] L.M. Leemis, J.T. McQueston, Univariate distribution relationships, The American Statistician 62 (2008) 45–53.[24] T. Mäkeläinen, K. Schmidt, G.P.H. Styan, On the existence and uniqueness of the maximum likelihood estimate of a vector-

valued parameter in fixed-size samples, Annals of Statistics 9 (1981) 758–767.[25] J.B. McDonald, Some generalized functions for the size distribution of income, Econometrica 52 (1984) 647–664.[26] S. Nadarajah, On the distribution of Kumaraswamy, Journal of Hydrology 348 (2008) 568–569.[27] S. Nadarajah, A.K. Gupta, Generalizations and related univariate distributions, in: A.K. Gupta, S. Nadarajah (Eds.), Handbook

of Beta Distribution and Its Applications, Dekker, New York, 2004, pp. 97–163.[28] E. Parzen, Nonparametric statistical data modelling (with comments), Journal of the American Statistical Association 74

(1979) 105–131.[29] J.S. Ramberg, B.W. Schmeiser, An approximate method for generating asymmetric random variables, Communications of

the Association for Computing Machinery 17 (1974) 78–82.[30] P.R. Tadikamalla, N.L. Johnson, Systems of frequency curves generated by transformations of logistic variables, Biometrika

69 (1982) 461–465.[31] G. Ulrich, Computer generation of distributions on the M-sphere, Applied Statistics 33 (1984) 158–163.[32] W.R. van Zwet, Convex Transformations of Random Variables, Mathematisch Centrum, Amsterdam, 1964.