Commonly Used Distributionsweb.eng.fiu.edu/leet/Eng_Data/chapt04_2014.pdfΒ Β· (4.14) (4.15) ~π ,...
Transcript of Commonly Used Distributionsweb.eng.fiu.edu/leet/Eng_Data/chapt04_2014.pdfΒ Β· (4.14) (4.15) ~π ,...
2/11/2014
1
Chapter 4:
Commonly Used Distributions
1
Introduction
Statistical inference involves drawing a sample
from a population and analyzing the sample data
to learn about the population.
We often have some knowledge about the
probability mass function or probability density
function of the population.
In this chapter, we describe some of the standard
families of curves.
2
2/11/2014
2
Section 4.1:
The Binomial Distribution
We use the Bernoulli distribution when we have an experiment which can result in one of two outcomes. One outcome is labeled βsuccess,β and the other outcome is labeled βfailure.β
The probability of a success is denoted by π. The probability of a failure is then 1 β π.
Such a trial is called a Bernoulli trial with success probability π.
3
Examples of Bernoulli Trials
1. The simplest Bernoulli trial is the toss of a coin.
The two outcomes are heads and tails. If we
define heads to be the success outcome, then π is
the probability that the coin comes up heads. For a
fair coin, π = 1/2.
2. Another Bernoulli trial is a selection of a component
from a population of components, some of which
are defective. If we define βsuccessβ to be a
defective component, then π is the proportion of
defective components in the population.
4
2/11/2014
3
Binomial Distribution
If a total of n Bernoulli trials are conducted,
and
The trials are independent.
Each trial has the same success probability π.
π is the number of successes in the π trials.
then π has the binomial distribution with
parameters π and π, denoted π ~ π΅ππ(π, π).
5
Example 4.1
A fair coin is tossed 10 times. Let π be the
number of heads that appear. What is the
distribution of π?
6
2/11/2014
4
Probability Mass Function of
a Binomial Random Variable
If π ~ π΅ππ(π, π), the Probability Mass Function
of π is
π π₯ = π π = π₯ =
π!
π₯! π β π₯ !ππ₯(1 β π)πβπ₯ , π₯ = 0,1, β¦ , π
0, ππ‘βπππ€ππ π
7
(4.2)
Binomial Probability Histogram
8
Figure 4.1 (a) Bin(10, 0.4) (b) Bin(20, 0.1)
2/11/2014
5
Example 4.2
The probability that a newborn baby is a girl
is approximately 0.49. Find the probability
that of the next five single births in a
certain hospital, no more than two are
girls.
9
Another Use of the Binomial
Assume that a finite population contains items of
two types, successes and failures, and that a
simple random sample is drawn from the
population. Then if the sample size is no more
than 5% of the population, the binomial
distribution may be used to model the number of
successes.
10
2/11/2014
6
Example 4.3
A lot contains several thousand components, 10%
of which are defective. Nine components are
sampled from the lot. Let π represent the
number of defective components in the sample.
Find the probability that exactly two are
defective.
11
Tables for Binomial Probabilities
Table A.1 (in Appendix A) presents the binomial
probabilities of the form π(π β€ π₯) for π β€ 20and selected values of π.
Excel:
BINOM.DIST(number_s, trials, probability_s, cumulative)
Minitab:
Calc Probability Distributions Binomial
12
2/11/2014
7
Example 4.4
Of all the new vehicles of a certain model that are
sold, 20% require repairs to be done under
warranty during the first year of service. A
particular dealership sells 14 such vehicles.
What is the probability that fewer than five of
them require warranty repairs?
13
Mean and Variance of
a Binomial Random Variable
Mean: ππ = ππ
Variance: ππ2 = ππ(1 β π)
14
(4.3)
(4.4)
2/11/2014
8
Section 4.2:
The Poisson Distribution
One way to think of the Poisson distribution is
as an approximation to the binomial distribution
when π is large and π is small.
It is the case when π is large and π is small that
the mass function depends almost entirely on the
mean ππ, and very little on the specific values of πand π.
We can therefore approximate the binomial mass
function with a quantity π = ππ; this π is the
parameter in the Poisson distribution.
15
Probability Mass Function, Mean,
and Variance of Poisson Dist.
If π ~ ππππ π ππ(π), the probability mass function of π is
π π₯ = π π = π₯ = πβπππ₯
π₯!, π₯ = 0,1, 2β¦
0, ππ‘βπππ€ππ π
Mean and Variance: ππ = π, ππ2 = π
Note: π must be a discrete random variable and πmust be a positive constant.
16
(4.6)
(4.7)
(4.8)
2/11/2014
9
Poisson Probability Histogram
17
Figure 4.2 (a) Poisson(1) (b) Poisson(10)
Poisson Probabilities
Excel:
POISSON.DIST(π₯, mean, cumulative)
Minitab:
Calc Probability Distributions Poisson
18
2/11/2014
10
Example 4.9
Particles are suspended in a liquid medium at a
concentration of 6 particles per mL. A large
volume of the suspension is thoroughly agitated,
and then 3 mL are withdrawn. What is the
probability that exactly 15 particles are
withdrawn?
19
Section 4.3:
The Normal Distribution
The normal distribution (also called the Gaussian distribution) is by far the most commonly used distribution in statistics. This distribution provides a good model for many, although not all, continuous populations.
The normal distribution is continuous rather than discrete. The mean of a normal population may have any value, and the variance may have any positive value.
20
2/11/2014
11
Probability Density Function, Mean,
and Variance of Normal Dist.
The probability density function of a normal
population with mean and variance 2 is
given by
π π₯ =1
π 2πππ₯π β
(π₯βπ)2
2π2, ββ < π₯ < β
If π ~ π(,2), then the mean and variance
of π are given by ππ = π, ππ2 = π2
21
(4.9)
68-95-99.7% Rule
This figure represents a plot of the normal probability density
function with mean and standard deviation . Note that the
curve is symmetric about , so that is the median as well as
the mean. It is also the case for the normal population.
About 68% of the population is in the interval .
About 95% of the population is in the interval 2.
About 99.7% of the population is in the interval 3.
22
2/11/2014
12
Standard Units
The proportion of a normal population that is
within a given number of standard deviations of
the mean is the same for any normal population.
For this reason, when dealing with normal
populations, we often convert from the units in
which the population items were originally
measured to standard units.
Standard units tell how many standard deviations
an observation is from the population mean.
23
Standard Normal Distribution
In general, we convert to standard units by subtracting the mean and dividing by the standard deviation. Thus, if π₯ is an item sampled from a normal population with mean and variance 2, the standard unit equivalent of π₯ is the number π§, where
π§ = (π₯ β )/
The number π§ is sometimes called the βz-scoreβ of π₯. The z-score is an item sampled from a normal population with mean 0 and standard deviation of 1. This normal distribution is called the standard normal distribution.
24
(4.10)
2/11/2014
13
Example 4.12
Resistances in a population of wires are normally distributed with mean 20 mΞ© and standard deviation 3 mΞ©. The resistance of two randomly chosen wires are 23 mΞ© and 16 mΞ©. Convert these amounts to standard units.
25
Example 4.12 (Cont.)
The resistance of a wire has a z-score of β1.7.
Find resistance of the wire in the original units of
mΞ©.
26
2/11/2014
14
Finding Areas Under the Normal
Curve
The proportion of a normal population that lies within a given interval is equal to the area under the normal probability density above that interval. This would suggest integrating the normal pdf, but this integral does not have a closed form solution.
So, the areas under the curve are approximated numerically and are available in Table A.2. This table provides area under the curve for the standard normal density. We can convert any normal into a standard normal so that we can compute areas under the curve.
The table gives the area in the left-hand tail of the curve. Other areas can be calculated by subtraction or by using the fact that the total area under the curve is 1.
27
Normal Probabilities
Excel:
NORMDIST(x, mean, standard_dev, cumulative)
NORMINV(probability, mean, standard_dev)
NORMSDIST(z)
NORMSINV(probability)
Minitab:
Calc Probability Distributions Normal
28
2/11/2014
15
Example 4.15
Find the area under normal curve to the left of z =
0.47.
29
Example 4.16
Find the area under the curve to the right of z = 1.38.
30
2/11/2014
16
Example 4.17
Find the area under the normal curve between z = 0.71 and z = 1.28.
31
Example 4.18
What z-score corresponds to the 75th percentile of a normal curve?
32
2/11/2014
17
Linear Functions of Normal
Random Variables
Let π ~ π(π, π2) and let π β 0 and π be constants.
Then ππ + π ~π(π+ π, π22).
Let π1, π2, β¦ , ππbe independent and normally distributed with means 1,2, β¦ ,π and variances 1
2,22, β¦ ,π
2. Let π1, π2, β¦ , ππ be constants, and π1π1 + π2π2 +β―+ ππππ be a linear combination. Then
π1π1 + π2π2 +β―+ ππππ
~ π(π11 + π2 2 +β―+ πππ, π121
2 + π222
2 +β― + ππ2π2)
33
(4.11)
(4.12)
Example 4.22
A chemist measures the temperature of a solution
in oC. The measurement is denoted C, and is
normally distributed with mean 40 oC and
standard deviation 1oC. The measurement is
converted to oF by the equation F = 1.8C + 32.
What is the distribution of F?
34
2/11/2014
18
Distributions of Functions of Normal
Random Variables
Let π1, π2, β¦ , ππbe independent and normally distributed
with mean and variance 2. Then
Let π and π be independent, with π ~ π(π,π2) and
π ~ π(π,π2). Then
π + π~π ππ + ππ , ππ2 + ππ
2
π β π~π ππ β ππ , ππ2 + ππ
2
35
(4.13)
(4.14)
(4.15)
π~π π,π2
π
Section 4.4:
The Lognormal Distribution
For data that contain outliers, the normal distribution is generally not appropriate. The lognormal distribution, which is related to the normal distribution, is often a good choice for these data sets.
If π~π(π, π2), then the random variable π = ππ has the lognormal distribution with parameters and 2.
If π has the lognormal distribution with parameters and 2, then the random variable π = πππ has the π(π, π2) distribution.
36
2/11/2014
19
Lognormal pdf, mean, and
variance
The pdf of a lognormal random variable π with parameters
and 2 is
π(π₯) = 1
ππ₯ 2πππ₯π β
[ln(π₯)βπ]2
2π2, π₯ > 0
0, ππ‘βπππ€ππ π
Mean: πΈ π = ππ₯π π +π2
2
Variance: π π = exp 2π + 2π2 β exp(2π + π2)
37
(4.16)
(4.17)
Lognormal Probability Density
Function
38
= 0 = 1
2/11/2014
20
Lognormal Probabilities
Excel:
LOGNORMDIST(x, mean, standard_dev)
Minitab:
Calc Probability Distributions Lognormal
39
Example 4.24
When a pesticide comes into contact with the skin,
a certain percentage of it is absorbed. The
percentage that is absorbed during a given time
period is often modeled with a lognormal
distribution. Assume that for a given pesticide,
the amount that is absorbed (in percent) within
two hours is lognormally distributed with of 1.5
and of 0.5. Find the probability that more than
5% of the pesticide is absorbed within two hours.
40
2/11/2014
21
Section 4.5:
The Exponential Distribution
The exponential distribution is a continuous
distribution that is sometimes used to model the
time that elapses before an event occurs. Such a
time is often called a waiting time.
The probability density of the exponential
distribution involves a parameter, which is a
positive constant whose value determines the
density functionβs location and shape.
We write π ~ πΈπ₯π().
41
Exponential R.V.:
pdf, cdf, mean and variance
The pdf of an exponential random variable π is
π(π₯) = π πβππ₯ , π₯ > 0
0, π₯ β€ 0
The cdf of an exponential random variable is
πΉ(π₯) = 1 β πβππ₯ , π₯ > 00, π₯ β€ 0
The mean of an exponential random variable is
ππ₯ = 1/The variance of an exponential random variable is
ππ₯2 = 1/2
42
(4.18)
(4.19)
(4.20)
(4.21)
2/11/2014
22
Exponential Probability Density
Function
43
Exponential Probabilities
Excel:
EXPONDIST(x, lambda, cumulative)
Minitab:
Calc Probability Distributions Exponential
44
2/11/2014
23
Example 4.26
A radioactive mass emits particles according to a
Poisson process at a mean rate of 15 particles
per minute. At some point, a clock is started.
1. What is the probability that more than 5 seconds
will elapse before the next emission?
2. What is the mean waiting time until the next
particle is emitted?
45
Lack of Memory Property
The exponential distribution has a property known
as the lack of memory property:
If π ~ πΈπ₯π(), and π‘ and π are positive
numbers, then
π(π > π‘ + π |π > π ) = π(π > π‘)
46
2/11/2014
24
Example 4.27 / 4.28
The lifetime of a transistor in a particular circuit has an exponential distribution with mean 1.25 years.
1. Find the probability that the circuit lasts longer than 2 years.
2. Assume the transistor is now three years old and is still functioning. Find the probability that it functions for more than two additional years.
3. Compare the probability computed in 1. and 2.
47
Section 4.6: Some Other
Continuous Distributions
The uniform distribution has two parameters, πand π, with π < π. If π is a random variable with
the continuous uniform distribution then it is
uniformly distributed on the interval (π, π). We
write π ~ π(π, π).
The pdf is
π(π₯) =
1
π β π, π < π₯ < π
0, ππ‘βπππ€ππ π
48
(4.22)
2/11/2014
25
Uniform Distribution:
Mean and Variance
If π ~ π π, π ,
Then the mean is ππ =π+π
2
and the variance is ππ2 =
(πβπ)2
12
49
(4.23)
(4.24)
The Gamma Distribution
First, letβs consider the gamma function:
For π > 0, the gamma function is defined by
Ξ π = 0βπ‘πβ1πβπ‘ππ‘
The gamma function has the following properties:
1. If π is any integer, then (π) = (π β 1)!
2. For any π, (π + 1) = π (π)
3. Ξ π = π
50
(4.25)
2/11/2014
26
Gamma R.V.
The pdf of the gamma distribution with parameters
π > 0 and π > 0 is
π(π₯) = π π₯πβ1πβππ₯
Ξ(π), π₯ > 0
0, π₯ β€ 0
The mean and variance are given by ππ = π π and ππ
2 = π π2 , respectively.
If π = 1, the gamma distribution is the same as the exponential.
If π = π/2, where k is a positive integer, the distribution is called a chi-square distribution with π degrees of freedom.
51
(4.26)
(4.27)
(4.28)
Gamma Probability Density
Function
52
2/11/2014
27
Gamma Probabilities
Excel:
GAMMADIST(x, alpha, beta, cumulative)
GAMMAINV(probability, alpha, beta)
GAMMALN(x)
Minitab:
Calc Probability Distributions Gamma
53
Special Cases of
Gamma Distribution
If π is a positive integer, the gamma distribution is called an Erlang Distribution.
If π = π/2 where π is a positive integer, the (π, 1/2) distribution is called chi-square distribution with π degrees of freedom.
54
2/11/2014
28
The Weibull Distribution
The Weibull distribution is a continuous random variable that is used in a variety of situations. A common application of the Weibull distribution is to model the lifetimes of components. The Weibull probability density function has two parameters, both positive constants, that determine the location and shape. We denote these parameters and .
If = 1, the Weibull distribution is the same as the exponential distribution with parameter = .
55
Weibull R.V.
The pdf of the Weibull distribution is
π(π₯) = πΌπ½πΌ π₯πΌβ1πβ(π½π₯)
πΌ, π₯ > 0
0, π₯ β€ 0
The mean of the Weibull is
ππ =1
π½Ξ 1 +
1
πΌ
The variance of the Weibull is
ππ2 =
1
π½2Ξ 1 +
2
πΌβ Ξ 1 +
1
πΌ
2
56
(4.29)
(4.31)
(4.32)
2/11/2014
29
Weibull Probability Density
Function
57
Weibull Probabilities
Excel:
WEIBULL(x, alpha, beta, cumulative)
Minitab:
Calc Probability Distributions Weibull
58
2/11/2014
30
Section 4.7: Probability Plots
Scientists and engineers often work with data that can be thought of as a random sample from some population. In many cases, it is important to determine the probability distribution that approximately describes the population.
More often than not, the only way to determine an appropriate distribution is to examine the sample to find a sample distribution that fits.
59
Finding a Distribution
Probability plots are a good way to determine an appropriate distribution.
Here is the idea: Suppose we have a random sample π1, β¦ , ππ. We first arrange the data in ascending order. Then assign evenly spaced values between 0 and 1 to each Xi. There are several acceptable ways to this; the simplest is to assign the value (π β 0.5)/π to ππ.
The distribution that we are comparing the Xβs to should have a mean and variance that match the sample mean and variance. We want to plot (ππ, πΉ(ππ)), if this plot resembles the cdf of the distribution that we are interested in, then we conclude that that is the distribution the data came from.
60
2/11/2014
31
Probability Plot: Example
61
i Xi (i-.5)/n Qi
1 3.01 0.1 2.4369
2 3.35 0.3 3.9512
3 4.79 0.5 5.0000
4 5.96 0.7 6.0488
5 7.89 0.9 7.5631
0.0000
1.0000
2.0000
3.0000
4.0000
5.0000
6.0000
7.0000
8.0000
0 2 4 6 8 10
Qi
Probability Plot: Example
62
2/11/2014
32
Software
Many software packages take the (πβ 0.5)/πassigned to each ππ, and calculate the quantile (ππ) corresponding to that number from the distribution of interest. Then it plots each (ππ, ππ). If this plot is a reasonably straight line then you may conclude that the sample came from the distribution that we used to find quantiles.
63
Normal Probability Plots
The sample plotted on the left comes from a population that is not close to normal.
2/11/2014
33
Normal Probability Plots
The sample plotted on the left comes from a population that is not close to normal. The sample plotted on the right comes from a population that is close to normal.
Section 4.8: The Central Limit
Theorem
The Central Limit Theorem
Let π1, β¦ , ππ be a random sample from a population with mean and variance 2.
Let π =π1+β―+ππ
πbe the sample mean.
Let ππ = π1+β―+ ππ be the sum of the sample observations.
Then if π is sufficiently large,
π~π π,π2
πand ππ~π ππ, ππ2 approximately.
66
(4.33) (4.34)
2/11/2014
34
Rule of Thumb
For most populations, if the sample size is greater than 30,
the Central Limit Theorem approximation is good.
Normal approximation to the Binomial:
If π~π΅ππ(π, π) and if ππ > 5, and π(1βπ) > 5, then
π ~ π(ππ, ππ(1 β π)) approximately.
Normal Approximation to the Poisson:
If π ~ ππππ π ππ(π), where π > 10, then π ~ π(π, π).
67
(4.35)
(4.36)
Normal approximation to
Binomial
68
Bin(100, 0.2) and N (20, 16)
2/11/2014
35
Continuity Correction
The binomial distribution is discrete, while the normal distribution is continuous.
The continuity correction is an adjustment, made when approximating a discrete distribution with a continuous one, that can improve the accuracy of the approximation.
If you want to include the endpoints in your probability calculation, then extend each endpoint by 0.5. Then proceed with the calculation.
If you want exclude the endpoints in your probability calculation, then include 0.5 less from each endpoint in the calculation.
69
Continuity Correction
70
P(45X55) P(45<X<55)
2/11/2014
36
Example 4.32
If a fair coin is tossed 100 times, use the normal
curve to approximate the probability that the
number of heads is between 45 and 55 inclusive.
71
Example 4.34
The number of hits on a website follow a Poisson
distribution, with a mean of 27 hits per hour. Find
the probability that there will be 90 or more hits in
three hours.
72
2/11/2014
37
Summary
Discrete Distributions
β Bernoulli
β Binomial
β Poisson.
Continuous distributions
β Normal
β Exponential
β Uniform
β Gamma
β Weibull.
Central Limit Theorem.
Normal approximations to the Binomial and Poisson dist.
73