BSc/HND IETM Week 9/10 - Some Probability Distributions
description
Transcript of BSc/HND IETM Week 9/10 - Some Probability Distributions
![Page 1: BSc/HND IETM Week 9/10 - Some Probability Distributions](https://reader036.fdocuments.us/reader036/viewer/2022062519/56814d27550346895dba5b0a/html5/thumbnails/1.jpg)
BSc/HND IETM Week 9/10 - Some Probability Distributions
![Page 2: BSc/HND IETM Week 9/10 - Some Probability Distributions](https://reader036.fdocuments.us/reader036/viewer/2022062519/56814d27550346895dba5b0a/html5/thumbnails/2.jpg)
When we looked at the histogram a few weeks ago, we were looking
at frequency distributions.
![Page 3: BSc/HND IETM Week 9/10 - Some Probability Distributions](https://reader036.fdocuments.us/reader036/viewer/2022062519/56814d27550346895dba5b0a/html5/thumbnails/3.jpg)
It is possible to convert such frequency distributions into
probability distributions, such that the probability of
encountering some particular value (or range of values) of x is plotted on the vertical axis, rather than the number of occurrences of
that value of x.
![Page 4: BSc/HND IETM Week 9/10 - Some Probability Distributions](https://reader036.fdocuments.us/reader036/viewer/2022062519/56814d27550346895dba5b0a/html5/thumbnails/4.jpg)
There are a few standard forms of such distributions, which make analysis rather easy - so long as the data really do fit the chosen
form.
![Page 5: BSc/HND IETM Week 9/10 - Some Probability Distributions](https://reader036.fdocuments.us/reader036/viewer/2022062519/56814d27550346895dba5b0a/html5/thumbnails/5.jpg)
We shall look at two of these standard forms, the normal and
the negative exponential distributions.
![Page 6: BSc/HND IETM Week 9/10 - Some Probability Distributions](https://reader036.fdocuments.us/reader036/viewer/2022062519/56814d27550346895dba5b0a/html5/thumbnails/6.jpg)
Probability distributions from frequency distributions
![Page 7: BSc/HND IETM Week 9/10 - Some Probability Distributions](https://reader036.fdocuments.us/reader036/viewer/2022062519/56814d27550346895dba5b0a/html5/thumbnails/7.jpg)
Suppose that our previously-mentioned (and, sadly,
hypothetical) optional unit for your course, ‘Flower Arranging
for Engineers’, becomes extremely popular.
![Page 8: BSc/HND IETM Week 9/10 - Some Probability Distributions](https://reader036.fdocuments.us/reader036/viewer/2022062519/56814d27550346895dba5b0a/html5/thumbnails/8.jpg)
In fact, it becomes so popular that it is studied by 208 students, from all the various BSc courses
in the School.
![Page 9: BSc/HND IETM Week 9/10 - Some Probability Distributions](https://reader036.fdocuments.us/reader036/viewer/2022062519/56814d27550346895dba5b0a/html5/thumbnails/9.jpg)
In an effort to analyse the performance of the students, so as to determine if any improvements to the unit are required, we might decide to plot a histogram of the
final marks obtained.
![Page 10: BSc/HND IETM Week 9/10 - Some Probability Distributions](https://reader036.fdocuments.us/reader036/viewer/2022062519/56814d27550346895dba5b0a/html5/thumbnails/10.jpg)
As we know, this is a frequency distribution, and might be
obtained from the following summary of the students’ scores,
as shown:
![Page 11: BSc/HND IETM Week 9/10 - Some Probability Distributions](https://reader036.fdocuments.us/reader036/viewer/2022062519/56814d27550346895dba5b0a/html5/thumbnails/11.jpg)
Mark Scored (%) 0-9.9 10-19.9 20-29.9 30-39.9 40-49.9Frequency (No. of students) 1 4 8 17 47
Mark Scored (%) 50-59.9 60-69.9 70-79.9 80-89.9 90-100Frequency (No. of students) 53 39 25 11 3
![Page 12: BSc/HND IETM Week 9/10 - Some Probability Distributions](https://reader036.fdocuments.us/reader036/viewer/2022062519/56814d27550346895dba5b0a/html5/thumbnails/12.jpg)
0 10 20 30 40 50 60 70 80 90 1000
10
20
30
40
50
Mark (per cent)
Frequency (No. of students)
1
4
8
17
47
53
39
25
11
3
![Page 13: BSc/HND IETM Week 9/10 - Some Probability Distributions](https://reader036.fdocuments.us/reader036/viewer/2022062519/56814d27550346895dba5b0a/html5/thumbnails/13.jpg)
Frequency polygonsThe first step in the conversion is to change from the histogram to
what is called a frequency polygon. This is simply a line
graph, joining the centres of each of the chosen data intervals.
![Page 14: BSc/HND IETM Week 9/10 - Some Probability Distributions](https://reader036.fdocuments.us/reader036/viewer/2022062519/56814d27550346895dba5b0a/html5/thumbnails/14.jpg)
At the ends, our frequency polygon reaches the zero axis as
shown, since no student can obtain less than zero or more
than 100 per cent. In situations when this doesn’t apply, it is conventional to terminate the polygon on the zero axis, half way through the next interval.
![Page 15: BSc/HND IETM Week 9/10 - Some Probability Distributions](https://reader036.fdocuments.us/reader036/viewer/2022062519/56814d27550346895dba5b0a/html5/thumbnails/15.jpg)
0 10 20 30 40 50 60 70 80 90 1000
10
20
30
40
50
60
Mark (per cent)
Frequency (No. of students)
1
4
8
17
47
53
39
25
11
3
![Page 16: BSc/HND IETM Week 9/10 - Some Probability Distributions](https://reader036.fdocuments.us/reader036/viewer/2022062519/56814d27550346895dba5b0a/html5/thumbnails/16.jpg)
It is very easy to obtain probability distributions from
diagrams such as those above. All that is necessary is to divide each frequency by the total number of (in this case) students, to obtain the probability of any individual
student, selected at random, obtaining a mark in a particular
range.
![Page 17: BSc/HND IETM Week 9/10 - Some Probability Distributions](https://reader036.fdocuments.us/reader036/viewer/2022062519/56814d27550346895dba5b0a/html5/thumbnails/17.jpg)
For example, to convert the histogram on page 1, or the
frequency polygon on page 2, into probability distributions,
simply divide every number on the vertical axis (and therefore also the numbers written on the
plots) by 208.
![Page 18: BSc/HND IETM Week 9/10 - Some Probability Distributions](https://reader036.fdocuments.us/reader036/viewer/2022062519/56814d27550346895dba5b0a/html5/thumbnails/18.jpg)
Thus, the vertical axes would now be calibrated in
probabilities from zero to 53/208 = 0.255.
![Page 19: BSc/HND IETM Week 9/10 - Some Probability Distributions](https://reader036.fdocuments.us/reader036/viewer/2022062519/56814d27550346895dba5b0a/html5/thumbnails/19.jpg)
The probability of any given student obtaining a mark in the
range 40 to 49.9 per cent will be 47/208 = 0.226. The probability of a student scoring 90 per cent or more will be 3/208 = 0.0144, etc.
![Page 20: BSc/HND IETM Week 9/10 - Some Probability Distributions](https://reader036.fdocuments.us/reader036/viewer/2022062519/56814d27550346895dba5b0a/html5/thumbnails/20.jpg)
The normal distribution
It is not very surprising that the marks distribution (frequency or
probability) looks like the diagrams above.
![Page 21: BSc/HND IETM Week 9/10 - Some Probability Distributions](https://reader036.fdocuments.us/reader036/viewer/2022062519/56814d27550346895dba5b0a/html5/thumbnails/21.jpg)
In a fair examination, taken by a large number of students, we would expect that only a few students would obtain either
abysmally low marks or astronomically high marks.
![Page 22: BSc/HND IETM Week 9/10 - Some Probability Distributions](https://reader036.fdocuments.us/reader036/viewer/2022062519/56814d27550346895dba5b0a/html5/thumbnails/22.jpg)
We would expect the majority of marks to be ‘somewhere in the middle’, with a ‘tail’ at both the
low and the high ends of the range.
![Page 23: BSc/HND IETM Week 9/10 - Some Probability Distributions](https://reader036.fdocuments.us/reader036/viewer/2022062519/56814d27550346895dba5b0a/html5/thumbnails/23.jpg)
We would expect the majority of marks to be ‘somewhere in the middle’, with a ‘tail’ at both the
low and the high ends of the range.
This is what we see above.
![Page 24: BSc/HND IETM Week 9/10 - Some Probability Distributions](https://reader036.fdocuments.us/reader036/viewer/2022062519/56814d27550346895dba5b0a/html5/thumbnails/24.jpg)
Several real-life situations fit this general form of distribution,
where it is most likely that results will be clustered around the centre of some range, with outlying values tailing off
towards the ends of the range.
![Page 25: BSc/HND IETM Week 9/10 - Some Probability Distributions](https://reader036.fdocuments.us/reader036/viewer/2022062519/56814d27550346895dba5b0a/html5/thumbnails/25.jpg)
Wisniewski, in his ‘Foundation’ text, uses an example based on the distributions of the weights of breakfast cereal packed by
machines into boxes.
![Page 26: BSc/HND IETM Week 9/10 - Some Probability Distributions](https://reader036.fdocuments.us/reader036/viewer/2022062519/56814d27550346895dba5b0a/html5/thumbnails/26.jpg)
There should always ideally be the stated amount in a box but, inevitably, some boxes will be
lighter, and some heavier. There will be the odd ‘rogue’ boxes a
long way from the mean.
![Page 27: BSc/HND IETM Week 9/10 - Some Probability Distributions](https://reader036.fdocuments.us/reader036/viewer/2022062519/56814d27550346895dba5b0a/html5/thumbnails/27.jpg)
To make it easier to cope with such situations, they are often assumed to fit a standardised
probability distribution, called the normal distribution.
![Page 28: BSc/HND IETM Week 9/10 - Some Probability Distributions](https://reader036.fdocuments.us/reader036/viewer/2022062519/56814d27550346895dba5b0a/html5/thumbnails/28.jpg)
By doing this, it is possible to use standard printed tables to make
predictions such as (for example), how many students would be
expected to score less than 40 per cent
![Page 29: BSc/HND IETM Week 9/10 - Some Probability Distributions](https://reader036.fdocuments.us/reader036/viewer/2022062519/56814d27550346895dba5b0a/html5/thumbnails/29.jpg)
To allow standard tables to be used, we need to assume a certain
fixed shape of probability distribution, and we also need to
define it in terms of mean and standard deviation.
![Page 30: BSc/HND IETM Week 9/10 - Some Probability Distributions](https://reader036.fdocuments.us/reader036/viewer/2022062519/56814d27550346895dba5b0a/html5/thumbnails/30.jpg)
We cannot define it in terms of actual data values (e.g.
examination marks, or weight of cereal in a box), otherwise we would need a different set of tables for every new problem.
![Page 31: BSc/HND IETM Week 9/10 - Some Probability Distributions](https://reader036.fdocuments.us/reader036/viewer/2022062519/56814d27550346895dba5b0a/html5/thumbnails/31.jpg)
The normal distribution curve is actually defined by a rather
unpleasant formula (but we don’t need to use it, as we are going to
use tables which have been derived from it by someone else).
![Page 32: BSc/HND IETM Week 9/10 - Some Probability Distributions](https://reader036.fdocuments.us/reader036/viewer/2022062519/56814d27550346895dba5b0a/html5/thumbnails/32.jpg)
If the variable in which we are interested is x (e.g. a mark in per cent, or the weight of cereal in a box in kg), the mean value of x is and the standard deviation of
the data set is x,
![Page 33: BSc/HND IETM Week 9/10 - Some Probability Distributions](https://reader036.fdocuments.us/reader036/viewer/2022062519/56814d27550346895dba5b0a/html5/thumbnails/33.jpg)
then the normal distribution curve is defined by the probability that x will take a particular value (P(x))
obeying the following relationship (I believe there is an error in
Wisniewski’s version):
![Page 34: BSc/HND IETM Week 9/10 - Some Probability Distributions](https://reader036.fdocuments.us/reader036/viewer/2022062519/56814d27550346895dba5b0a/html5/thumbnails/34.jpg)
2
2
1
22
1)(
x
xx
x
exP
![Page 35: BSc/HND IETM Week 9/10 - Some Probability Distributions](https://reader036.fdocuments.us/reader036/viewer/2022062519/56814d27550346895dba5b0a/html5/thumbnails/35.jpg)
The resulting plot of P(x) as x varies is a ‘bell-shaped’ curve, as
shown in the next slide.
![Page 36: BSc/HND IETM Week 9/10 - Some Probability Distributions](https://reader036.fdocuments.us/reader036/viewer/2022062519/56814d27550346895dba5b0a/html5/thumbnails/36.jpg)
0
-4 -3 -2 -1 0 1 2 3 4
0.1
0.2
0.3
0.4P(x)
z = no. of standard deviations of x from its mean valuez = 0 for mean of x
x
![Page 37: BSc/HND IETM Week 9/10 - Some Probability Distributions](https://reader036.fdocuments.us/reader036/viewer/2022062519/56814d27550346895dba5b0a/html5/thumbnails/37.jpg)
Notes1. The “x axis” is in STANDARD DEVIATIONS2. The total area under the graph is 1 unit.3. The area under the graph between two values of x gives the probability that the quantity will be between those values.
![Page 38: BSc/HND IETM Week 9/10 - Some Probability Distributions](https://reader036.fdocuments.us/reader036/viewer/2022062519/56814d27550346895dba5b0a/html5/thumbnails/38.jpg)
ExampleSay that a large set of
examination results has a mean of 55 per cent, and a standard
deviation of 15 per cent.
![Page 39: BSc/HND IETM Week 9/10 - Some Probability Distributions](https://reader036.fdocuments.us/reader036/viewer/2022062519/56814d27550346895dba5b0a/html5/thumbnails/39.jpg)
How many students would we expect to fail the examination (if we define a failure as obtaining less than 40 per cent), and how many students would we expect to get a first-class result (defined
as obtaining 70 per cent or more)?
![Page 40: BSc/HND IETM Week 9/10 - Some Probability Distributions](https://reader036.fdocuments.us/reader036/viewer/2022062519/56814d27550346895dba5b0a/html5/thumbnails/40.jpg)
0
-4 -3 -2 -1 0 1 2 3 4
0.4P(x)
z for x = 55 per cent
x
z for x = 70 per centz for x = 40 per cent
area = probabilitythat student fails
area = probability thatstudent gets a ’first’
![Page 41: BSc/HND IETM Week 9/10 - Some Probability Distributions](https://reader036.fdocuments.us/reader036/viewer/2022062519/56814d27550346895dba5b0a/html5/thumbnails/41.jpg)
SD from Area SD from Areamean mean2.00 0.0227 0.95 0.17101.95 0.0256 0.90 0.18401.90 0.0287 0.85 0.19761.85 0.0321 0.80 0.21181.80 0.0359 0.75 0.22661.75 0.0400 0.70 0.24191.70 0.0445 0.65 0.25781.65 0.0495 0.60 0.27421.60 0.0548 0.55 0.29111.55 0.0606 0.50 0.30851.50 0.0668 0.45 0.32631.45 0.0735 0.40 0.34461.40 0.0807 0.35 0.36321.35 0.0885 0.30 0.38211.30 0.0968 0.25 0.40131.25 0.1056 0.20 0.42071.20 0.1150 0.15 0.44041.15 0.1250 0.10 0.46021.10 0.1356 0.05 0.48011.05 0.1468 0.00 0.50001.00 0.1586
![Page 42: BSc/HND IETM Week 9/10 - Some Probability Distributions](https://reader036.fdocuments.us/reader036/viewer/2022062519/56814d27550346895dba5b0a/html5/thumbnails/42.jpg)
X = 1.0 (1 SD from mean)
First : Probability 0.1587
Fail: Also 0.1587 !
![Page 43: BSc/HND IETM Week 9/10 - Some Probability Distributions](https://reader036.fdocuments.us/reader036/viewer/2022062519/56814d27550346895dba5b0a/html5/thumbnails/43.jpg)
The negative exponential distribution
To cover a wider range of real-world situations, more
‘standardised’ probability distributions are required.
![Page 44: BSc/HND IETM Week 9/10 - Some Probability Distributions](https://reader036.fdocuments.us/reader036/viewer/2022062519/56814d27550346895dba5b0a/html5/thumbnails/44.jpg)
The other one we shall briefly look at is the negative-
exponential distribution. This is also sometimes called a ‘failure-rate’ curve, because it
tends to describe how components fail with time.
![Page 45: BSc/HND IETM Week 9/10 - Some Probability Distributions](https://reader036.fdocuments.us/reader036/viewer/2022062519/56814d27550346895dba5b0a/html5/thumbnails/45.jpg)
If a certain number of components is manufactured and put into
service, it is reasonable to assume that they will all eventually fail.
![Page 46: BSc/HND IETM Week 9/10 - Some Probability Distributions](https://reader036.fdocuments.us/reader036/viewer/2022062519/56814d27550346895dba5b0a/html5/thumbnails/46.jpg)
If a certain number of components is manufactured and put into
service, it is reasonable to assume that they will all eventually fail. The probability of any one of the
components failing during a given time period might well depend on how many components are left in
service.
![Page 47: BSc/HND IETM Week 9/10 - Some Probability Distributions](https://reader036.fdocuments.us/reader036/viewer/2022062519/56814d27550346895dba5b0a/html5/thumbnails/47.jpg)
Choose to measure time t in the best units for the problem
(seconds, months, years, etc.). Technically, the unit chosen
should be short compared with the expected lifetime of a component,
so that any given component is expected to last for many time
units.
![Page 48: BSc/HND IETM Week 9/10 - Some Probability Distributions](https://reader036.fdocuments.us/reader036/viewer/2022062519/56814d27550346895dba5b0a/html5/thumbnails/48.jpg)
Let be the failure rate, that is, the proportion of components
expected to fail in one time unit. This means that must have
‘dimensions’ of (1/time). In the example above, we said that 1
per cent of components might fail in three years so, in that case, the
failure rate 0.01/3 (proportion per year).
![Page 49: BSc/HND IETM Week 9/10 - Some Probability Distributions](https://reader036.fdocuments.us/reader036/viewer/2022062519/56814d27550346895dba5b0a/html5/thumbnails/49.jpg)
This can also be viewed as a probability - there is a probability
of 0.01/3 that any given component will fail in a given
period of one year.
![Page 50: BSc/HND IETM Week 9/10 - Some Probability Distributions](https://reader036.fdocuments.us/reader036/viewer/2022062519/56814d27550346895dba5b0a/html5/thumbnails/50.jpg)
Therefore, to find the proportion of components expected to fail over a time t (measured in our
chosen units), we need the quantity t. This is now
dimensionless - it is actually the probability that any given
component will fail over the stated time period.
![Page 51: BSc/HND IETM Week 9/10 - Some Probability Distributions](https://reader036.fdocuments.us/reader036/viewer/2022062519/56814d27550346895dba5b0a/html5/thumbnails/51.jpg)
We can now state the rate of change of the number of
components as follows (it is negative, because the number
decreases as time passes):
![Page 52: BSc/HND IETM Week 9/10 - Some Probability Distributions](https://reader036.fdocuments.us/reader036/viewer/2022062519/56814d27550346895dba5b0a/html5/thumbnails/52.jpg)
N
t
tN
period time
failures ofnumber
period time
components ofnumber in the change
![Page 53: BSc/HND IETM Week 9/10 - Some Probability Distributions](https://reader036.fdocuments.us/reader036/viewer/2022062519/56814d27550346895dba5b0a/html5/thumbnails/53.jpg)
This is called a differential equation and would normally be
written
![Page 54: BSc/HND IETM Week 9/10 - Some Probability Distributions](https://reader036.fdocuments.us/reader036/viewer/2022062519/56814d27550346895dba5b0a/html5/thumbnails/54.jpg)
Ndt
dN
![Page 55: BSc/HND IETM Week 9/10 - Some Probability Distributions](https://reader036.fdocuments.us/reader036/viewer/2022062519/56814d27550346895dba5b0a/html5/thumbnails/55.jpg)
in which the quantity dN / dt is to be interpreted as the rate of
change of N as the time progresses.
![Page 56: BSc/HND IETM Week 9/10 - Some Probability Distributions](https://reader036.fdocuments.us/reader036/viewer/2022062519/56814d27550346895dba5b0a/html5/thumbnails/56.jpg)
It turns out that:
tNen
![Page 57: BSc/HND IETM Week 9/10 - Some Probability Distributions](https://reader036.fdocuments.us/reader036/viewer/2022062519/56814d27550346895dba5b0a/html5/thumbnails/57.jpg)
We can plot this negative exponential function as the following curve relating the
remaining number of components n to time:
![Page 58: BSc/HND IETM Week 9/10 - Some Probability Distributions](https://reader036.fdocuments.us/reader036/viewer/2022062519/56814d27550346895dba5b0a/html5/thumbnails/58.jpg)
0 1 2 3 4 5 60
0.2
0.4
0.6
0.8
1
Time units
Multiple of initial numberof components (N)
n = Ne -t
![Page 59: BSc/HND IETM Week 9/10 - Some Probability Distributions](https://reader036.fdocuments.us/reader036/viewer/2022062519/56814d27550346895dba5b0a/html5/thumbnails/59.jpg)
The End