W2 Frequency Distribution 0
-
Upload
danny-manno -
Category
Documents
-
view
219 -
download
0
Transcript of W2 Frequency Distribution 0
-
8/2/2019 W2 Frequency Distribution 0
1/47
Centre for Computer Technology
ICT114Mathematics for
Computing
Week2
Statistics and FrequencyDistribution
-
8/2/2019 W2 Frequency Distribution 0
2/47
-
8/2/2019 W2 Frequency Distribution 0
3/47
March 20, 2012 Copyright Box Hill Institute
Set : Introduction
A set is a well-defined list, collection or class of objects.
The objects could be anything : numbers, names,
people, cities. These objects are called the elements ormembers of the set.
Example 1: The numbers 1,3,5,7,9,11,13,Example 2: The solutions of the equation x2 4x+3=0Example 3 : The rivers in Australia
-
8/2/2019 W2 Frequency Distribution 0
4/47
March 20, 2012 Copyright Box Hill Institute
Set Notation
Sets are usually denoted by capital letters
A, B, P, X, ..
The elements are usually represented bylowercase letters a, b, p, x, ..
There are two forms for presentation of a set :Tabular form , A = {1,3,5,7,9,11,}Set builder form, A = {x | x is odd}
-
8/2/2019 W2 Frequency Distribution 0
5/47
March 20, 2012 Copyright Box Hill Institute
Subsets
If every element in a set A is also a member of aset B, then A is called a subsetof B
In other words,
if x A x B for all x,
then A is a subsetof B
It is written as AB or BA
A is called a proper subsetof B, if A B and Ais not equal to B.
-
8/2/2019 W2 Frequency Distribution 0
6/47
March 20, 2012 Copyright Box Hill Institute
Venn Diagram to represent sets
U is the universal set.
A and B are disjoint sets
R is a subset of S
UU
A
B
S
R
-
8/2/2019 W2 Frequency Distribution 0
7/47
March 20, 2012 Copyright Box Hill Institute
Set Operations
Let A and B represent two sets. We havethe definitions in a compact manner
1. A U B ={ x | x
A or x
B or x
both}2. A B ={ x | x A and x B }3. A B ={ x | x A and x B }
4. A
/
={ x | x
A }5. A B={ x | x A or x B but x both}6. #A = Number of elements in set A
-
8/2/2019 W2 Frequency Distribution 0
8/47
Centre for Computer Technology
Statistics and FrequencyDistribution
-
8/2/2019 W2 Frequency Distribution 0
9/47
March 20, 2012 Copyright Box Hill Institute
Introduction
Statistics is the medium to describe thecenter spread and shape of a data set.
Two components Gathering of information or scientific data
Inferential statistics/Statistical methods
Statistical Methods are employed to makejudgements in the face of uncertainty andvariation.
-
8/2/2019 W2 Frequency Distribution 0
10/47
March 20, 2012 Copyright Box Hill Institute
Measures of Central Tendency Measures of Central Tendency are single
values that act as a representative of data
Three main measures
Mean
Mode
Median
-
8/2/2019 W2 Frequency Distribution 0
11/47
March 20, 2012 Copyright Box Hill Institute
Mean
For a given set of n numbers x1,x2,x3,.....xn.
The mean denoted by
x1+x2+x3+.....+xn
= ------------------------
n
-
8/2/2019 W2 Frequency Distribution 0
12/47
March 20, 2012 Copyright Box Hill Institute
Example : Consider the following set of
numbersS = {1, 2, 3, 4, 5, 6, 7, 8, 9}
The mean of the set S is
1+2+3+4+5+6+7+8+9
= ------------------------------- = 59
-
8/2/2019 W2 Frequency Distribution 0
13/47
March 20, 2012 Copyright Box Hill Institute
Median
For a given set of n numbers x1,x2,x3,.....xn
Median is a value where half the values
are of x1,x2,x3,.....xn are larger than themedian and the other half are smaller thanthe median.
In other words, Median is the middlemostnumber
-
8/2/2019 W2 Frequency Distribution 0
14/47
March 20, 2012 Copyright Box Hill Institute
Median
Example : Consider the following set ofnumbers
S = {1, 6, 3, 8, 2, 4, 9}
To find the median, we need to order thelist
S = {1, 2, 3, 4, 6, 8, 9}
The middlemost number is 4 which is themedian of the set.
-
8/2/2019 W2 Frequency Distribution 0
15/47
March 20, 2012Copyright Box Hill Institute
What happens when we have to find the
median of a set with an even number ofelements
For example:
Find the median of
S = {1, 6, 3, 8, 2, 12, 4, 9}
-
8/2/2019 W2 Frequency Distribution 0
16/47
March 20, 2012Copyright Box Hill Institute
Some More Concepts
For a set of n ordered data points
If n is odd, the median is found in the location(n+1)/2 of the set
If n is even, the median is the average of thetwo middle terms.
The two terms are found in the location
n/2, n/2+1
-
8/2/2019 W2 Frequency Distribution 0
17/47
March 20, 2012Copyright Box Hill Institute
Mode
Mode of a data set is the value that occursmost often
If there are two, three or multiple valuesthe data is bimodal, trimodal or multimodal
Example:
R = {2, 8, 1, 9, 5, 2, 7, 2, 7, 9, 4, 7, 1, 5, 2}
The number that appears most is 2, whichis the mode of R.
-
8/2/2019 W2 Frequency Distribution 0
18/47
March 20, 2012Copyright Box Hill Institute
Measures of Dispersion
Consider two sets
S={5, 5, 5, 5, 5, 5}
R={0, 0, 0, 10, 10, 10}for both the above sets, mean = 5
But the above sets are two different datasets.
Is it a good practice to use mean, medianor mode to describe them?
-
8/2/2019 W2 Frequency Distribution 0
19/47
March 20, 2012Copyright Box Hill Institute
Measures of Dispersion
We use another descriptive statistic toevaluate the data called Measure of
Dispersion. It is a measure of scatter or
dispersion.
It is a measure of scatter about the
mean.
-
8/2/2019 W2 Frequency Distribution 0
20/47
March 20, 2012Copyright Box Hill Institute
Measures of Dispersion
What happens to the values of dispersion
If they are concentrated near the mean ?
If they are distributed far from the mean?
-
8/2/2019 W2 Frequency Distribution 0
21/47
March 20, 2012Copyright Box Hill Institute
Measures of Dispersion
If the values are concentrated near the meanof the data set, the measure is small.
If they are distributed far from the mean of thedata set, the measure will be large.
There are two main measures of dispersion
Variance
Standard Deviation
-
8/2/2019 W2 Frequency Distribution 0
22/47
March 20, 2012Copyright Box Hill Institute
Variance and Standard Deviation
For a given set of n numbersx1,x2,x3,.....xn, the Variance, denoted by 2
is given by
(x1- )2 + (x2- )2 + .....+ (xn- )2
2 = -------------------------------------------n
-
8/2/2019 W2 Frequency Distribution 0
23/47
March 20, 2012Copyright Box Hill Institute
Variance (method 2)
Variance (method 2)
= Mean of squares minusSquare ofMean
= ( x2 / n) - ( x / n)2
=
x = x1 + x2 + x3........+ xn
-
8/2/2019 W2 Frequency Distribution 0
24/47
March 20, 2012Copyright Box Hill Institute
Variance and Standard Deviation
The Variance is a non negative number
The positive square root of the varianceis standard deviation.
The simplest spread of variability isSample Range.Xmax - Xmin
-
8/2/2019 W2 Frequency Distribution 0
25/47
March 20, 2012Copyright Box Hill Institute
Variance and Standard Deviation
Example: Find the variance and standarddeviation for the following set of test scores:
T = {75, 80, 82, 87, 96}
The mean of the set T is
75+80+82+87+96 = ------------------------------- = 845
-
8/2/2019 W2 Frequency Distribution 0
26/47
March 20, 2012
Copyright Box Hill Institute
Using the mean we get the variance as
(75-84)2 + (80-84)2 + (82-84)2 + (87-84)2 + (96-84)2
2 = ----------------------------------------------------
5
= 50.8
Standard Deviation = 2 = 7.1274
-
8/2/2019 W2 Frequency Distribution 0
27/47
March 20, 2012
Copyright Box Hill Institute
Sample Space
Set of all possible outcomes of a
statistical experiment is called a sample
space or sample Each outcome is called an element or a
member or sample point
A group of samples is called population
-
8/2/2019 W2 Frequency Distribution 0
28/47
March 20, 2012 Copyright Box Hill Institute
Sample Statistics
Any quantity obtained from a sample forthe purpose of estimating a populationparameter is called a sample statistic
A sample along with inferential statisticsallow us to draw conclusions about
population, with inferential statisticsmaking clear use of elements ofProbability.
-
8/2/2019 W2 Frequency Distribution 0
29/47
March 20, 2012 Copyright Box Hill Institute
Sample Mean
For a given sample of n numbersx1,x2,x3,.....xn.
The sample mean denoted by X
x1+x
2+x
3+.....+x
n
X = ------------------------
n
-
8/2/2019 W2 Frequency Distribution 0
30/47
March 20, 2012 Copyright Box Hill Institute
Weighted Mean
For a given set of data, X= { x1, x2, ..., xn}
and corresponding non-negative weights,
W= { w1, w2, ..., wn}the weighted mean/average, is given by
w1x1+w2x2+w3x3+.....+wnxn
X = ---------------------------------------w1+w2+w3++wn
-
8/2/2019 W2 Frequency Distribution 0
31/47
March 20, 2012 Copyright Box Hill Institute
Sample Variance
For a given sample of n numbersx1,x2,x3,.....xn, the Variance, denoted by S
2
is given by
(x1- X)2 + (x2- X)
2 + .....+ (xn- X)2
S2 = -------------------------------------------(n-1)
-
8/2/2019 W2 Frequency Distribution 0
32/47
March 20, 2012 Copyright Box Hill Institute
Frequency Distributions
For large samples (or populations) it isdifficult to observe various characteristics
or to compute statistics Therefore it is useful to organize or group
the raw data
The data is arranged in intervals of equalwidth.
-
8/2/2019 W2 Frequency Distribution 0
33/47
March 20, 2012 Copyright Box Hill Institute
Frequency Distributions
The intervals are called classes orcategories.
The number of individuals or elements ineach class is determined, called classfrequency.
The resulting arrangement is calledfrequency distribution or frequency table.
-
8/2/2019 W2 Frequency Distribution 0
34/47
March 20, 2012 Copyright Box Hill Institute
Frequency Distribution
Example : Height ofstudents in XYZ
university (frequencytable)
Height
(cm)
Number of
Students
155-159
160-164
165-169
170-174
175-179
5
18
42
27
8
Total 100
-
8/2/2019 W2 Frequency Distribution 0
35/47
March 20, 2012 Copyright Box Hill Institute
Frequency Distribution
In the previous example
The first category 155-159 is called classinterval
The corresponding class frequency is 5.
The mid point of the class interval is calledthe class mark.
-
8/2/2019 W2 Frequency Distribution 0
36/47
March 20, 2012 Copyright Box Hill Institute
Frequency Histogram
Height
(cm)
Number of
Students
155-159
160-164
165-169
170-174
175-179
5
18
42
27
8
Total 100 0
5
10
15
20
25
30
35
40
45
155-159 160-164 165-169 170-174 175-179
Height (cm)
-
8/2/2019 W2 Frequency Distribution 0
37/47
March 20, 2012 Copyright Box Hill Institute
Frequency Polygon
Height
(cm)
Number of
Students
155-159
160-164
165-169
170-174
175-179
5
18
42
27
8
Total 1000
5
10
15
20
25
30
35
40
45
157 161 167 172 177
Height (cm)
-
8/2/2019 W2 Frequency Distribution 0
38/47
March 20, 2012 Copyright Box Hill Institute
Frequency Graphs
In a histogram, the sum of therectangular areas is 100.
A frequency polygon is a graphconnecting the midpoints of the topsof the histogram.
In a bar graph, the sum of theordinates is 1.
-
8/2/2019 W2 Frequency Distribution 0
39/47
March 20, 2012 Copyright Box Hill Institute
Relative Frequency
Height
(cm)
Number of
Students
155-159
160-164
165-169
170-174
175-179
05%
18 %
42 %
27 %
08 %
Total 100%
In relative frequency,the class frequency isreplaced by
percentage ratherthan the number.
In the histogram thevertical axis will bereplaced with relativefrequency instead offrequency.
-
8/2/2019 W2 Frequency Distribution 0
40/47
March 20, 2012 Copyright Box Hill Institute
In the previousexample, whathappens if we have a
student with a heightof 159.7 cm.
Height
(cm)
Number of
Students
155-159
160-164
165-169
170-174
175-179
Total
-
8/2/2019 W2 Frequency Distribution 0
41/47
March 20, 2012 Copyright Box Hill Institute
Continuous Frequency Distribution
The class intervalsare chosen such thatthey are continuous
as shown
Height
(cm)
Number of
Students
154.5-159.4
159.5-164.4
164.5-169.4
169.5-174.4
174.5-179.4
Total
-
8/2/2019 W2 Frequency Distribution 0
42/47
March 20, 2012 Copyright Box Hill Institute
Mean and Variance from
Frequency TableInterval mid point (x) frequency (f) f.X f.X2
a0- a1 x1 f1 f1.x1 f1.x1.x1
a1- a2 x2 f2 f2.x2 f2.x2.x2
an-1an xn fn fn.xn fn.xn.xn
All Total f Total f.x Total f.x.x
Mean = total (f.x) / total f Variance = [total (f.x.x)/total f] (mean)2
-
8/2/2019 W2 Frequency Distribution 0
43/47
March 20, 2012 Copyright Box Hill Institute
Example : Mean and Variance from
Frequency TableClass interval Frequency, f
1.5 1.9
2.0 2.4
2.5 2.9
3.0 3.4
3.5 3.94.0 4.4
4.5 4.9
2
1
4
15
10
5
3
-
8/2/2019 W2 Frequency Distribution 0
44/47
March 20, 2012 Copyright Box Hill Institute
Class interval Class
midpoint, x
Frequency, f f.x f.x2
1.5 1.9
2.0 2.4
2.5 2.9
3.0 3.4
3.5 3.9
4.0 4.4
4.5 4.9
1.7
2.2
2.7
3.2
3.7
4.2
4.7
2
1
4
15
10
5
3
3.4
2.2
10.8
48
37
21
14.1
5.78
4.84
29.16
153.6
136.9
88.2
66.2740 136.5 484.75
-
8/2/2019 W2 Frequency Distribution 0
45/47
March 20, 2012 Copyright Box Hill Institute
Mean = total (f.x) / total f
= 136.5 / 40
= 3.4125
Variance = [total (f.x.x)/total f] (mean)2
= 484.75 / 40 (3.4125)2
= 12.1188 11.6452= 0.4736
-
8/2/2019 W2 Frequency Distribution 0
46/47
March 20, 2012 Copyright Box Hill Institute
Summary
There are three main measures of centraltendency : Mean, Mode and Median.
There are two main measures ofdispersion : Variance and StandardDeviation.
The organization or grouping of raw datain a table is called Frequency distribution.
-
8/2/2019 W2 Frequency Distribution 0
47/47
March 20, 2012 Copyright Box Hill Institute
References
M R Spiegel : Theory and Problems ofStatistics, Schaum's Outline Series,McGraw Hill.
http://mathworld.wolfram.com