Central Tendency 2015
-
Upload
adriana-chieng -
Category
Documents
-
view
222 -
download
0
Transcript of Central Tendency 2015
-
7/25/2019 Central Tendency 2015
1/42
Measures of central
tendencyDr Ismarulyusda bt Ishak
-
7/25/2019 Central Tendency 2015
2/42
MEASURES OF CENTRAL TENDENCY FOR
UNGROUPED DATA
Mean
Median
Mode Relationships among the Mean, Median, and Mode
Prem Mann, Introductory Statistics, 7/ECopyright 2010 John Wiley & Sons. All right reserved
-
7/25/2019 Central Tendency 2015
3/42
RM 3,500.00
-
7/25/2019 Central Tendency 2015
4/42
An indication of
the locationor
centrality of the data.
-
7/25/2019 Central Tendency 2015
5/42
The most common measures
Mean
(numerical average),
Median(the midpoint of an order data set such that half of
the data points are above and half are below it)
Mode
(the value that occurs most frequently)
-
7/25/2019 Central Tendency 2015
6/42
POPULATION
THE ENTIRE COLLECTION
OF ITEMSTHAT IS THE FOCUS OF
CONCERN.
-
7/25/2019 Central Tendency 2015
7/42
POPULATION AND SAMPLE
Popn to identify its characteristics.
Sample to make inferences about the
characteristics of the population
-
7/25/2019 Central Tendency 2015
8/42
Figure 1.1 Population and Sample
Prem Mann, Introductory Statistics, 7/ECopyright 2010 John Wiley & Sons. All right reserved
-
7/25/2019 Central Tendency 2015
9/42
Mean
Arithmetic mean (most familiar)
Another name for average.
If describing a population, denoted as , the
greek letter mu.
If describing a sample, denoted as x,called x-bar.
Appropriate for describing measurement data.
Seriously affected by unusual values calledoutliers.
-
7/25/2019 Central Tendency 2015
10/42
Calculating Sample Mean
nxx Formula:
That is, add up all of the data points and divide bythe number of data points.
n= sample size
x= variable used to represent individual data
Do not round! Mean need not be a whole
number.
-
7/25/2019 Central Tendency 2015
11/42
Population Mean
Mean of all values in population
N
x
N= number of values in population
x= variable used to represent individual data
-
7/25/2019 Central Tendency 2015
12/42
Table 3.1 2008 Sales of Six U.S.
Companies
Prem Mann, Introductory Statistics, 7/ECopyright 2010 John Wiley & Sons. All right reserved
Find the 2008 mean sales for these sixcompanies.
-
7/25/2019 Central Tendency 2015
13/42
Example 3-1: Solution
1368228 $228 Billion6
x
x
n
1 2 3 4 5 6x x x x x x x
Thus, the mean 2008 sales of these six companies
was 228, or $228 billion.
Prem Mann, Introductory Statistics, 7/ECopyright 2010 John Wiley & Sons. All right reserved
149 406 183 107 426 97 1368
-
7/25/2019 Central Tendency 2015
14/42
Example 3-3
Table 3.2 lists the total philanthropic givings (in million dollars)
by six companies during 2007.
Prem Mann, Introductory Statistics, 7/ECopyright 2010 John Wiley & Sons. All right reserved
-
7/25/2019 Central Tendency 2015
15/42
Example 3-3
Notice that the charitable contributions made
by Wal-Mart are very large compared to those
of other companies. Hence, it is an outlier.
Show how the inclusion of this outlier affectsthe value of the mean.
Prem Mann, Introductory Statistics, 7/ECopyright 2010 John Wiley & Sons. All right reserved
-
7/25/2019 Central Tendency 2015
16/42
Example 3-3: Solution
If we do not include the charitable giving's of Wal-Mart (the outlier), the mean of the charitablecontributions of the five companies is
22.4 31.8 19.8 9.0 27.5Mean $22.1 million
5
Prem Mann, Introductory Statistics, 7/ECopyright 2010 John Wiley & Sons. All right reserved
-
7/25/2019 Central Tendency 2015
17/42
Example 3-3: Solution
Now, to see the impact of the outlier on the
value of the mean, we include the
contributions of Wal-Mart and find the mean
contributions of the six companies. This meanis
Prem Mann, Introductory Statistics, 7/ECopyright 2010 John Wiley & Sons. All right reserved
22.4 31.8 19.8 9.0 27.5 337.9
Mean $74.73 million6
-
7/25/2019 Central Tendency 2015
18/42
Median
Another name for 50th percentile. The middle value when the original data
value are arranged in order of increasing.
Appropriate for describing measurementdata.
Robust to outliers, that is, not affectedmuch by unusual values.
-
7/25/2019 Central Tendency 2015
19/42
Example 3-4
The following data give the prices (in
thousands of dollars) of seven houses
selected from all houses sold last month ina city.
312 257 421 289 526 374 497
Find the median.
Prem Mann, Introductory Statistics, 7/ECopyright 2010 John Wiley & Sons. All right reserved
-
7/25/2019 Central Tendency 2015
20/42
Example 3-4: Solution
First, we rank the given data in increasing orderas follows:
257 289 312 374 421 497 526
Since there are seven homes in this data set andthe middle term is the fourth term,
Thus, the median price of a house is 374.
Prem Mann, Introductory Statistics, 7/ECopyright 2010 John Wiley & Sons. All right reserved
-
7/25/2019 Central Tendency 2015
21/42
Example 3-5
Table 3.3 gives the 2008 profits (rounded to billions ofdollars) of 12 companies selected from all over the world.
Prem Mann, Introductory Statistics, 7/ECopyright 2010 John Wiley & Sons. All right reserved
-
7/25/2019 Central Tendency 2015
22/42
Table 3.3 Profits of 12 Companies for
2008
Prem Mann, Introductory Statistics, 7/ECopyright 2010 John Wiley & Sons. All right reserved
Find the medianof these data.
-
7/25/2019 Central Tendency 2015
23/42
Example 3-5: Solution
First we rank the given profits as follows:
7 8 9 10 11 12 13 13 14 17 17 45
There are 12 values in this data set. Because
there is an even number of values in the data set,
the median is given by the average of the two
middle values.
Prem Mann, Introductory Statistics, 7/ECopyright 2010 John Wiley & Sons. All right reserved
-
7/25/2019 Central Tendency 2015
24/42
Example 3-5: Solution
The two middle values are the sixth and seventh
in the foregoing list of data, and these two values
are 12 and 13.
Thus, the median profit of these 12 companies is
$12.5 billion.
Prem Mann, Introductory Statistics, 7/ECopyright 2010 John Wiley & Sons. All right reserved
12 13 25Median 12.5 $12.5 billion
2 2
-
7/25/2019 Central Tendency 2015
25/42
Median
The median gives the centerof a histogram,
with half the data values to the left of the median
and half to the right of the median. The
advantage of using the median as a measure ofcentral tendency is that it is not influenced by
outliers. Consequently, the median is preferred
over the mean as a measure of central tendencyfor data sets that contain outliers or not normal.
Prem Mann, Introductory Statistics, 7/ECopyright 2010 John Wiley & Sons. All right reserved
-
7/25/2019 Central Tendency 2015
26/42
Case Study 3-2 The Gender Pay Gap
Prem Mann, Introductory Statistics, 7/ECopyright 2010 John Wiley & Sons. All right reserved
-
7/25/2019 Central Tendency 2015
27/42
Mode The value that occurs with the highest
frequency in a data set. French word = fashion (most
popular/common)
One data set can have many modes. When two value have same greatest frequency, each
one is a mode and the data set is bimodal
When more than twomultimodal
No value repeatedno mode
Appropriate for all types of data, but mostuseful for categorical data or discrete datawithonly a few number of possible values.
-
7/25/2019 Central Tendency 2015
28/42
Speeds (in km/hour) of eight cars that
stopped for speeding violations.
120 140 167 135 140 182 159 133
Solution:
_______ is the mode
l i hi h di
-
7/25/2019 Central Tendency 2015
29/42
Relationships among the Mean, Median,
and Mode
1. For a symmetric histogram and
frequency curve with one peak (Figure
3.2), the values of the mean, median, and
mode are identical, and they lie at the
center of the distribution.
Prem Mann, Introductory Statistics, 7/ECopyright 2010 John Wiley & Sons. All right reserved
-
7/25/2019 Central Tendency 2015
30/42
Relationships among the mean, median and
mode
Mean, median, and mode for a symmetric histogram and frequencydistribution curve.
R l i hi h M M di
-
7/25/2019 Central Tendency 2015
31/42
Relationships among the Mean, Median,
and Mode
2. For a histogram and a frequency curveskewed to the right (Figure 3.3), the value ofthe mean is the largest, that of the mode is
the smallest, and the value of the median lies
between these two. (Notice that the modealways occurs at the peak point.) The value ofthe mean is the largest in this case because it
is sensitive to outliers that occur in the righttail. These outliers pull the mean to the right.
Prem Mann, Introductory Statistics, 7/ECopyright 2010 John Wiley & Sons. All right reserved
-
7/25/2019 Central Tendency 2015
32/42
Mean, median, and mode for a histogram and frequency distribution curveskewed to the right.
R l ti hi th M M di
-
7/25/2019 Central Tendency 2015
33/42
Relationships among the Mean, Median,
and Mode
3. If a histogram and a distribution curve
are skewed to the left (Figure 3.4), the
value of the mean is the smallest and thatof the mode is the largest, with the value
of the median lying between these two.
In this case, the outliers in the left tailpull the mean to the left.
Prem Mann, Introductory Statistics, 7/E
Copyright 2010 John Wiley & Sons. All right reserved
-
7/25/2019 Central Tendency 2015
34/42
Mean, median, and mode for a histogram and frequency
distribution curve skewed to the left.
-
7/25/2019 Central Tendency 2015
35/42
Mean for Grouped Data
N
m f
nx
mf
Mean for
population data
Mean for sample
data
m is midpoint and f is frequency of a class
-
7/25/2019 Central Tendency 2015
36/42
Example 3-14
Table 3.8 gives the frequency
distribution of the daily commuting
times (in minutes) from home to workfor all25 employees of a company.
Calculate the mean of the daily
commuting times.
Prem Mann, Introductory Statistics, 7/E
Copyright 2010 John Wiley & Sons. All right reserved
-
7/25/2019 Central Tendency 2015
37/42
Prem Mann, Introductory Statistics, 6/ECopyright 2007 John Wiley & Sons. All rights reserved.
-
7/25/2019 Central Tendency 2015
38/42
-
7/25/2019 Central Tendency 2015
39/42
= 535 = 21.4 minutes
25N
m f
Thus, the employees of this company spend
an average of 21.40 minutes a daycommuting from home to work.
-
7/25/2019 Central Tendency 2015
40/42
Choosing Appropriate
If data are symmetric, the mean,
median, and mode will be
approximately the same.
If data are multimodal, report the
mean, median and/or mode for each
subgroup. If data are skewed, report the median.
-
7/25/2019 Central Tendency 2015
41/42
-
7/25/2019 Central Tendency 2015
42/42