Statistics Notes
-
Upload
yushandeng -
Category
Documents
-
view
12 -
download
3
description
Transcript of Statistics Notes
Statistics Revision Notes
© ACE-Learning 1
Methods of collecting data:
Survey form, questionnaire, observation, interviews and measurement
Data Displays
Collected data can then be presented in the following ways: pictogram, bar graph, pie
chart, line graph, frequency table, histogram, dot diagram and stem-and-leaf diagram.
Mean
Mean, x , is also known as the average.
Ungrouped data Grouped data
For set of n numbers
x1, x2, …, xn, n
xx
f
fxx
Where f is the frequency;
x is the number or score (grouped
without class interval)
x is the mid-value (grouped according to
class interval)
E.g. for Grouped Mean
Earnings
($)
Mid-Value
(x)
Frequency
(f) fx
250 < x 300 275 78 21 450
300 < x 350 325 106 34 450
350 < x 400 375 249 93 375
400 < x 450 425 67 28 475
050 f 750 77 1fx
50.355$
500
750177
f
fxx
Median
Middle value of a set of data arranged in ascending or descending order.
Ungrouped Data Grouped data
Measures of Central Tendency
Data Handling
Statistics Revision Notes
© ACE-Learning 2
If n is odd,
Median = value in the
th
2
1
nposition
If n is even,
Median = average of the
values in the
th
2
nposition
and
th
12
nposition
Median is the value in the
position of 50% of the
total frequency.
Mode
Mode is the value or score that appears most frequently in a set of data.
Quartile
3 values which divide a set of data (pre-arranged
in ascending order) into 4 equal parts:
1. 1Q - lower quartile
2. 2Q - median
3. 3Q - upper quartile.
Percentile
Value that tells us the percentage of
the data scored at or below that
value and is denoted as 1P , 2P ,
99 P .
Note: 251 PQ , Median502 PQ and 753 PQ
e.g. for Ungrouped Data
Grouped data (N = total frequency)
251 PQ = value that lies on thN) %25(
position;
Median502 PQ is the value that lies
on thN) 0%5( position;
753 PQ is the value that lies
on thN) %75( position.
Note: See illustration at cumulative frequency
Measurement of Spread of Data
Quartiles & Percentiles
14 8 25.5
32 27 24 15 13 10 6 4
Q1
P25
Q2
P50
Median
Q3
P75
Statistics Revision Notes
© ACE-Learning 3
Range
Range = largest value – smallest value
Interquartile range
Takes into account the middle 50% of the data, eliminating the influence of the extreme
values.
Interquartile Range = 13 QQ
Note:
Small interquartile range value: indicates that the data cluster closely around the median.
Large interquartile value: indicates that the data spread across a wide range of values.
Standard Deviation
Shows how dispersed the rest of the data is from the mean.
It is the positive square root of the variance.
Ungrouped data For grouped data expressed as class intervals,
Variance = N
xx
2)(
Standard deviation,
2
2
2
N
or
N
)(
xx
SD
xxSD
where N is the number of values,
Variance
f
xxf 2)(
2
22
SDor )(
SD xf
fx
f
xxf
where f is frequency; x is the mean; and
x is the number or score (grouped without class
interval)
x is the mid-value (grouped according to class
interval)
e.g. for Ungrouped Data
Age, x
(year) xx 2)( xx 2x
4 – 2 4 16
5 – 1 1 25
6 0 0 36
From the data,
N = 6 and 6x
Statistics Revision Notes
© ACE-Learning 4
6 0 0 36
7 1 1 49
8 2 4 64
36 x 10)(2 xx
2262 x
e.g. for Grouped Data (Class Interval)
Mass (in kg) Mid-value (x) Frequency (f) fx 2x 2fx
54 < x 58 56 3 168 3136 9408
58 < x 62 60 4 240 3600 14 400
62 < x 66 64 9 576 4096 36 864
66 < x 70 68 13 884 4624 60 112
70 < x 74 72 6 432 5184 31 104
74 < x 78 76 5 380 5776 28 880
40f 2680fx 768 1802 fx
Cumulative frequency is the frequency of values equal to or less than the particular data
value.
Frequency distribution vs. cumulative frequency with their corresponding histograms
Cumulative Frequency Diagrams
(Correct to 3 sig. fig.) kg 50.5
)67(40
768 180
)(SD
2
2
2
xf
fx
kg 67
40
2680
f
fxx
(Correct to
3 sig. fig.) years 29.1
6
10
N
)( SD
2
xx
Statistics Revision Notes
© ACE-Learning 5
Cumulative Frequency Curve
Analysis of the
cumulative frequency:
Modal Class:
2015 x
Lower Quartile:
.50200%25
Hence,
min 13251 PQ
Median
Cum
ula
tive
Fre
quen
cy
Waiting time (min)
0 15 30 25 20 5 10
50
100
150
200
Fre
quen
cy
Waiting time (min)
0 15 30 25 20 5 10
50
100
Statistics Revision Notes
© ACE-Learning 6
.100200%50
Hence,
502 PQ = Median
= 17 min
Upper Quartile:
.150200%75
Hence,
min 20753 PQ
Useful in presenting the five-
number summary
Box-and-Whisker Plots
Cum
ula
tive
Fre
quen
cy
Waiting time (min)
0 15 30 25 20 5 10
50
100
150
200
Cumulative frequency curve
of the waiting time of 200 people
x
x
x
x
x x
13 17
1Q 2Q 3Q
0 15 30 25 20 5 10 17 13
Min 1Q 2Q 3Q Max