Statistics Notes

6
Statistics Revision Notes © ACE-Learning 1 Methods of collecting data: Survey form, questionnaire, observation, interviews and measurement Data Displays Collected data can then be presented in the following ways: pictogram, bar graph, pie chart, line graph, frequency table, histogram, dot diagram and stem-and-leaf diagram. Mean Mean, x , is also known as the average. Ungrouped data Grouped data For set of n numbers x 1 , x 2 , …, x n , n x x f fx x Where f is the frequency; x is the number or score (grouped without class interval) x is the mid-value (grouped according to class interval) E.g. for Grouped Mean Earnings ($) Mid-Value (x) Frequency (f) fx 250 < x 300 275 78 21 450 300 < x 350 325 106 34 450 350 < x 400 375 249 93 375 400 < x 450 425 67 28 475 0 50 f 750 77 1 fx 50 . 355 $ 500 750 177 f fx x Median Middle value of a set of data arranged in ascending or descending order. Ungrouped Data Grouped data Measures of Central Tendency Data Handling

description

Math

Transcript of Statistics Notes

Page 1: Statistics Notes

Statistics Revision Notes

© ACE-Learning 1

Methods of collecting data:

Survey form, questionnaire, observation, interviews and measurement

Data Displays

Collected data can then be presented in the following ways: pictogram, bar graph, pie

chart, line graph, frequency table, histogram, dot diagram and stem-and-leaf diagram.

Mean

Mean, x , is also known as the average.

Ungrouped data Grouped data

For set of n numbers

x1, x2, …, xn, n

xx

f

fxx

Where f is the frequency;

x is the number or score (grouped

without class interval)

x is the mid-value (grouped according to

class interval)

E.g. for Grouped Mean

Earnings

($)

Mid-Value

(x)

Frequency

(f) fx

250 < x 300 275 78 21 450

300 < x 350 325 106 34 450

350 < x 400 375 249 93 375

400 < x 450 425 67 28 475

050 f 750 77 1fx

50.355$

500

750177

f

fxx

Median

Middle value of a set of data arranged in ascending or descending order.

Ungrouped Data Grouped data

Measures of Central Tendency

Data Handling

Page 2: Statistics Notes

Statistics Revision Notes

© ACE-Learning 2

If n is odd,

Median = value in the

th

2

1

nposition

If n is even,

Median = average of the

values in the

th

2

nposition

and

th

12

nposition

Median is the value in the

position of 50% of the

total frequency.

Mode

Mode is the value or score that appears most frequently in a set of data.

Quartile

3 values which divide a set of data (pre-arranged

in ascending order) into 4 equal parts:

1. 1Q - lower quartile

2. 2Q - median

3. 3Q - upper quartile.

Percentile

Value that tells us the percentage of

the data scored at or below that

value and is denoted as 1P , 2P ,

99 P .

Note: 251 PQ , Median502 PQ and 753 PQ

e.g. for Ungrouped Data

Grouped data (N = total frequency)

251 PQ = value that lies on thN) %25(

position;

Median502 PQ is the value that lies

on thN) 0%5( position;

753 PQ is the value that lies

on thN) %75( position.

Note: See illustration at cumulative frequency

Measurement of Spread of Data

Quartiles & Percentiles

14 8 25.5

32 27 24 15 13 10 6 4

Q1

P25

Q2

P50

Median

Q3

P75

Page 3: Statistics Notes

Statistics Revision Notes

© ACE-Learning 3

Range

Range = largest value – smallest value

Interquartile range

Takes into account the middle 50% of the data, eliminating the influence of the extreme

values.

Interquartile Range = 13 QQ

Note:

Small interquartile range value: indicates that the data cluster closely around the median.

Large interquartile value: indicates that the data spread across a wide range of values.

Standard Deviation

Shows how dispersed the rest of the data is from the mean.

It is the positive square root of the variance.

Ungrouped data For grouped data expressed as class intervals,

Variance = N

xx

2)(

Standard deviation,

2

2

2

N

or

N

)(

xx

SD

xxSD

where N is the number of values,

Variance

f

xxf 2)(

2

22

SDor )(

SD xf

fx

f

xxf

where f is frequency; x is the mean; and

x is the number or score (grouped without class

interval)

x is the mid-value (grouped according to class

interval)

e.g. for Ungrouped Data

Age, x

(year) xx 2)( xx 2x

4 – 2 4 16

5 – 1 1 25

6 0 0 36

From the data,

N = 6 and 6x

Page 4: Statistics Notes

Statistics Revision Notes

© ACE-Learning 4

6 0 0 36

7 1 1 49

8 2 4 64

36 x 10)(2 xx

2262 x

e.g. for Grouped Data (Class Interval)

Mass (in kg) Mid-value (x) Frequency (f) fx 2x 2fx

54 < x 58 56 3 168 3136 9408

58 < x 62 60 4 240 3600 14 400

62 < x 66 64 9 576 4096 36 864

66 < x 70 68 13 884 4624 60 112

70 < x 74 72 6 432 5184 31 104

74 < x 78 76 5 380 5776 28 880

40f 2680fx 768 1802 fx

Cumulative frequency is the frequency of values equal to or less than the particular data

value.

Frequency distribution vs. cumulative frequency with their corresponding histograms

Cumulative Frequency Diagrams

(Correct to 3 sig. fig.) kg 50.5

)67(40

768 180

)(SD

2

2

2

xf

fx

kg 67

40

2680

f

fxx

(Correct to

3 sig. fig.) years 29.1

6

10

N

)( SD

2

xx

Page 5: Statistics Notes

Statistics Revision Notes

© ACE-Learning 5

Cumulative Frequency Curve

Analysis of the

cumulative frequency:

Modal Class:

2015 x

Lower Quartile:

.50200%25

Hence,

min 13251 PQ

Median

Cum

ula

tive

Fre

quen

cy

Waiting time (min)

0 15 30 25 20 5 10

50

100

150

200

Fre

quen

cy

Waiting time (min)

0 15 30 25 20 5 10

50

100

Page 6: Statistics Notes

Statistics Revision Notes

© ACE-Learning 6

.100200%50

Hence,

502 PQ = Median

= 17 min

Upper Quartile:

.150200%75

Hence,

min 20753 PQ

Useful in presenting the five-

number summary

Box-and-Whisker Plots

Cum

ula

tive

Fre

quen

cy

Waiting time (min)

0 15 30 25 20 5 10

50

100

150

200

Cumulative frequency curve

of the waiting time of 200 people

x

x

x

x

x x

13 17

1Q 2Q 3Q

0 15 30 25 20 5 10 17 13

Min 1Q 2Q 3Q Max