!NNC Yr12 maths ch 04 Century Year 12/04... · spread. For set B, the interquartile range is the...
Transcript of !NNC Yr12 maths ch 04 Century Year 12/04... · spread. For set B, the interquartile range is the...
![Page 1: !NNC Yr12 maths ch 04 Century Year 12/04... · spread. For set B, the interquartile range is the better measure of spread as it does not take the outlier score 29 into account. Standard](https://reader035.fdocuments.us/reader035/viewer/2022071013/5fcb6a5622805a07463cbfd1/html5/thumbnails/1.jpg)
STATISTICAL DISTRIBUTIONS
109
44
Statistical distributions
DATA ANALYSIS
Sydneysiders often like to claim that it’s always raining in Melbourne, but is Melbourne really such a wet city? The following data shows the average number of rainy days per month for the two capital cities, and is supplied by the Bureau of Meteorology.
Is it possible to use this data to compare the rainfalls of the two cities and decide which city is wetter? Does Melbourne generally have more rainy days than Sydney, or is the number of rainy days per month more consistent?
This chapter is about comparing the statistics of data sets and noting any similarities and differences. Do men spend more money than women? Do more people shop on weekends than weekdays? Are teachers generally younger than doctors? Data sets can be compared by examining the shapes of their graphs or by analysing their calculated measures of location and spread.
In this chapter you will learn how to:
n
name and use the different types of data and random samples
n
calculate measures of location (mean, median, mode)
n
calculate measures of spread (range, interquartile range, standard deviation)
n
analyse and interpret dot plots, stem-and-leaf-plots, box plots and radar charts
n
investigate outliers in small data sets and their effects on the mean, median and mode
n
describe the shape of a distribution and make conclusions about the data in the distribution
n
display and compare two sets of data in double stem-and-leaf plots, double box-and-whisker plots, radar charts and area charts
n
interpret data presented in a two-way table form
n
use summary statistics and multiple displays to interpret and compare the relationships between two data sets.
Month
J F M A M J J A S O N D
Sydney
12 12 13 12 12 12 10 10 10 12 11 12
Melbourne
8 7 9 12 14 14 15 16 15 14 12 11
!NNC Yr12 maths ch 04 Page 109 Wednesday, October 4, 2000 1:43 AM
![Page 2: !NNC Yr12 maths ch 04 Century Year 12/04... · spread. For set B, the interquartile range is the better measure of spread as it does not take the outlier score 29 into account. Standard](https://reader035.fdocuments.us/reader035/viewer/2022071013/5fcb6a5622805a07463cbfd1/html5/thumbnails/2.jpg)
110
NEW CENTURY MATHS GENERAL: HSC
COLLECTING AND DISPLAYING DATA
Collecting data
Data or information can be collected by a variety of means:
n
through observation, such as a naturalist observing animal behaviour
n
by experiment—for example, a medical researcher testing the effects of a new drug
n
from a survey, usually via a telephone poll or a written questionnaire
n
by taking a census—that is, surveying the whole population.
Do you still have your statistics file containing graphs and tables collected during the Preliminary Course? You should now add to your file by collecting articles from recent newspapers that contain graphs and tables, especially those that contain more than one statistical display or display data in an interesting way.
Use your library or explore the Internet to find real data. Here are three useful websites:Australian Bureau of Statistics (ABS) www.abs.gov.au or www.statistics.gov.auBureau of Meterorology www.bom.gov.auMorgan Surveys www.roymorgan.com.au
Types of data
Data falls into one of the following types:
n
quantitative
(numerical) data that is
discrete
, such as the number of computers in schools
n
quantitative
(numerical) data that is
continuous
, such as the weights of gym members
n
categorical
(qualitative) data, such as the birthplaces of people living in Sydney.
Example 1
What type of data is each of these?(a) the numbers of people attending Olympic Games(b) the types of breakfast cereal in Cottonworths supermarket(c) the body temperature of a hospital patient taken over a 24-hour period
Solution
(a) Quantitative and discrete since the data are distinct whole numbers.(b) Categorical since the data are brand names of cereals.(c) Quantitative and continuous since the data can be measured along a continuous scale.
n
Quantitative or numerical data is best displayed in a column graph or line graph.
n
Categorical or qualitative data is best displayed in a sector graph or divided bar graph.
Why do you think this is so?
Idea: Collecting statistical graphs and tables
Idea: Use the Internet to find real data
Think: Which graph is best?
P
!NNC Yr12 maths ch 04 Page 110 Wednesday, October 4, 2000 1:43 AM
![Page 3: !NNC Yr12 maths ch 04 Century Year 12/04... · spread. For set B, the interquartile range is the better measure of spread as it does not take the outlier score 29 into account. Standard](https://reader035.fdocuments.us/reader035/viewer/2022071013/5fcb6a5622805a07463cbfd1/html5/thumbnails/3.jpg)
STATISTICAL DISTRIBUTIONS
111
Random sampling
It is not always convenient to collect data from all members of a population—that is, by using a
census
. If a population is too large or too difficult to survey, a
sample
of items can be taken from the population and analysed, and the results used to reflect population characteristics.
n
A
simple random sample
is one where each member of the population is equally likely to be chosen—for example, choosing the winning balls in Lotto.
n
A
systematic sample
is one where the first member is chosen at random and the others are chosen at regular intervals—for example, every 8th toy on a production line.
n
A
stratified sample
is one where a representative sample is taken from each
stratum
or
layer
of a population—for example, a stratified sample from a population containing 70% adults and 30% children would contain 70% adults and 30% children.
1.
State whether the data is (i) categorical, (ii) quantitative and discrete, or (iii) quantitative and continuous in each case.(a) temperature of water in a swimming pool(b) number of people who voted Liberal in the last four elections(c) response time when patient’s reflexes are tested(d) religious denomination(e) breeds of dogs(f) speed of a car(g) number of goals scored in a football match(h) heights of girls in the school athletics team
2.
Give two reasons for choosing a sample rather than a census.
3.
What are
biased
and
unbiased
samples?
4.
Which type of random sample (simple, systematic or stratified) would best suit each of the following situations?(a) random breath testing(b) opinion poll on whether Australia should change the flag(c) taste testing a new brand of soft drink
5.
Answer the questions that follow these three displays.(a)
Exercise 4-01: Collecting and displaying data
Drugs used by Australians
Alcohol
Tobacco
Cannabis
Painkillers
Sleeping pills
Heroin
Amphetamines
Ecstasy
Cocaine
Hallucin
ogens
100
80
60
40
20
0
90
70
50
30
10
Pe
rce
nta
ge
!NNC Yr12 maths ch 04 Page 111 Wednesday, October 4, 2000 1:43 AM
![Page 4: !NNC Yr12 maths ch 04 Century Year 12/04... · spread. For set B, the interquartile range is the better measure of spread as it does not take the outlier score 29 into account. Standard](https://reader035.fdocuments.us/reader035/viewer/2022071013/5fcb6a5622805a07463cbfd1/html5/thumbnails/4.jpg)
112
NEW CENTURY MATHS GENERAL: HSC
(b)
(c)
(i) State the type of display and information contained in the display.(ii) Describe the type of data displayed.(iii) Comment briefly on the strengths and weaknesses of the display.
SUMMARY STATISTICS
Measures of location (or averages)
Measures of
location
or
averages
are used to indicate the middle or centre of a data set. There are three measures of location: the mean, median and mode. You should use the measure that is best suited to the type and distribution of the data.
The
mode
is the most popular value or category.
The centre of a
categorical
data set is always described by the mode. For example, the modal dress size is the size worn by more women than any other.
The
mean
is the arithmetic average of all scores:
=
or
=
.
The
median
is the middle score (or average of the two middle scores) when the scores are arranged in ascending order.
The centre of a
quantitative
data set is usually described by the mean or the median.
The mean takes into account all scores in a data set and can be considered as the ‘balance point’ of the data set. It is, however, affected by very large or very small scores. In distributions where there are outliers, it is better to use the median as the measure of location.
Newstart Allowance
Lessthan 20
Morethan 60
21–34 35–54 55–59
250 000
200 000
150 000
100 000
50 000
0
199519961997
Road fatalities in Australia
Mar 95 Sep 96Sep 95 Mar 96 Mar 97 Sep 98Sep 97 Mar 98 Mar 99
3.5
3.0
2.5
2.0
1.5
1.0
0.5
0.0
Fat
aliti
es p
er 1
000
0 po
pula
tion
xΣxn
------ xΣfxΣf--------
!NNC Yr12 maths ch 04 Page 112 Wednesday, October 4, 2000 1:43 AM
![Page 5: !NNC Yr12 maths ch 04 Century Year 12/04... · spread. For set B, the interquartile range is the better measure of spread as it does not take the outlier score 29 into account. Standard](https://reader035.fdocuments.us/reader035/viewer/2022071013/5fcb6a5622805a07463cbfd1/html5/thumbnails/5.jpg)
STATISTICAL DISTRIBUTIONS
113
Outliers
An
outlier
is a very high or very low score that is clearly apart from the other scores.
An outlier can occur for a variety of reasons and should always be investigated. If an outlier is found to be a value obtained through incorrect measurement or observation and is not a typical score, it can be excluded. If the outlier is a possible value from the population, it should be included in the distribution.
Here the outlier temperature is 42°C.
Example 2
Find the mean and median of the two data sets and state which is the more appropriate measure of location for each set:
A: 1 2 3 3 4 7 8
B: 1 2 3 3 4 7 29
Solution
A: Mean
=
=
4
Median
=
3
B: Mean
=
=
7
Median
=
3
For set A, either the mean or median could be used as the measure of location.
For set B, the median is the better measure of location as the outlier 29 affects the mean.
36 37 38 39 40 41 42 °C
x1 2 3 3 4 7 8+ + + + + +
7----------------------------------------------------------- The mean can also be found
using the statistics mode of a scientific or graphics calculator.
x1 2 3 3 4 7 29+ + + + + +
7--------------------------------------------------------------
Just for the record
THE CHALLENGER DISASTER
In 1986, the space shuttle Challenger exploded just after takeoff and seven astronauts were killed. It was found that two rubber O-rings had failed because of the low air temperature. Here is some data from previous flights:
An engineer had noticed the outlier (i.e. a damage index of 11 at a temperature of 12°C) and the fact that the expected air temperature at the takeoff time was below 0°C and had recommended that the flight be delayed. Unfortunately, the outlier was not considered important and the flight ended in tragedy.
Can you give another example where an outlier should not be ignored?
Air temperature (°C) 12 14 19 23 26
Damage index 11 4 0 0 0
!NNC Yr12 maths ch 04 Page 113 Wednesday, October 4, 2000 1:43 AM
![Page 6: !NNC Yr12 maths ch 04 Century Year 12/04... · spread. For set B, the interquartile range is the better measure of spread as it does not take the outlier score 29 into account. Standard](https://reader035.fdocuments.us/reader035/viewer/2022071013/5fcb6a5622805a07463cbfd1/html5/thumbnails/6.jpg)
114
NEW CENTURY MATHS GENERAL: HSC
Measures of spread
Measures of
dispersion
or
spread
are used to indicate how spread out a data set is. As with measures of location, you should use the measure that is best suited to the type and distribution of data.
Range and interquartile range
Range
= highest score – lowest score
Interquartile range = upper quartile – lower quartile= Q3 – Q1
The interquartile range is the range of the middle 50% of scores. The upper quartile (Q3) is the median of the upper half of the scores and the lower quartile (Q1) is the median of the lower half. The interquartile range is often a better indicator of spread than the range as it does not take extreme scores into account.
Example 3Find the range and interquartile range of the two data sets and state which is the more appropriate measure of spread for each set.
A: 1 2 3 3 4 7 8B: 1 2 3 3 4 7 29
SolutionA: 1 2 3 3 4 7 8
↑ ↑ ↑Q1 Q2 Q3
Range = 8 – 1 = 7Interquartile range = 7 – 2 = 5
B: 1 2 3 3 4 7 29↑ ↑ ↑Q1 Q2 Q3
Range = 29 – 1 = 28
Interquartile range = 7 – 2 = 5
For set A, either the range or interquartile range could be used as the scores are fairly evenly spread.
For set B, the interquartile range is the better measure of spread as it does not take the outlier score 29 into account.
Standard deviationStandard deviation is the most common measure of the spread of a distribution. It is the square root of the average of the squared deviations from the mean.
σn is the standard deviation of a population and σn − 1 is the standard deviation of a sample.
σn − 1 is used to approximate the population standard deviation σn and gets closer to the population value as the sample size increases.
Use σn − 1 if the data is from a sample (or if you are unsure) and σn if all the possible data is given. In either case, always state which standard deviation you are using.
σn − 1 for sample data σn for population data
!NNC Yr12 maths ch 04 Page 114 Wednesday, October 4, 2000 1:43 AM
![Page 7: !NNC Yr12 maths ch 04 Century Year 12/04... · spread. For set B, the interquartile range is the better measure of spread as it does not take the outlier score 29 into account. Standard](https://reader035.fdocuments.us/reader035/viewer/2022071013/5fcb6a5622805a07463cbfd1/html5/thumbnails/7.jpg)
STATISTICAL DISTRIBUTIONS 115
Mean and standard deviation from a calculatorThe mean and standard deviation of a data set can be calculated using the statistics mode (STAT or SD) of your calculator.
Example 4Here are the net weekly earnings of 8 labourers:
$730 $490 $600 $440 $490 $370 $700 $580(a) What is the mean weekly earning?(b) What is the standard deviation of the earnings?
SolutionClear any previous data and check that n = 0.Enter the separate values: 730 490 600 … 580 Check that you have entered the correct number of scores by checking that n = 8.
(a) Mean = $550
(b) Standard deviation σn − 1 ≈ $125.40
Example 5Find the standard deviation of the two data sets and state which set is the more widely spread.
A: 1 2 3 3 4 7 8B: 1 2 3 3 4 7 29
SolutionA: Sample standard deviation σn − 1 ≈ 2.58
B: Sample standard deviation σn − 1 ≈ 9.88
Set B is more widely spread as it has a much larger standard deviation.
Example 6Twenty possums were captured, tagged and released in the Booderee National Park. Rangers recaptured several samples of 10 possums over a 2-month period and recorded the number of tagged possums in each sample.
(a) What is the mean number of tagged possums per sample (correct to 2 decimal places)?(b) What is the standard deviation of tagged possums (correct to 2 decimal places)?
SolutionClear any previous data and check that n = 0.
Enter the data: 0 8 1 11 2 5 etc.
Check that you have entered the correct number of scores by checking that n = 31.
(a) Mean = 1.48 possums
(b) Standard deviation σn − 1 = 1.36 possums
No. tagged per sample (score) 0 1 2 3 4 5
No. of samples (frequency) 8 11 5 4 2 1
P
DATA DATA DATA DATA
x
σn − 1 is the sample standard deviation.
× DATA × DATA × DATA
x
!NNC Yr12 maths ch 04 Page 115 Wednesday, October 4, 2000 1:43 AM
![Page 8: !NNC Yr12 maths ch 04 Century Year 12/04... · spread. For set B, the interquartile range is the better measure of spread as it does not take the outlier score 29 into account. Standard](https://reader035.fdocuments.us/reader035/viewer/2022071013/5fcb6a5622805a07463cbfd1/html5/thumbnails/8.jpg)
116 NEW CENTURY MATHS GENERAL: HSC
Example 7The annual salaries of employees at the Nelson manufacturing company are tabulated.
(a) How many people are employed at the company?(b) Using class centres 45, 55, 65, …, find the estimated mean salary of the employees.(c) What is the standard deviation of the salaries?
Solution
(a) Number of employees = 46.(b) Clear any previous data and check that n = 0.
Enter the data: 25 16 35 12 45 5 etc.
Check that n = 46.Mean = 40.4.Hence the estimated mean salary is $40 400 per annum.
(c) Standard deviation σn = 15.9.
Hence the standard deviation of the salaries is $15 900 per annum.
1. Without using the statistical functions on your calculator, find the mean and median for each data set (correct to 1 decimal place where appropriate). State which is the better measure of location and why.(a) 2 3 3 5 6 8 9(b) 26 22 24 29 21 23 24 22(c) 8 40 38 42 45 29 31 41 30(d) 6 8 11 9 10 8 11 12 6 7
2. Find the range and interquartile range for each data set in question 1 and state which is the better measure of spread and why.
Annual salary (× $1000) Number of employees
20–,30 16
30–,40 12
40–,50 5
50–,60 5
60–,70 6
70–,80 2
Annual salary (× $1000) Class centre No. of employees
20–,3030–,4040–,5050–,6060–,7070–,80
253545556575
16125562
46
20–,30 means from 20 up to but not including 30.
× DATA × DATA × DATA
x
σn is the population standard deviation.
Exercise 4-02: Summary statistics
!NNC Yr12 maths ch 04 Page 116 Wednesday, October 4, 2000 1:43 AM
![Page 9: !NNC Yr12 maths ch 04 Century Year 12/04... · spread. For set B, the interquartile range is the better measure of spread as it does not take the outlier score 29 into account. Standard](https://reader035.fdocuments.us/reader035/viewer/2022071013/5fcb6a5622805a07463cbfd1/html5/thumbnails/9.jpg)
STATISTICAL DISTRIBUTIONS 117
3. Using your calculator, find the mean and standard deviation (correct to 1 decimal place) for each of these sets of data.(a) 42, 35, 63, 70, 81, 80, 85(b) $300, $400, $600, $440, $300, $700, $250, $580, $260(c) 37.4°F, 38.2°F, 39.0°F, 36.8°F, 38.5°F, 38.0°F, 36.8°F, 40.5°F(d) 165 kg, 146 kg, 178 kg, 190 kg, 158 kg, 147 kg(e) 23, 18, 24, 16, 17, 20, 15, 22, 19
4. The hair colours of 75 people were noted.(a) What is the modal hair colour?(b) Why is the mode the best measure of
central tendency here?(c) Why are the mean and median not
appropriate measures here?
5. Joshua swam a kilometre each morning for 10 days in preparation for a swimming carnival. His times (in minutes) were:
28 24 22 24 25 24 26 26 24 27(a) What is his median swim time?(b) What is his mean swim time?(c) What is his range of swim times?(d) What is the interquartile range of his times?(e) What is the standard deviation of his times (correct to 1 decimal place)?(f) If Joshua asked you to tell him the most appropriate measures of location and spread
for these times, which two would you choose? Justify your answer.
6. Ted and Julie were paid by piecework for making T-shirts. The numbers made each day over an 8-day period were:
Ted: 18 25 19 19 26 24 15 22Julie: 16 20 21 28 12 26 18 19
(a) For each person find:(i) the number of T-shirts made in the 8-day period(ii) the interquartile range of T-shirts made(iii) the mean number of T-shirts made(iv) the standard deviation of T-shirts made (correct to 1 decimal place)
(b) Comment on the statement that Ted is a more consistent worker than Julie by comparing their means and standard deviations. Give reasons for your answer.
7. Numbers of motor accidents per week over a 9-week period at a busy intersection were:4 3 6 0 4 9 2 3 5
(a) What is the median number of accidents?(b) What is the mean number of accidents per week?(c) Does the mean or median best describe the centre of this data set? Give reasons.(d) What is the range of the data?(e) What is the interquartile range?(f) Find the standard deviation for the data (correct to 1 decimal place).(g) Comment on the statement: ‘The number of accidents per week is fairly consistent’.
Justify your answer.
Hair colour No. of people
Brown 16
Blonde 25
Black 14
Red 7
Other 13
!NNC Yr12 maths ch 04 Page 117 Wednesday, October 4, 2000 1:43 AM
![Page 10: !NNC Yr12 maths ch 04 Century Year 12/04... · spread. For set B, the interquartile range is the better measure of spread as it does not take the outlier score 29 into account. Standard](https://reader035.fdocuments.us/reader035/viewer/2022071013/5fcb6a5622805a07463cbfd1/html5/thumbnails/10.jpg)
118 NEW CENTURY MATHS GENERAL: HSC
8. This stem plot shows the waiting times in a medical centre (in minutes).(a) Find the mean waiting time (correct to 1 decimal place).(b) Find the standard deviation of waiting times (correct
to 1 decimal place).
9. The weekly wages of a group of teachers are shown in the table.(a) What is the mean weekly wage?(b) What is the standard deviation of the
weekly wages (correct to 1 decimal place)?
10. The percentage marks for 250 students in a Business Studies examination are listed.
(a) What is the mean mark (to the nearest whole number)?(b) What is the standard deviation of the marks (correct to 1 decimal place)?(c) Write a comment to the school principal describing the results of these students.
FEATURES OF A STATISTICAL DISPLAYShapeThe shape of a statistical display shows how the data is distributed. When using a dot plot or histogram, a curve can be used to approximate the general shape.
ClusteringClustering occurs when scores are close together or ‘bunched up’. In this stem-and-leaf plot, the scores are clustered in the 50s and 80s.
Score 11–20 21–30 31–40 41–50 51–60 61–70 71–80 81–90 91–100
No. of students 14 18 15 26 20 31 44 39 43
Stem Leaf
123456
2 4 53 6 90 1 4 5 7 7 82 4 51 3 72
Weekly wage ($) No. of teachers
500–,600 8
600–,700 5
700–,800 30
800–,900 25
900–,1000 12
0 1 2 3 4 5 6 7 8 9 10
Stem Leaf
3456789
1 23 5 90 2 6 7 7 8 94 53 5 60 2 4 6 6 8 8 94 2 9
!NNC Yr12 maths ch 04 Page 118 Wednesday, October 4, 2000 1:43 AM
![Page 11: !NNC Yr12 maths ch 04 Century Year 12/04... · spread. For set B, the interquartile range is the better measure of spread as it does not take the outlier score 29 into account. Standard](https://reader035.fdocuments.us/reader035/viewer/2022071013/5fcb6a5622805a07463cbfd1/html5/thumbnails/11.jpg)
STATISTICAL DISTRIBUTIONS 119
SymmetryA distribution has symmetry if the scores are balanced or evenly spread about the centre of the distribution.
In a symmetrical distribution, the mean, median and mode are usually the same. For this distribution, the mean median and mode are all 5.
SkewA distribution that is skewed is not symmetrical. The tail indicates the direction of the skew.n If the scores are mostly low (or to the left), the distribution is positively skewed.n If the scores are mostly high (or to the right), the distribution is negatively skewed.
This distribution is positively skewed. The tail points to the right, the positive direction.
The data in this dot plot is negatively skewed. The tail points to the left, the negative direction.
The data in this stem-and-leaf display is negatively skewed as the scores are mostly high with a tail towards the low scores. Clustering also occurs in the 70s and 80s.
Peaks and modesPeaks are the high points or ‘humps’ in a display. The highest peak is called the mode.
No peaks: display is uniform or flat and there is no mode.
One peak: display is unimodal. The mode here is 6.
Two peaks: display is bimodal. The mode here is 6.
Many peaks: display is multimodal . There are three peaks here and twomodes: 5 and 7.
3 4 5 6 7
21 22 23 24 25 26 27 28
Stem Leaf
23456789
2 60 14 82 6 91 3 4 50 2 3 4 4 5 5 7 7 7 8 93 5 7 7 8 8 8 8 92 8
3 4 5 6 7
3 4 5 6 7
3 4 5 6 7 8 9
The mode is the higher of the two peaks.
3 4 5 6 7 8 92 10There are two highest peaks here so there are two modes.
!NNC Yr12 maths ch 04 Page 119 Wednesday, October 4, 2000 1:43 AM
![Page 12: !NNC Yr12 maths ch 04 Century Year 12/04... · spread. For set B, the interquartile range is the better measure of spread as it does not take the outlier score 29 into account. Standard](https://reader035.fdocuments.us/reader035/viewer/2022071013/5fcb6a5622805a07463cbfd1/html5/thumbnails/12.jpg)
120 NEW CENTURY MATHS GENERAL: HSC
The three distributions show the relative positions of the mean, median and mode.
n For a symmetrical distribution, the mean, median and mode are usually equal.n For a skewed distribution, the median is usually between the mean and mode and is the
better measure of location.
Diagram (a) could represent results in an HSC General Mathematics examination.
Diagram (b) could represent traffic flow from 6 am to noon.
Diagram (c) could represent the heights of basketball players in a club.
Can you think of other situations that these diagrams could represent?
Think: Shape and measures of location
(a) (b) (c)
MeanMedian
Fre
quen
cy
Score
Mode
MeanMedian
Fre
quen
cy
ScoreMode MeanMedian
Fre
quen
cy
ScoreMode
Symmetrical Positively skewed Negatively skewed
TEN HOT TIPS FOR TACKLING EXAMS
1. Find out about the format of the exam: the topics to be tested, the time allowed, the number and format of questions, the marks awarded, whether formulas are supplied.
2. Be prepared!
3. Spend the first 5 minutes browsing through the exam to see the work that is ahead of you. Note the harder questions—you may need to spend more time on them.
4. Spend the first minute of each question planning and thinking.
5. Keep an eye on the time. Don’t spend too much time on one question.
6. Write clearly. Draw big diagrams. Spread out your working and set it out neatly. Write down the page, not across.
7. Make sure you have answered the question. Did you remember to round off and/or include units? Did you use all of the relevant information given?
8. Attempt every question.
9. If the working-out to a hard question is taking too long, then it’s probably wrong. Don’t get bogged down. If you’re getting nowhere, retrace your steps, start again, or skip the question and return later with a fresh mind.
10. Once you have completed the exam, go over it again. Double-check your answers, especially the harder ones or those of which you’re unsure.
Study tips
!NNC Yr12 maths ch 04 Page 120 Wednesday, October 4, 2000 1:43 AM
![Page 13: !NNC Yr12 maths ch 04 Century Year 12/04... · spread. For set B, the interquartile range is the better measure of spread as it does not take the outlier score 29 into account. Standard](https://reader035.fdocuments.us/reader035/viewer/2022071013/5fcb6a5622805a07463cbfd1/html5/thumbnails/13.jpg)
STATISTICAL DISTRIBUTIONS 121
1. Draw a curve representing a statistical display that:(a) is symmetrical (b) is positively skewed(c) shows clustering (d) is negatively skewed with clustering(e) is symmetrical and bimodal
2. For each of the following displays state:(i) if the data is symmetrical or skewed(ii) if there are any clusters(iii) if there are any outliers(iv) how many peaks there are
(a) (b)
(c) (d)
(e) (f)
3. The numbers of visits (or hits) to a popular Internet website were tabulated over a 10-hour period.
Draw a histogram to represent this data and comment on the features of the display, such as shape, skew, clustering and peaks.
Time1201–1300
1301–1400
1401–1500
1501–1600
1601–1700
1701–1800
1801–1900
1901–2000
2001–2100
2101–2200
Hits (× 1000) 1.3 0.8 0.4 2.1 2.6 4.5 3.9 5.3 2.3 1.2
Exercise 4-03: Features of a statistical display
1 2 3 4 5 6 7 8 9 10 11Score
0
2
4
6
8
10
12
Fre
quen
cy 4 5 6 7 8 9
Stem Leaf
12345
3 4 6 6 6 7 8 9 90 71 2 2 5 7 8 8 90 2 32 9
11 13 16 18 19 2010 12 14 15 17
5 10 15 20 25 30 35 40 45 50Score
0
1
2
3
4
5
6
Fre
quen
cy
7
8
9Stem Leaf
456678
1 35 5 60 3 55 6 82 65 5 8
!NNC Yr12 maths ch 04 Page 121 Wednesday, October 4, 2000 1:43 AM
![Page 14: !NNC Yr12 maths ch 04 Century Year 12/04... · spread. For set B, the interquartile range is the better measure of spread as it does not take the outlier score 29 into account. Standard](https://reader035.fdocuments.us/reader035/viewer/2022071013/5fcb6a5622805a07463cbfd1/html5/thumbnails/14.jpg)
122 NEW CENTURY MATHS GENERAL: HSC
4. For the given information:5 14 8 7 12 3 2 8 4 10 6 2 7 3 9 9 6 4 8 9
(a) draw a dot plot to display the data(b) comment on the features of the display
5. Here is a set of data:22 16 36 15 16 24 15 15 19 55 58 59 18 17 20 20 24 15 54 1915 40 21 17 50 22 23 21 24 23 15 35 15 24 22 19 15 17 43 49
(a) Draw a stem-and-leaf display for this data set using stems 1, 2, 3, 4 and 5.(b) Comment on the features of the display.(c) Give the name of a possible population that this data could represent.
6. This dot plot represents the industrial accidents per month at a factory:
(a) What is the mean number of accidents in this period (correct to 1 decimal place)?(b) What is the standard deviation (correct to 1 decimal place)?(c) What could be a possible reason for the outlier 9?(d) What are the mean and standard deviation if the outlier 9 is not included (correct to
1 decimal place)?(e) Compare the means and standard deviations of the two groups of data.
INVESTIGATING OUTLIERSOutliers often have the effect of raising or lowering a mean value but they can also affect the mode and median.
Example 8A: 20 25 30 35 40 45B: 20 25 30 35 40 60C: 20 25 30 35 40 120
(a) Find the mean and median of each set of scores.(b) The three data sets are the same except for the value of the last score. Investigate the
effect of increasing the last score on the mean and median of set A.(c) What are the values of the mean and median of set C if the outlier 120 is not included?
Solution(a) A: 20 25 30 35 40 45
B: 20 25 30 35 40 60C: 20 25 30 35 40 120
↑Median = 32.5
Set Mean Median
A 32.5 32.5
B 35 32.5
C 45 32.5
1 2 3 4 5 6 70 8 9Accidents/month
!NNC Yr12 maths ch 04 Page 122 Wednesday, October 4, 2000 1:43 AM
![Page 15: !NNC Yr12 maths ch 04 Century Year 12/04... · spread. For set B, the interquartile range is the better measure of spread as it does not take the outlier score 29 into account. Standard](https://reader035.fdocuments.us/reader035/viewer/2022071013/5fcb6a5622805a07463cbfd1/html5/thumbnails/15.jpg)
STATISTICAL DISTRIBUTIONS 123
(b) Increasing the last score has no effect on the median.As the last score increases, so the value of the mean increases. The outlier of 120 has the greatest effect on the value of the mean.
(c) Set C without the score 120 has a mean and median of 30.
1. For each pair of data sets below find:(i) the mean and median (correct to 1 decimal place)(ii) the value of any outlier score(iii) the effect on the mean and median of any outlier
(a) A: 10 12 14 16 18 20B: 10 12 14 16 18 40
(b) A: 5 37 41 53 56B: 36 37 41 53 56
(c) A: 3 4 8 9 12 14B: 3 6 7 10 13 25
(d) A: 110 120 130 135 135 140 140B: 55 115 135 140 145 145 150
2. For each data set below:(i) find the mean, median and mode (correct to 1 decimal place where needed)(ii) state the value of any outlier(iii) say which measure of location is the most appropriate(iv) sketch the shape
(a) 2 8 3 16 9 26 8(b) 8 16 4 21 4 23 16 12(c) 120 g 85 g 72 g 60 g 80 g 80 g(d) 37°C 38°C 41°C 39°C 38°C 37°C 37°C
3. The 7 employees at the Bug and Beef Cafe earned the following wages in a week:$350 $420 $510 $130 $635 $320 $460
(a) What is the mean wage?(b) What is the median wage?(c) Which is the more appropriate measure of location? Justify your answer.(d) If each employee received a 10% pay rise, what would be the new mean and median
wages?(e) By what percentage would the mean increase?(f) If the manager who earned $635 was not included in the data set, what would be the
mean and median wages?
4. In a netball tournament of 5 matches, the numbers of points scored by three teams are:The Wombats: 24 18 14 6 22The Possums: 16 16 15 18 15The Koalas: 36 8 14 16 12
(a) What are the mean and median for each team?(b) Which team is more consistent? Why?(c) An error was found in the recording for the Wombats. The score of 6 should have
been 16. What are the new mean and median?(d) Which team is more consistent now? Why?
Exercise 4-04: Investigating outliers
!NNC Yr12 maths ch 04 Page 123 Wednesday, October 4, 2000 1:43 AM
![Page 16: !NNC Yr12 maths ch 04 Century Year 12/04... · spread. For set B, the interquartile range is the better measure of spread as it does not take the outlier score 29 into account. Standard](https://reader035.fdocuments.us/reader035/viewer/2022071013/5fcb6a5622805a07463cbfd1/html5/thumbnails/16.jpg)
124 NEW CENTURY MATHS GENERAL: HSC
5. Pam and Percy sell photocopiers. The numbers of copiers sold over a 10-week period are shown.
Pam: 1 2 3 3 5 6 7 8 12 25Percy: 3 3 3 14 16 18 18 24 32 35
(a) What is the modal number of copiers sold by each person?(b) What could you say about each person if you only knew the mode?(c) What is the median number of copiers sold by each?(d) What is the mean number of copiers sold by each?(e) Which measure of location is the best measure to compare the sales performances
of Pam and Percy?(f) Who is the better salesperson? Why?
6. Choose 5 scores that have the same mean and median. What effect will adding a score of 100 have on the mean and median?
7. Rupert’s bookstore employs the following people with annual wages as shown:2 store managers $64 3004 cashiers $34 2003 part-time clerical staff $28 50010 salespeople $46 5002 part-time cleaners $13 500
(a) What is the modal wage? Why?(b) What is the median wage?(c) What is the mean wage (to the nearest dollar)?(d) Which measure would Rupert use to make the salaries appear higher?(e) Which measure of location (average) best represents the average wage for an
employee at Rupert’s bookstore?
DISPLAYING AND COMPARING TWO DATA SETSDouble stem-and-leaf plotsBy representing two related data sets in a double (back-to-back) stem-and-leaf display, similarities and differences, such as clustering and averages (measures of location), can be easily seen.
Example 9This double stem-and-leaf plot shows the numbers of dollars spent by a group of students visiting the Easter show.(a) How many students went to the show?(b) Give two observations on the shape and
features of the data.(c) Calculate the mean and standard deviation
(to the nearest 5 cents) of amounts spent by boys and by girls.(d) Considering all the information you have, do you think that boys are the bigger
spenders? Why?
Boys Girls
8 6 6 5 5 46 4 3 2
9 8 25 3 2 1 1 0
2
12345
2 5 5 80 2 4 5 5 5 6 7 8 91 2 40 2
!NNC Yr12 maths ch 04 Page 124 Wednesday, October 4, 2000 1:43 AM
![Page 17: !NNC Yr12 maths ch 04 Century Year 12/04... · spread. For set B, the interquartile range is the better measure of spread as it does not take the outlier score 29 into account. Standard](https://reader035.fdocuments.us/reader035/viewer/2022071013/5fcb6a5622805a07463cbfd1/html5/thumbnails/17.jpg)
STATISTICAL DISTRIBUTIONS 125
Solution(a) 39 students, consisting of 20 boys and 19 girls.(b) The amounts spent by the girls show clustering at $20–$29, whereas the amounts spent
by the boys are more evenly spread out.The data for the girls is positively skewed.
(c) Girls: Mean = $25.80 Standard deviation σn − 1 = $8.00Boys: Mean = $30.10 Standard deviation σn − 1 = $12.40
(d) Yes. The average amount spent by a boy was $30.10. This was about $6 more than the average amount spent by a girl.
Box plotsWhereas a stem-and-leaf plot gives a good visual comparison of the location of scores in a data set, a box plot (or box-and-whisker plot) shows the spread of the data. Find a five-number summary and draw each box plot on the same scale.
Example 10The box plots below show the ranges of unleaded petrol prices in six cities in Australia.(a) (i) Which city’s petrol prices had the smallest range?
(ii) Which city’s had the largest range?(b) In which city was petrol generally cheapest? Give a possible reason for this.(c) Canberra, Sydney and Melbourne had the same range of prices.
(i) Which of these three cities had the lowest median price?(ii) In which of these cities would you be more likely to pay a higher price for petrol?
(d) Write down one observation about petrol prices in Canberra.
Solution(a) (i) Adelaide (ii) Darwin(b) Brisbane. The government tax on petrol is lower than in the other cities and so the price
paid by the consumer is lower.(c) (i) Sydney (ii) Melbourne(d) They were evenly spread across the city. The distribution of petrol prices is symmetrical.
x Use the statistical function on a scientific or graphics calculator.
x
The box contains the middle 50% of scores with each whisker representing 25% of the remaining scores.
Q1 Q2 Q3
Lower Medianquartile
Upperquartile
Upperextreme
Lowerextreme
Canberra
Sydney
Melbourne
Adelaide
Brisbane
Darwin
!NNC Yr12 maths ch 04 Page 125 Wednesday, October 4, 2000 1:43 AM
![Page 18: !NNC Yr12 maths ch 04 Century Year 12/04... · spread. For set B, the interquartile range is the better measure of spread as it does not take the outlier score 29 into account. Standard](https://reader035.fdocuments.us/reader035/viewer/2022071013/5fcb6a5622805a07463cbfd1/html5/thumbnails/18.jpg)
126 NEW CENTURY MATHS GENERAL: HSC
Using a graphics calculator is an easy and excellent way to compare box plots.
1. Enter the individual scores of the first data set in List 1.
2. Enter the individual scores of the second data set in List 2.
3. Set the to a median box plot (some calculators have a mean box plot as well).
4. Make sure that both Graph 1 and Graph 2 are ON.
5. Draw the graphs. Both graphs will appear on the screen at the same time, giving you an excellent comparison of the two data sets.
The calculator will also give you the five-number summary.
Example 11Liz and George deliver pamphlets to letterboxes in the same neighbourhood. The numbers of pamphlets delivered per hour over 12 hours are shown:
Liz: 24 25 26 27 28 28 31 32 32 32 35 35George: 15 18 21 24 25 29 31 31 32 38 38 45
(a) Represent the data in a double stem-and-leaf plot.(b) Find a five-number summary for each data set and hence draw two box plots.(c) Write down one observation that is best seen in the stem-and-leaf plot.(d) Write down one observation that is best seen in the box plots.(e) Which worker showed the greater interquartile range of pamphlets delivered? Which
display shows this the best?(f) Can we conclude that Liz is a better worker than George?
Solution
(b) Liz: 24 25 26 27 28 28 31 32 32 32 35 35↑ ↑ ↑ ↑ ↑
Lower extreme = 24 Lower quartile = = 26.5
Median = = 29.5 Upper quartile = = 32 Upper extreme = 35
George: 15 18 21 24 25 29 31 31 32 38 38 45↑ ↑ ↑ ↑ ↑
Lower extreme = 15 Lower quartile = 22.5Median = 30 Upper quartile = 35 Upper extreme = 45
(a) Liz George
8 8 7 6 5 45 5 2 2 2 1
1234
5 81 4 5 91 1 2 8 85
Technology: Box plots on a graphics calculator
GRAPH
26 27+2
------------------
28 31+2
------------------ 32 32+2
------------------
15 20 25 30Pamphlets/hour
35 40 45
George
Liz
!NNC Yr12 maths ch 04 Page 126 Wednesday, October 4, 2000 1:43 AM
![Page 19: !NNC Yr12 maths ch 04 Century Year 12/04... · spread. For set B, the interquartile range is the better measure of spread as it does not take the outlier score 29 into account. Standard](https://reader035.fdocuments.us/reader035/viewer/2022071013/5fcb6a5622805a07463cbfd1/html5/thumbnails/19.jpg)
STATISTICAL DISTRIBUTIONS 127
(c) The stem-and-leaf plot shows that the number of pamphlets delivered per hour by Liz was always in the 20s and 30s.
(d) The box plots show the median number of pamphlets delivered per hour by both was about the same (around 30) but George’s range was greater.
(e) George. This is obvious from the box plots. The interquartile range is the length of ‘the box’.
(f) If an employer was looking for consistency, Liz is the more consistent worker as she had less variation in the number of pamphlets delivered per hour. However, for the total number of pamphlets delivered, both employees delivered approximately the same number of pamphlets. We cannot conclude that Liz is a better worker than George.
What to do with outliers?n If an outlier is considered to be feasible, you can include it in the whiskers.n If an outlier is considered to be an error, you need not include it in the whiskers but can
represent it as a separate point.
Can you describe a situation that these box plots could represent?
1. The numbers of dollars spent by a class of students visiting the Easter show were discussed in Example 9 (page 124).(a) Find a five-figure summary for each data set.(b) What is the interquartile range of each?(c) Draw two box plots representing the data sets.(d) What information is seen more easily in the box plots?
2. A teacher proposes that ‘People always underestimate the length of a piece of string’. A group of students decide to investigate this theory. They each estimate the lengths of several pieces of string and then measure the actual lengths.
(a) Write down the median of the estimated lengths.(b) Write down the median of the actual lengths.(c) What are the range and interquartile range for each data set?(d) Would you agree with the teacher’s theory? Justify your answer.
Think: Is the outlier in or out?
1 3 5 7 9 112 4 6 8 10
Outlierexcluded
Outlierincluded
Exercise 4-05: Displaying and comparing two data sets
Boys Girls
8 6 6 5 5 46 4 3 2
9 8 25 3 2 1 1 0
2
12345
2 5 5 80 2 4 5 5 5 6 7 8 91 2 40 2
5 10 15 20Length of string (cm)
25 30 35
Actual
Estimates
40
!NNC Yr12 maths ch 04 Page 127 Wednesday, October 4, 2000 1:43 AM
![Page 20: !NNC Yr12 maths ch 04 Century Year 12/04... · spread. For set B, the interquartile range is the better measure of spread as it does not take the outlier score 29 into account. Standard](https://reader035.fdocuments.us/reader035/viewer/2022071013/5fcb6a5622805a07463cbfd1/html5/thumbnails/20.jpg)
128 NEW CENTURY MATHS GENERAL: HSC
3. Here are two sets of scores represented in a stem-and-leafdisplay.(a) Find the range and interquartile range of each set.(b) Find the median for each set.(c) Draw box plots representing the data sets.(d) Write down one observation from the stem-and-leaf
plot and one from the box plots.
4. The pulse rates (in beats/minute) of two groups of people were recorded:Group X: 77 72 80 77 91 62 72 82 79 58 75 67 69 66 98 81Group Y: 81 86 64 74 92 75 73 81 64 52 82 79 80 53 62 78
(a) Draw a back-to-back stem-and-leaf plot. (b) What is the mean of each group (correct to 1 decimal place)?(c) What is the median of each group?(d) Which is the better measure of location? Why?(e) Comment on the shape of each group in the stem-and-leaf plot.
5. A group of 20 people had their pulse rates taken before and after an exercise class.
(a) By how much did the median pulse rate increase?(b) The lower extreme ‘before’ and ‘after’ the class did not change. Give a possible
reason for this.(c) Give a possible reason for the outlier pulse rates in the ‘after exercise’ box plot.(d) How many people had a pulse rate between 64 and 72 before the exercise class?(e) What was the interquartile range of pulse rates after the class?
6. Eighteen people took part in the QUIT smoking program. The numbers of cigarettes smoked per day were recorded before the start of the program and 6 weeks later:Before: 21 10 36 42 16 23 32 42 9 14 21 18 34 45 12 18 16 286 weeks later: 6 24 31 38 21 25 16 19 16 18 28 32 8 13 40 38 16 28(a) What is the interquartile range for each data set?(b) Draw two box plots on the same scale showing ‘before’ and ‘6 weeks later’.(c) Is the QUIT program working for these people? Justify your answer.
7. The following data shows the average number of rainy days per month for two capital cities, and is supplied by the Bureau of Meteorology.
Month J F M A M J J A S O N D
Sydney 12 12 13 12 12 12 10 10 10 12 11 12
Melbourne 8 7 9 12 14 14 15 16 15 14 12 11
Set A Set B
255
8
5702
0123456789
3
852 4 72
4
40 60 80 100Pulse rate (beats/min)
120 140
Before
50 70 90 110 130
exercise
Afterexercise
!NNC Yr12 maths ch 04 Page 128 Wednesday, October 4, 2000 1:43 AM
![Page 21: !NNC Yr12 maths ch 04 Century Year 12/04... · spread. For set B, the interquartile range is the better measure of spread as it does not take the outlier score 29 into account. Standard](https://reader035.fdocuments.us/reader035/viewer/2022071013/5fcb6a5622805a07463cbfd1/html5/thumbnails/21.jpg)
STATISTICAL DISTRIBUTIONS 129
(a) Use a double stem-and-leaf plot to display the data.(b) Draw box plots representing the data.(c) Write down one observation from each display.(d) ‘Melbourne is much wetter than Sydney.’ Do you agree with this statement? Justify
your answer.
8. This display represents the lifetime in hours of two brands of light globes.
(a) How many of each brand of light globe were tested?(b) What is the mean lifetime of ‘Oso Bright’ globes (correct to 1 decimal place)?(c) What is the mean lifetime of ‘Brighta Longa’ globes (correct to 1 decimal place)?(d) Find the standard deviation of the lifetime of each brand (correct to 1 decimal place).(e) Draw box plots representing the data sets.(f) Which brand of globe would you say is better? Explain your answer.
COMPARING DATA SETS USING CHARTSRadar chartA radar chart is used to plot changes over a certain period or cycle, such as temperarure during a 24-hour period, but it is also useful for comparing two sets of data.
A radar plotting chart (or polar graph paper) can be used to manually plot data, but the best option is to generate the radar chart from a spreadsheet package on a computer.
Example 12This radar chart shows air pollution levels at two different workplaces over a 10-day period.(a) What was the air pollution level at the
meatworks on day 10?(b) What was the air pollution level at the oil
refinery on day 1?(c) On what days was the pollution level above
50 at the oil refinery?(d) What were the maximum and minimum
pollution levels? When and where did they occur?
(e) By comparing the areas contained within each graph, decide which workplace had the higher overall pollution level.
Oso Bright Brighta Longa
6 5 5 4 28 7 7 7 7 7 7 4 4 3
9 9 8 8 7 6 6 6 5 4 4 08 8 8 7 7 7 6 5 4 3 1
9 8 8 8 5 5 2 27 7 5 1
101112131415
3 4 52 3 3 4 4 5 61 2 2 3 3 4 5 5 7 9 9 90 2 3 3 4 4 4 5 6 6 8 8 91 2 2 3 5 5 6 7 8 80 3 3 4 6
Air pollution levelsDay 1
Day 2
Day 3
Day 4
Day 5
Day 6
Day 7
Day 8
Day 9
Day 10
02040
8060
100
MeatworksOil refinery
!NNC Yr12 maths ch 04 Page 129 Wednesday, October 4, 2000 1:43 AM
![Page 22: !NNC Yr12 maths ch 04 Century Year 12/04... · spread. For set B, the interquartile range is the better measure of spread as it does not take the outlier score 29 into account. Standard](https://reader035.fdocuments.us/reader035/viewer/2022071013/5fcb6a5622805a07463cbfd1/html5/thumbnails/22.jpg)
130 NEW CENTURY MATHS GENERAL: HSC
Solution(a) About 60.(b) About 45.(c) Days 4, 6, 8 and 9.(d) The maximum level was about 85
on day 4 at the oil refinery and the minimum level was about 25 on day 1 at the meatworks.
(e) The oil refinery graph seems to cover a slightly larger area and so had a higher level of pollution over the 10-day period.
Area chartAn area chart consists of different ‘areas’ or ‘bands’, each representing a data set over a given period of time. It shows the sum of the data over the given time as well as the relationship of the parts to a whole. Its main feature is to emphasise changes during this time. An area chart can be plotted on graph paper or drawn using the Chart option in a spreadsheet package. There are several chart subtypes that you can investigate.
Example 13The table shows the numbers of males and females in full-time employment in January from 1990 to 2000.
Construct an area chart showing the contribution of male and female employees to Australia’s full-time workforce.
SolutionStep 1 Draw a line graph for males using the values in the table and shade below it. This
area represents the male employees.
Step 2 Draw a line graph for total employees by adding the values for females to those of males. Shade the area between the two lines. This area represents the female employees.
Year 1990 1992 1994 1996 1998 2000
Males (×10 000) 350 160 200 320 360 450
Females (×10 000) 80 50 60 120 140 200
For example, in January 2000 the full-time workforce was 6 500 000, and this was made up of 4 500 000 males and 2 000 000 females.
Australia’s full-time workforce
No.
of e
mpl
oyee
s
700
600
500
400
300
200
100
0
Year1990 1992 1994 1996 1998 2000
FemalesMales
(×10
000)
!NNC Yr12 maths ch 04 Page 130 Wednesday, October 4, 2000 1:43 AM
![Page 23: !NNC Yr12 maths ch 04 Century Year 12/04... · spread. For set B, the interquartile range is the better measure of spread as it does not take the outlier score 29 into account. Standard](https://reader035.fdocuments.us/reader035/viewer/2022071013/5fcb6a5622805a07463cbfd1/html5/thumbnails/23.jpg)
STATISTICAL DISTRIBUTIONS 131
Example 14This area chart compares the unemployment rates for males and females from 1981 to 1997.(a) For the year 1985 find:
(i) the unemployment rate for males(ii) the combined unemployment rate(iii) the unemployment rate for females
(b) What was the unemployment rate in 1993?(c) What trends in the unemployment rate can
be seen over the period from 1981 to 1997?
Solution(a) (i) About 8%.
(ii) About 17%.(iii) About 9% (subtract the 8% rate for males from the 17% total rate).
(b) About 22%.(c) The unemployment rate rose from about 12% in 1981 to 17% in 1997.
A fall in the unemployment rate occurred from 1985 to 1989 followed by a rise before another fall from 1993 to 1997. The unemployment rate was at its highest in 1993.
Radar charts and area charts are drawn in a similar way using a spreadsheet package. Use a spreadsheet to draw the area chart for Australia’s full-time workforce (Example 13 on page 130).
1. The numbers of clear days for the ski resorts of Thredbo and Perisher in the Snowy Mountains area of NSW are shown in the radar chart.(a) How many clear days did Thredbo have
in March?(b) What was the most number of clear
days at either resort? When was this?(c) How many days were not clear in
Perisher in July?(d) Which data set contains the largest
area? What does this area refer to?(e) ‘The weather is better for skiing at
Perisher.’ Do you agree with this statement? Justify your answer.
Unemployment rates
Per
cent
age
25
20
15
10
5
0
Year1981 1985 1989 1993 1997
FemalesMales
Use your ruler to help you measure the vertical distances.
Technology: Using a spreadsheet to draw an area chart or radar chart
Exercise 4-06: Comparing data sets using charts
Clear days in the ski fields of NSWJan
Feb
Mar
Apr
Jun
Jul
Aug
Oct
Nov
Dec
0
4
8
2
6
10
MaySep
ThredboPerisher
!NNC Yr12 maths ch 04 Page 131 Wednesday, October 4, 2000 1:43 AM
![Page 24: !NNC Yr12 maths ch 04 Century Year 12/04... · spread. For set B, the interquartile range is the better measure of spread as it does not take the outlier score 29 into account. Standard](https://reader035.fdocuments.us/reader035/viewer/2022071013/5fcb6a5622805a07463cbfd1/html5/thumbnails/24.jpg)
132 NEW CENTURY MATHS GENERAL: HSC
2. The area chart shows the number of wage earners employed in the public and private sectors in Australia over different years.(a) How many wage earners were
there in the public sector in 1997?(b) What was the total number of
wage earners in 1993?(c) How many wage earners were
employed in the private sector in 1991?
(d) What trends can be seen over the period from 1991 to 1997?
(e) What similarities or differences can be seen between the public and private sectors?
3. The area chart shows the seasonal rainfall for an island group in the Pacific Ocean.(a) What was the rainfall for the
southeastern region in summer?(b) What was the rainfall for the
northern region in spring?(c) What was the total rainfall in
autumn?(d) The southeast is the wettest region.
How is this shown in the graph? What could be a possible reason for one area getting more rain than the others?
(e) What trends in the rainfall can be seen over the year?
(f) What similarities or differences in rainfall can be seen between the regions?
4. Mr Pappadopoulos was admitted to hospital with a suspected stomach ulcer. His fluid intake (e.g. water and medicine) and output (e.g. urine) over a 24-hour period are summarised in the following table.
(a) Represent the data in a radar chart.(b) By considering the areas enclosed by each data set, what observation can you make
about Mr Pappadopoulos’s intake and output over the 24-hour period?(c) Write down two other observations from your radar chart.
Time 6 am 8 am 10 am 12 noon 2 pm 4 pm
Intake (mL) 170 240 150 110 250 90
Output (mL) 140 150 80 180 130 90
Time 6 pm 8 pm 10 pm 12 pm 2 am 4 am
Intake (mL) 150 60 180 170 160 210
Output (mL) 60 220 110 160 100 140
Wage earners in Australia8000
7000
6000
5000
3000
2000
Year1991 1992 1993 1994 1995
Private sectorPublic sector
1996 1997
1000
4000
0No.
of w
age
earn
ers
(×10
00)
Seasonal rainfall for island group400
350
300
250
150
100
SeasonSummer Autumn Winter
Southwestern regionSoutheastern regionNorthern region
Spring
50
200
0
()
!NNC Yr12 maths ch 04 Page 132 Wednesday, October 4, 2000 1:43 AM
![Page 25: !NNC Yr12 maths ch 04 Century Year 12/04... · spread. For set B, the interquartile range is the better measure of spread as it does not take the outlier score 29 into account. Standard](https://reader035.fdocuments.us/reader035/viewer/2022071013/5fcb6a5622805a07463cbfd1/html5/thumbnails/25.jpg)
STATISTICAL DISTRIBUTIONS 133
5. Clark and Lois earn extra money for writing articles for newspapers and magazines. They save these amounts in a joint holiday fund. Their monthly earnings last year are shown in the table.
(a) Represent the data in a radar chart.(b) Represent the data in an area chart.(c) What information is best seen in the radar chart?(d) What trends are clearly seen in the area chart?
6. (a) What information is contained in the graph?
(b) How do you think data for the years 2021–2041 was obtained?
(c) Describe the features of the part of the graph for the 15–59 age group.
(d) In 1961, approximately what percentage of the population was between (i) 0 and 14 (ii) 15 and 59?
(e) Approximately what percentage of the population is expected to be over 60 in 2021?
(f) Give two facts about Australia’s population that can be seen in the graph.
(g) What does this area chart show about age groups in the future?
TWO-WAY TABLESTwo-way tables are used to compare two characteristics—for example, gender and health.
Example 15A National Health Survey in 1995 compared the number of adults in a population who exercised regularly to those who didn’t. The data is displayed in a two-way table.
(a) How many people were surveyed?(b) What percentage of the people surveyed were female? Give your answer correct to
1 decimal place.(c) What percentage of females exercised regularly?(d) What percentage of the population did not exercise regularly?(e) Comment on the statement ‘Men and women are similar in their exercise habits’.
Month Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
Clark’s earnings ($)
370 240 530 570 780 1030 770 620 790 520 430 490
Lois’s earnings ($)
150 420 480 530 850 1280 920 650 810 480 390 350
Exercise No exercise
Male 3028 1532
Female 1804 946
Australia’s population by age groups1009080
60
4030
Year1921 1961 2001 2041
10
50
0
Per
cent
age
20211941 1981
Age 60+Age 15–59Age 0–14
20
70
!NNC Yr12 maths ch 04 Page 133 Wednesday, October 4, 2000 1:43 AM
![Page 26: !NNC Yr12 maths ch 04 Century Year 12/04... · spread. For set B, the interquartile range is the better measure of spread as it does not take the outlier score 29 into account. Standard](https://reader035.fdocuments.us/reader035/viewer/2022071013/5fcb6a5622805a07463cbfd1/html5/thumbnails/26.jpg)
134 NEW CENTURY MATHS GENERAL: HSC
Solution(a) Number of people surveyed = 3028 + 1532 + 1804 + 946 = 7310(b) Number of females = 1804 + 946 = 2750
Percentage of people who were female = × 100 = 37.6%
(c) Percentage of females who exercised = × 100 = 65.6%
(d) Percentage of people who did not exercise = × 100 = 33.9%
(e) Number of males = 3028 + 1532 = 4560
Percentage of males who exercised = × 100 = 66.4%
Since the percentage of females who exercised was 65.6% and the percentage for males was 66.4%, there is no significant difference, so the statement is supported by this data.
1. The population of a town was surveyed in 1990 and 1997 to find out who had private health insurance.
(a) What was the population of the town in 1990?(b) What was the population of the town in 1997?(c) What percentage of the town had private health insurance in 1990?(d) What percentage of the town had private health insurance in 1997?(e) Suggest a reason for the decrease in the percentage of people with private health
insurance.
2. The percentages of Australians living in rural areas in 1911 and 1996 were compared.
(a) Copy and complete the table.(b) What percentage of Australians lived in urban areas in 1911?(c) Comment on the differences between 1911 and 1996.
3. In one area there are three phone companies providing a service for mobile phones. The number of people using each company as a provider was recorded over a 3-year period.
1990 1997
Private 4563 4048
No private 5577 8602
1911 1996
Rural areas 43%
Urban (city) areas 87%
Telstra Optus Vodaphone
1995 204 695 194 198 125 967
1996 315 144 216 276 86 510
1997 402 628 304 025 115 037
27507310------------
18042750------------
1532 946+7310
---------------------------
30284560------------
Exercise 4-07: Two-way tables
!NNC Yr12 maths ch 04 Page 134 Wednesday, October 4, 2000 1:43 AM
![Page 27: !NNC Yr12 maths ch 04 Century Year 12/04... · spread. For set B, the interquartile range is the better measure of spread as it does not take the outlier score 29 into account. Standard](https://reader035.fdocuments.us/reader035/viewer/2022071013/5fcb6a5622805a07463cbfd1/html5/thumbnails/27.jpg)
STATISTICAL DISTRIBUTIONS 135
(a) How many people in this area owned mobile phones in (i) 1995 and (ii) 1997?(b) What percentage of people used Telstra as their provider in 1996?(c) What percentage of people used Optus as their provider in 1997?(d) What share of the market did Vodaphone have in (i) 1995 and (ii) 1997?(e) What happened to Telstra’s share of the market from 1995 to 1997?(f) What happened to Optus’s share of the market from 1995 to 1997?(g) Comment on the statement ‘Telstra users doubled from 1995 to 1997’.
4. A survey was taken on whether to change the Australian flag or not. The results are shown in the table, grouped by age in years.
(a) How many people surveyed voted to (i) change the flag and (ii) keep the flag?(b) What percentage of those surveyed wanted to keep the flag?(c) What percentage of 18–24-year-olds wanted to change the flag?(d) Which group was most definite in its response? What was this response? Why do
you think this is so?
18–24 25–39 40–54 55–69
Change the flag 790 640 450 140
Keep the flag 1240 860 930 620
TEN MORE HOT TIPS FOR TACKLING EXAMS
1. Bring all of your equipment: pens, paper, geometrical instruments, calculator (check calculator works).
2. Don’t worry if you feel nervous before an exam. This is normal and helps you perform better. However, being too casual or too anxious can be harmful to your performance.
3. Write in black or blue, not red. Don’t use liquid paper. Use pencil only for diagrams and constructions.
4. Read each question and identify what needs to be found.
5. You don’t need to be writing all of the time. What you are writing may be wrong and a waste of time. Spend some time thinking and considering the best approach.
6. Make sure your answer sounds reasonable and realistic, especially if it involves money or measurement.
7. If you make a mistake, cross it out with a neat line. Don’t scribble over it completely. You may still get marks for it if it is right. Don’t use liquid paper. It is both time-consuming and messy.
8. Don’t cross out or change an answer rashly. You may have been right the first time.
9. Don’t round off in the middle of a calculation. Round off at the end only.
10. Don’t be afraid to write words and sentences in your working, but don’t use abbreviations that you’ve just made up.
Study tips
!NNC Yr12 maths ch 04 Page 135 Wednesday, October 4, 2000 1:43 AM
![Page 28: !NNC Yr12 maths ch 04 Century Year 12/04... · spread. For set B, the interquartile range is the better measure of spread as it does not take the outlier score 29 into account. Standard](https://reader035.fdocuments.us/reader035/viewer/2022071013/5fcb6a5622805a07463cbfd1/html5/thumbnails/28.jpg)
136 NEW CENTURY MATHS GENERAL: HSC
USING MULTIPLE DISPLAYS TO COMPARE DATA SETSRelationships between data sets can often be interpreted and described more effectively by using more than one display. Looking at a variety of different displays allows a better comparison of data sets as some features are more obvious in one display than in another.
Every day in the media you will find examples of multiple displays describing data sets.
A company director compares this year’s figures with those of previous years. Medical researchers compare the effects of a new drug on men and women for similarities and differences. Local councils investigate the population mix in a new suburban area in order to provide the most appropriate facilities.
Let us start with two simple data sets and look at three different ways of comparing them.
Example 16The data sets A and B are displayed as lists, dot plots, a frequency table and a clustered column graph.
ListsA: 5 6 7 8 9 B: 5 5 7 9 9Dot plots
Frequency table Column graph
(a) Comment on the shape and features of each data set.(b) Find the mean, median and mode for each set.(c) Find the range, interquartile range and standard deviation of each set.(d) Comment on the benefits of using multiple displays to describe the data sets and to find
measures of location and spread.
Solution(a) Set A is symmetrical and flat.
Set B is symmetrical and has two peaks; that is, it is bimodal.(b) Set A: Mean = 7 Median = 7 No mode
Set B: Mean = 7 Median = 7 Mode = 5, 9(c) Set A: Range = 4 Interquartile range = 3 Standard deviation σn − 1 = 1.58
Set B: Range = 4 Interquartile range = 4 Standard deviation σn − 1 = 2
(d) Multiple displays cater for differences in people’s preferences as well as allowing for different statistical needs. The dot plots and histogram give good visual representations of the data sets and are best used to describe the shape and features of the data sets. The measures of location and spread are best found from the lists or frequency table, although the other displays can also be used.
ScoreFrequency
Set A Set B
56789
11111
20102
A
5 6 7 8 9Score
5 6 7 8 9Score
B
5 6 7 8 9Score
0
1
2
3
Fre
quen
cy
Set ASet B
!NNC Yr12 maths ch 04 Page 136 Wednesday, October 4, 2000 1:43 AM
![Page 29: !NNC Yr12 maths ch 04 Century Year 12/04... · spread. For set B, the interquartile range is the better measure of spread as it does not take the outlier score 29 into account. Standard](https://reader035.fdocuments.us/reader035/viewer/2022071013/5fcb6a5622805a07463cbfd1/html5/thumbnails/29.jpg)
STATISTICAL DISTRIBUTIONS 137
1. Two groups, each containing 15 people, were given a small timer and asked to stop the timer when they thought 60 seconds had elapsed. The results, in seconds, for the ‘estimated minute’ are listed:
Group A: 34 43 45 50 62 64 65 65 66 68 69 70 71 75 81
Group B: 42 46 48 48 49 50 55 58 60 61 62 64 65 68 70(a) Construct a double stem-and-leaf plot.(b) Draw a clustered column graph with classes 30–39, 40–49, …(c) Draw box plots to represent the data sets.(d) Write down one piece of information that is clearly shown in each of the three
displays you have drawn.(e) Find the mean and standard deviation of each data set (correct to 1 decimal place).(f) Comment on the ability of each group to estimate a minute.
2. A coach, deciding which team should win the ‘most consistent players’ award, compared the season’s scores for two netball teams:
The Birds: 55 23 35 51 56 48 70 52 64 72
The Bees: 18 41 23 46 48 24 56 27 36 48(a) Display the data in a stem-and-leaf plot, box plots and a column graph.(b) Use your displays to describe the shape and features of each data set.(c) By finding suitable measures of location and spread, decide which team is more
consistent. Justify your answer.
3. The populations of two regions were surveyed to find out who belongs to a workers’ union. The results are tabulated and shown in a back-to-back histogram.
Table
Back-to-back histogram
(a) Write down two comparisons you can make between the two data sets.(b) Use the information to comment on the statement ‘People in the eastern region are
more likely to join a union’. Justify your answer.
Age 15–24 25–34 35–44 45–54 55–64 65+
Eastern region 35% 49% 54% 51% 62% 11%
Western region 34% 36% 38% 42% 45% 4%
Exercise 4-08: Using multiple displays to compare data sets
Union membership by age and region
30 20 10 0 0 10 20% belonging to a workers’ union
30
15–24
45–54
40 40506070 706050
25–3435–44
55–6465+
Eastern Western
!NNC Yr12 maths ch 04 Page 137 Wednesday, October 4, 2000 1:43 AM
![Page 30: !NNC Yr12 maths ch 04 Century Year 12/04... · spread. For set B, the interquartile range is the better measure of spread as it does not take the outlier score 29 into account. Standard](https://reader035.fdocuments.us/reader035/viewer/2022071013/5fcb6a5622805a07463cbfd1/html5/thumbnails/30.jpg)
138 NEW CENTURY MATHS GENERAL: HSC
4. The heights of a group of men and women were measured to the nearest centimetre. The data was then represented in a double stem-and-leaf display and also as box plots.
Stem-and-leaf
Box plots
(a) What information is better shown in the stem-and-leaf display?(b) What information is better shown in the box plots?(c) What are the medians and interquartile ranges of the heights of men and women?(d) Calculate the means and standard deviations of the heights of men and women
(correct to 1 decimal place).(e) Write down two similarities between the heights of men and women.(f) Write down two differences between the heights of men and women.
5. The table below gives the average number of rainy days per month for the Australian capital cities.
(a) Draw at least two suitable displays illustrating the data.(b) Calculate the mean and median number of rainy days for each city.(c) Find the range and standard deviation of the number of rainy days for each city.(d) Use these statistical measures and displays to determine:
(i) which city is driest(ii) which city is wettest(iii) which city has the most consistent pattern of rainy days(iv) which city has most variation in the number of rainy days per month
Men Women
89 7 7 5 2
9 9 8 8 6 5 5 4 4 4 2 18 6 3 2
4
1516171819
2 4 4 5 6 8 8 90 2 3 3 4 5 5 5 5 6 7 8 82 3 4 43
CityMonth
J F M A M J J A S O N D
AdelaideBrisbaneCanberraDarwinHobartMelbournePerthSydney
5137
211183
12
4137
201073
12
6157
191194
13
91189
12128
12
141092
14141312
13891
14141712
177
100
15151810
167
121
15161610
137
102
15151310
119
116
16141011
810101214127
11
7128
1613114
12
145 155 165 175Height (cm)
185 195
Men
150 160 170 180 190
Women
!NNC Yr12 maths ch 04 Page 138 Wednesday, October 4, 2000 1:43 AM
![Page 31: !NNC Yr12 maths ch 04 Century Year 12/04... · spread. For set B, the interquartile range is the better measure of spread as it does not take the outlier score 29 into account. Standard](https://reader035.fdocuments.us/reader035/viewer/2022071013/5fcb6a5622805a07463cbfd1/html5/thumbnails/31.jpg)
STATISTICAL DISTRIBUTIONS 139
6. Use the table in question 5 to consider the rainfall per season in Australia. The seasons are summer (D, J, F), autumn (M, A, M), winter (J, J, A) and spring (S, O, N).(a) Draw at least two suitable displays to illustrate the data.(b) Calculate the mean, median, range and standard deviation for each season.(c) Use these statistical measures and displays to determine:
(i) which is the wettest season (ii) which is the driest season(d) Comment on the statement ‘Rainfall in Australia does not vary much between
seasons’.
Just for the record
BABY BOOMERS
After World War II finished in 1945, there was a ‘baby boom’ in Australia, New Zealand,Britain and North America. This rapid growth in the number of babies born lasted until themid-1960s. People born during this time are referred to as ‘baby boomers’. The result ofthe large increase in births during this period will affect Australia’s population statistics asthis group of people age. The two graphs show the baby boomer population moving from2001 to 2031.
In 2031, the baby boomers will be over 65 years. Approximately how many more personsaged over 65 will there be in 2031 compared with 2001?
Age distribution of Australian population
2001
0–5 6–10 11–15 16–20 21–25 26–30 31–35 36–40 41–45 46–500
200
400
600
800
1000
1200
(× 1
000)
1400
1600
51–55 56–60 61–65 66–70 71–75 76–80 81–85 86+
Baby boomers
2031
0–5 6–10 11–15 16–20 21–25 26–30 31–35 36–40 41–45 46–500
200
400
600
800
1000
1200
1400
1600
51–55 56–60 61–65 66–70 71–75 76–80 81–85 86+
Baby boomers
(× 1
000)
!NNC Yr12 maths ch 04 Page 139 Wednesday, October 4, 2000 1:43 AM
![Page 32: !NNC Yr12 maths ch 04 Century Year 12/04... · spread. For set B, the interquartile range is the better measure of spread as it does not take the outlier score 29 into account. Standard](https://reader035.fdocuments.us/reader035/viewer/2022071013/5fcb6a5622805a07463cbfd1/html5/thumbnails/32.jpg)
140 NEW CENTURY MATHS GENERAL: HSC
One of the main roles of a statistician is to critically analyse related data sets and report on the findings. Businesses often use the results of an analysis for promotional purposes and companies report to their shareholders.
To critically analyse data sets:n Draw suitable displays.n Find measures of location and spread.n Write a report on the relationship between the data sets, commenting on any similarities
and differences between the data sets, unusual features, outliers or patterns.n Draw conclusions and make recommendations.
1. Twenty overweight people enrolled in a weight loss program at Rhonda’s Weight Loss Centre. Their weights (in kilograms) before and after the program were:
Before: 128 159 85 76 93 125 102 74 88 8297 84 106 125 76 80 92 77 115 102
After: 75 72 64 95 58 62 120 93 85 72102 65 73 62 56 60 105 82 52 64
Critically analyse the data and report back to Rhonda on how she can best advertise the success of her centre.
2. The times taken (in seconds) to check a basket of 20 grocery items at 15 automated and 15 manual checkouts were:
Automated: 45 58 63 43 75 69 84 65 96 73 90 61 84 72 96
Manual: 95 105 82 110 125 148 136 137 86 99 145 119 101 97 124
Critically analyse the data and report back to the manager of a store on the benefits of installing automated checkouts based on this data.
Obtain published data from the media or Internet, collect data through experiment or simulation, or use data already collected for your statistics file. Critically analyse the data sets by drawing appropriate graphs and tables, determining measures of location and spread, and writing a report on your findings.
Some suggested data sets are:n the performances of two sporting teams (e.g. football or netball) in a seasonn the performance of a sporting team in home and away matchesn pulse rates of males and females before and after exercisen spending patterns of men and womenn heights and weights of males and femalesn scores in two subject testsn waiting times at a checkout on different daysn pollution levels at different times in the same city or in two different citiesn rainfall in two different towns or regionsn part-time incomes of male and female students.
Modelling activity: Analysing data sets
Investigation: Collecting and analysing data sets
!NNC Yr12 maths ch 04 Page 140 Wednesday, October 4, 2000 1:43 AM
![Page 33: !NNC Yr12 maths ch 04 Century Year 12/04... · spread. For set B, the interquartile range is the better measure of spread as it does not take the outlier score 29 into account. Standard](https://reader035.fdocuments.us/reader035/viewer/2022071013/5fcb6a5622805a07463cbfd1/html5/thumbnails/33.jpg)
STATISTICAL DISTRIBUTIONS 141
A population pyramid displays information about the ages of a population. The oldest age group is at the top and hence the display resembles a pyramid. A simple population pyramid (or back-to-back histogram) is shown in question 3 of Exercise 4-08 (page 137).
1. This population pyramid shows a profile of the Australian population from 1911 to 2051. It is actually three pyramids together, showing the years 1911, 1996 and the population projection for 2051.
(a) Compare the numbers of males and females over 60 in 1911 and in 2051. (b) How many females were 35 in 1996?(c) How many males were 20 in 1911?(d) Find one age group where there are more males.(e) Find one age group where there are more females.(f) Write down three differences between the population in 1911 and in 1996.
2. Investigate the age of the Aboriginal and Torres Strait Islander population and compare with the general Australian population using a population pyramid. You can find the necessary information at the following website: www.abs.gov.au.
Investigation: Population pyramids
100+
Profile of Australia’s population, 1911–2051Males Females
Thousand0 50 100 150 20050 0100150200
95
90
85
80
75
70
65
60
55
50
45
40
35
30
25
20
15
10
5
0
191119962051
Age
!NNC Yr12 maths ch 04 Page 141 Wednesday, October 4, 2000 1:43 AM
![Page 34: !NNC Yr12 maths ch 04 Century Year 12/04... · spread. For set B, the interquartile range is the better measure of spread as it does not take the outlier score 29 into account. Standard](https://reader035.fdocuments.us/reader035/viewer/2022071013/5fcb6a5622805a07463cbfd1/html5/thumbnails/34.jpg)
142 NEW CENTURY MATHS GENERAL: HSC
Chapter review
Statistical distributions1. Collecting and displaying data2. Summary statistics3. Features of a statistical display4. Investigating outliers5. Displaying and comparing two data sets6. Comparing data sets using charts 7. Two-way tables8. Using multiple displays to compare data sets
This chapter, Statistical distributions, revises and extends the statistics covered in the Preliminary Course. It compares two data sets in a variety of displays, including double stem-and-leaf plots, box plots, radar charts and area charts. You also used measures of location and spread to compare data sets and learned how to interpret information from different displays. Be sure to include area charts and the effect of outliers in your summary. You could also include a glossary of statistical terms.
Make a summary of this topic. Use the chapter outline above as a guide. An incomplete mind map has also been started below. Use your own words, symbols, diagrams, boxes and reminders. Use the questions in Your say below to think about your understanding of the topic. Gain a ‘whole picture’ view of the topic and identify any weak areas.
Topic summary
Statistical distributions
Area charts
Two-way tables
Stem-and-leaf plots
Radar charts
Box plotsOutliers
Comparing data sets
Summary statistics
Measures of spread
Measures of location
!NNC Yr12 maths ch 04 Page 142 Wednesday, October 4, 2000 1:43 AM
![Page 35: !NNC Yr12 maths ch 04 Century Year 12/04... · spread. For set B, the interquartile range is the better measure of spread as it does not take the outlier score 29 into account. Standard](https://reader035.fdocuments.us/reader035/viewer/2022071013/5fcb6a5622805a07463cbfd1/html5/thumbnails/35.jpg)
STATISTICAL DISTRIBUTIONS 143
n Have you satisfied the outcomes listed at the front of this chapter?n What was the most important thing that you learned?n How did you feel about the topic? Did you enjoy it?n What was new?n What are your weaknesses? What will you need to study more?n How will you revise and summarise this topic?
1. Classify the data as (i) quantitative and discrete, (ii) quantitative and continuous, or (iii) categorical.(a) numbers of cows on farms in NSW(b) numbers of letters delivered each day to households in Campbelltown(c) annual water consumption in Sydney(d) numbers of workers who travel to work by public transport(e) ages of first-year university students(f) favourite movie
2. Find the mean, median and mode for each data set and suggest a possible population from which each set of data was taken.(a) 10 11 11 12 12 12 13 13(b) 3 3 3 4 4 4 5 5 5(c) 72 72 73 75 76 83 84 85 87 94
3. Consider the set of scores: 3 4 5 5 8 9 12 15 18 20(a) What is the mean?(b) What is the median?(c) Without doing any calculations, say what the effect on the mean and median
would be of adding:(i) one score of 30 (ii) one score of 50
(iii) a score of zero (iv) a score of 10(d) What would be the effect on the mean and median if each score was:
(i) increased by 2? (ii) decreased by 3?
4. For each statistical display below:(i) find the mean and standard deviation of the data set (to 1 decimal place)(ii) describe the shape and features of the distribution
Your say: Reflecting about the topic ● ● ● ●
Chapter assignment
(a)
5
10
15
Fre
quen
cy
6 10 14 18Wages from part-time job (× $10)
8 12 16 200
!NNC Yr12 maths ch 04 Page 143 Wednesday, October 4, 2000 1:43 AM
![Page 36: !NNC Yr12 maths ch 04 Century Year 12/04... · spread. For set B, the interquartile range is the better measure of spread as it does not take the outlier score 29 into account. Standard](https://reader035.fdocuments.us/reader035/viewer/2022071013/5fcb6a5622805a07463cbfd1/html5/thumbnails/36.jpg)
144 NEW CENTURY MATHS GENERAL: HSC
5. Match the box plots to the following data sets.
(a) a random sample of 30 spectators at a football match(b) a group of 30 senior citizens on a bus trip(c) a group of 30 dancers at a nightclub(d) two teachers taking a group of 30 primary students to the zoo
6. A factory produces small metal rods, designed to have a mass of 50 g. Samples were taken from two different machines and compared.(a) Find the mean and standard
deviation for each machine (correct to 1 decimal place).
(b) What are the median and interquartile range for machine A?
(c) What are the median and interquartile range for machine B?
(d) Construct box plots for the two data sets.(e) Comment on the statement ‘Machine B produces rods of a more consistent mass
than machine A’.
7. This back-to-back stem-and-leaf plot compares the maximum average monthly temperatures (°C) for two towns in NSW.(a) What was the highest average monthly
temperature for Grafton?(b) What was the range of temperatures for
each town?(c) What was the median temperature for each town?(d) Name two features of the data sets that differ.
1 3 5 7Hours spent doing homework per day
92 4 6 8
(c)
0
5
10
15
Fre
quen
cy
(b)
3 4 5 6 7 8 92 10No. of overseas trips
10 20 40 60Age (years)
80 9030 50 700
A.
B.
C.
D.
Machine A Machine B
9 9 9 9 93 2 2 1 1 0 0 0
8 8 7 6 6 50
445566
48 90 0 0 0 0 1 1 1 2 3 3 3 4 45 6 7 7 8 91 2 35
Goulburn Grafton
9 6 6 3 2 17 6 5 4 2 0
123
0 0 1 2 3 4 6 6 8 8 90
!NNC Yr12 maths ch 04 Page 144 Wednesday, October 4, 2000 1:43 AM
![Page 37: !NNC Yr12 maths ch 04 Century Year 12/04... · spread. For set B, the interquartile range is the better measure of spread as it does not take the outlier score 29 into account. Standard](https://reader035.fdocuments.us/reader035/viewer/2022071013/5fcb6a5622805a07463cbfd1/html5/thumbnails/37.jpg)
STATISTICAL DISTRIBUTIONS 145
8. The monthly rainfall (in millimetres) for two areas in Australia for July to December1999 is given in the table.
(a) Represent the data in an area chart.(b) Write down two observations about the rainfall in 1999.(c) Give one difference between area 1 and area 2.
9.
(a) State the type of display and what information is being displayed.(b) Is the data quantitative or categorical? Justify your answer.(c) In which occupations were more females employed than males in 1996?(d) Which occupation had the biggest gender difference?(e) Comment briefly on the strengths and weaknesses of the display.
10. This population pyramid (back-to-back histogram) shows Australia’s population in 1995 by age and gender.
(a) What was the female population in the 30–39 age group?(b) What was the population of males in the 10–19 age group?(c) How many people were aged 60 and over?(d) Does the graph support the statement ‘Women live longer than men’? Justify your
answer.
Month Jul Aug Sep Oct Nov Dec
Area 1 2 165 60 92 160 94
Area 2 23 11 14 2 5 6
Employed persons by occupation and sex, Australia, 1996
(× 1000)0 100
Labourers and related
Elementary clerical, sales and service
Intermediate production and transport
Intermediate clerical, sales and service
Advanced clerical and service persons
Tradespersons and related
Associate professionals
Professionals
Managers and administrators
200 300 400 500 600 700 800 900
FemalesMales
Australia’s population by age and gender, 1995
500 0 0 500Population (× 1000)
1000
0–9
30–39
10001500 1500
10–1920–29
40–49
70+Male Female
60–6950–59
!NNC Yr12 maths ch 04 Page 145 Wednesday, October 4, 2000 1:43 AM
![Page 38: !NNC Yr12 maths ch 04 Century Year 12/04... · spread. For set B, the interquartile range is the better measure of spread as it does not take the outlier score 29 into account. Standard](https://reader035.fdocuments.us/reader035/viewer/2022071013/5fcb6a5622805a07463cbfd1/html5/thumbnails/38.jpg)
146 NEW CENTURY MATHS GENERAL: HSC
11. The results for two classes in a Geography test are listed below:Class A: 43, 50, 54, 63, 75, 48, 68, 72, 65, 63, 70, 69, 55, 64, 73, 66, 50, 59, 68, 71, 73, 64Class B: 35, 89, 42, 79, 45, 90, 64, 53, 66, 82, 71, 63, 32, 79, 44, 92, 46, 63(a) Represent the data in a back-to-back stem-and-leaf plot with stems 3, 4, …(b) Use your display to comment on the shape of each class data set by describing
any outliers, clusters, peaks, symmetry or skew.(c) Find the range and interquartile range for each class.(d) Find the standard deviation for each class (correct to 1 decimal place).(e) Which measure would best describe the spread of the two data sets. Why?(f) Which measure of location, the mean or median, would better describe each data
set? Why?
12. This area chart shows the percentage of Australia’s national income saved by households and companies.
(a) What percentage was saved by households in 1980–81?(b) What was the combined percentages saved in 1992–93?(c) In which period was the corporate saving highest?(d) What were the similarities and differences in saving patterns of households and
companies?(e) What trends in savings can be seen from 1962 to 1999?
13. A random sample of people took part in a survey to see who had been to the dentist in the last 6 months.
(a) How many people were surveyed?(b) How many of those surveyed had been to the dentist?(c) What percentage of the 40–44 age group had been to the dentist?(d) What percentage of those surveyed were under 25 and had been to the dentist?(e) What percentage of those surveyed were between 25 and 39?
Age 15–19 20–24 25–29 30–34 35–39 40–44 45–49 50+
Dentist 21 18 54 43 58 43 38 15
No dentist 24 35 51 47 52 49 38 20
% o
f nat
iona
l inc
ome
Savings by households and companies, Australia
1962–63 1968–69 1974–75 1980–81 1986–87 1992–93 1998–99Year
Household saving Corporate saving(or company profits)
20
15
10
5
0
25
!NNC Yr12 maths ch 04 Page 146 Wednesday, October 4, 2000 1:43 AM