Data!&Statistics!(DS)! Displaying!and!Interpreting!! Single...
Transcript of Data!&Statistics!(DS)! Displaying!and!Interpreting!! Single...
1 General Mathematics (Preliminary Course) | Displaying & Interpreting Single Data Sets (DS2) Summary Statistics (DS3)
Data & Statistics (DS)
Displaying and Interpreting Single Data Sets (DS2)
Summary Statistics (DS3)
Name ..............................................................................
G. Georgiou
2 General Mathematics (Preliminary Course) | Displaying & Interpreting Single Data Sets (DS2) Summary Statistics (DS3)
• Draw a Radar Chart to Display Data A radar chart should be drawn with the following key features: ~ A title ~ A key ~ Values represented by dots ~ Appropriate scale ~ Each vertex clearly labelled ~ A straight line to connect each dot ~ Shaped like a regular polygon (octagons, decagons and dodecagons are common) Here is an example of a radar chart. It shows a person’s fluid intake over a 12-‐hour period.
0 50
100 150 200 250
6am 8am
10am
12 noon
2pm
4pm 6pm
8pm
10pm
12pm
2am
4am
Fluid Level
Intake
3 General Mathematics (Preliminary Course) | Displaying & Interpreting Single Data Sets (DS2) Summary Statistics (DS3)
The radar chart shows the mean maximum daily temperature recorded at Observatory Hill in Sydney for each month from May 2006 to April 2007.
(a) What was the coldest month? ........................................................................................................................................ (b) What was the hottest 3-‐month period? ........................................................................................................................................ (c) In what month did it warm up considerably from the previous month? ........................................................................................................................................ (d) Comment on the climate of Sydney in terms of the mean maximum daily temperature throughout the year. On the whole is Sydney a comfortable place to live? ........................................................................................................................................ ........................................................................................................................................ ........................................................................................................................................ ........................................................................................................................................
Example 1
4 General Mathematics (Preliminary Course) | Displaying & Interpreting Single Data Sets (DS2) Summary Statistics (DS3)
The radar chart below shows the total rainfall recorded at Observatory Hill in Sydney for each month from May 2006 to April 2007.
(a) What were the 3 individual months with the highest rainfall? ........................................................................................................................................ ........................................................................................................................................ (b) What was the rainfall in August 2006? ........................................................................................................................................ (c) What was the driest month? ........................................................................................................................................ (d) Was the winter period shown relatively dry or wet? ........................................................................................................................................ (e) Estimate the total rainfall for the 12-‐month period. ........................................................................................................................................ ........................................................................................................................................ (f) From this radar chart, does it appear easy or difficult to predict how much rain there will be in any particular month? ........................................................................................................................................ ........................................................................................................................................
Example 2
5 General Mathematics (Preliminary Course) | Displaying & Interpreting Single Data Sets (DS2) Summary Statistics (DS3)
Activity Ex 1.09 Q 1, 2, 4, 5
The following table shows the average monthly temperatures in degrees Celsius in the Spanish city of Barcelona.
Month Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
Temp (°C) 8 9 11 12 16 20 22 23 21 16 12 10
Draw a radar chart displaying this information.
Blank Radar Chart for Question 5
0
1000
2000
3000
4000 Jan
Feb
Mar
Apr
May
Jun Jul
Aug
Sep
Oct
Nov
Dec
Example 3
0510152025Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
6 General Mathematics (Preliminary Course) | Displaying & Interpreting Single Data Sets (DS2) Summary Statistics (DS3)
• Interpret the Various Displays of Single Data Sets
........................................................................................................................................ ........................................................................................................................................ ........................................................................................................................................
H.S.C. Question (4)
7 General Mathematics (Preliminary Course) | Displaying & Interpreting Single Data Sets (DS2) Summary Statistics (DS3)
........................................................................................................................................ ........................................................................................................................................
........................................................................................................................................ ........................................................................................................................................
H.S.C. Question (5)
Example 6
8 General Mathematics (Preliminary Course) | Displaying & Interpreting Single Data Sets (DS2) Summary Statistics (DS3)
........................................................................................................................................ ........................................................................................................................................ ........................................................................................................................................
Wallaby is an app that combines all your credit cards together virtually. The above info-‐graph shows the number of credit cards a Wallaby user has in relation to the total number of Wallaby users. If there are 25 000 Wallaby users, how many carry 6 or more credit cards? ........................................................................................................................................ ........................................................................................................................................
Example 7
Example 8
9 General Mathematics (Preliminary Course) | Displaying & Interpreting Single Data Sets (DS2) Summary Statistics (DS3)
• Identify the Misrepresentation of Data The presentation of data can be done in such a way that it is misrepresented. The following table shows some ways in which graphs can be commonly misused.
10 General Mathematics (Preliminary Course) | Displaying & Interpreting Single Data Sets (DS2) Summary Statistics (DS3)
........................................................................................................................................ ........................................................................................................................................ ........................................................................................................................................ ........................................................................................................................................ ........................................................................................................................................ ........................................................................................................................................ Explain how the following graph is misleading.
…………………………………………………..…………………………………………….. …………………………………………………..…………………………………………….. …………………………………………………..…………………………………………….. …………………………………………………..…………………………………………….. …………………………………………………..…………………………………………….. …………………………………………………..…………………………………………….. …………………………………………………..……………………………………………..
H.S.C. Question (9)
Example 10
11 General Mathematics (Preliminary Course) | Displaying & Interpreting Single Data Sets (DS2) Summary Statistics (DS3)
Activity Ex 1.01 Q 2, 3, 4, 5, 6, 7, 8, 10 Ex 1.02 Q 1, 2, 4, 6
The above graph was used by CNN to present the opinion of politicians on an issue.
(a) Explain how the above graph is misleading.
........................................................................................................................................ ........................................................................................................................................ ........................................................................................................................................
(b) Redraw the graph to eliminate the misleading factor.
(c) How does your redrawn graph change the interpretation of the data represented?
........................................................................................................................................ ........................................................................................................................................ ........................................................................................................................................
Example 11
12 General Mathematics (Preliminary Course) | Displaying & Interpreting Single Data Sets (DS2) Summary Statistics (DS3)
• Select and Use the Appropriate Statistic (Mean, Median or Mode) to Describe Features of a Data Set (Eg: median house price or modal shirt size)
• Assess the Effect of Outlying Values on Summary Statistics for Small Data Sets A measure of central tendency is essentially an average. It indicates where the middle (or centre) of the data tends/leans towards. There are 3 measures of central tendency:
Mean ( x ): .......................................................................................................................
…………..............................................................................................................................
Median: ..........................................................................................................................
…………..............................................................................................................................
Mode: ............................................................................................................................. ………….............................................................................................................................. Calculate the mean, median and mode of the following data sets. (a) 3, 7, 6, 9, 2, 4, 2 ........................................................................................................................................ ........................................................................................................................................ ........................................................................................................................................ ........................................................................................................................................
Formula
Example 12
The average of a set of scores can be given by:
Σx – the sum of all the scores n – the number of scores
PROVIDED ON HSC FORMULA SHEET
13 General Mathematics (Preliminary Course) | Displaying & Interpreting Single Data Sets (DS2) Summary Statistics (DS3)
(b) 12, 15, 18, 32, 22, 25 ........................................................................................................................................ ........................................................................................................................................ ........................................................................................................................................ ........................................................................................................................................ In which of the following data sets are the mean, median and mode all equal? (A) 1, 1, 2, 3, 3 (B) 1, 1, 1, 2 (C) 3, 4, 4, 5 (D) 8, 10, 12 ........................................................................................................................................ ........................................................................................................................................ ....................................................................................................................................... Two biology classes sat for the same assessment task and the results are shown in the table.
Class Number of Students Average Mark (%) 11.1 25 52 11.2 20 70
The average mark in the assessment task for the two classes combined is:
(A) 56% (B) 60% (C) 61% (D) 77% ........................................................................................................................................ ........................................................................................................................................ ........................................................................................................................................ ........................................................................................................................................
Example 13
Example 14
14 General Mathematics (Preliminary Course) | Displaying & Interpreting Single Data Sets (DS2) Summary Statistics (DS3)
There are 3 measures of central tendency because some are more appropriate for certain situations. Consider these next examples. (a) Determine the mean and median of the following data set:
6, 4, 3, 2, 5, 4, 3, 2, 2, 2 ........................................................................................................................................ ........................................................................................................................................ ........................................................................................................................................ (b) Calculate the mean and median of the following data set:
6, 4, 3, 2, 5, 4, 3, 2, 2, 2, 10 ........................................................................................................................................ ........................................................................................................................................ ........................................................................................................................................ (c) The only difference between (a) and (b) is the outlier score of 10. What effect does this outlier score have on these 2 measures of central tendency? ........................................................................................................................................ ........................................................................................................................................ ........................................................................................................................................ Determine which measure of central tendency can be used to determine the following question.
“What is the average Pants Size of an Australian Male?”
........................................................................................................................................ ........................................................................................................................................ ........................................................................................................................................ ........................................................................................................................................
Example 16
Example 15
15 General Mathematics (Preliminary Course) | Displaying & Interpreting Single Data Sets (DS2) Summary Statistics (DS3)
The following is a guide as to when to use each measure of central tendency. Mode .............................................................................................................................. ........................................................................................................................................ ........................................................................................................................................ Median ........................................................................................................................... ........................................................................................................................................ ........................................................................................................................................ Mean .............................................................................................................................. ........................................................................................................................................ ........................................................................................................................................ Decide which M (mean, median or mode) is correct for each statement. (a) This M takes all scores in the data set into account. ………………. (b) This M is one of the scores if there is an odd number of scores. ………………. (c) Half of the scores are above this M, the other half are below. ………………. (d) There can be more than one M in a set of data. ………………. (e) This M often needs to be rounded to decimal places. ………………. (f) This M can also be used for categorical data. ………………. (g) This M can be distorted by many outliers. ………………. (h) This M must be one of the scores in the data set. ……………….
Example 17
16 General Mathematics (Preliminary Course) | Displaying & Interpreting Single Data Sets (DS2) Summary Statistics (DS3)
Activity Ex 9:01 Q 1, 3 Ex 9:02 Q 1, 3 Ex 9:03 Q 3 – 6, 8
Which measure of location is most appropriate for describing each average? (a) The average exam mark for the class ………………………… (b) The average shirt size for teenage girls ………………………… (c) The average rent paid for a house in Sydney ………………………… (d) The average screen size of a notebook computer ………………………… (e) The average mass of football players in a team ………………………… (f) The average brand of mobile phone …………………………
Example 18
17 General Mathematics (Preliminary Course) | Displaying & Interpreting Single Data Sets (DS2) Summary Statistics (DS3)
• Construct a Dot Plot from a Small Data Set and Interpret the Dot Plot • Calculate the Median from a Stem-‐and-‐Leaf Plot
A dot plot should contain the following features: ~ A title ~ Horizontal axis labelled ~ An appropriate scale ~ Dots clearly placed and lined up in the appropriate rows and columns Here is an example of a dot plot. It shows the temperature of patients at Liverpool Hospital.
Use the above dot plot to answer the following questions. (a) How many patients had their temperature taken? ........................................................................................................................................ (b) Determine the value of the outlier temperature. ........................................................................................................................................ (c) Determine for what temperatures there exists a cluster of data? ........................................................................................................................................ (d) What percentage of temperatures were less than 38 C? ........................................................................................................................................
Example 19
18 General Mathematics (Preliminary Course) | Displaying & Interpreting Single Data Sets (DS2) Summary Statistics (DS3)
Draw a dot plot to display the following data, which represents the score learned drivers received in their driving test.
The current mean of the following data plot is 5.9 (correct to 1dp). What score will need to be added to this dot plot to increase the mean to 6?
........................................................................................................................................ ........................................................................................................................................ ........................................................................................................................................ ........................................................................................................................................ ........................................................................................................................................
Example 21
Example 20
19 General Mathematics (Preliminary Course) | Displaying & Interpreting Single Data Sets (DS2) Summary Statistics (DS3)
A stem-‐and-‐leaf plot should contain the following features: ~ A title ~ A stem labelled (usually gives the ‘tens’ place value of a number) ~ A leaf labelled (usually gives the ‘ones’ place value of a number) Here is an example of a stem-‐and-‐leaf plot. It shows the ages of student’s siblings in a class.
Answer the following questions based on the stem-‐and-‐leaf plot above. (a) What age group are most of the siblings in? ........................................................................................................................................ (b) How many siblings were older than 20? ........................................................................................................................................ (c) How old was the oldest sibling? ........................................................................................................................................ (d) How old is the fifth youngest sibling? ........................................................................................................................................ (e) One student has twin sisters. How old could they be? ........................................................................................................................................
Example 22
20 General Mathematics (Preliminary Course) | Displaying & Interpreting Single Data Sets (DS2) Summary Statistics (DS3)
Activity Ex 9:01 Q 2, 5, 7 Ex 9:02 Q 2, 5, 6 Ex 1:08 Q 1, 2, 6, 7, 11
(f) Explain why you cannot tell from this stem-‐and-‐leaf plot how many students were surveyed? ........................................................................................................................................ ........................................................................................................................................ ........................................................................................................................................ ........................................................................................................................................ (g) Calculate the mean, median and mode for the stem and leaf plot above. ........................................................................................................................................ ........................................................................................................................................ ........................................................................................................................................ ........................................................................................................................................ (h) The teacher forgot to add one of the siblings’ ages. If the age was supposed to go between the 3 and 5 in the “1” stem, what are all the possibilities of the sibling’s age given that this stem and leaf plot is an ordered one? ........................................................................................................................................ ........................................................................................................................................ ........................................................................................................................................ ........................................................................................................................................
21 General Mathematics (Preliminary Course) | Displaying & Interpreting Single Data Sets (DS2) Summary Statistics (DS3)
• Determine the Mean for Larger Data Sets of either Ungrouped or Grouped Data using the Statistical Functions of a Calculator
A frequency distribution table is a combination of 4 columns.
Score (x) Tally Frequency (f) Frequency × Score (fx)
1 ||| 3 3 2 |||| 4 8 3 |||| | 6 18 4 |||| 4 16 5 || 2 10 6 ||| 3 18
Σ The mean for this data set can be found using this formula:
Alternatively the mean for large data sets like this can be calculated quite easily using a calculator.
x f 21 6 22 7 23 3 24 7 25 5
Formula
The average of a set of scores can be given by:
Σ fx – the total of the fx column in a frequency table Σ f – the total of the f column in a frequency table
PROVIDED ON HSC FORMULA SHEET
1. Clear the memory SHIFT 9 3 = AC
2. Turn the frequency on
SHIFT MODE ⇓ 3 1 3. Enter Stat mode
MODE 2 1 4. Enter the scores in the first column and the frequency into the second column. The default frequency is 1. When you are done, press AC 5. Find the mean SHIFT 1 4 2
22 General Mathematics (Preliminary Course) | Displaying & Interpreting Single Data Sets (DS2) Summary Statistics (DS3)
H.S.C. Question (23)
H.S.C. Question (24)
23 General Mathematics (Preliminary Course) | Displaying & Interpreting Single Data Sets (DS2) Summary Statistics (DS3)
.................................................................................................................................................... .................................................................................................................................................... ....................................................................................................................................................
H.S.C. Question (25)
24 General Mathematics (Preliminary Course) | Displaying & Interpreting Single Data Sets (DS2) Summary Statistics (DS3)
Activity Ex 9.01 Q 4, 6, 8 Ex 9.02 Q 4
.................................................................................................................................................... .................................................................................................................................................... .................................................................................................................................................... .................................................................................................................................................... .................................................................................................................................................... .................................................................................................................................................... .................................................................................................................................................... .................................................................................................................................................... .................................................................................................................................................... .................................................................................................................................................... .................................................................................................................................................... .................................................................................................................................................... .................................................................................................................................................... .................................................................................................................................................... .................................................................................................................................................... .................................................................................................................................................... .................................................................................................................................................... ................................................................................................................................................... .................................................................................................................................................... .................................................................................................................................................... ....................................................................................................................................................
25 General Mathematics (Preliminary Course) | Displaying & Interpreting Single Data Sets (DS2) Summary Statistics (DS3)
• Calculate the Measures of Location – Mean, Mode and Median – for Grouped Data presented in Table or Graphical Form
Imagine collecting the height of every student in Year 11 and then arranging them into a frequency distribution table. This would prove difficult as the heights would vary greatly. In this instance, statisticians tend to group the data. Grouped data frequency distribution tables have: ~ Classes instead of scores (all classes must be the same size) ~ A class centre column which is the average of the upper and lower limit of the class ~ An fcc column instead of an fx column. Since we don’t know the individual scores, we have to use the class centres to provide an estimate.
Class Class Centre (cc) Tally Frequency (f)
Frequency × Class Centre
(fcc) 50-‐54 52 55-‐59 57 60-‐64 62
Σ A grouped data frequency histogram has the following features: ~ A title ~ Class centres labelled on the horizontal axis ~ Frequency labelled on the vertical axis with appropriate scale ~ Columns of equal width with NO gaps in between ~ A half a column gap at the start Here is an example of a grouped frequency histogram.
As can be seen, the first column has a class centre of 22. For this to be the average, the class would need to be
20-‐24.
26 General Mathematics (Preliminary Course) | Displaying & Interpreting Single Data Sets (DS2) Summary Statistics (DS3)
A grouped data frequency polygon has the following features: ~ A title ~ Class centres labelled on the horizontal axis ~ Frequency labelled on the vertical axis with appropriate scale ~ Line graph starting from the origin of the graph, going to the top middle of each column, then coming back down to the horizontal axis Here is an example of a combined grouped frequency histogram and polygon.
Mode: When we deal with grouped data, we do not have a mode but rather a modal class. It is the most frequently occurring class. Median: Only the median class can be found. This is the class that occurs in the middle and is found using this procedure: ~ Add 1 to the total frequency
~ Divide this value by 2 ~ If you get a whole number, the median can be found at the score at that position (If your answer is 14 that means the median is the 14th score) ~ If you get an answer that ends in .5, say 14.5, then the median is the average of the 14th and 15th score
Mean: An estimate for the mean can be found only. This is because we do not have the actual scores, so the exact mean cannot be found. This is found by: ~ Finding the total frequency (f)
~ Finding the total of the frequency × class centre (fcc) ~ ∑ fcc ÷ ∑ f
As can be seen, the first column has a class centre of 57. For this to be the average, the class would need to be
55-‐59.
The polygon is the line graph part.
27 General Mathematics (Preliminary Course) | Displaying & Interpreting Single Data Sets (DS2) Summary Statistics (DS3)
Class Class Centre (cc) Tally Frequency (f) f x c.c.
50-‐54 55-‐59 60-‐64 65-‐69 70-‐74 75-‐79 80-‐84 85-‐89
Σ
84 63 51 55 65 55 63 55 82 71 74 85 75 76 61 74 83 79 78 79 77 72 56 56 57 59 89 88 87 86 84 83 82 81 80 66 65 64 63 69 58 57
(a) Complete the grouped frequency distribution table above. (b) Determine the modal class. ........................................................................................................................................ (c) Determine the median class. ........................................................................................................................................ ........................................................................................................................................ ........................................................................................................................................ (d) Estimate the value of the mean. ........................................................................................................................................ ........................................................................................................................................ (e) Explain why the mean is only an estimate. ........................................................................................................................................ ........................................................................................................................................ ........................................................................................................................................
Example 26
28 General Mathematics (Preliminary Course) | Displaying & Interpreting Single Data Sets (DS2) Summary Statistics (DS3)
.................................................................................................................................................... .................................................................................................................................................... .................................................................................................................................................... .................................................................................................................................................... .................................................................................................................................................... .................................................................................................................................................... .................................................................................................................................................... .................................................................................................................................................... .................................................................................................................................................... .................................................................................................................................................... .................................................................................................................................................... ....................................................................................................................................................
H.S.C. Question (27)
29 General Mathematics (Preliminary Course) | Displaying & Interpreting Single Data Sets (DS2) Summary Statistics (DS3)
........................................................................................................................................ ........................................................................................................................................ ........................................................................................................................................ ........................................................................................................................................ ........................................................................................................................................ ........................................................................................................................................
H.S.C. Question (28)
30 General Mathematics (Preliminary Course) | Displaying & Interpreting Single Data Sets (DS2) Summary Statistics (DS3)
Activity Ex 9.01 Q 9, 10 Ex 9.02 Q 8, 10
Heather made a mistake when she arranged these class groups. What has she done wrong?
Class 1 – 5 5 – 10 10 – 15 15 – 20 20 – 25
........................................................................................................................................ ........................................................................................................................................ ........................................................................................................................................ Heath made a mistake when he arranged these class groups. What has he done wrong?
Class 1 – 4 5 – 9
10 – 15 15 – 20 20 – 25
........................................................................................................................................ ........................................................................................................................................ ........................................................................................................................................
Example 29
Example 30
31 General Mathematics (Preliminary Course) | Displaying & Interpreting Single Data Sets (DS2) Summary Statistics (DS3)
• Describe Standard Deviation informally as a Measure of the Spread of Data in relation to the Mean
• Calculate Standard Deviation using a Calculator Standard Deviation: ........................................................................................................ ........................................................................................................................................ ........................................................................................................................................ If the value of the standard deviation is small, this implies that the data set is more clustered around the mean. The smaller the standard deviation, the more consistent the data set is. The symbol for standard deviation is . The standard deviation is found by using your calculator.
x f 21 6 22 7 23 3 24 7 25 5
σ n
1. Clear the memory SHIFT 9 3 = AC
2. Turn the frequency on
SHIFT MODE ⇓ 3 1 3. Enter Stat mode
MODE 2 1 4. Enter the scores in the first column and the frequency into the second column. The default frequency is 1. When you are done, press AC 5. Find the population standard deviation (census) SHIFT 1 4 3 Find the sample standard deviation (sample) SHIFT 1 4 4
32 General Mathematics (Preliminary Course) | Displaying & Interpreting Single Data Sets (DS2) Summary Statistics (DS3)
(a) Find the mean and standard deviation of the following data on the weekly wage of all employees of a hairdressing salon.
$450, $520, $610, $450, $620, $395, $415, $500
........................................................................................................................................ ........................................................................................................................................ ........................................................................................................................................ ........................................................................................................................................ (b) Explain what the standard deviation means in the context of this question. ........................................................................................................................................ ........................................................................................................................................ ........................................................................................................................................ Two typists, Mary and Fred, were tested as to how many words they could type per minute. This was done over 8 minutes. Mary 42 43 45 40 39 47 38 44 Fred 30 49 46 48 29 46 47 48 (a) Calculate the mean and standard deviation for Fred and Mary (to 2 d.p). ........................................................................................................................................ ........................................................................................................................................ ........................................................................................................................................ (b) Which typist is more reliable? Justify your answer with reference to the standard deviation. ........................................................................................................................................ ........................................................................................................................................ ........................................................................................................................................
Example 31
Example 32
33 General Mathematics (Preliminary Course) | Displaying & Interpreting Single Data Sets (DS2) Summary Statistics (DS3)
Activity Ex 9.08 Q 1, 3 -‐ 7
........................................................................................................................................ ........................................................................................................................................ ........................................................................................................................................ ........................................................................................................................................ ........................................................................................................................................ ........................................................................................................................................ ........................................................................................................................................ ........................................................................................................................................ ........................................................................................................................................
H.S.C. Question (33)
34 General Mathematics (Preliminary Course) | Displaying & Interpreting Single Data Sets (DS2) Summary Statistics (DS3)
• Calculate and Interpret the Range as a Measure of the Spread of a Data Set The range is a measure of spread. It tells you how far the scores are spread out. It is not an average. The answer to the range does not tell you anything about the data set other than the spread of the scores.
Calculate the range of this data set: 3, 4, 6, 7, 5, 3, 1, 9 ........................................................................................................................................ The range does not determine whether one thing is better than another like measures of central tendency do. The range of house prices in Campbelltown in $540 000, yet the range of house prices in Cecil Hills is $230 000. (a) Does the fact that the range of house prices in Campbelltown is greater imply that the house prices there are more expensive than those in Cecil Hills? ........................................................................................................................................ ........................................................................................................................................ ........................................................................................................................................ (b) What implications can be made about the houses in Cecil Hills based on the fact that its range is smaller than that of Campbelltown? ........................................................................................................................................ ........................................................................................................................................ ........................................................................................................................................
Example 34
Example 35
Formula
Range = highest score MINUS lowest score
NOT ON FORMULA SHEET
35 General Mathematics (Preliminary Course) | Displaying & Interpreting Single Data Sets (DS2) Summary Statistics (DS3)
The range of the exam results in 8A is 12%, whereas the range of the exam results in 8B is 16%. From this provided data, what information can you determine? Is it possible to determine which class performed better? If not what extra information would you need? ........................................................................................................................................ ........................................................................................................................................ ........................................................................................................................................ Which would be greater: the range of the student’s ages in this school or the range of teacher’s ages in this school? Explain your answer. ........................................................................................................................................ ........................................................................................................................................ ........................................................................................................................................ The data set below represents the results of a class test that was out of 45.
Stem Leaf 1 1 3 4 6 6 7 8 9 2 2 3 4 5 6 7 7 7 3 5 8 9 4 0 1
(a) Calculate the range of the above data set. …………….……………………….. Another class sat the same test, however, their range for the test was 25. (b) What are possible values for the highest score and the lowest score?
........................................................................................................................................ (c) Can you determine which class performed better only from the range? Explain.
........................................................................................................................................ ........................................................................................................................................ ........................................................................................................................................
Example 36
Example 37
Example 38
36 General Mathematics (Preliminary Course) | Displaying & Interpreting Single Data Sets (DS2) Summary Statistics (DS3)
• Calculate and Interpret the Interquartile Range as Measures of the Spread of a Data Set
• Establish a Five-‐Number Summary for a Data Set (lower extreme, lower quartile, median, upper quartile and upper extreme)
• Determine the Five-‐Number Summary from a Stem-‐and-‐Leaf Plot • Develop a Box-‐and-‐Whisker Plot from a Five-‐Number Summary A five-‐number summary consists of: 1. Lower Extreme: the lowest score 2. Lower Quartile (Q1): the median of the lower half of the scores 3. Median (Q2): the middle score 4. Upper Quartile (Q3): the median of the upper half of the scores 5. Higher Extreme: the highest score The process of determining quartiles is similar to that of finding the median, but instead of splitting the data into 2 equal groups, we now want 4 equal groups. Remember: Ø If there is an odd number of scores, circle the score Ø If there is an even number of scores, put a line between the 2 middle scores Determine the five-‐number summary for the below data sets. (a) 1 2 3 3 6 6 7 9 10 10 10 11 11 11
11 11 11 11 12 12 13 13 13 13 14 15 17 ........................................................................................................................................ ........................................................................................................................................ ........................................................................................................................................ (b) 63, 56, 62, 58, 59, 60, 61, 62, 63, 64 ........................................................................................................................................ ........................................................................................................................................ ........................................................................................................................................ ........................................................................................................................................ ........................................................................................................................................
Example 39
37 General Mathematics (Preliminary Course) | Displaying & Interpreting Single Data Sets (DS2) Summary Statistics (DS3)
One other measure of spread is the interquartile range. This measures the spread of the middle 50% of scores. It is more accurate than the range because it allows us to see what is happening in the middle. It is not as heavily affected by outlying scores like the range is.
Calculate the interquartile range for the data sets from Example 39 (a) .......................................................... (b) ............................................................ A box and whisker plot is a diagrammatic version of a five-‐number summary.
Ages of Citizens
Note: Between the Lower Extreme and Lower Quartile, 25% of the scores occur Between the Lower Quartile and the Median, 25% of the scores occur Between the Median and the Upper Quartile, 25% of the scores occur Between the Upper Quartile and the Upper Extreme, 25% of the scores occur The purpose of splitting data up into quartiles is to get a general idea of how many scores lie between two sets of values.
Example 40
Formula
Interquartile Range = Upper Quartile MINUS Lower Quartile
IQR = Q3 – Q1
NOT ON FORMULA SHEET
38 General Mathematics (Preliminary Course) | Displaying & Interpreting Single Data Sets (DS2) Summary Statistics (DS3)
Draw a box and whisker plot to present the following information.
1,1,2,3,4,5,6,6,7,8,8,9,10
1.10 1.20 1.30 1.40 1.50 1.60 1.70 1.80 1.90
(a) What is the median price of petrol? ....................................... (b) What was the interquartile range of petrol prices? ....................................... (c) What was the range of the petrol prices? ....................................... (d) What percentage of prices is less than $1.60? ....................................... (e) What percentage of prices is above $1.70? ....................................... (f) Between what two amounts is the middle 50% of petrol prices? ........................................................................................................................................
Example 41
Example 42
Petrol price ($)
39 General Mathematics (Preliminary Course) | Displaying & Interpreting Single Data Sets (DS2) Summary Statistics (DS3)
........................................................................................................................................ ........................................................................................................................................ ........................................................................................................................................ ........................................................................................................................................ ........................................................................................................................................ ........................................................................................................................................ ........................................................................................................................................ ........................................................................................................................................ ........................................................................................................................................ ........................................................................................................................................ ........................................................................................................................................
Example 43
40 General Mathematics (Preliminary Course) | Displaying & Interpreting Single Data Sets (DS2) Summary Statistics (DS3)
At a plant nursery, 200 seeds of a new variety of plant were potted. After 10 weeks, development of the seeds into plants was recorded. 20 of the seeds did NOT form plants. The heights of the plants that developed from seeds were recorded in a ‘box-‐and-‐whisker plot’ as shown above. (a) What was the median height of the plants? ........................................................................................................................................ (b) Write a five-‐number summary for the plant heights. ........................................................................................................................................ ........................................................................................................................................ ........................................................................................................................................ ........................................................................................................................................ ........................................................................................................................................ (c) Determine the interquartile range of the plant heights. ........................................................................................................................................
(d) How many of the plants had heights recorded over 22cm? ........................................................................................................................................ ........................................................................................................................................ (e) Suggest ONE disadvantage for recording data in this type of graph. ........................................................................................................................................ ........................................................................................................................................
Example 44
Height of Plants (cm)
41 General Mathematics (Preliminary Course) | Displaying & Interpreting Single Data Sets (DS2) Summary Statistics (DS3)
Activity Ex 9:05 Q 1 – 4 Ex 9.07 ALL
Match the following 4 box and whisker plots to the following 4 scenarios
Age in Years
Scenario 1: A teacher taking 20 primary school students on an excursion Scenario 2: An extended family of 2 children, 4 parents and a grandparent Scenario 3: A group of elderly citizens Scenario 4: A random group of 20 citizens
Example 45
42 General Mathematics (Preliminary Course) | Displaying & Interpreting Single Data Sets (DS2) Summary Statistics (DS3)
• Construct Frequency Tables for Grouped Data from Cumulative Frequency Graphs (Histograms and Polygons)
• Estimate the Median and Upper and Lower Quartiles of a Data Set from a Cumulative Frequency Polygon for Grouped Data
• Calculate the Median from Cumulative Frequency Polygons Sometimes we can add an extra column onto a frequency distribution table called the cumulative frequency (cf) column. This column tells us the progressive total of the frequency column. Here is an example of a frequency table with cumulative frequency included. Class Centre
(cc) Frequency
(f) Cumulative
Frequency (cf) 12 4 4 17 5 9 22 7 16 27 8 32 3 37 2 29
A cumulative frequency histogram has the following features: ~ A title ~ Scores labelled on the horizontal axis ~ Cumulative frequency labelled on the vertical axis with appropriate scale ~ Columns of equal width with NO gaps in between ~ A half a column gap at the start ~ Columns consistently increasing since cf is an accumulation of scores Here is an example of a grouped cumulative frequency histogram.
This is 9 because there have been 4 scores in the first class, and 5 scores in the next class
This cf is 16 because 4+5+7=16
This value is 4 because we have only had four scores in the first
class
43 General Mathematics (Preliminary Course) | Displaying & Interpreting Single Data Sets (DS2) Summary Statistics (DS3)
A cumulative frequency polygon has the following features: ~ A title ~ Scores labelled on the horizontal axis ~ Cumulative frequency labelled on the vertical axis with appropriate scale ~ Line graph starting from the bottom left corner of the first column, going to the top right corner of each column ~ The polygon ends at the top of the last column. It does not come back down Here is an example of a combined cumulative frequency histogram and polygon. Note how the polygon for normal frequency and cumulative frequency is different. The only other difference occurs on the horizontal axis. As the data is grouped, we now need to place the class centres in the middle of the columns instead of the scores. In this case, the horizontal axis is treated like a number line and can be used for estimation.
Notice how the class centres are actually in the centre of the columns.
You can tell this is grouped data because the numbers on the horizontal
axis are NOT consecutive. The groupings of classes would have been
0-‐4, 5-‐9, 10-‐14 etc.
44 General Mathematics (Preliminary Course) | Displaying & Interpreting Single Data Sets (DS2) Summary Statistics (DS3)
(a) Complete the following grouped cumulative frequency distribution table.
Classes Class Centres Frequency fcc cf 0-‐5 4 6-‐11 5 12-‐17 5 18-‐23 6 24-‐29 6 30-‐35 2
Σ (b) Draw a grouped cumulative frequency histogram and polygon.
Example 46
45 General Mathematics (Preliminary Course) | Displaying & Interpreting Single Data Sets (DS2) Summary Statistics (DS3)
The below cumulative frequency histogram and ogive (polygon) show the heights of students in a Maths class. (a) What is the modal class? ……………………………………..……………………………………..……………………………………..…………… (b) Which score occurred the least? ……………………………………..……………………………………..……………………………………..…………… (c) How do you know that the shortest person must be taller than 154 cm? ……………………………………..……………………………………..……………………………………..…………… ……………………………………..……………………………………..……………………………………..…………… ……………………………………..……………………………………..……………………………………..…………… (d) Draw a frequency distribution table displaying the above information.
Example 47
46 General Mathematics (Preliminary Course) | Displaying & Interpreting Single Data Sets (DS2) Summary Statistics (DS3)
CF
15
10
5
2 5 8 11 Class centres
Which of the following frequency histograms best represent the scores in the cumulative frequency histogram above? (A) (B)
Freq
uency
15
10
5
2 5 8 11 Class Centres
(C) (D)
Freq
uency
15
10
5
2 5 8 11 Class Centres
Example 48
Freq
uency
15
10
5
2 5 8 11 Class Centres
Freq
uency
15
10
5
2 5 8 11 Class Centres
47 General Mathematics (Preliminary Course) | Displaying & Interpreting Single Data Sets (DS2) Summary Statistics (DS3)
Activity Ex 9.06 Q 4
While the syllabus states that only grouped data needs to be covered, there may be some instances where the data is not grouped. This will work the same way, but will have scores on the horizontal axis, not class centres.
A cumulative frequency graph can be used to estimate the value of scores whose position is known. Simply find the right score on the cf axis, draw a horizontal line to the polygon (ogive), then draw a vertical line down to find the score. Ø Median – is the score in the middle, so the median can be estimated by figuring
out what score lies halfway. Ø Lower Quartile – is the score ¼ of the way into the data set when in ascending
order. So if there are 40 scores, the lower quartiles will be ¼ × 40 = 10 so the 10th score.
Ø Upper Quartile – is the score ¾ of the way into the data set when in ascending order. So if there are 40 scores, the data set is ¾ × 40 = 10 so the 30th score.
H.S.C. Question (49)
48 General Mathematics (Preliminary Course) | Displaying & Interpreting Single Data Sets (DS2) Summary Statistics (DS3)
........................................................................................................................................ ........................................................................................................................................ ........................................................................................................................................ ........................................................................................................................................ ........................................................................................................................................ ........................................................................................................................................ ........................................................................................................................................ ........................................................................................................................................
H.S.C. Question (50)
49 General Mathematics (Preliminary Course) | Displaying & Interpreting Single Data Sets (DS2) Summary Statistics (DS3)
........................................................................................................................................ ........................................................................................................................................ ........................................................................................................................................ ........................................................................................................................................ ........................................................................................................................................
........................................................................................................................................ ........................................................................................................................................ ........................................................................................................................................ ........................................................................................................................................
H.S.C. Question (51)
50 General Mathematics (Preliminary Course) | Displaying & Interpreting Single Data Sets (DS2) Summary Statistics (DS3)
Activity Ex 9.06 Q1-‐3
H.S.C. Question (52)
51 General Mathematics (Preliminary Course) | Displaying & Interpreting Single Data Sets (DS2) Summary Statistics (DS3)
• Divide Large Sets of Data into Deciles, Quartiles and Percentiles and Interpret Displays
When dividing data into quartiles, we split the data up into 4 equal groups of scores. The numbers we establish for Q1, Q2 and Q3 act as the cut-‐offs for our quartiles. Similarly we can be asked to split scores into deciles. This is when we split a data set into 10 equal groups. With deciles we use the notation D1, D2, and so on all the way up to D9 to separate the data into tenths.
Label/Name Cut-‐off at the bottom Cut-‐off at the top D1 – 1st decile Shows bottom 10% Shows top 90% D4 D9 Consider the following lengths of infant babies:
44 45 47 48 48 48 49 49 49 50 50 50 51 51 52 52 52 53 55 56
(a) Determine the value of the 3rd decile for this data set. ........................................................................................................................................ (b) What is the 5th decile for this data set and identify TWO other names for the 5th decile. ........................................................................................................................................ ........................................................................................................................................ (c) What value splits the bottom 70% of scores from the top 30% of scores? What decile is this? ........................................................................................................................................ ........................................................................................................................................ (d) Baby James was born and his length was in the top 10% of infant lengths. What value must his length be greater than? ........................................................................................................................................ ........................................................................................................................................
Example 53
52 General Mathematics (Preliminary Course) | Displaying & Interpreting Single Data Sets (DS2) Summary Statistics (DS3)
Percentiles split data into hundredths. We label percentiles as P1, P2 … P99.
Label/Name Cut-‐off at the bottom Cut-‐off at the top P24 – 24th Percentile Shows bottom 24% Shows top 76% P60 – P87 – The following information is based on population for the heights of girls aged 16 years.
~ The median is 163cm ~ The 3rd quartile is 167cm ~ The 9th decile is 171 cm ~ The 5th percentile is 152 cm ~ The 97th percentile is 175 cm
(a) Holly’s height is 175cm. Is she tall for her age? Justify your answer with reference to the information provided above. ........................................................................................................................................ ........................................................................................................................................ (b) Olga is taller than 90% of girls her age. What is her height? ........................................................................................................................................ ........................................................................................................................................
(c) If ¼ of girls her age are taller than Verity, how tall is she? ........................................................................................................................................ ........................................................................................................................................ (d) What height separates the bottom 5% of heights from the top 95% of heights?
........................................................................................................................................ ........................................................................................................................................
(e) What percentile is a height of 163cm?
........................................................................................................................................ ........................................................................................................................................
Example 54
53 General Mathematics (Preliminary Course) | Displaying & Interpreting Single Data Sets (DS2) Summary Statistics (DS3)
(a) The length of an infant girl is 70cm. Her head circumference is 45cm. What percentile does this place her in for head circumference? ........................................................................................................................................ (b) What will the approximate head circumference when she is 100cm tall? ........................................................................................................................................
12
Birth 3 1296 18 21 24 27 30 33 3615
52
50
48
46
44
42
40
38
36
32
cm
52
50
48
46
44
cm
20
19
18
17
16
15
14
13
in
20
19
18
in HEAD
CIRCUMFERENCE
34
17
HEAD
CIRCUMFERENCE
9590
50
25
105
75
30
AGE (MONTHS)
42
WEIGHT
WEIGHT
14
20
18
14
16
12
10
8
6
4
2
9
8
7
2
9
8
7
22
20
18
14
16
24
26
28
30
32
34
36
38
40
42
44
in
cm
kglb1
3
12
13
14
15
16
17
WEIGHT
12
10
11
46
48
50
22
24
18
19
20
21
22
10
11
6
5
4kg
in
cmLENGTH lb
6
5
Date Age Weight Length Head Circ. Comment
SOURCE: Developed b(2000).
y the National Center for Health Statistics in collaboration withthe National Center for Chronic Disease Prevention and Health Promotionhttp://www.cdc.gov/growthcharts
46 48 50 52 54 56 58 60 62
64 66 68 70 72 74 76 78 80 82 84 86 88 90 9290 94 96 98100
414039383735 36343332313029282726
24232221201918
Birth to 36 months: GirlsHead circumference-for-age andWeight-for-length percentiles
NAME
RECORD #
Published May 30, 2000 (modified 10/16/00).
95
90
50
2510
75
5
Example 55
54 General Mathematics (Preliminary Course) | Displaying & Interpreting Single Data Sets (DS2) Summary Statistics (DS3)
Activity Ex 9.04 ALL
H.S.C. Question (56)
55 General Mathematics (Preliminary Course) | Displaying & Interpreting Single Data Sets (DS2) Summary Statistics (DS3)
• Calculate Summary Statistics using Spreadsheet Formulae • Compare Summary Statistics of Various Samples from the Same Population Summary statistics can be completed via spreadsheet programs. Obviously in examinations we cannot ask you to perform calculations using a spreadsheet, but we can ask questions about what formula you might use to complete certain calculations. This is what a blank spreadsheet looks like:
Cell: a box in which you can enter a single piece of data. Every cell has a coordinate to help us and others find it. For example, the arrow is
pointing to cell B3
Cell range: In this section a group of cells has been
selected. This is known as a cell range. We would write
this like this D4:D9
Function Bar: lets you see what you are typing into a cell
56 General Mathematics (Preliminary Course) | Displaying & Interpreting Single Data Sets (DS2) Summary Statistics (DS3)
........................................................................................................................................ ........................................................................................................................................
........................................................................................................................................ ........................................................................................................................................
Example 57
Example 58
57 General Mathematics (Preliminary Course) | Displaying & Interpreting Single Data Sets (DS2) Summary Statistics (DS3)
You need to know a series of formulas when dealing with spreadsheets. Formulas follow these rules: ~ Must start with an equals sign = ~ Next is a word that tells the program what mathematical operation to perform ~ Last will be the cell range that will be used in the calculation written in brackets (D4:D9)
Function Formula
Calculating the average (mean) =average(B2:B5)
Calculating the median =median(B2:B5)
Calculating the mode =mode(B2:B5)
Calculating the standard deviation =stdevp(B2:B5)
Calculating the lower extreme (lowest score) =min(B2:B5)
Calculating the lower quartile =quartile(B2:B5, 1)
Calculating the median (or as above) =quartile(B2:B5,2)
Calculating the upper quartile =quartile(B2:B5, 3)
Calculating the upper extreme (highest score) =max(B2:B5)
Count Function (counting the number of scores) =count(B2:B5)
58 General Mathematics (Preliminary Course) | Displaying & Interpreting Single Data Sets (DS2) Summary Statistics (DS3)
The spreadsheet below contains data on 20 students. These students were asked how many hours they spend on their homework. Answer the following question based on this data set.
(a) What number is in cell B4? ........................................................................................................................................ (b) What numbers are in the cell range B2:B5? ........................................................................................................................................ (c) What formula would calculate the mean for this data set? ........................................................................................................................................ (d) What formula would calculate the mode for this set? ........................................................................................................................................ (e) There is no statistical function for range. How would you calculate the range? ........................................................................................................................................ (f) There is no statistical function for interquartile range. How would you calculate the interquartile range? ........................................................................................................................................ ........................................................................................................................................ ........................................................................................................................................
Example 59
59 General Mathematics (Preliminary Course) | Displaying & Interpreting Single Data Sets (DS2) Summary Statistics (DS3)
Enter the following data into a blank spreadsheet
(a) Calculate the following for the entire data set and then just for the first column.
Entire Data Set Sample Mean Median Mode Standard Deviation Interquartile Range Range (b) Are the census results the same as the sample results? What conclusion can we make when calculating the summary statistics on a sample of data as opposed to the entire population? ........................................................................................................................................ ........................................................................................................................................ ........................................................................................................................................ ........................................................................................................................................ ........................................................................................................................................ ........................................................................................................................................ ........................................................................................................................................ ........................................................................................................................................
Example 60
60 General Mathematics (Preliminary Course) | Displaying & Interpreting Single Data Sets (DS2) Summary Statistics (DS3)
........................................................................................................................................ ........................................................................................................................................
........................................................................................................................................ ........................................................................................................................................
Example 61
Example 62
61 General Mathematics (Preliminary Course) | Displaying & Interpreting Single Data Sets (DS2) Summary Statistics (DS3)
........................................................................................................................................ ........................................................................................................................................ • Link Type of Data with an Appropriate Display (Eg, Continuous Quantitative
Data with a Histogram, or Categorical Data with a Divided Bar Graph or Sector Graph)
Certain graphs suit certain types of data. Consider the following examples and justify why these graph types are best for this data type.
Data Type Graph Justification
Categorical Data Sector Graph Divided Bar Graph
Quantitative continuous data
Histogram Radar Charts
Quantitative discrete data
Dot Plots Stem-‐and-‐Leaf Plots
Example 63
62 General Mathematics (Preliminary Course) | Displaying & Interpreting Single Data Sets (DS2) Summary Statistics (DS3)
• Create Statistical Displays using a Spreadsheet or other Appropriate Software The following link is YouTube clip on how to create graphs using a Google spreadsheet if you need help.
http://www.youtube.com/watch?v=713apMgym-‐w The following data has been taken from Census @ School and reflects the eye colour of 20 random students in different parts of Sydney.
North Sydney West Sydney Blue Brown Blue Blue Blue Blue Brown Brown Hazel Hazel Brown Brown Blue Brown Blue Brown Brown Hazel Hazel Brown Blue Brown Brown Brown Hazel Blue Blue Brown Blue Hazel Hazel Hazel Blue Brown Blue Brown Blue Blue Brown Blue
(a) Using Google Docs, draw 2 separate sector graphs showing the eye colour of the 20 students in different parts of Sydney. (b) Identify ONE difference between the two graphs. ........................................................................................................................................ ........................................................................................................................................ (c) Explain why a sector graph would be the most appropriate type of graph for this data set. ....................................................................................................................................... ........................................................................................................................................
Example 64