Excursions in Modern Mathematics, 7e: 14.1 - 2Copyright © 2010 Pearson Education, Inc. 14...

26
Excursions in Modern Mathematics, 7e: 14.1 - 1 Copyright © 2010 Pearson Education, Inc.

Transcript of Excursions in Modern Mathematics, 7e: 14.1 - 2Copyright © 2010 Pearson Education, Inc. 14...

Page 1: Excursions in Modern Mathematics, 7e: 14.1 - 2Copyright © 2010 Pearson Education, Inc. 14 Descriptive Statistics 14.1Graphical Descriptions of Data 14.2Variables.
Page 2: Excursions in Modern Mathematics, 7e: 14.1 - 2Copyright © 2010 Pearson Education, Inc. 14 Descriptive Statistics 14.1Graphical Descriptions of Data 14.2Variables.

Excursions in Modern Mathematics, 7e: 14.1 - 2Copyright © 2010 Pearson Education, Inc.

14 Descriptive Statistics

14.1 Graphical Descriptions of Data

14.2 Variables

14.3 Numerical Summaries

14.4 Measures of Spread

Page 3: Excursions in Modern Mathematics, 7e: 14.1 - 2Copyright © 2010 Pearson Education, Inc. 14 Descriptive Statistics 14.1Graphical Descriptions of Data 14.2Variables.

Excursions in Modern Mathematics, 7e: 14.1 - 3Copyright © 2010 Pearson Education, Inc.

•data set: a collection of data values.•data points: individual data values in a data set•N still represents the size of the data set.•variable: any characteristic that varies with the members of a population.

Data Set

Page 4: Excursions in Modern Mathematics, 7e: 14.1 - 2Copyright © 2010 Pearson Education, Inc. 14 Descriptive Statistics 14.1Graphical Descriptions of Data 14.2Variables.

Excursions in Modern Mathematics, 7e: 14.1 - 4Copyright © 2010 Pearson Education, Inc.

Numerical (or quantitative) variable: a variable that represents a measurable quantity

–Continuous variable: difference between the values of a numerical variable can be arbitrarily small–Discrete: values of the numerical variable change by minimum increments

Categorical (or qualitative): cannot be measured numerically:

Variables

Page 5: Excursions in Modern Mathematics, 7e: 14.1 - 2Copyright © 2010 Pearson Education, Inc. 14 Descriptive Statistics 14.1Graphical Descriptions of Data 14.2Variables.

Excursions in Modern Mathematics, 7e: 14.1 - 5Copyright © 2010 Pearson Education, Inc.© Copyright McGraw-Hill 2000

5

Frequency Table

• organize the data in a meaningful,

intelligible way.

• enable the reader to determine the

nature or shape of the distribution.

• facilitate computational procedures for

measures of average and spread.

Page 6: Excursions in Modern Mathematics, 7e: 14.1 - 2Copyright © 2010 Pearson Education, Inc. 14 Descriptive Statistics 14.1Graphical Descriptions of Data 14.2Variables.

Excursions in Modern Mathematics, 7e: 14.1 - 6Copyright © 2010 Pearson Education, Inc.

Chem 103 Test Scores (pg 545 #1)What type of data is this? Construct a Frequency Table for the raw data.

Student ID Score Student ID Score

1362 50 4315 70

1486 70 4719 70

1721 80 4951 60

1932 60 5321 60

2489 70 5872 100

2766 10 6433 50

2877 80 6921 50

2964 60 8317 70

3217 70 8854 100

3588 80 8964 80

3780 80 9158 60

3921 60 9347 60

Page 7: Excursions in Modern Mathematics, 7e: 14.1 - 2Copyright © 2010 Pearson Education, Inc. 14 Descriptive Statistics 14.1Graphical Descriptions of Data 14.2Variables.

Excursions in Modern Mathematics, 7e: 14.1 - 7Copyright © 2010 Pearson Education, Inc.

•Bar graph has data listed in increasing order on a horizontal axis and the frequency of each data value displayed by the height of the column above that test score

•Pictograms use icons or pictures instead of bars to show the frequencies (see pg 528)

•The point of a pictogram is that a graph is often used not only to inform but also to impress and persuade, and, in such cases, a well-chosen icon or picture can be a more effective tool than just a bar.

•Draw a bar graph for the Chem 103 data. Note any outliers

Bar Graphs and Pictograms

Page 8: Excursions in Modern Mathematics, 7e: 14.1 - 2Copyright © 2010 Pearson Education, Inc. 14 Descriptive Statistics 14.1Graphical Descriptions of Data 14.2Variables.

Excursions in Modern Mathematics, 7e: 14.1 - 8Copyright © 2010 Pearson Education, Inc.

•used when the number of categories is small.•Uses relative frequencies of the categories •the “pie” represents the entire population (100%)•the “slices” represent the categories (or classes), with the size (angle) of each slice being proportional to the relative frequency of the corresponding category.

Pie Charts

Page 9: Excursions in Modern Mathematics, 7e: 14.1 - 2Copyright © 2010 Pearson Education, Inc. 14 Descriptive Statistics 14.1Graphical Descriptions of Data 14.2Variables.

Excursions in Modern Mathematics, 7e: 14.1 - 9Copyright © 2010 Pearson Education, Inc.

Relative frequencies : the frequencies given in terms of percentages of the total population.

For the Chem103 data :(round to nearest 10th)

Construct a pie chart.

Relative Frequency

Score 10 50 60 70 80 100

Relative Frequency

4.2% 12.5% 29.2% 25% 20.8% 8.3%

Page 10: Excursions in Modern Mathematics, 7e: 14.1 - 2Copyright © 2010 Pearson Education, Inc. 14 Descriptive Statistics 14.1Graphical Descriptions of Data 14.2Variables.

Excursions in Modern Mathematics, 7e: 14.1 - 10Copyright © 2010 Pearson Education, Inc.

When it comes to deciding how best to display graphically the frequencies of a population, a critical issue is the number of categories into which the data can fall. When the number of categories is too big (say, in the dozens), a bar graph or pictogram can become muddled and ineffective. This happens more often than not with numerical data–numerical variables can take on infinitely many values.

How Many Categories

Page 11: Excursions in Modern Mathematics, 7e: 14.1 - 2Copyright © 2010 Pearson Education, Inc. 14 Descriptive Statistics 14.1Graphical Descriptions of Data 14.2Variables.

Excursions in Modern Mathematics, 7e: 14.1 - 11Copyright © 2010 Pearson Education, Inc.

•In situations with large data sets it is customary to present a more compact picture of the data by grouping together sets of scores into categories called class intervals. •the number of class intervals should be somewhere between 5 and 20.•Class interval and endpoint conventions. (#20)•Histograms•See pg. 533•DO #20

Page 12: Excursions in Modern Mathematics, 7e: 14.1 - 2Copyright © 2010 Pearson Education, Inc. 14 Descriptive Statistics 14.1Graphical Descriptions of Data 14.2Variables.

Excursions in Modern Mathematics, 7e: 14.1 - 12Copyright © 2010 Pearson Education, Inc.

Measures of Location

Measures of location such as the mean (or average), the median, and the quartiles, are numbers that provide information about the values of the data.

Numerical Summaries of a Data Set

Page 13: Excursions in Modern Mathematics, 7e: 14.1 - 2Copyright © 2010 Pearson Education, Inc. 14 Descriptive Statistics 14.1Graphical Descriptions of Data 14.2Variables.

Excursions in Modern Mathematics, 7e: 14.1 - 13Copyright © 2010 Pearson Education, Inc.

Mean

N

ddd N...mean

valuesdata ofnumber total

valuesdata theof summean

21

#24a 548 .Pg

Page 14: Excursions in Modern Mathematics, 7e: 14.1 - 2Copyright © 2010 Pearson Education, Inc. 14 Descriptive Statistics 14.1Graphical Descriptions of Data 14.2Variables.

Excursions in Modern Mathematics, 7e: 14.1 - 14Copyright © 2010 Pearson Education, Inc.

To find the average A of a data set given by a frequency table do the following:Step 1.

S = d1•f1 + d2•f2 +… + dk•fk

To Find the Average From a Table

Step 2.

N = f1 + f2 +…+ fk Step 3.A = S/N

Pg. 548 # 29

Page 15: Excursions in Modern Mathematics, 7e: 14.1 - 2Copyright © 2010 Pearson Education, Inc. 14 Descriptive Statistics 14.1Graphical Descriptions of Data 14.2Variables.

Excursions in Modern Mathematics, 7e: 14.1 - 15Copyright © 2010 Pearson Education, Inc.

Median

• Halfway point in the data set.• Physical middle• Data MUST be in order.

Page 16: Excursions in Modern Mathematics, 7e: 14.1 - 2Copyright © 2010 Pearson Education, Inc. 14 Descriptive Statistics 14.1Graphical Descriptions of Data 14.2Variables.

Excursions in Modern Mathematics, 7e: 14.1 - 16Copyright © 2010 Pearson Education, Inc.

■ Sort the data set from smallest to largest. Let d1, d2, d3, … , dN represent the sorted data.

■ If N is odd, the median is (middle)

■ If N is even, the median is the average of

FINDING THE MEDIANOF A DATA SET

d

N1

2

.

d

N

2

and dN

21

.

Pg. 548 #24b

Page 17: Excursions in Modern Mathematics, 7e: 14.1 - 2Copyright © 2010 Pearson Education, Inc. 14 Descriptive Statistics 14.1Graphical Descriptions of Data 14.2Variables.

Excursions in Modern Mathematics, 7e: 14.1 - 17Copyright © 2010 Pearson Education, Inc.

After the median, the next most commonly used values are the first and third quartiles. The first quartile (denoted by Q1) is the 25th percentile, and the third quartile (denoted by Q3) is the 75th percentile.

Pg. 549 # 34

Quartiles

Page 18: Excursions in Modern Mathematics, 7e: 14.1 - 2Copyright © 2010 Pearson Education, Inc. 14 Descriptive Statistics 14.1Graphical Descriptions of Data 14.2Variables.

Excursions in Modern Mathematics, 7e: 14.1 - 18Copyright © 2010 Pearson Education, Inc.

Invented in 1977 by statistician John Tukey, a box plot (also known as a box-and-whisker plot) is a picture of the five-number summary of a data set. The box plot consists of a rectangular box that sits above a scale and extends from the first quartile Q1 to the third quartile Q3 on that scale. A vertical line crosses the box, indicating the position of the median M. On both sides of the box are “whiskers” extending to the smallest value, Min, and largest value, Max, of the data.

Box Plots

Page 19: Excursions in Modern Mathematics, 7e: 14.1 - 2Copyright © 2010 Pearson Education, Inc. 14 Descriptive Statistics 14.1Graphical Descriptions of Data 14.2Variables.

Excursions in Modern Mathematics, 7e: 14.1 - 19Copyright © 2010 Pearson Education, Inc.

This figure shows a generic box plot for a data set.

Pg. 549 # 42

Box Plots

Page 20: Excursions in Modern Mathematics, 7e: 14.1 - 2Copyright © 2010 Pearson Education, Inc. 14 Descriptive Statistics 14.1Graphical Descriptions of Data 14.2Variables.

Excursions in Modern Mathematics, 7e: 14.1 - 20Copyright © 2010 Pearson Education, Inc.

Range: the difference between the highest and lowest data value usually denoted by R.

R = Max – Min

The range of a data set is a useful piece of information when there are no outliers in the data. In the presence of outliers the range tells a distorted story.

The Range

Page 21: Excursions in Modern Mathematics, 7e: 14.1 - 2Copyright © 2010 Pearson Education, Inc. 14 Descriptive Statistics 14.1Graphical Descriptions of Data 14.2Variables.

Excursions in Modern Mathematics, 7e: 14.1 - 21Copyright © 2010 Pearson Education, Inc.

•eliminate the possible distortion caused by outliers•denoted by the acronym IQR.•the difference between the third quartile and the first quartile• IQR = Q3 – Q1•tells us how spread out the middle 50% of the data values are.

•Find R and IQR for #34

The Interquartile Range

Page 22: Excursions in Modern Mathematics, 7e: 14.1 - 2Copyright © 2010 Pearson Education, Inc. 14 Descriptive Statistics 14.1Graphical Descriptions of Data 14.2Variables.

Excursions in Modern Mathematics, 7e: 14.1 - 22Copyright © 2010 Pearson Education, Inc.

•The most important and most commonly used measure of spread for a data set•The key concept for understanding the standard deviation is the concept of deviation from the mean. •If A is the average of the data set and x is an arbitrary data value, the difference x – A is x’s deviation from the mean.• The deviations from the mean tell us how “far” the data values are from the average value of the data.

Standard Deviation

Page 23: Excursions in Modern Mathematics, 7e: 14.1 - 2Copyright © 2010 Pearson Education, Inc. 14 Descriptive Statistics 14.1Graphical Descriptions of Data 14.2Variables.

Excursions in Modern Mathematics, 7e: 14.1 - 23Copyright © 2010 Pearson Education, Inc.

The deviations from the mean are themselves a data set, which we would like to summarize. One way would be to average them, but if we do that, the negative deviations and the positive deviations will always cancel each other out so that we end up with an average of 0. This, of course, makes the average useless in this case. The cancellation of positive and negative deviations can be avoided by squaring each of the deviations.

Standard Deviation

Page 24: Excursions in Modern Mathematics, 7e: 14.1 - 2Copyright © 2010 Pearson Education, Inc. 14 Descriptive Statistics 14.1Graphical Descriptions of Data 14.2Variables.

Excursions in Modern Mathematics, 7e: 14.1 - 24Copyright © 2010 Pearson Education, Inc.

The squared deviations are never negative, and if we average them out, we get an important measure of spread called the variance, denoted by V.

Finally, we take the square root of the variance and get the standard deviation, denoted by the Greek letter (and sometimes by the acronym SD).

The following is an outline of the definition of the standard deviation of a data set.

Standard Deviation

Page 25: Excursions in Modern Mathematics, 7e: 14.1 - 2Copyright © 2010 Pearson Education, Inc. 14 Descriptive Statistics 14.1Graphical Descriptions of Data 14.2Variables.

Excursions in Modern Mathematics, 7e: 14.1 - 25Copyright © 2010 Pearson Education, Inc.

■ Let A denote the mean of the data set. For each number x in the data set, compute its deviation from the mean (x – A) and square each of these numbers. These numbers are called the squared deviations.

■ Find the average of the squared deviations. This number is called the variance V.

■ The standard deviation is the square

root of the variance

THE STANDARD DEVIATION OF A DATA SET

V .

Page 26: Excursions in Modern Mathematics, 7e: 14.1 - 2Copyright © 2010 Pearson Education, Inc. 14 Descriptive Statistics 14.1Graphical Descriptions of Data 14.2Variables.

Excursions in Modern Mathematics, 7e: 14.1 - 26Copyright © 2010 Pearson Education, Inc.

• Page 551 # 56a,c, 62, 63

• Groups Pg. 545 – 551

# 20, 34c, 56b, 64