1 1 Slide © 2003 South-Western/Thomson Learning TM Chapter 3 Descriptive Statistics: Numerical...

27
1 © 2003 South-Western/Thomson Learning © 2003 South-Western/Thomson Learning TM TM Chapter 3 Chapter 3 Descriptive Statistics: Descriptive Statistics: Numerical Methods Numerical Methods Measures of Variability Measures of Variability Measures of Relative Location and Measures of Relative Location and Detecting Outliers Detecting Outliers Exploratory Data Analysis Exploratory Data Analysis Measures of Association Measures of Association Between Two Variables Between Two Variables x %

description

3 3 Slide © 2003 South-Western/Thomson Learning TM Measures of Variation Variation Variance Standard Deviation Coefficient of Variation PopulationVariance SampleVariance PopulationStandardDeviation SampleStandardDeviation Range InterquartileRange

Transcript of 1 1 Slide © 2003 South-Western/Thomson Learning TM Chapter 3 Descriptive Statistics: Numerical...

Page 1: 1 1 Slide © 2003 South-Western/Thomson Learning TM Chapter 3 Descriptive Statistics: Numerical Methods n Measures of Variability n Measures of Relative.

1 Slide

© 2003 South-Western/Thomson Learning© 2003 South-Western/Thomson LearningTMTM

Chapter 3Chapter 3 Descriptive Statistics: Descriptive Statistics:

Numerical MethodsNumerical Methods Measures of VariabilityMeasures of Variability Measures of Relative Location and Detecting Measures of Relative Location and Detecting

OutliersOutliers Exploratory Data AnalysisExploratory Data Analysis Measures of Association Measures of Association Between Two VariablesBetween Two Variables x

%

Page 2: 1 1 Slide © 2003 South-Western/Thomson Learning TM Chapter 3 Descriptive Statistics: Numerical Methods n Measures of Variability n Measures of Relative.

2 Slide

© 2003 South-Western/Thomson Learning© 2003 South-Western/Thomson LearningTMTM

Measures of VariabilityMeasures of Variability

It is often desirable to consider measures of It is often desirable to consider measures of variability (dispersion), as well as measures of variability (dispersion), as well as measures of location.location.

For example, in choosing supplier A or supplier B For example, in choosing supplier A or supplier B we might consider not only the average delivery we might consider not only the average delivery time for each, but also the variability in delivery time for each, but also the variability in delivery time for each. time for each.

RangeRange Inter-quartile RangeInter-quartile Range VarianceVariance Standard DeviationStandard Deviation Coefficient of VariationCoefficient of Variation

Page 3: 1 1 Slide © 2003 South-Western/Thomson Learning TM Chapter 3 Descriptive Statistics: Numerical Methods n Measures of Variability n Measures of Relative.

3 Slide

© 2003 South-Western/Thomson Learning© 2003 South-Western/Thomson LearningTMTM

Measures of VariationMeasures of Variation

VariationVariation

VariancVariancee

Standard Standard DeviationDeviation

Coefficient Coefficient of Variationof Variation

PopulatiPopulationon

VarianceVariance

Sample Sample VarianceVariance

PopulatioPopulationn

StandardStandardDeviationDeviationSample Sample Standard Standard DeviatioDeviationn

RangeRange

InterquartiInterquartile le

RangeRange

Page 4: 1 1 Slide © 2003 South-Western/Thomson Learning TM Chapter 3 Descriptive Statistics: Numerical Methods n Measures of Variability n Measures of Relative.

4 Slide

© 2003 South-Western/Thomson Learning© 2003 South-Western/Thomson LearningTMTM

Measures of variation give information on Measures of variation give information on the the spread spread oror variability variability of the data of the data values.values.

VariationVariation

Same center, Same center, different different variationvariation

Page 5: 1 1 Slide © 2003 South-Western/Thomson Learning TM Chapter 3 Descriptive Statistics: Numerical Methods n Measures of Variability n Measures of Relative.

5 Slide

© 2003 South-Western/Thomson Learning© 2003 South-Western/Thomson LearningTMTM

RangeRange

Simplest measure of variationSimplest measure of variation Difference between the largest and the Difference between the largest and the

smallest observations:smallest observations:

Range = xRange = xmaximummaximum – x – xminimumminimum

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14

Range = 14 - 1 = 13Range = 14 - 1 = 13

Example:Example:

Chap 3-5

Page 6: 1 1 Slide © 2003 South-Western/Thomson Learning TM Chapter 3 Descriptive Statistics: Numerical Methods n Measures of Variability n Measures of Relative.

6 Slide

© 2003 South-Western/Thomson Learning© 2003 South-Western/Thomson LearningTMTM

Example: Apartment RentsExample: Apartment Rents

RangeRange Range = largest value - smallest Range = largest value - smallest

value value Range = 615 - 425 = 190Range = 615 - 425 = 190425 430 430 435 435 435 435 435 440 440

440 440 440 445 445 445 445 445 450 450450 450 450 450 450 460 460 460 465 465465 470 470 472 475 475 475 480 480 480480 485 490 490 490 500 500 500 500 510510 515 525 525 525 535 549 550 570 570575 575 580 590 600 600 600 600 615 615

Page 7: 1 1 Slide © 2003 South-Western/Thomson Learning TM Chapter 3 Descriptive Statistics: Numerical Methods n Measures of Variability n Measures of Relative.

7 Slide

© 2003 South-Western/Thomson Learning© 2003 South-Western/Thomson LearningTMTM

Interquartile RangeInterquartile Range

The The interquartile rangeinterquartile range of a data set is the of a data set is the difference between the third quartile and the difference between the third quartile and the first quartile.first quartile.

It is the range for the It is the range for the middle 50%middle 50% of the data. of the data.

Page 8: 1 1 Slide © 2003 South-Western/Thomson Learning TM Chapter 3 Descriptive Statistics: Numerical Methods n Measures of Variability n Measures of Relative.

8 Slide

© 2003 South-Western/Thomson Learning© 2003 South-Western/Thomson LearningTMTM

Example: Apartment RentsExample: Apartment Rents

Interquartile RangeInterquartile Range 3rd Quartile (3rd Quartile (QQ3) = 5253) = 525 1st Quartile (1st Quartile (QQ1) = 4451) = 445

Interquartile Range = Interquartile Range = QQ3 - 3 - QQ1 = 525 - 445 = 1 = 525 - 445 = 8080425 430 430 435 435 435 435 435 440 440

440 440 440 445 445 445 445 445 450 450450 450 450 450 450 460 460 460 465 465465 470 470 472 475 475 475 480 480 480480 485 490 490 490 500 500 500 500 510510 515 525 525 525 535 549 550 570 570575 575 580 590 600 600 600 600 615 615

Page 9: 1 1 Slide © 2003 South-Western/Thomson Learning TM Chapter 3 Descriptive Statistics: Numerical Methods n Measures of Variability n Measures of Relative.

9 Slide

© 2003 South-Western/Thomson Learning© 2003 South-Western/Thomson LearningTMTM

VarianceVariance

The The variancevariance is a measure of variability that is a measure of variability that utilizes all the data.utilizes all the data.

It is based on the difference between the value It is based on the difference between the value of each observation (of each observation (xxii) and the mean () and the mean (xx for a for a sample, sample, for a population). for a population).

Page 10: 1 1 Slide © 2003 South-Western/Thomson Learning TM Chapter 3 Descriptive Statistics: Numerical Methods n Measures of Variability n Measures of Relative.

10 Slide

© 2003 South-Western/Thomson Learning© 2003 South-Western/Thomson LearningTMTM

VarianceVariance

The variance is the The variance is the average of the squared average of the squared differencesdifferences between each data value and the between each data value and the mean.mean.

If the data set is a sample, the variance is If the data set is a sample, the variance is denoted by denoted by ss22. .

If the data set is a population, the variance is If the data set is a population, the variance is denoted by denoted by 22..

sxi xn

22

1

( )

22

( )xNi

Page 11: 1 1 Slide © 2003 South-Western/Thomson Learning TM Chapter 3 Descriptive Statistics: Numerical Methods n Measures of Variability n Measures of Relative.

11 Slide

© 2003 South-Western/Thomson Learning© 2003 South-Western/Thomson LearningTMTM

Variance for Grouped DataVariance for Grouped Data

Sample DataSample Data

Population DataPopulation Data1

)( 2

2

n

xXfs ii

NXf ii

2

2)(

Page 12: 1 1 Slide © 2003 South-Western/Thomson Learning TM Chapter 3 Descriptive Statistics: Numerical Methods n Measures of Variability n Measures of Relative.

12 Slide

© 2003 South-Western/Thomson Learning© 2003 South-Western/Thomson LearningTMTM

Standard DeviationStandard Deviation

Most commonly used measure of variationMost commonly used measure of variation Shows variation about the meanShows variation about the mean The The standard deviationstandard deviation of a data set is the of a data set is the

positive square root of the variance.positive square root of the variance. If the data set is a sample, the standard If the data set is a sample, the standard

deviation is denoted deviation is denoted ss..

If the data set is a population, the standard If the data set is a population, the standard deviation is denoted deviation is denoted (sigma). (sigma).

s s 2

2

Page 13: 1 1 Slide © 2003 South-Western/Thomson Learning TM Chapter 3 Descriptive Statistics: Numerical Methods n Measures of Variability n Measures of Relative.

13 Slide

© 2003 South-Western/Thomson Learning© 2003 South-Western/Thomson LearningTMTM

Calculation Example:Calculation Example:Sample Standard Sample Standard

DeviationDeviationSample Sample

Data (XData (Xii) : 10 12 14 15 17 18 18 ) : 10 12 14 15 17 18 18 2424

n = 8 Mean = x = 16n = 8 Mean = x = 16

4.24267

126

1816)(2416)(1416)(1216)(10

1n)x(24)x(14)x(12)x(10s

2222

2222

Page 14: 1 1 Slide © 2003 South-Western/Thomson Learning TM Chapter 3 Descriptive Statistics: Numerical Methods n Measures of Variability n Measures of Relative.

14 Slide

© 2003 South-Western/Thomson Learning© 2003 South-Western/Thomson LearningTMTM

Coefficient of VariationCoefficient of Variation

Measures relative variationMeasures relative variation Always in percentage (%)Always in percentage (%) Shows variation relative to meanShows variation relative to mean Is used to compare two or more sets of data Is used to compare two or more sets of data

measured in different units measured in different units

100%xsCV

100%

μσCV

Population Population SampleSample

Page 15: 1 1 Slide © 2003 South-Western/Thomson Learning TM Chapter 3 Descriptive Statistics: Numerical Methods n Measures of Variability n Measures of Relative.

15 Slide

© 2003 South-Western/Thomson Learning© 2003 South-Western/Thomson LearningTMTM

Example: Apartment RentsExample: Apartment Rents

VarianceVariance

Standard DeviationStandard Deviation

Coefficient of VariationCoefficient of Variation

sxi xn

22

12 996 16

( ), .

s s 2 2996 47 54 74. .

sx

100 54 74490 80

100 11 15..

.

Page 16: 1 1 Slide © 2003 South-Western/Thomson Learning TM Chapter 3 Descriptive Statistics: Numerical Methods n Measures of Variability n Measures of Relative.

16 Slide

© 2003 South-Western/Thomson Learning© 2003 South-Western/Thomson LearningTMTM

Measures of Relative LocationMeasures of Relative Locationand Detecting Outliersand Detecting Outliers

z-Scoresz-Scores Detecting OutliersDetecting Outliers

Page 17: 1 1 Slide © 2003 South-Western/Thomson Learning TM Chapter 3 Descriptive Statistics: Numerical Methods n Measures of Variability n Measures of Relative.

17 Slide

© 2003 South-Western/Thomson Learning© 2003 South-Western/Thomson LearningTMTM

zz-Scores-Scores

The The zz-score-score is often called the standardized is often called the standardized value.value.

It denotes the number of standard deviations a It denotes the number of standard deviations a data value data value xxii is from the mean. is from the mean.

A data value less than the sample mean will have A data value less than the sample mean will have a a zz-score less than zero.-score less than zero.

A data value greater than the sample mean will A data value greater than the sample mean will have a have a zz-score greater than zero.-score greater than zero.

A data value equal to the sample mean will have A data value equal to the sample mean will have a a zz-score of zero.-score of zero.

z x xsii

Page 18: 1 1 Slide © 2003 South-Western/Thomson Learning TM Chapter 3 Descriptive Statistics: Numerical Methods n Measures of Variability n Measures of Relative.

18 Slide

© 2003 South-Western/Thomson Learning© 2003 South-Western/Thomson LearningTMTM

zz-Score of Smallest Value (425)-Score of Smallest Value (425)

Standardized Values for Apartment RentsStandardized Values for Apartment Rents

z x xsi

425 490 80

54 741 20.

..

-1.20 -1.11 -1.11 -1.02 -1.02 -1.02 -1.02 -1.02 -0.93 -0.93-0.93 -0.93 -0.93 -0.84 -0.84 -0.84 -0.84 -0.84 -0.75 -0.75-0.75 -0.75 -0.75 -0.75 -0.75 -0.56 -0.56 -0.56 -0.47 -0.47-0.47 -0.38 -0.38 -0.34 -0.29 -0.29 -0.29 -0.20 -0.20 -0.20-0.20 -0.11 -0.01 -0.01 -0.01 0.17 0.17 0.17 0.17 0.350.35 0.44 0.62 0.62 0.62 0.81 1.06 1.08 1.45 1.451.54 1.54 1.63 1.81 1.99 1.99 1.99 1.99 2.27 2.27

Example: Apartment RentsExample: Apartment Rents

Page 19: 1 1 Slide © 2003 South-Western/Thomson Learning TM Chapter 3 Descriptive Statistics: Numerical Methods n Measures of Variability n Measures of Relative.

19 Slide

© 2003 South-Western/Thomson Learning© 2003 South-Western/Thomson LearningTMTM

Detecting OutliersDetecting Outliers

An An outlieroutlier is an unusually small or unusually is an unusually small or unusually large value in a data set.large value in a data set.

A data value with a z-score less than -3 or A data value with a z-score less than -3 or greater than +3 might be considered an greater than +3 might be considered an outlier. outlier.

It might be an incorrectly recorded data value.It might be an incorrectly recorded data value. It might be a data value that was incorrectly It might be a data value that was incorrectly

included in the data set.included in the data set.

Page 20: 1 1 Slide © 2003 South-Western/Thomson Learning TM Chapter 3 Descriptive Statistics: Numerical Methods n Measures of Variability n Measures of Relative.

20 Slide

© 2003 South-Western/Thomson Learning© 2003 South-Western/Thomson LearningTMTM

Example: Apartment RentsExample: Apartment Rents

Detecting OutliersDetecting OutliersThe most extreme z-scores are -1.20 and The most extreme z-scores are -1.20 and

2.27.2.27.Using |Using |zz| | >> 3 as the criterion for an 3 as the criterion for an

outlier, outlier, there are no outliers in this data set. there are no outliers in this data set.

Standardized Values for Apartment RentsStandardized Values for Apartment Rents-1.20 -1.11 -1.11 -1.02 -1.02 -1.02 -1.02 -1.02 -0.93 -0.93-0.93 -0.93 -0.93 -0.84 -0.84 -0.84 -0.84 -0.84 -0.75 -0.75-0.75 -0.75 -0.75 -0.75 -0.75 -0.56 -0.56 -0.56 -0.47 -0.47-0.47 -0.38 -0.38 -0.34 -0.29 -0.29 -0.29 -0.20 -0.20 -0.20-0.20 -0.11 -0.01 -0.01 -0.01 0.17 0.17 0.17 0.17 0.350.35 0.44 0.62 0.62 0.62 0.81 1.06 1.08 1.45 1.451.54 1.54 1.63 1.81 1.99 1.99 1.99 1.99 2.27 2.27

Page 21: 1 1 Slide © 2003 South-Western/Thomson Learning TM Chapter 3 Descriptive Statistics: Numerical Methods n Measures of Variability n Measures of Relative.

21 Slide

© 2003 South-Western/Thomson Learning© 2003 South-Western/Thomson LearningTMTM

Exploratory Data AnalysisExploratory Data Analysis

Five-Number SummaryFive-Number Summary

Page 22: 1 1 Slide © 2003 South-Western/Thomson Learning TM Chapter 3 Descriptive Statistics: Numerical Methods n Measures of Variability n Measures of Relative.

22 Slide

© 2003 South-Western/Thomson Learning© 2003 South-Western/Thomson LearningTMTM

Five-Number SummaryFive-Number Summary

Smallest ValueSmallest Value First QuartileFirst Quartile MedianMedian Third QuartileThird Quartile Largest ValueLargest Value

Page 23: 1 1 Slide © 2003 South-Western/Thomson Learning TM Chapter 3 Descriptive Statistics: Numerical Methods n Measures of Variability n Measures of Relative.

23 Slide

© 2003 South-Western/Thomson Learning© 2003 South-Western/Thomson LearningTMTM

Example: Apartment RentsExample: Apartment Rents

Five-Number SummaryFive-Number SummaryLowest Value = 425Lowest Value = 425 First Quartile First Quartile

= 450= 450 Median = 475Median = 475

Third Quartile = 525 Largest Value Third Quartile = 525 Largest Value = 615= 615425 430 430 435 435 435 435 435 440 440

440 440 440 445 445 445 445 445 450 450450 450 450 450 450 460 460 460 465 465465 470 470 472 475 475 475 480 480 480480 485 490 490 490 500 500 500 500 510510 515 525 525 525 535 549 550 570 570575 575 580 590 600 600 600 600 615 615

Page 24: 1 1 Slide © 2003 South-Western/Thomson Learning TM Chapter 3 Descriptive Statistics: Numerical Methods n Measures of Variability n Measures of Relative.

24 Slide

© 2003 South-Western/Thomson Learning© 2003 South-Western/Thomson LearningTMTM

Measures of Association Measures of Association between Two Variablesbetween Two Variables

CovarianceCovariance Correlation CoefficientCorrelation Coefficient

Page 25: 1 1 Slide © 2003 South-Western/Thomson Learning TM Chapter 3 Descriptive Statistics: Numerical Methods n Measures of Variability n Measures of Relative.

25 Slide

© 2003 South-Western/Thomson Learning© 2003 South-Western/Thomson LearningTMTM

CovarianceCovariance

The The covariancecovariance is a measure of the linear is a measure of the linear association between two variables.association between two variables.

Positive values indicate a positive relationship.Positive values indicate a positive relationship. Negative values indicate a negative Negative values indicate a negative

relationship.relationship.

Page 26: 1 1 Slide © 2003 South-Western/Thomson Learning TM Chapter 3 Descriptive Statistics: Numerical Methods n Measures of Variability n Measures of Relative.

26 Slide

© 2003 South-Western/Thomson Learning© 2003 South-Western/Thomson LearningTMTM

If the data sets are samples, the covariance is If the data sets are samples, the covariance is denoted by denoted by ssxyxy..

If the data sets are populations, the covariance If the data sets are populations, the covariance is denoted by .is denoted by .

CovarianceCovariance

s x x y ynxy

i i

( )( )

1

xyi x i yx y

N

( )( )

xy

Page 27: 1 1 Slide © 2003 South-Western/Thomson Learning TM Chapter 3 Descriptive Statistics: Numerical Methods n Measures of Variability n Measures of Relative.

27 Slide

© 2003 South-Western/Thomson Learning© 2003 South-Western/Thomson LearningTMTM

Correlation CoefficientCorrelation Coefficient

The coefficient can take on values between -1 and The coefficient can take on values between -1 and +1.+1.

Values near -1 indicate a Values near -1 indicate a strong negative linear strong negative linear relationshiprelationship..

Values near +1 indicate a Values near +1 indicate a strong positive linear strong positive linear relationshiprelationship..

If the data sets are samples, the coefficient is If the data sets are samples, the coefficient is rrxyxy..

If the data sets are populations, the coefficient is .If the data sets are populations, the coefficient is .

rss sxyxy

x y

xyxy

x y

xy