Measures of Dispersion. Here are two sets to look at A = {1,2,3,4,5,6,7} B = {8,9,10,11,12,13,14} Do...

28
Measures of Dispersion

Transcript of Measures of Dispersion. Here are two sets to look at A = {1,2,3,4,5,6,7} B = {8,9,10,11,12,13,14} Do...

Measures of Dispersion

• Here are two sets to look at• A = {1,2,3,4,5,6,7}• B = {8,9,10,11,12,13,14}

• Do you expect the sets to have the same means?

• Median?• Mode?

• What if they did?

• Lets look at three other sets:

• C = {11,12,13,14,15}

• D = {5,9,13,17,21}

• E = {1,7,13,19,25}

• Find the mean, median and mode

• C

• D

• E

• Yet the set are different. How?

• Mean Median Mode

• C = {11,12,13,14,15} 13 13

• D = {5,9,13,17,21} 13 13

• E = {1,7,13,19,25} 13 13

E x x x x xD x x x x xC x x x x x 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25

E x x x x x

D x x x x x

C x x x x x 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25

Set C values are close together

Set D values are spread out more from the center

Set E values are farthest from the center

Conclusion: Mean, median and mode are blind to how the data is spread out.

So lets look at the Range: Largest Value – Smallest Value

Rangec = 4 RangeD = 16 RangeE = 24

E x x x x x

D x x x x x

C x x x x x 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25

Range: Largest Value – Smallest Value

Rangec = 4 RangeD = 16 RangeE = 24

We can see the larger the range the greater the spread between the largest and smallest.

Weakness: It is looking at only the extreme values and ignores all the other values. We know nothing about the other data values.

Here are three set all with same range but very different dispersion of the data values.

R x x x x x x x x x

S x x x x x x x x x

D x x x x x x x x1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25

So we should be looking at all the values and their relationship to the center of the data. We will use the mean as the center of measure to do this comparison.

We call this looking at the spread of the data. The statistics terms are Variation and Standard Deviation.

δ

Variance is δ2 or s2

Formula for Standard Deviation

x u

ns

x x

n

2 2

1

C Deviation Deviation2

• value - mean

• 11

• 12

• 13

• 14

• 15

Finding Standard DeviationE x x x x xD x x x x xC x x x x x

D Deviation Deviation2

value - mean

• 5

• 9

• 13

• 17

• 21

Finding Standard Deviation

n

x

2

E x x x x xD x x x x xC x x x x x

E Deviation Deviation2

value - mean

• 1

• 7

• 13

• 19

• 25

Finding Standard Deviation

n

x

2

E x x x x xD x x x x xC x x x x x

• Set c standard deviation is

• Set D standard deviation is

• Set E standard deviation is

• We can see the larger the standard deviation is the more spread out the population is

41.12

E x x x x xD x x x x xC x x x x x

65.52232

49.82372

Population STD Sample STD

δ

Variance is δ2 or s2

Formula for Standard Deviation

x u

ns

x x

n

2 2

1

Empirical Rule• The standard deviation is very useful for estimating

probabilities 68 – 95 - 99.7

IN A NORMAL DISTRIBUTION (Bell shaped curve):68-95-99.7 RULE μ IS THE MEAN OF THE DATA AND σ THE STANDARD DEVIATION 68% OF THE DATA FALLS WITHIN σ OF THE MEAN 95% OF THE DATA FALLS WITHIN 2 σ OF THE MEAN 99.7% OF THE DATA FALLS WITHIN 3 σ OF THE MEAN

Empirical Rule

μ σ 2σ 3σ-1σ-2σ-3σ

Empirical Rule

Empirical Rule

Empirical Rule

Empirical Rule

Below is the height of 149 females at a local college. Does this data seem to be a symmetrical mound shape?

Mean ≈ 66.4 & SD ≈ 2

Applying the empirical rule, what % of females at the college are between 62.4 inches and70.4 inches?

Between what two heights are 68% of the females at the college?

What percent of the females at the college are above 68.4 inches?

• Amanda 39• Amber21• Tim 9• Mike 32• Nicole 30• Scot 45• Erica 11• Tiffany 12• Glenn 39

The following data represents the travel time in minutes to school for nine students enrolled in College.

Find the population mean and standard deviation.

Find 2 random samples of 3 to estimate the mean and standard deviation for this population.

• Amanda 39• Amber21• Tim 9• Mike 32• Nicole 30• Scot 45• Erica 11• Tiffany 12• Glenn 39

Sample 1: mean STD

Sample 2: mean STD

Sample 3 : mean STD

Which samples over or underestimated the population Parameters.?

END

Go to Skewness PowerPoint

μ

Empirical Rule• The standard deviation is very useful for estimating

probabilities 68 – 95 - 99.7

δ

Variance is δ2 or s2

Formula for Standard Deviation

x u

ns

x x

n

2 2

1

n

x

2

n

x

2

2