Min Mod Median

Post on 20-Nov-2014

33 views 1 download

Tags:

Transcript of Min Mod Median

1

Basic Business Statistics: Concepts & Applications

Activity 4+ 5 + 6Descriptive Statistics

and Graphical Analysis   

2

Learning Objectives

• Measures of Center and Location: Mean (Arithmetic Average,Weighted Mean, Geometric Mean), Mode, Median

• Other measures of Location: Percentiles, Quartiles

• Measures of Variation: Range, Interquartile range, Variance and Standard deviation, Coefficient of variation

3

Contents

• 1. Measures of Central Tendency

• 1.1. Mean (Arithmetic Mean)

• 1.2. Mode

• 1.3. Median

• 1.4. Shape of a Distribution

• 2. Other Location Measures: Percentiles and Quartiles

• 2.1. Quartiles

• 2.2. Box-and-Whisker Plot

• 2.3. Distribution Shape and Box-and-Whisker Plot

4

Contents (continued)

• 3. Measures of Variation • 3.1. Range• 3.2. Interquartile Range• 3.3. Variance• 3.4. Standard Deviation• 3.5. Coefficient of Variation• 3.6. The Empirical Rule and Tchebysheff’s Theorem• 3.7. Tchebysheff’s Theorem• Using Microsoft Excel• Exercices

5

Summary Measures

Center and Location

Mean

Median

Mode

Other Measures of Location

Weighted Mean

Describing Data Numerically

Variation

Variance

Standard Deviation

Range

Percentiles

Interquartile Range

Quartiles

6

1. Measures of Central Tendency

Center and Location

Mean Median Mode Weighted Mean

N

x

n

xx

N

ii

n

ii

1

1

i

iiW

i

iiW

w

xw

w

xwX

7

1.1. Mean (Arithmetic Mean)

Mean (arithmetic mean) of data values

Sample mean

Population mean

1 1 2

n

ii n

XX X X

Xn n

1 1 2

N

ii N

XX X X

N N

Sample Size

Population Size

8

1.1 Mean (continued) (Weighted Mean)

• Used when values are grouped by frequency or relative importance

28

87

126

45

FrequencyDays to Complete

28

87

126

45

FrequencyDays to Complete

Example: Sample of 26 Repair Projects Weighted Mean Days

to Complete:

days 6.31 26

164

28124

8)(27)(86)(125)(4

w

xwX

i

iiW

9

1.2. Mode• A measure of central tendency• Value that occurs most often• Not affected by extreme values• Used for either numerical or categorical data• There may be no mode or several modes

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14

Mode = 9

0 1 2 3 4 5 6

No Mode

10

1.3. Median• Robust measure of central tendency• Not affected by extreme values

• In an ordered array, the median is the “middle” numberIf n or N is odd, the median is the middle numberIf n or N is even, the median is the average of the two

middle numbers

0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 12 14

Median = 5 Median = 5

11

1.4. Shape of a Distribution

• Describes how data is distributed

• Symmetric or skewed

Mean = Median = ModeMean < Median < Mode Mode < Median < Mean

Right-SkewedLeft-Skewed Symmetric

(Longer tail extends to left) (Longer tail extends to right)

12

2. Other Location Measures

Other Measures of Location

Percentiles Quartiles

1st quartile = 25th percentile

2nd quartile = 50th percentile

= median

3rd quartile = 75th percentile

The pth percentile in a data array:

p% are less than or equal to this value

(100 – p)% are greater than or equal to this value

(where 0 ≤ p ≤ 100)

13

2.1. Quartiles

• Split Ordered Data into 4 Quarters

• Position of ith Quartile

25% 25% 25% 25%

1Q 2Q 3Q

1

4i

i nQ

14

2.1. Quartiles (continued)

Data in Ordered Array: 11 12 13 16 16 17 17 18 21

11Position of 2.

1 9 1 12 13

4 25 12.5Q Q

= Median = 162Q Q3 = 17,5

15

2.2. Box-and-Whisker Plot

Graphical Display of Data Using 5-Number Summary:

Median

4 6 8 10 12

Q3Q1 XlargestXsmallest

16

2.3. Distribution Shape and Box-and-Whisker Plot

Right-SkewedLeft-Skewed Symmetric

1Q 1Q 1Q2Q 2Q 2Q3Q 3Q3Q

Refer to summary on Exhibit 3.4 (p.114), Fig. 3.13 (p.115)

17

3. Measures of Variation

Variation

Variance Standard Deviation Coefficient of Variation

PopulationVariance

Sample

Variance

PopulationStandardDeviationSample

Standard

Deviation

Range

Interquartile Range

18

3.1. Range

• Measure of variation

• Difference between the largest and the smallest observations:

• Ignore the way in which data are distributed

Largest SmallestRange X X

7 8 9 10 11 12

Range = 12 - 7 = 5

7 8 9 10 11 12

Range = 12 - 7 = 5

19

3.2. Interquartile Range

• Interquartile range = 3rd quartile – 1st quartile

3 1Interquartile Range 17.5 12.5 5Q Q

Data in Ordered Array: 11 12 13 16 16 17 17 18 21

Median(Q2)

XmaximumX

minimum Q1 Q3

Example:

25% 25% 25% 25%

12 30 45 57 70

Interquartile range = 57 – 30 = 27

20

2

2 1

N

ii

X

N

• Important measure of variation

• Shows variation about the mean

Sample variance:

Population variance:

2

2 1

1

n

ii

X XS

n

3.3. Variance

21

3.4. Standard Deviation

• Most important measure of variation

• Shows variation about the mean

Sample standard deviation:

Population standard deviation:

2

1

1

n

ii

X XS

n

2

1

N

ii

X

N

22

Comparing Standard Deviations

Mean = 15.5s = 3.33811 12 13 14 15 16 17 18 19 20 21

11 12 13 14 15 16 17 18 19 20 21

Data B

Data A

Mean = 15.5

s = .9258

11 12 13 14 15 16 17 18 19 20 21

Mean = 15.5

s = 4.57

Data C

23

3.5. Coefficient of Variation

• Measure of Relative Dispersion

• Always in %

• Shows Variation Relative to Mean

• Used to Compare 2 or More Groups

• Formula:

100%μ

σCV

Population Sample

100%x

sCV

24

3.6. The Empirical Rule and Tchebysheff’s Theorem

• 3.6.1. The Empirical Rule

◘ contains about 68% of the values in the population or the sample

1σμ

μ

68%

1σμ

25

3.6.1. The Empirical Rule (continued)

3σμ

99.7%95%

2σμ

26

3.7. Tchebysheff’s Theorem

• Regardless of how the data are distributed, at least (1 - 1/k2) of the values will fall within k standard deviations of the mean

• Examples:

(1 - 1/12) = 0% ……..... k=1 (μ ± 1σ)

(1 - 1/22) = 75% …........ k=2 (μ ± 2σ)

(1 - 1/32) = 89% ……… k=3 (μ ± 3σ)

27

Using Microsoft Excel

• Use menu choice:

tools / data analysis / descriptive statistics

• Enter details in dialog box

28

Using Microsoft Excel (continued)

Use menu choice:

tools / data analysis /

descriptive statistics

29

Using Microsoft Excel (continued)

Enter dialog box details

Check box for summary statistics

Click OK

30

Using Microsoft Excel (continued)

Microsoft Excel

descriptive statistics output,

using the house price data:

House Prices:

$2,000,000500,000300,000100,000100,000

31

Froblem for Activity 6

• The following data represent the waiting time in minutes (definded as the time the customer enters the line until he or she reaches the teller window) of 15 customers at two bank branches.

32

Froblem for Activity 6 (continued)Bank 1 Bank 2

4.21

5.55

3.02

5.13

4.77

2.34

3.54

3.2

4.5

6.1

0.38

5.12

6.46

6.19

3.79

9.66

5.90

8.02

5.79

8.73

3.82

8.01

8.35

10.49

6.68

5.64

4.08

6.17

9.91

5.47

33

Froblem for Activity 6 (continued)

• For each of the waiting time at the two bank branches :

1. Using Exel: List the five-number summary (Xmin – First quartile – Median – Third quartile – Xmax)

2. Using SPSS for window: Compute the range, interquartile range, variance, standard deviation, coefficient of variation.