Probability and Statistics

Post on 22-Jan-2016

35 views 0 download

Tags:

description

Probability and Statistics. 1.3 The Normal Distributions. Density Curve. A density curve is a smooth function meant to approximate a histogram. The area under a density curve is one. - PowerPoint PPT Presentation

Transcript of Probability and Statistics

Probability and Statistics

Probability and Statistics

1.3 The Normal Distributions1.3 The Normal Distributions

Density CurveDensity CurveDensity CurveDensity Curve

A density curve is a smooth function meant to approximate a histogram.

The area under a density curve is one.

Since the density curve represents the entire distribution, the area under the curve on any interval represents the proportion of observations in that interval.

Density CurveDensity CurveDensity CurveDensity Curve

Density Curves: Density Curves: PropertiesProperties

Density Curves: Density Curves: PropertiesProperties

Density CurvesDensity CurvesDensity CurvesDensity CurvesThe mean of density curve is the point at which the curve would balance.

The median of a density curve is the equal-areas point. In other words the areas under the curve on either side of the median are equal.

For symmetric density curves, balance point (mean) and the equal-areas point (median) are the same.

6

Symmetric Data is symmetric if the left half of its histogram (or density curve) is roughly a mirror of its right half.

Skewed Data is skewed if its histogram (or density curve) is not symmetric and if it extends more to one side than the other.

DefinitionsDefinitions

7

Mode = Mean = Median

SYMMETRIC

SKEWED LEFT(negatively)

Mean Mode Median

SKEWED RIGHT(positively)

Mean Mode Median

SkewnessSkewness

CharacterizationCharacterizationCharacterizationCharacterizationA normal distribution is bell-shaped and symmetric.

The distribution is determined by the mean (mu (μ)), and the standard deviation (sigma (σ)).

The mean controls the center and stdev controls the spread.

Note: These two density curves have the same mean but different Standard

Deviations.

68-95-99.7 Rule68-95-99.7 Rule68-95-99.7 Rule68-95-99.7 RuleFor any normal curve with mean μ and standard deviation σ:

68 percent of the observations fall within one standard deviation of the mean. (μ – 1σ < x < μ + 1σ)

95 percent of observation fall within 2 standard deviations. (μ – 2σ < x < μ + 2σ)

99.7 percent of observations fall within 3 standard deviations of the mean. (μ – 3σ < x < μ + 3σ)

10

Waiting Times of Bank Customers at Different Banks

in minutes

Jefferson Valley Bank

Bank of Providence

6.5

4.2

6.6

5.4

6.7

5.8

6.8

6.2

7.1

6.7

7.3

7.7

7.4

7.7

7.7

8.5

7.7

9.3

7.7

10.0

Jefferson Valley Bank

7.15

7.20

7.7

7.10

Bank of Providence

7.15

7.20

7.7

7.10

Mean

Median

Mode

Midrange

What is the Standard Deviation of the data from JV Bank? from BofP?

Dotplots of Waiting Times

Visually, which one has the greater spread?

12

Measures of VariationMeasures of Variation

Range

highest value – lowest value

13

a measure of variation of the scores about the mean

(average deviation from the mean)

Measures of Variation

Standard Deviation

14

Sample Standard Deviation Formula

Sample Standard Deviation Formula

calculators can compute the sample standard deviation of data

Σ (x - x)2

n - 1S =

15

Symbolsfor Standard Deviation

Population

σσx

xσn

Book

Some graphicscalculators

Somenon-graphicscalculators

Sample

s

Sx

xσn-1

Textbook

Some graphicscalculators

Somenon-graphics

calculators

Articles in professional journals and reports often use SD for standard deviation and VAR for variance.

Understanding Standard Deviation

Spot the Jack Russell weighs 19 pounds. The mean weight for a Jack Russell Terrier is 16 pounds with a std dev of 1.5 pounds. Desdi the Maine Coon cat also weighs 19 pounds and frequently kicks Spot’s butt around the house. The mean weight for a Maine Coon is 17 pounds with a std dev of 0.75 pounds. Which animal is most in need of a diet?

Understanding Standard Deviation

The only way to compare values in different units is to standardize the deviations from the means. In other words, we

first have to convert all of the values into similar units – standard deviations from the respective means. THEN, we can compare them directly. This is done through the application of

a Z-score:

(y – y)z = s

Value of interestValue of interest Mean of

dataMean of

data

Std dev of data

Std dev of data

z-score

will have same units as the independent variable if the data in quantitative or unit-less if the independent variable is categorical

represents the number of standard deviations a given number in the data is from the mean

Understanding Standard Deviation

Understanding Standard DeviationSpot the Jack Russell weighs 19 pounds. The mean weight for a Jack Russell Terrier is 16 pounds with a std dev of 1.5 pounds. Desdi the Maine Coon cat also weighs 19 pounds and frequently kicks Spot’s butt around the house. The mean weight for a Maine Coon is 17 pounds with a std dev of .75 pounds. Which animal is most in need of a diet?

z-score for Spot z-score for Desdi

Desdi is farther from the mean for the typical weight of her breed than Spot is from his breed.

What can you say about the spread of weights for the two breeds?

Can you think of any extraneous factor that could explain Desdi’s weight other than being overweight?

z =19−170.75

=2.67z =19−16

1.5=2

Understanding Standard Deviation

Spotz=2

Desdiz=2.67

What percent of Jack Russell terriers weigh less than Spot? more?

What percent of Maine Coon cats weigh less than Desdi? more?

Using z-score and the normal distribution

Suppose it takes you 20 minutes to drive to school, with a standard deviation of 2 minutes.

• How often will you arrive on school in less than 22 minutes?• How often will it take you more than 24 minutes?• 75% of the time you will arrive in x minutes or less. Solve for x.• 43% of the time you will arrive in y minutes or more. Solve for y.

22

Measures of VariationMeasures of VariationVariance

standard deviation squared

s2 or σ2 Notation

23

SampleVariance

PopulationVariance

Variance

Σ (x - x )2

n - 1s2 =

Σ (x - µ)2

Nσ2 =

24

Round-off Rulefor measures of variation

Round-off Rulefor measures of variation

Carry at least one more decimal place than is present in the

original set of values.

Round only the final answer, never in the middle of a calculation.

25

Estimation of Standard DeviationRange Rule of Thumb

x - 2s x x + 2s

Range ≈ 4sor(minimum

usual value)(maximum usual

value)

Range

4s ≈ =highest value - lowest value

4

minimum ‘usual’ value ≈ (mean) - 2 (standard deviation)

minimum ≈ x - 2(s)

maximum ‘usual’ value ≈ (mean) + 2 (standard deviation)

maximum ≈ x + 2(s)

Usual Sample Values

Usual Sample Values

27

The Empirical Rule(applies to bell-shaped distributions)

x x - s x + s

68% within1 standard deviation

34% 34%

x - 2s x + 2s

95% within 2 standard deviations

13.5% 13.5%

x - 3s x + 3s

99.7% of data are within 3 standard deviations of the mean

0.1% 0.1%

2.4% 2.4%

28

Chebyshev’s TheoremChebyshev’s Theorem

applies to distributions of any shape.

the proportion (or fraction) of any set of data lying within K standard deviations of the mean is always at least 1 - 1/K

2 , where K is any positive number greater than 1.

at least 3/4 (75%) of all values lie within 2 standard deviations of the mean.

at least 8/9 (89%) of all values lie within 3 standard deviations of the mean.

29

Measures of Variation Summary

Measures of Variation Summary

For typical data sets, it is unusual for a score to differ from the mean by more than 2 or 3 standard deviations.

AssignmentAssignment

Read Section 1.3

p. 64-66 1.62, 1.64-1.69, 1.71