Statistics -- Normal Distribution

63
+ Chapter 6: What’s Normal? Stats in Your World BOCK & MARIANO

description

Statistics -- Normal Distribution

Transcript of Statistics -- Normal Distribution

  • 5/27/2018 Statistics -- Normal Distribution

    1/63

    +

    Chapter 6: Whats Normal?

    Stats in Your World

    BOCK & MARIANO

  • 5/27/2018 Statistics -- Normal Distribution

    2/63

    +Ch. 6

    Whats Normal?

    After this chapter, you should be able to

    MEASURE position using percentiles

    INTERPRET cumulative relative frequency graphs

    MEASURE position using z-scores

    TRANSFORM data

    DEFINE and DESCRIBE density curves

    Learning Objectives

  • 5/27/2018 Statistics -- Normal Distribution

    3/63

    +Ch. 6

    Whats Normal

    DESCRIBE and APPLY the 68-95-99.7 Rule

    DESCRIBE the standard Normal Distribution

    PERFORM Normal distribution calculations

    ASSESS Normality

    Learning Objectives (contd)

  • 5/27/2018 Statistics -- Normal Distribution

    4/63

    +

    DescribingLoca

    tioninaDistr

    ibution

    Measuring Position: Percentiles

    One way to describe the location of a value in a distribution

    is to tell what percent of observations are less than it.

    Definition:

    The pthpercentileof a distribution is the value

    withppercent of the observations less than it.

    6 7

    7 2334

    7 5777899

    8 00123334

    8 569

    9 03

    Deja earned a score of 86 on her test. How did she perform

    relative to the rest of the class?

    Example

    Her score was greater than 21 of the 25observations. Since 21 of the 25, or 84%, of the

    scores are below hers, Deja is at the 84th

    percentile in the classs test score distribution.

    6 7

    7 2334

    7 5777899

    8 00123334

    8 569

    9 03

    What percentile is the person who earned a 72?What is the percentile is the person who earned

    a 93?

    What is the percentile of the two students who

    earned an 80?

    If two observations have the same value, they will be at the same

    percentile. To find the percentile, calculate the percent of the values in the

    distribution that are below bothvalues.

    Percentiles should be whole numbers, so if you get a decimal, you should

    round to the nearest integer.

  • 5/27/2018 Statistics -- Normal Distribution

    5/63

    +

    DescribingLoca

    tioninaDistr

    ibution

    Measuring Position: Percentiles

    Just knowing that Deja is 6 points above average doesnt tell

    you much about her location in the distribution. Depending on

    the spread of the distribution, Deja might be just barely above

    average or really far above average. This is why we need toincorporate a measure of spread (standard deviation) to have

    a really good understanding of how far above the mean she is.

  • 5/27/2018 Statistics -- Normal Distribution

    6/63

    +

    DescribingLocationinaDistr

    ibution

    Measuring Position: Percentiles

    The stemplot below shows the number of wins for each of

    the 30 Major League Baseball teams in 2009.

    Find the percentiles for the following teams:

    1) The Colorado Rockies, who won 92 games.

    2) The New York Yankees, who won 103 games.

    3) The Kansas City Royals and Cleveland Indians, who both

    won 65 games.

  • 5/27/2018 Statistics -- Normal Distribution

    7/63

    +

    DescribingLocationinaDistr

    ibution

    Measuring Position: z-Scores

    A z-score tells us how many standard deviations from the

    mean an observation falls, and in what direction.

    Definition:

    Ifxis an observation from a distribution that has known mean

    and standard deviation, the standardized valueofxis:

    A standardized value is often called a z-score.

    z x

    meanstandard deviation

    Deja earned a score of 86 on her test. The class mean is 80and the standard deviation is 6.07. What is her standardized

    score?

    z x mean

    standard deviation

    86 80

    6.07

    0.99

  • 5/27/2018 Statistics -- Normal Distribution

    8/63

    +

    De

    scribingLoc

    ationinaDistribution

    Using z-scores for Comparison

    We can use z-scores to compare the position of individuals in

    different distributions.

    Deja earned a score of 86 on her statistics test. The class mean was

    80 and the standard deviation was 6.07. She earned a score of 82

    on her chemistry test. The chemistry scores had a fairly symmetric

    distribution with a mean 76 and standard deviation of 4. On whichtest did Deja perform better relative to the rest of her class?

    Example

    zstats

    86 80

    6.07

    zstats

    0.99

    zchem

    82 76

    4

    zchem

    1.5

  • 5/27/2018 Statistics -- Normal Distribution

    9/63

    +

    De

    scribingLoc

    ationinaDistribution

    Using z-scores for Comparison

    We can use z-scores to compare the position of individuals in

    different distributions.

    Now go back to Slide 6 and find and interpret the z-scores for the

    following teams:

    a) The New York Yankees with 103 wins.

    b) The New York Mets, with 70 wins.

  • 5/27/2018 Statistics -- Normal Distribution

    10/63

    +

    De

    scribingLoc

    ationinaDistribution

    CHECK YOUR UNDERSTANDING

    Ms. Raskins Statistics class recorded their heights on a

    dotplot.

    1) Find and interpret the z-score of the student who is 65 inches tall.

    2) Find and interpret the z-score of the student who is 74 inches tall.

    3) The student who is 76 inches tall is on the basketball team. His

    height translates to az-score of -0.85 in the teams heightdistribution. The standard deviation of basketball team members

    is 3.5 inches. What is the mean of the team members heights?

  • 5/27/2018 Statistics -- Normal Distribution

    11/63

    +

    DescribingLoc

    ationinaDistribution

    CHECK YOUR UNDERSTANDING

    On one test your class achieved an average grade of 80 with

    a standard deviation of 8 points.

    1) If you got a 96, what is your z-score?

    2) Your best friend got a 76. What is her z-score?3) What test grade has a z-score of +1.5?

    4) Ms. Raskin calls home whenever a students z-score is worse

    than -2.0. What grade earns that phone call?

  • 5/27/2018 Statistics -- Normal Distribution

    12/63

    +

    DescribingLoc

    ationinaDistribution

    CHECK YOUR UNDERSTANDING

    In the last chapter, we made boxplots for the number of

    calories and the fiber content in 23 kinds of Kelloggs cereals.Now think about the sugar content. Those cereals average

    7.6 grams of sugar per serving with a standard deviation of

    4.5 grams.

    1) Find the z-scores for the following cereals and describe what the

    z-score tells you about that cereal:

    1) Frosted Flakes: 11g of sugar

    2) Apple Jacks: 14g of sugar

    3) Crispix: 3g of sugar2) The z-score for Honey Smacks sugar content is a very high 3.87!

    How many grams of sugar are in one serving?

    3) Product 19 is very low in sugar, with a z-score of -0.8. How many

    grams of sugar are in a serving of this cereal?

  • 5/27/2018 Statistics -- Normal Distribution

    13/63

    +

    DescribingLoc

    ationinaDistribution

    CHECK YOUR UNDERSTANDING

    The calorie content for 23 varieties of Kelloggs cereals

    averages 109 calories per serving with a standard deviation of

    22.2 calories. The mean fiber content for these cereals is 2.7

    grams per serving with a standard deviation of 3.2 grams. A

    serving of Kelloggs All-Bran with Extra Fiber has a very low50 calories and a very high 14 grams of fiber. Which is more

    remarkablethe calorie content or the fiber content?

    Explain.

  • 5/27/2018 Statistics -- Normal Distribution

    14/63

    +

    DescribingLoc

    ationinaDistribution

    CHECK YOUR UNDERSTANDING

    Ms. Raskin has just announced that the lower of your two test

    scores will be dropped (JK!) You got a 90 on Test 1 and an 80on Test 2. Youre all set to drop the 80 until she announces

    that she grades on a curve. She standardizes the scores in

    order to decide which is the lower one. If the mean on the

    first test was 88 with a standard deviation of 4 and the mean

    on the second was 75 with a standard deviation of 5, whichone will be dropped?

    Is this fair?

  • 5/27/2018 Statistics -- Normal Distribution

    15/63

    +

    DescribingLoc

    ationinaDistribution

    CHECK YOUR UNDERSTANDING

    The mens combined skiing event in the winter Olympics

    consists of two races: a downhill and a slalom. Times for thetwo events are added together, and the skier with the lowest

    total time wins. In the 2006 Winter Olympics, the mean

    slalom time was 94.2714 seconds with a standard deviation of

    1.8356 seconds. Ted Ligety of the United States, who won

    the gold medal with a combined time of 189.35 seconds,skied the slalom in 87.93 seconds and the downhill in 101.42

    seconds. On which race did he do better compared with the

    competition?

  • 5/27/2018 Statistics -- Normal Distribution

    16/63

    +

    DescribingLoc

    ationinaDistribution

    CHECK YOUR UNDERSTANDING

    A single-season home run record for Major League Baseball

    has been set just 3 times since Babe Ruth hit 60 home runs in1927. In an absolute sense, Barry Bonds had the best

    performance of these four players, since he hit the most home

    runs in a single season. However, in a relative sense, this

    may not be true. Baseball historians suggest that hitting a

    home un has been easier in some eras than others. This isdue to many factors, including quality of batters, quality of

    pitchers, hardness of the baseball, dimensions of the parks,

    and possible use of performance-enhancing drugs. To make

    a fair comparison, we should see how these performances

    rate relative to those of other hitters during the same year.

  • 5/27/2018 Statistics -- Normal Distribution

    17/63

    +

    DescribingLoc

    ationinaDistribution

    CHECK YOUR UNDERSTANDING

    Compute the standardized scores for each performance.Which player had the most outstanding performance

    relative to his peers?

  • 5/27/2018 Statistics -- Normal Distribution

    18/63

    +

    Nor

    malDistributions

    That z-score jont:

    A z-score gives us an indication of how unusual a value is

    because it tells us how far it is from the mean.

    A data value that sits right at the mean has a z-score of 0.

    A z-score of 1 means that the data value is 1 standard

    deviation above the mean.

    A z-score of -1 means that the data value is 1 standarddeviation below the mean.

  • 5/27/2018 Statistics -- Normal Distribution

    19/63

    +

    Nor

    malDistributions

    What happens when a z-score is REALLY BIG?

    How far from 0 does a z-score have to be to be interesting or

    unusual? There is no universal standard, but the larger (+/-)

    a z-score is, the more unusual it is.

    There is no universal standard for z-scores, but there is a

    model that shows up over and over in Statistics. This modelis called the Normal Model.

    All models are wrong,but some are useful!

    -George Box, statistician

  • 5/27/2018 Statistics -- Normal Distribution

    20/63

    +

    Nor

    malDistributions

    Normal Distributions

    You may have heard of bell-shaped curves. Statisticians

    call them Normal models. Normal modelsare appropriate

    for distributions whose shapes are unimodel and roughlysymmetric

    All Normal curves are symmetric, single-peaked, and bell-

    shaped

    A Specific Normal curve is described by giving its mean

    and standard deviation .

    Two Normal curves, showing the mean and standard deviation .

  • 5/27/2018 Statistics -- Normal Distribution

    21/63

    +

    Normal Distributions Nor

    malDistributions

    Definition:

    A Normal distributionis described by a Normal density curve. Anyparticular Normal distribution is completely specified by two numbers: its

    mean and standard deviation .

    The mean of a Normal distribution is the center of the symmetric

    Normal curve.

    The standard deviation is the distance from the center to thechange-of-curvature points on either side.

    We abbreviate the Normal distribution with mean and standard

    deviation as N(,).

    Normal distributions are good descriptions for some distributions of real data(test scores, characteristics of biological populations, etc.).

    Normal distributions are good approximations of the results of many kinds of

    chance outcomes.

    Many statistical inferenceprocedures are based on Normal distributions.

  • 5/27/2018 Statistics -- Normal Distribution

    22/63

    +

    NormalDistribu

    tions

    Although there are many Normal curves, they all have properties

    in common.

    The 68-95-99.7 Rule

    Definition: The 68-95-99.7 Rule

    In the Normal distribution with meanand standard deviation :

    Approximately 68% of the observations fall within of .

    Approximately 95% of the observations fall within 2 of .

    Approximately 99.7% of the observations fall within 3 of .

  • 5/27/2018 Statistics -- Normal Distribution

    23/63

    +

    NormalDistribu

    tions

    The distribution of Iowa Test of Basic Skills (ITBS) vocabulary

    scores for 7thgrade students in Gary, Indiana, is close to

    Normal. Suppose the distribution is N(6.84, 1.55).

    a) Sketch the Normal density curve for this distribution.

    b) What percent of ITBS vocabulary scores are less than 3.74?

    c) What percent of the scores are between 5.29 and 9.94?

    Example

  • 5/27/2018 Statistics -- Normal Distribution

    24/63

    +

    NormalDistribu

    tions

    The distribution of heights of young women aged 1824 is

    approximately Normal: N(64.5, 2.5).

    a) Sketch the Normal density curve for this distribution.

    b) What percent of young women have heights greater than 67

    inches?

    c) What percent of young women have heights between 67 and72 inches?

    Example

  • 5/27/2018 Statistics -- Normal Distribution

    25/63

    +

    NormalDistribu

    tions

    The average clean-up time for a crew of a medium-sized firm

    is 84.0 hours and the standard deviation is 6.8 hours.

    Assuming a Normal distribution, within what time interval will

    95% of the clean-up times fall?

    Over a long period of time, a farmer notes that the eggs

    produced by his chickens have a mean weight of 60g and astandard deviation of 15g. If eggs are classified by weight and

    small eggs are those having a weight of less than 45g, what

    percent of the farmers eggs will be classified as small?

    Eggs are classified as jumbo if their weight is 90g or more.What percent of the farmers eggs will be jumbo?

    CHECK YOUR UNDERSTANDING

  • 5/27/2018 Statistics -- Normal Distribution

    26/63

    +

    NormalDistribu

    tions

    As a group, the Dutch are among the tallest people in the

    world. The average Dutch man is 184cm talljust over 6 feet!

    The standard deviation of mens heights is about 8cm.

    Assuming the distribution is approximately Normal, use the 68-

    95-99.7 rule to sketch a model for the heights of Dutch men.

    Label the axis clearly and indicate appropriate percentages.

    CHECK YOUR UNDERSTANDING

    Based on this model, what percentage of all Dutch men shouldbe over 2 meters (66) tall?

  • 5/27/2018 Statistics -- Normal Distribution

    27/63

    +

    NormalDistribu

    tions

    Lets say it takes you 20 minutes, on average, to drive to

    school, with a standard deviation of 2 minutes. Suppose a

    Normal model is appropriate for the distributions of driving

    times. Based on the 68-95-99.7 Rule:

    About how often will it take you between 18 and 22 minutes toget to school?

    How often will you arrive at school in less than 22 minutes?

    How often will it take you more than 24 minutes?

    CHECK YOUR UNDERSTANDING

  • 5/27/2018 Statistics -- Normal Distribution

    28/63

    +

    NormalDistribu

    tions

    In the 2006 Winter Olympics mens combined event, Jean

    Baptiste Grange of France skied the slalom in 88.46 seconds

    about 1 standard deviation faster than the mean. If a Normal

    model is useful in describing slalom times, about how many of

    the 35 skiers finishing the event would you expect skied the

    slalom fasterthan Jean-Baptiste?

    CHECK YOUR UNDERSTANDING

  • 5/27/2018 Statistics -- Normal Distribution

    29/63

    +

    NormalDistribu

    tions

    Scores on the Wechsler Adult Intelligence Scale for the 20-34

    year old age group are normally distributed with a mean of 110

    and standard deviation of 25. Use the 68-95-99.7 rule to

    determine what percent of people score

    Above 110

    Above 150 Below 85

    Below 185

    CHECK YOUR UNDERSTANDING

  • 5/27/2018 Statistics -- Normal Distribution

    30/63

    +

    NormalDistribu

    tions

    EPA fuel economy estimates for automobiles tested recently

    predicted a mean of 24.8mpg and a standard deviation of 6.2

    for highway driving. Assume a Normal model to determine

    The range of gas mileage for the central 68% of cars

    The percent of autos getting more than 31 mpg

    The percent of autos getting between 31 and 37.2mpg Why you shouldnt use these numbers to predict driving in-town

    gas mileage.

    CHECK YOUR UNDERSTANDING

  • 5/27/2018 Statistics -- Normal Distribution

    31/63

    +

    NormalDistributions

    The Standard Normal Distribution

    All Normal distributions are the same if we measure in units

    of size from the mean as center.

    We can standardize these data by changing to z-scores:

    z= (x- )/ . If the variable we standardize has a Normal

    distribution, then so does the new variable z.

    This new distribution is called the Standard Normal

    Distribution.

  • 5/27/2018 Statistics -- Normal Distribution

    32/63

    +

    NormalDistributions

    The Standard Normal Distribution

    All Normal distributions are the same if we measure in units

    of size from the mean as center.

    Definition:

    The standard Normal distributionis the Normal distribution

    with mean 0 and standard deviation 1.

    If a variablexhas any Normal distribution N(,) with mean

    and standard deviation , then the standardized variable

    has the standard Normal distribution, N(0,1).

    zx -

  • 5/27/2018 Statistics -- Normal Distribution

    33/63

    +

    NormalDistribu

    tions

    The Standard Normal Table

    Because all Normal distributions are the same when we

    standardize, we can find areas under any Normal curve froma single table.

    Definition: The Standard Normal Table

    Table Ais a table of areas under the standard Normal curve. The table

    entry for each value zis the area under the curve to the left of z.

    Z .00 .01 .02

    0.7 .7580 .7611 .7642

    0.8 .7881 .7910 .7939

    0.9 .8159 .8186 .8212

    P(z < 0.81) = .7910

    Suppose we want to find the

    proportion of observations from the

    standard Normal distribution that are

    less than 0.81.

    We can use Table A:

  • 5/27/2018 Statistics -- Normal Distribution

    34/63

    +

    NormalDistribu

    tions

    Finding Areas Under the Standard Normal Curve

    Find the proportion of observations from the standard Normal distribution that

    are between -1.25 and 0.81.

    Example

    Can you find the same proportion using a different approach?

    1 - (0.1056+0.2090) = 10.3146

    = 0.6854

  • 5/27/2018 Statistics -- Normal Distribution

    35/63

    +

    NormalDistribu

    tions

    Finding Areas Under the Standard Normal Curve

    Find the proportion of observations from the standard Normal distribution that

    are greater than -1.78.

    Example

  • 5/27/2018 Statistics -- Normal Distribution

    36/63

    +

    NormalDistribu

    tions

    Finding Areas Under the Standard Normal Curve

    Find the proportion of observations in a Normal distribution that are more than

    1.53 standard deviations above the mean.

    Example

    The table entry0.9370 is for the

    area to the left of z

    = 1.53.

    The area to the

    right of z= 1.53 is 1

    0.9370 = 0.0630

  • 5/27/2018 Statistics -- Normal Distribution

    37/63

    +

    NormalDistribu

    tions

    Finding Areas Under the Standard Normal Curve

    Find the proportion of observations in a Normal distribution that are between -

    0.58 and 1.79.

    Example

    =

    Area to the left

    of z= 1.79 is

    0.9633

    Area to the left

    of z= -0.58 is

    0.2810

    Area between z= -

    0.58 and z= 1.79 is

    0.6823

  • 5/27/2018 Statistics -- Normal Distribution

    38/63

    +

    NormalDistribu

    tions

    Remember those tall Dutch men? Their mean height was

    184cm and the standard deviation was 8cm. Answer each of

    these questions by sketching a Normal model, shading the

    appropriate area, finding the z-scores, and using the table to

    determine the percentage.

    What percent of Dutch men should be less than 190cm tall?

    What percent of Dutch men should be between 170 and

    180cm tall?

    What fraction of Dutch men should be over 198cm tall?

    CHECK YOUR UNDERSTANDING

  • 5/27/2018 Statistics -- Normal Distribution

    39/63

    +

    NormalDistribu

    tions

    Working Backward: Finding z-scores from Percentiles

    Sometimes we start with areas and need to find the corresponding z-score or

    even the original data value.

    What z-score represents the first quartile in a Normal Model?

  • 5/27/2018 Statistics -- Normal Distribution

    40/63

    +

    Nor

    malDistribu

    tions

    Working Backward: Finding z-scores from Percentiles

    For each, sketch the standard Normal distribution, shade the area

    described, and find the z-score cutpoints

    1. The lowest 40% of the distribution

    2. The highest 30% of the distribution

    3. The highest 2% of the distribution

    4. The middle 30% of the distribution

  • 5/27/2018 Statistics -- Normal Distribution

    41/63

    +

    Nor

    malDistribu

    tions

    Working Backward: Finding z-scores from Percentiles

    To Find a z-score from a percentile:

    1.Look for the percentile (as a decimal) in the middleof the z-table.

    2.Plug the z-score, the mean, and the standard deviation into the

    formula: z= (x)/

    3.Solve forx

  • 5/27/2018 Statistics -- Normal Distribution

    42/63

    +

    Nor

    malDistribu

    tions

    Going Back to those Dutch men

    Lets think about the Normal model for the heights of Dutch men one

    more time. Remember that their mean height was 184cm and the

    standard deviation was 8cm. Answer each of these questions by

    sketching a Normal model, shading the appropriate area, finding the

    cutpoint z-score, and then determining the cutpoint height.

    1.How tall are the tallest 10% of all Dutch men?

    2.How tall are the shortest 20% of Dutch men?

    3.How tall are the middle 50% of Dutch men?

  • 5/27/2018 Statistics -- Normal Distribution

    43/63

    +

    Nor

    malDistribu

    tions

    Normal Distribution Calculations

    State:Express the problem in terms of the observed variablex.

    Plan:Draw a picture of the distribution and shade the area ofinterest under the curve.

    Do:Perform calculations.

    Standardizexto restate the problem in terms of a standard

    Normal variable z.

    Use Table A and the fact that the total area under the curve

    is 1 to find the required area under the standard Normal curve.

    Conclude:Write your conclusion in the context of the problem.

    How to Solve Problems Involving Normal Distributions

  • 5/27/2018 Statistics -- Normal Distribution

    44/63

    +

    Nor

    malDistribu

    tions

    Normal Distribution Calculations

    When Tiger Woods hits his driver, the distance the ball travels can be

    described by N(304, 8). What percent of Tigers drives travel between 305

    and 325 yards?

    Whenx = 305, z =305- 304

    8 0.13

    Whenx = 325, z =325- 304

    8 2.63

    Using Table A, we can find the area to the left of z=2.63 and the area to the left of z=0.13.

    0.99570.5517 = 0.4440. About 44%of Tigers drives travel between 305 and 325 yards.

  • 5/27/2018 Statistics -- Normal Distribution

    45/63

    +

    Nor

    malDistribu

    tions

    Normal Distribution Calculations

    In the 2008 Wimbledon tennis tournament, Rafael Nadal averaged 115 miles

    per hour on his first serves: N(115, 6). About what proportion of his first

    serves would you expect to exceed 120mph?

    Do: Whenx =120, z =120-115

    6 0.83

    Looking up a z-score of 0.83 shows us that the area less than z= -.83 is 0.7967. This means

    that the area to the right of z= 0.83 is 10.7967 = 0.2033.

    Conclude: About 20% of Nadals first serves will travel more than 120mph.

    State: Letx= the speed of Nadals first serve. The variablexhas a Normal distribution

    with = 115 and = 6. We want the proportion of first serves withx 120.

    x= 120

    z

    = 0.83

    Plan:

  • 5/27/2018 Statistics -- Normal Distribution

    46/63

    +

    Nor

    malDistribu

    tions

    Normal Distribution Calculations

    In the 2008 Wimbledon tennis tournament, Rafael Nadal averaged 115 miles

    per hour on his first serves: N(115, 6). About what proportion of his first

    serves are between 100 and 110mph?

    Do: Whenx =100, z= 100-115

    6 2.5

    When x = 110, z110 115

    6 0.83

    Looking up a z-score of -2.50 shows us that the area less than z= -2.50 is 0.0062. Looking

    up a z-score of -0.83 shows us that the area less than z= -0.83 is 0.2033. Thus, the area

    between z= -2.50 and z= 0.83 is 0.20330.0062 = 0.1971.

    Conclude: About 20% of Nadals first serves will travel between 100 and 110mph.

    State: Letx= the speed of Nadals first serve. The variablexhas a Normal distribution

    with = 115 and = 6. We want the proportion of first serves withx 120.

    x= 100

    z

    = -2.50

    x= 110

    z

    = -0.83

  • 5/27/2018 Statistics -- Normal Distribution

    47/63

    +

    Nor

    malDistribu

    tions

    Normal Distribution Calculations: Cholesterol in

    Teenage Boys

    High levels of cholesterol in the blood increase the risk of

    heart disease. For 14-year-old boys, the distribution of blood

    cholesterol is approximately Normal: N(170, 30). What is thefirst quartile of the distribution of blood cholesterol?

  • 5/27/2018 Statistics -- Normal Distribution

    48/63

    +

    Nor

    malDistribu

    tions

    High levels of cholesterol in the blood increase the risk of

    heart disease. For 14-year-old boys, the distribution of blood

    cholesterol is approximately Normal: N(170, 30). Cholesterollevels above 240mg/dL may require medical attention. What

    percent of 14-year-old boys have more than 240mg/dL of

    cholesterol? What percent of 14-year-old boys have blood

    cholesterol between 200 and 240 mg/dL?

    CHECK YOUR UNDERSTANDING

  • 5/27/2018 Statistics -- Normal Distribution

    49/63

    +

    Nor

    malDistribu

    tions

    Assessing Normality

    The Normal distributions provide good models for some

    distributions of real data. Many statistical inference proceduresare based on the assumption that the population is

    approximately Normally distributed. Consequently, we need a

    strategy for assessing Normality.

    Plot the data.

    Make a dotplot, stemplot, or histogram and see if the graph is

    approximately symmetric and bell-shaped.

    Check whether the data follow th e 68-95-99.7 rule.

    Count how many observations fall within one, two, and three

    standard deviations of the mean and check to see if these

    percents are close to the 68%, 95%, and 99.7% targets for a

    Normal distribution.

  • 5/27/2018 Statistics -- Normal Distribution

    50/63

    +

    Nor

    malDistribu

    tions

    Normal Probability Plots

    Most software packages can construct Normal probability plots.

    These plots are constructed by plotting each observation in a data set

    against its corresponding percentiles z-score.

    If the points on a Normal probability plotlie close to a straight line,

    the plot indicates that the data are Normal. Systematic deviations from

    a straight line indicate a non-Normal distribution. Outliers appear aspoints that are far away from the overall pattern of the plot.

    Interpreting Normal Probability Plots

    E l

  • 5/27/2018 Statistics -- Normal Distribution

    51/63

    +

    Nor

    malDistribu

    tions

    Assessing Normality

    Find the proportion of observations from the standard Normal distribution that

    are between -1.25 and 0.81.

    Example

    Can you find the same proportion using a different approach?

    1 - (0.1056+0.2090) = 10.3146

    = 0.6854

  • 5/27/2018 Statistics -- Normal Distribution

    52/63

    +

    Nor

    malDistribu

    tions

    Assessing Normality: Is Unemployment

    Normal?

    At right are the data onunemployment rates in the 50

    states in November, 2009.

    Below is a histogram showing

    the data.

    1 - (0.1056+0.2090) = 10.3146

    = 0.6854

    4 1 5

    5 0

    6 3 3 4 4 6 7 7 7 9

    7 0 0 2 4 4 4 8

    8 0 0 2 2 4 5 5 6 7 8 9

    9 1 2 5 6 6 7

    10 2 3 5 6 6 8 9

    11 1 5

    12 3 3 3 7

    13

    14 7

    Mean = 8.682 StDev = 2.225

    Key: 4|1 = 4.1%

  • 5/27/2018 Statistics -- Normal Distribution

    53/63

    +

    Nor

    malDistribu

    tions

    Assessing Normality: Is Unemployment

    Normal?

    Refer to the tableat right:

    Mean = 8.682 StDev = 2.225

    Mean 1SD 6.457 to 10.907 36/50 = 72%

    Mean2SD 4.232 to 13.132 48/50 = 96%

    Mean3SD 2.007 to 15.357 50/50 = 100%

    These percents are quite close to the68%, 95%, and 99.7% targets for a

    Normal distribution.

  • 5/27/2018 Statistics -- Normal Distribution

    54/63

    +

    Nor

    malDistribu

    tions

    Assessing Normality: Were last years Seniors

    GPAs Normal?0.78 2 2.31 2.79 2.93 3.23 3.46 3.89

    0.81 2 2.39 2.8 2.93 3.23 3.48 3.92

    0.92 2.04 2.53 2.83 3 3.27 3.5 3.98

    1.28 2.2 2.55 2.84 3.11 3.34 3.79 4.15

    1.28 2.28 2.63 2.86 3.12 3.36 3.79

    1.78 2.29 2.77 2.88 3.13 3.38 3.79

    Mean1SD

    Mean2SD

    Mean3SD

  • 5/27/2018 Statistics -- Normal Distribution

    55/63

    +

    Nor

    malDistribu

    tions

    Assessing Normality: Are your GPAs Normal?

    1.22 1.36 1.39 1.45 1.49 1.60 1.63 1.95

    1.95 2 2 2.03 2.06 2.26 2.41 2.44

    2.45 2.46 2.49 2.51 2.53 2.64 2.75 2.80

    2.83 2.83 2.94 2.94 2.99 2.99 3.11 3.14

    3.16 3.23 3.27 3.31 3.31 3.37 3.453.48

    3.55 3.75 4.04 4.06 4.33 4.35

    Mean1SD

    Mean2SD

    Mean3SD

    Mean = 2.70 St.Dev. = 0.81

  • 5/27/2018 Statistics -- Normal Distribution

    56/63

    +

    Nor

    malDistribu

    tions

    Assessing Normality: No Space in the Fridge?

    The measurements listed below describe the useable capacity

    (in cubic feet) of a sample of 36 side-by-side refrigerators.(source: Consumer Reports, May 2010)Are the data close to Normal?

    Mean1SD

    Mean2SD

    Mean3SD

    12.9 13.7 14.1 14.2 14.5 14.5 14.6

    14.7 15.1 15.2 15.3 15.3

    15.3 15.3 15.5 15.6 15.6 15.8 16.016.0 16.2 16.2 16.3 16.4

    16.5 16.6 16.6 16.6 16.8 17.0 17.0

    17.2 17.4 17.4 17.9 18.4

  • 5/27/2018 Statistics -- Normal Distribution

    57/63

    +

    Nor

    malDistribu

    tions

    Assessing Normality: Are Twizzlers Normal?

    Ms. Raskin will give you a Twizzler. Measure its length to the

    nearest cm. Are Twizzlers Normal?

    Mean1SD

    Mean2SD

    Mean3SD

  • 5/27/2018 Statistics -- Normal Distribution

    58/63

    +

    Nor

    malDistribu

    tions

    What Can Go Wrong?

    Dont use a Normal model when the distribution is

    not unimodal and symmetric.

  • 5/27/2018 Statistics -- Normal Distribution

    59/63

    +

    Nor

    malDistribu

    tions

    What Can Go Wrong?

    Dont use the mean and standard deviation when

    outliers are present. Both mean and standard

    deviation can be distorted by outliers.

    Dont round off too soon.

    Dont round your results in the middle of a calculation.

    Dont worry about minor differences in results.

    +

  • 5/27/2018 Statistics -- Normal Distribution

    60/63

    +Ch. 6

    Normal Distributions

    In this chapter, we learned that

    The Normal Distributionsare described by a special family of bell-shaped, symmetric density curves called Normal curves. The mean

    and standard deviation completely specify a Normal distribution

    N(,). The mean is the center of the curve, and is the distance

    from to the change-of-curvature points on either side.

    All Normal distributions obey the 68-95-99.7 Rule, which describes

    what percent of observations lie within one, two, and three standard

    deviations of the mean.

    Summary

    +

  • 5/27/2018 Statistics -- Normal Distribution

    61/63

    +Ch. 6

    Normal Distributions

    All Normal distributions are the same when measurements are

    standardized. The standard Normal distributionhas mean =0

    and standard deviation =1.

    Table Agives percentiles for the standard Normal curve. By

    standardizing, we can use Table A to determine the percentile for a

    given z-score or the z-score corresponding to a given percentile in

    any Normal distribution.

    To assess Normality for a given set of data, we first observe its

    shape. We then check how well the data fits the 68-95-99.7 rule.We

    can also construct and interpret a Normal probability plot.

    Summary (contd)

    +

  • 5/27/2018 Statistics -- Normal Distribution

    62/63

    +Ch. 6

    Describing Location in a Distribution

    There are two ways of describing an individuals location within a

    distributionthe percentileand z-score.

    A cumulative relative frequency graphallows us to examine

    location within a distribution.

    It is common to transform data, especially when changing units of

    measurement. Transforming data can affect the shape, center, and

    spread of a distribution.

    We can sometimes describe the overall pattern of a distribution by a

    density curve(an idealized description of a distribution that smooths

    out the irregularities in the actual data).

    Summary (contd)

    +

  • 5/27/2018 Statistics -- Normal Distribution

    63/63

    +Looking Ahead

    Well learn how to describe relationships between two

    quantitative variables

    Well study

    Scatterplots and correlation

    Least-squares regression

    In the next Chapter