MODULE 09 Frequency Distributions

download MODULE 09 Frequency Distributions

of 40

Transcript of MODULE 09 Frequency Distributions

  • 7/23/2019 MODULE 09 Frequency Distributions

    1/40

    Module 9: Frequency Distributions

    This module includes descriptions of frequencydistributions, frequency tables, histograms and

    frequency polygons. Also included are stem and

    leaf plots.

    Reviewed ! "ay ! #"$%&'( 9

  • 7/23/2019 MODULE 09 Frequency Distributions

    2/40

    Frequency Distribution

    A frequency distribution is the organi*ation of a data

    set into contiguous, mutually e+clusive intervals so

    that the number or proportion of observations fallingin each interval is apparent.

  • 7/23/2019 MODULE 09 Frequency Distributions

    3/40

    Frequency Distribution Example

    An e+ample of a frequency distribution is the age distributionof one of the classes for an in-class version of this course.

    The class included n 1)1 students, with the age distribution

    as shown on a following slide.

    Two things are notable about this distribution. $ne has to do

    with the way we measure age, specified as age at last

    birthday, not age at nearest birthday.

    The other is the ages listed as ! years, with )1 studentshaving this age. This is an interval with indefinite length and

    it will require special handling.

  • 7/23/2019 MODULE 09 Frequency Distributions

    4/40

    Age 0o.

    )1

    )) 1

    ) 11)/ 9

    )! 12

    ) 1

    )2

    )3 !

    )9 /

    1 1

    ) /

    / )

    ! )1

    Total 1)1

  • 7/23/2019 MODULE 09 Frequency Distributions

    5/40

    Intervals for a Frequency Distribution

    4 &se not less than intervals, generally 3-1!

    4 &se equal width intervals if feasible and appropriate

    4 Avoid intervals with indefinite length, if possible4 5elect interval width by dividing difference between

    smallest and largest observation by 1

    4 Ta6e into account the manner measurements were

    made when setting up intervals

  • 7/23/2019 MODULE 09 Frequency Distributions

    6/40

    7or the age distribution e+ample, the ages range from )1

    years to ! years and older8 to construct at least intervals

    for this age range, each interval has to be at least ) years

    wide. To create the intervals, we need to consider themidpoint of the intervals8 The midpoint for the age interval

    for persons )1 years old is )1.! years since they will not

    become )) until their ))nd birthday. Thus, the age interval

    for persons )1 and )) years old goes from )1 to ), with ))

    as the midpoint.

    )1 )) )

    The intervals will be written as )1 ), ) )!, )! )2

    etc. :e will deal with the ! interval in a subsequent

    slide.

    Midpoint

  • 7/23/2019 MODULE 09 Frequency Distributions

    7/40

    Age 0o. ;nterval"idpoint

    )1 ))

    )) 1

    ) 11)/

    )/ 9

    )! 12)

    ) 1

    )2 )3)3 !

    )9 /

    1 1)

    ) /

    /

    / )

    ! )1 um =

    )1 )) )) 13 13

    )) 1

    ) 11)/ ) 12 !

    )/ 9

    )! 12) )!

    ) 1

    )2 )3 11 9 9)3 !

    )9 / 2 2!

    1 1

    ) ! / 29) /

    / ! / 3

    / )

    ! )1 / )1 12 1

    Total 1)1

  • 7/23/2019 MODULE 09 Frequency Distributions

    13/40

    4 &se Braph of a frequency distribution4 7or >ontinuous variables

    Rules4 0o space between bars

    4 (qual areas must represent equal percentages ornumbers

    4 Cercentages often preferred to numbers for verticala+is

    istogram

  • 7/23/2019 MODULE 09 Frequency Distributions

    14/40

    From a Frequency !able to a istogram

    Histocomes from the Bree6 word for cell. ;n this case, the

    reference is to a cell li6e those in bee honeycombs, all of whichhave an equal area when viewed from the top. That is, a

    histogram is an equal cell or equal area graph, with each

    percentage represented by the same area. To prepare a histogram

    from a frequency table, we must first decide the shape and areafor the cell that represents a single =. :e can then stac6 the

    appropriate number of these cells on top of each other to

    represent the percentage of the frequency distribution located

    within the interval of interest.

    7or this e+ample, we will ma6e each cell be two years wide and

    one unit high so that each percentage point for a specific interval

    will loo6 something li6e

  • 7/23/2019 MODULE 09 Frequency Distributions

    15/40

    The )1 to ) years interval of the age frequency distribution includes 13= of

    the observations, with a midpoint of ))8 so for this interval, we need to draw

    13 cells, as shown below.

    )

    ))

    1)

    1/

    1

    13

    3

    1

    /

    )

    )/

    )

    /))3))/))) /3

    "e

    rcentageofstudents

    Interval Midpoint

    Each yellow bloc# represents $ cell% thus the

    &$ to &' years interval requires $( bloc#s

    13

  • 7/23/2019 MODULE 09 Frequency Distributions

    16/40

    )

    ))

    1)

    1/

    1

    13

    3

    1

    /

    )

    )/

    )

    /))3))/))) /3

    "ercentageof)tudents

    Interval Midpoint

    13

    12

    )!

    9

    / /

  • 7/23/2019 MODULE 09 Frequency Distributions

    17/40

    Dealing with our Interval with Indefinite *ength

    4 The areashown for an indefinite interval should be on

    same basis as it is for other intervals. $ur indefinite

    interval contains 12= of the distribution so it needs to

    include the area equivalent to that of 12 cells, where each

    cell is ) years wide by 1= tall.

    4 The widthof the indefinite interval is 1 years so that it is

    3 cells wide

    4 The height of the bar for this interval thus needs to be12#3 or two and 1#3=.

  • 7/23/2019 MODULE 09 Frequency Distributions

    18/40

    ) )) )/ ) )3 ) / 3 / /) // / /3 ! !) !/

    )

    ))

    1)

    1/

    1

    13

    3

    1

    /

    )

    )/

    )

    "ercentageofstudents

    +ge Interval Midpoint

    Indefinite Interval

    13

    12

    )!

    9

    / /

    1)3

    1)3

    1)

    3

    1)3

    1)3

    1)3

    1)3

    1)3

  • 7/23/2019 MODULE 09 Frequency Distributions

    19/40

  • 7/23/2019 MODULE 09 Frequency Distributions

    20/40

    Source: Am J Public Health, April 2004;94:559

  • 7/23/2019 MODULE 09 Frequency Distributions

    21/40

  • 7/23/2019 MODULE 09 Frequency Distributions

    22/40

    )ource: Am J Public Health, June 2004;94:559

  • 7/23/2019 MODULE 09 Frequency Distributions

    23/40

    Data )ource: Am J Public Health, June 2004;94:559

    + revised version of the histogram on the previous slide%to deal

    with the indefinite interval for patients who are D 2/ years old, we

    made an arbitrary decision that the oldest patient was 3! years old .

    "idpoint of interval 19 30 40 50 60 70 80

    Age, years

    0

    20

    40

    60

    Numb

    erofPatients

    3!

    'argest value

  • 7/23/2019 MODULE 09 Frequency Distributions

    24/40

    "idpoint of

    interval

    19 30 40 50 60 70 87

    +ge, years

    0

    20

    40

    60

    -umberof"atients

    'argest value

    100

    Data )ource: Am J Public Health, June 2004;94:559

    A histogram of data fromAJPH,Eune )/8 9/!!9 article with oldest

    patient assumed to be 1 years old.

  • 7/23/2019 MODULE 09 Frequency Distributions

    25/40

    +ge Distribution of the ./)/ "opulation, by )ex: $901

    Millions

    15 10 5 0 5 10 15

    )ource: .) 2E-).) 3.4E+.

    !-9

    1-1/

    1!-19

    )-)/

    )!-)9

    -/

    !-9

    /-//

    /!-/9

    !-!/

    !!-!9

    -/

    !-9

    2-2/

    2!-29

    3-3/

    F!

    +ge

    Interval

    FemaleMale3!

  • 7/23/2019 MODULE 09 Frequency Distributions

    26/40

    10 5 0 5 1015

    +ge Distribution of the ./)/ "opulation, by )ex: &111

    Millions

    Male Female

    15

    !-9

    1-1/

    1!-19)-)/

    )!-)9

    -/

    !-9

    /-//

    /!-/9

    !-!/

    !!-!9

    -/

    !-9

    2-2/

    2!-29

    3-3/

    F!

    +geI

    nterval

    )ource: .) 2E-).) 3.4E+.

    Year 1950

    Year 2000

    3!

  • 7/23/2019 MODULE 09 Frequency Distributions

    27/40

    15 10 5 0 5 10 15

    Male Female

    Millions

    !-9

    1-1/

    1!-19

    )-)/

    )!-)9

    -/

    !-9

    /-//

    /!-/9

    !-!/

    !!-!9

    -/

    !-9

    2-2/

    2!-29

    3-3/

    F!

    +geInterval

    +ge Distribution of the ./)/ "opulation, by )ex: &101

    )ource: .) 2E-).) 3.4E+.

    Year 2000

    Year 2050

    3!

  • 7/23/2019 MODULE 09 Frequency Distributions

    28/40

    Frequency "olygon

  • 7/23/2019 MODULE 09 Frequency Distributions

    29/40

    A frequency polygon is prepared by connecting themidpoints of the tops of the histogram bars with

    straight lines in such a manner that the area covered

    by the resulting figure includes 1= of the

    histogram area.

    Frequency "olygon

    F " l f + E l

  • 7/23/2019 MODULE 09 Frequency Distributions

    30/40

    Frequency "olygon for +ge Example

    )

    ))

    1)

    1

    13

    3

    1

    /

    )

    "ercentageo

    fstudents

    ) )) )/ ) )3 ) / 3 / /) // / /3 ! !) !/

    1/

    )/

    )

    +ge Interval Midpoint

  • 7/23/2019 MODULE 09 Frequency Distributions

    31/40

    )ource:Am J Public Health, Aug 200!;9!:!209

  • 7/23/2019 MODULE 09 Frequency Distributions

    32/40

    )ource:Am J Public Health, Sept !9"#;##:$"4

  • 7/23/2019 MODULE 09 Frequency Distributions

    33/40

    )ource:Am J Public Health, Sept !9"#;##:$"4

  • 7/23/2019 MODULE 09 Frequency Distributions

    34/40

    )ource:Am J Public Health, Sept !9"#;##:$"4

    ) d * f "l

  • 7/23/2019 MODULE 09 Frequency Distributions

    35/40

    )tem and *eaf "lot

    5tem# 'eaf >ount

    )3 1) )) 1!

    )! 111) /

    )/ )! !

    ) /23

    )) 1!!!39 2

    )1 !!!!!!!! 3

    ) 12222 919 1)))//!239 1)

    13 1)/!!233999 1/

    12 )))//!22333 1!

    1 111))))////9 1/

    1! 111)//9999 1)

    1/ 11)!299 1

    1 1//2333 3

    1) /229 2

    11 1)/ !

    1 1)22 /

    9 )/

    2 1) )

    The stem contains the

    leading digits of the

    observations8 because we

    have digit numbers from

    11 to )3) for this data set

    the stem includes the firsttwo digits of the numbers.

    ;n general the stem consists

    of n-1 digits, where n is the

    number of digits in the

    number.

    7or the last two sets of

    numbers the stem

    contains only 1 digit

    because the observations

    are ) digit numbers.

    The leaf always contains

    the last digit of eachobservation with the

    same leading digits8

    arranged in increasing

    order of magnitude from

    9 to form the leaf.

    Bives the number

    of digits in the

    leaf of each stem

  • 7/23/2019 MODULE 09 Frequency Distributions

    36/40

    4 A chart that displays a frequency distribution

    similar to a histogram.

    4 A stem and leaf plot shows

    4 The spread of the data 4 The mode

    4 :hether the distribution is s6ewed

    4 :hether there are gaps in the data4 :hether there are any unusual data points

    )tem and *eaf "lot

  • 7/23/2019 MODULE 09 Frequency Distributions

    37/40

    )tem and *eaf "lot

    A stem and leaf plot typically consists of threecolumns, the first two being separated by a single

    blan6 space. The first column is the stem or leading

    digits. The second is the leaf which represents the

    values in the interval following the stem digits. The

    third column indicates the number of data values or

    count in the interval.

  • 7/23/2019 MODULE 09 Frequency Distributions

    38/40

    )tem and *eaf "lot Example

    5uestion %isplay the information in the table below in a 5tem and 'eaf Clot.

  • 7/23/2019 MODULE 09 Frequency Distributions

    39/40

    )tem *eaf "lot for 3lood 2holesterol Data

    5tem#'eaf >ount

    )2 1

    ) 1 1

    )!

    )/ 2 1

    ) ! 1

    )) 13 )1 2 1

    ) 1 !

    19 1)))//!239 1)

    13 1)/!!23399 1

    12 )))//!239 1

    1 111))))////!!!2339999 )/

    1! 111))/////!!!3339999 )!1/ 111)!299 11

    1 1/2333 1

    1) /29 !

    11 1/ )

    1

    9 ! 1

    )+) 6utput

  • 7/23/2019 MODULE 09 Frequency Distributions

    40/40

    Y

    30

    20

    10

    0

    Std. Dev = 28.66

    Mean = 167.6

    N = 129.00

    )

    3

    )

    2

    )

    1

    )

    !

    )

    /

    2

    )

    ,

    !

    )

    )

    13

    )

    1

    2

    )

    1

    1

    9

    1))),//!239

    1

    3

    1)/!!23399

    1

    2

    ))),,//!239

    1

    111))))////!!!2339999

    1

    !

    111)),,/////!!!3339999

    1

    /

    111),!299

    1

    ,

    1,/2333

    1

    )

    ,/29

    1

    1

    1/

    1

    9

    !

    0ote The stem G 'eaf plot is plotted in decreasing order of

    magnitude, while the @istogram is plotted in increasing order, from

    ll h hi h id i

    istogram and )tem 7 *eaf plot for the 2holesterol Data