Mark - Q1 Review F09

download Mark - Q1 Review F09

of 15

Transcript of Mark - Q1 Review F09

  • 8/4/2019 Mark - Q1 Review F09

    1/15

    A. Stats Starts Here (all page references from Intro Stats text)

    1. what is statistics?

    a. art of distilling meaning from datab. a way of reasoningc. tools and methodsd. an aid to understanding the worlde. making sense of variation

    2. what is data?

    a. who?

    b. what?

    c. when?

    d. where?

    e. why?

    f. how?

  • 8/4/2019 Mark - Q1 Review F09

    2/15

    B. What is a Categorical Variable?

    Variable (sometimes referred to as a qualitative variable) that describes an

    element of a population such as eye color, political affiliation, rank in a

    traditional college, gender, etc

    (Refer to Titanic Example from Chapter 3)

    1. graphic displays of categorical data

    a. frequency table (p. 21)b. relative frequency table (p. 21)c. bar charts (p. 23)d. relative frequency bar chart (p. 23)e. pie chart (p.23)f. segmented bar chart (p.31)

    2. contingency table (p. 24)

    a. marginal distributions (p.25)

    b. conditional distribution (p.27)

    c. independent variables (p.29)

  • 8/4/2019 Mark - Q1 Review F09

    3/15

    C. What is a Quantitative Variable?

    Variables recorded in numbers that we use as numbers incomes, heights,

    weights, ages, and counts. Quantitative variables have measurement units.

    Units tell how a quantitative value has been measured.

    1. graphic displays of quantitative data

    a. distribution (p.49)b. frequency histogram (p. 49)c. relative frequency histogram (p. 50)d. gap (p. 50)e. stem and leaf display (p.50)f. dotplot (p.52)

    2. describing the distribution

    a. shape (p. 52)b. center (p. 53, 56)

    c. spread (p. 53, 57)

    d. mode (p. 53)

    e. unimodal (p. 53)

    f. uniform (p.53)

    g. tails (p. 54)

    h. skewed (p. 54)

    i. outliers (p. 54)

  • 8/4/2019 Mark - Q1 Review F09

    4/15

    D. Summarizing Quantitative Data via Three Measures

    1. measures of centrality2. measures of variability3. measures of position

    1. measures of centrality

    a. mean (p.62)

    b. median (p.56)

    2. measures of dispersiona. range (p.57)

    b. variance (p.64)

    c. standard deviation (p.64)

    3. measures of positiona. quartiles (p.58)b. interquartile range (IQR) (p.58)c. five number summary (p.60)

    d. upper and lower fence (p. 90)

    e. outlier (p.91)

    f. boxplot (p.90)

  • 8/4/2019 Mark - Q1 Review F09

    5/15

    FIND THE MEASURES OF:

    1 CENTRALITY

    FOR THE FOLLOWING AGES OF A GROUP OF

    EXTENSION SCHOOL STUDENTS:

    19 24 28 28 34 36 38 57

    a) What is the mean age of these students?

    b) What is the median age of these students?

    c) what is the mode of this data set?

  • 8/4/2019 Mark - Q1 Review F09

    6/15

    FIND THE MEASURES OF:

    2 DISPERSION OR VARIABILITY

    FOR THE FOLLOWING AGES OF A GROUP OFEXTENSION SCHOOL STUDENTS:

    19 24 28 28 34 36 38 57

    a) What is the range of these ages?

    b) What is the variance of these ages?

    c) What is the standard deviation of these ages?

  • 8/4/2019 Mark - Q1 Review F09

    7/15

    FIND THE MEASURES OF:

    3 POSITION

    FOR THE FOLLOWING AGES OF A GROUP OFEXTENSION SCHOOL STUDENTS:

    19 24 28 28 34 36 38 57

    a) what are the quartiles of this set of ages?

    b) what is the interquartile range of this set of ages?

    c) show the five number summary of this set of ages

    d) construct a boxplot (see hand-written boxplot below)

    e) find the upper and lower fences for this set of ages

    f) Are there any outliers in this set of data?

  • 8/4/2019 Mark - Q1 Review F09

    8/15

    Answers:

    FOR THE FOLLOWING AGES OF A GROUP OF EXTENSION SCHOOL

    STUDENTS,

    19 24 28 28 34 36 38 57

    FIND THE MEASURES OF:

    1 CENTRALITY

    a) What is the mean age of these students? 33

    b) What is the median age of these students? 31

    c) what is the mode of this data set? 28

    2 DISPERSION OR VARIABILITY

    a) What is the range of these ages? 38

    b) What is the variance of these ages? 134

    c) What is the standard deviation of these ages? 11.5758

    3 POSITION

    a) what are the quartiles of this set of ages? Q1 = 26 Q2 = 31 Q3 =37

    b) what is the interquartile range of this set of ages? 11c) show the five number summary of this set of ages

    H = 57

    Q3 = 37

    Q2 = 31

    Q1 =26

    L = 19

    d) construct a boxplot (see hand drawn boxplot below)

    e) find the upper and lower fences for this set of ages

    upper fence = 53.5 lower fence = 9.5

    f) Are there any outliers in this set of data

    yes -- 57

  • 8/4/2019 Mark - Q1 Review F09

    9/15

  • 8/4/2019 Mark - Q1 Review F09

    10/15

    E. The Standard Deviation as a Ruler and the Normal Model

    1. normal model -- 68-95-99.7 Rule-----------------99.7%----------------

    ----------95%-----------68%--

    -3 -2 -1 + +2 +3

    In a Normal model, about 68% of the values fall within one standard deviation of

    the mean, about 95% of the values fall within two standard deviations of the mean,

    and about 99.7% of the values fall within three standard deviations of the mean

  • 8/4/2019 Mark - Q1 Review F09

    11/15

    2. calculating the z-score

    Formula

    The standard score is

    Z = (y - )/

    where:

    y is a raw score to be standardized; is the mean of the population;

    is the standard deviation of the population.

    3. the z-score table (example of partial table)

    Positive z-scores:

    Z 0.0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.090.0 0.5000 0.5040 0.5080 0.5120 0.5160 0.5199 0.5239 0.5279 0.5319 0.5359

    0.1 0.5398 0.5438 0.5478 0.5517 0.5557 0.5596 0.5636 0.5675 0.5714 0.5753

    0.2 0.5793 0.5832 0.5871 0.5910 0.5948 0.5987 0.6026 0.6064 0.6103 0.6141

    0.3 0.6179 0.6217 0.6255 0.6293 0.6331 0.6368 0.6406 0..6443 0.6480 0.6517

    0.4 0.6554 0.6591 0.6628 0.6664 0.6700 0.6736 0.6772 0.6808 0.6844 0.6879

    0.5 0.6915 0.6950 0.6985 0.7019 0.7054 0.7088 0.7123 0.7157 0.7190 0.7554

    0.6 0.7257 0.7291 0.7324 0.7357 0.7389 0.7422 0.7454 0.7486 0.7517 0.7549

    0.7 0.7580 0.7611 0.7642 0.7673 0.7704 0.7734 0.7764 0.7794 0.7823 0.7852

    0.8 0.7881 0.7910 0.7939 0.7967 0.7995 0.8023 0.8051 0.8078 0.8106 0.8133

    0.9 0.8159 0.8186 0.8212 0.8238 0.8264 0.8289 0.8315 0.8340 0.8365 0.8389

    http://en.wikipedia.org/wiki/Meanhttp://en.wikipedia.org/wiki/Standard_deviationhttp://en.wikipedia.org/wiki/Standard_deviationhttp://en.wikipedia.org/wiki/Mean
  • 8/4/2019 Mark - Q1 Review F09

    12/15

    4. finding percentages using z-scores

    DRAW A PICTURE

    a. above a pointc. below a pointd. between two pointse. beyond two points

    5. finding a point if the percentages are provided

    6. normal probability plot

    Sample Plot

    The points on this plot form a nearly linear pattern, which

    indicates that the normal distribution is a good model forthis data set.

  • 8/4/2019 Mark - Q1 Review F09

    13/15

    7. what can go wrong?

    a. dont use a normal model if distribution is not unimodal/symmetricb. dont use mean and standard deviation if outliers are presentc. dont round off too soond. dont round your results in the middle of a calculatione. dont worry about minor differences in results

  • 8/4/2019 Mark - Q1 Review F09

    14/15

    Normal Model Question

    In a particular population of sedentary adults (i.e.

    adults aged 20-40 who spend less than 5 hours per weekin active exercise), the distribution of systolic blood

    pressure is approximately normal with a mean of 125

    and a standard deviation of 15. Suppose that a

    sedentary young adult is randomly selected from this

    population.

    a.Find the probability that this person will have asystolic blood pressure or 110 or less.

    b.Would you expect this person to have a systolicblood pressure below 80%? Explain fully and

    show calculations to justify your answer.

    c. What is the probability that this person will have

    a systolic blood pressure higher than 150.

    c.Assume that high blood pressure is defined as asystolic blood pressure in the highest 10% of the

    population. What value would be the cutoff for

    high blood pressure in this population of

    sedentary young adults?

  • 8/4/2019 Mark - Q1 Review F09

    15/15