BASIC STATISTICS For the HEALTH SCIENCES Fifth Edition By Kuzma.

58
BASIC STATISTICS For the HEALTH SCIENCES Fifth Edition By Kuzma
  • date post

    19-Dec-2015
  • Category

    Documents

  • view

    232
  • download

    0

Transcript of BASIC STATISTICS For the HEALTH SCIENCES Fifth Edition By Kuzma.

Page 1: BASIC STATISTICS For the HEALTH SCIENCES Fifth Edition By Kuzma.

BASIC STATISTICSFor the HEALTH

SCIENCESFifth Edition

ByKuzma

Page 2: BASIC STATISTICS For the HEALTH SCIENCES Fifth Edition By Kuzma.

“It’s all Greek to me!!!!” = Summation

xxxxxx

xi

546528

= 30

Page 3: BASIC STATISTICS For the HEALTH SCIENCES Fifth Edition By Kuzma.

CHAPTER 4

Summarizing Data

Page 4: BASIC STATISTICS For the HEALTH SCIENCES Fifth Edition By Kuzma.

OUTLINE

 4.1                MEASURES OF CENTRAL TENDENCY

Explains why the selection of an appropriate sample has an imprint bearing the reliability of inferences about a population

 4.2                MEASURES AND VARIATION  Describes several measure of variation or variability including the standard deviation 4.3                COEFFICIENT OF VARIATION

Defines the coefficient of variation, useful in comparing levels of variation 4.4                MEASURING AND INTERPRETING SKEWNESS

Explains how to measure skewness and how to determine if a distribution is symmetrical or skewed

MEANS AND STANDARD DEVIATIOS OF POPULATIONSContrasts the equations for the parameters of a population to the statistics of a sample

Page 5: BASIC STATISTICS For the HEALTH SCIENCES Fifth Edition By Kuzma.

LEARNING OBJECTIVES 1.                   Compute and distinguish between the uses of measures of

central tendency: mean, median, and mode

2.                   Compute and lists some uses for measures of variation: range, variance, and standard deviation

3.                   Compare sets of data by computing their coefficients of variation

4.                   Be able to compute the mean and standard deviation for grouped and ungrouped data

5.                   Determine if a data set is symmetrical or skewed

6.                   Understand the distinction between the population mean and the sample mean

Page 6: BASIC STATISTICS For the HEALTH SCIENCES Fifth Edition By Kuzma.

MEASURES OF CENTRAL TENDENCY A.                  The Mean

1.                    the arithmetic or simple mean is computed by sunning all the observations in the sample and dividing the

sum by the number of observations 2.                    there are also harmonic and geometric means 3.                    The arithmetic mean may also be considered the balance point,

or, fulcrum

B.                   The Median 1.                    the observation that divides the distribution into equal parts 2.                    considered the most typical observation in the a distribution 3.                    that value above which there are the same number of

observations below 4.                    Symbolically the mean is represented by

 

n

xxxx n

...x 321

Page 7: BASIC STATISTICS For the HEALTH SCIENCES Fifth Edition By Kuzma.

MEASURES OF CENTRAL TENDENCY

C.                  The Mode

1.                    Observation that occurs most frequently

2.                    If all values are different, there is no mode

D.                   Which Average Should You Use

1.                    Arithmetic mean is the most commonly used

2.                    Median gives the typical observation for a distribution – good for income

Page 8: BASIC STATISTICS For the HEALTH SCIENCES Fifth Edition By Kuzma.

MEASURES AND VARIATION A.                  Range

1.                    the difference in value between the highest (maximum) and lowest (minimum) observation

  Range =

  2.                    can be computed easily but is not very useful because it

considers only the extremes

minmax xx

Page 9: BASIC STATISTICS For the HEALTH SCIENCES Fifth Edition By Kuzma.

MEASURES AND VARIATION B.                  Mean Deviation 1.                    the average deviation of all observations from the mean 2.                    the sum of all of the absolute values divided by the

number of observations   Mean Deviation =   or

n

xxxxxx ...21

or

n

x

xi

n

i 1

Page 10: BASIC STATISTICS For the HEALTH SCIENCES Fifth Edition By Kuzma.

MEASURES AND VARIATION C.                  Standard Deviation

1.                    by far the most widely used measure of variation

2.                    the square root of the variance of the observations

3.                    computed by: -          squaring each deviation from the mean -          adding them up - dividing their sum by less than the sample size

Page 11: BASIC STATISTICS For the HEALTH SCIENCES Fifth Edition By Kuzma.

MEASURES AND VARIATION

1

2

12

n

xxs

ii

or

s =

1

2

1

n

xxin

i

Page 12: BASIC STATISTICS For the HEALTH SCIENCES Fifth Edition By Kuzma.

xi (xi - x) (xi - x)2

1 -5 25

2 -4 16

4 -2 4

7 +1 1

10 +4 16

12 +6 36

xi = 36 0 (xi - x)2 = 98

N=6 Mean = 6

S 2 = (xi - x)2 /n-1 98/6-1 = 98/5 =

19.6

Standard Deviation = 4.43

Page 13: BASIC STATISTICS For the HEALTH SCIENCES Fifth Edition By Kuzma.

Variance ------Standard Deviation

Variance = Standard Deviation2

Variance = Standard Deviation

Page 14: BASIC STATISTICS For the HEALTH SCIENCES Fifth Edition By Kuzma.

Raw data and results of Cholesterol levels in 26 subjects

Number of observations or N 26Initial HDL values 31, 41, 44, 46, 47, 47, 48, 48, 49,

52, 53, 54, 57, 58, 58, 60, 60, 62, 63, 64, 67, 69, 70, 77, 81, 90 mg/dl

Highest values 90 mg/dl Lowest value 31 mg/dl Mode 47, 48, 58, 60 mg/dl Median (57 + 58)/2 = 57.5 mg/dl Sum of the values (xi) 1496 mg/dlMeans, x 1496/26 = 57.5 mg/dlInterquartile range 64 – 48 = 16 mg/dl

Page 15: BASIC STATISTICS For the HEALTH SCIENCES Fifth Edition By Kuzma.

Measures of dispersion based on the Mean.

Mean deviation =

Variance =

Standard deviation = s =

Degrees of Freedom

(xi - x )

N -1

2

(xi - x )

N -1s

22=

(|xi - x| )

N

Page 16: BASIC STATISTICS For the HEALTH SCIENCES Fifth Edition By Kuzma.

Raw data and results of Cholesterol levels in 26 subjects

Number of observations or N = 26Initial HDL values 31, 41, 44, 46, 47, 47, 48, 48, 49, 52,

53, 54, 57, 58, 58, 60, 60, 62, 63, 64, 67, 69, 70, 77, 81, 90 mg/dl

Highest values 90 mg/dl Lowest value 31 mg/dl Mode 47, 48, 58, 60 mg/dl Median (57 + 58)/2 = 57.5 mg/dl Sum of the values (xi) 1496 mg/dlMeans, x 1496/26 = 57.5 mg/dlInterquartile range 64 – 48 = 16 mg/dlSum of squares (TSS) 4,298.46 mg/dl

squaredVariance, “s” squared 171.94 mg/dlStandard Deviation, s 171.94 mg/dl = 13.1

mg/dl

Page 17: BASIC STATISTICS For the HEALTH SCIENCES Fifth Edition By Kuzma.

MEASURES AND VARIATION

D.                  Computing Central Tendency Using SPSS

1.                    To calculate mean and standard deviation:

a.                    go to the menu and choose Analyze b.                   Select Descriptives

- select variables you want to analyze

             

Page 18: BASIC STATISTICS For the HEALTH SCIENCES Fifth Edition By Kuzma.

COEFFICIENT OF VARIATION A.                  Defined as the ratio of the standard deviation to the

absolute value of the mean, expressed as a percentage  

CV =  

B.                   Depicts the size of the standard deviation in comparison to its mean

C.                   Possible to use it to compare the relative variation of even unrelated quantities

%

100

x

s

Page 19: BASIC STATISTICS For the HEALTH SCIENCES Fifth Edition By Kuzma.

MEASURING AND INTERPRETING SKEWNESS A.                  Skewed – a distribution that has a marked or pronounced

asymmetry

B.                   Use SPSS to measure skewness

1.                    Open dialog box titled Options

2.                    Select Skewness

3.                    Two columns under skewness a.                    Statistic b.                   Std Error

4.                    The Statistics column gives the actual measure of skewness, or the standard error of skewness

Page 20: BASIC STATISTICS For the HEALTH SCIENCES Fifth Edition By Kuzma.

MEANS AND STANDARD DEVIATIONS

OF POPULATIONS A.                  Population Mean

1.                    defined as the sum of the values divided by the number of observations for the entire population (N)

2.                    it is the sum of the squared deviations from the population mean, divided by N

3.                    The sample mean (x) is an estimate of the population mean and is the sum of values in the sample divided by n, the number of observations in the sample alone

4.                    The population variance is the is the sum of the squared deviations from the population mean divided by N, whereas the sample variance is the sum of the squared deviations from the sample mean

Page 21: BASIC STATISTICS For the HEALTH SCIENCES Fifth Edition By Kuzma.

CONCLUSION

In describing data by use of a summary measure, it is important to select the measure of central tendency that most accurately represents the data. A key factor is to determine of the data are symmetrical or skewed. Data are most commonly represented by two summary measures – one to indicate central tendency and one to indicate variation. The most commonly used pair is the arithmetic mean and the standard deviation.

Page 22: BASIC STATISTICS For the HEALTH SCIENCES Fifth Edition By Kuzma.

CHAPTER 5

Probability

Page 23: BASIC STATISTICS For the HEALTH SCIENCES Fifth Edition By Kuzma.

OUTLINE  5.1 WHAT IS PROBABILITY

Discusses the concept of probability as a measure of the likelihood of occurrence of a particular event 5.2 COMPLEMENTARY EVENTS

Demonstrates how to calculate probability when the events are complementary 5.3 PROBABILITY RULES

Solves problems involving the probability of compound events by use of the addition rule or the multiplication rule, or conditional probability

 5.4 COUNTING RULES

Explains how to compute the number of possible ways an event can occur by use of permutations and combinations

 5.5 PROBABILITY DISTRIBUTIONS

Illustrates the concept of a probability distribution, which lists the probabilities associated with the various outcomes of a variable

 5.6 LOTTERY PROBABILITY AND SAMPLING

Uses lottery probabilities to illustrate sampling with and without replacement 5.7 BINOMIAL DISTRIBUTION

Describes a common distribution having only two possible outcomes on each trial 

Page 24: BASIC STATISTICS For the HEALTH SCIENCES Fifth Edition By Kuzma.

LEARNING OBJECTIVES   1.                    Define probability and compute it in a given situation 2.                    State the basic properties of probability 3.                    Select and apply the appropriate probability rule for a

given situation 4.                    Distinguish between mutually exclusive events and

independent events 5.                    Distinguish between permutations and combinations, and

be able to compute them for various events 6.                    Explain what a probability distribution is, and state its

major use 7.                    State the probabilities by using a binomial distribution 8.                    Interpret the symbols in the binomial term  

Page 25: BASIC STATISTICS For the HEALTH SCIENCES Fifth Edition By Kuzma.

WHAT IS PROBABILITY A.                  Applies exclusively to a future event

B.                   Probability statements are numeric, defined in the range 0 to 1, never more and never less

C.                   Defined as the ratio of the number of ways the specified event can occur to the total number of equally likely events that can occur

D.                  P(E), can be defined as the proportion of times a favorable event will occur in a long series of repeated trials:

  P(E) = =

=

N

noutcomespossibleofnumber

outcomesfavorableofnumber

Page 26: BASIC STATISTICS For the HEALTH SCIENCES Fifth Edition By Kuzma.

Probability Probabilities are always stated in

terms between 1 and 0. (Rule 1) The sum of all probabilities must

equal 100% Three basic rules

The independence rule The product rule The addition rule

Page 27: BASIC STATISTICS For the HEALTH SCIENCES Fifth Edition By Kuzma.

Mutually exclusive

Event A is mutually exclusive of event B, if event A occurs, than event B cannot occur and conversely, if event B occurs, than event A cannot occur.

Rule 2: P(A) = 1 – P(A)

Page 28: BASIC STATISTICS For the HEALTH SCIENCES Fifth Edition By Kuzma.

Independence

Coin 1 Coin 2H HH TT HT T

A

A A

A A

A

• The first law of probability: (rule 5)

– The probability of the occurrence of any two or more mutually exclusive events is equal to the sum of their separate probabilities.

If a coin is flipped two times in a row and heads shows on the first flip is there a better chance that heads will show on the second flip?

P{H|H}= {H|T}

A

A

Page 29: BASIC STATISTICS For the HEALTH SCIENCES Fifth Edition By Kuzma.

COMPLEMENTARY EVENTS A.                  Event is the complement of event A

B.                   P(A) = sum of probabilities of outcomes in A

C.                   P = sum of probabilities of outcomes in  

And  

Therefore

A

A A

1 APAP

APAP 1

Page 30: BASIC STATISTICS For the HEALTH SCIENCES Fifth Edition By Kuzma.

A

A not B AB

A

AA

A

A

A(4 cards)(48 cards)

Total 52 cards

Page 31: BASIC STATISTICS For the HEALTH SCIENCES Fifth Edition By Kuzma.

Conditional Probability

Suppose that in the case of drawing cards, the first card drawn is not replaced.

If the first card draw is an ace what is the probability that the second card with be an ace?

What is the probability the second will be a deuce?

P (B | A) = P (AB) / P(A)The vertical line “|” between B and A indicates that the event to the right of the vertical line (event A) is a condition potentially influencing the probability of the event to the left (event B)

Page 32: BASIC STATISTICS For the HEALTH SCIENCES Fifth Edition By Kuzma.

Probability of a Ace on first draw4/52 = 1/13 = .0769

Page 33: BASIC STATISTICS For the HEALTH SCIENCES Fifth Edition By Kuzma.

A

A not B AB

A

AA

A

A

A

A

A

(4 cards)(48 cards)

Total 52 cards

Page 34: BASIC STATISTICS For the HEALTH SCIENCES Fifth Edition By Kuzma.

Probability of a Ace on first draw4/52 = 1/13 = .0769

Probability of an ace on second draw without replacement3/51 = .0588

Page 35: BASIC STATISTICS For the HEALTH SCIENCES Fifth Edition By Kuzma.

A

A not B AB

A

A

A

AA

A A

A(3 cards)(48 cards)

Total 51 cards

Page 36: BASIC STATISTICS For the HEALTH SCIENCES Fifth Edition By Kuzma.

Probability of a Ace on first draw4/52 = 1/13 = .0769

Probability of an ace on second draw without replacement3/51 = .0588

Probability of a deuce on second draws without replacement4/51 = .0788

Page 37: BASIC STATISTICS For the HEALTH SCIENCES Fifth Edition By Kuzma.

A

A not B AB

2

2

2

22

2

(4 cards)(48 cards)

Total 51 cards2

2A

A

Page 38: BASIC STATISTICS For the HEALTH SCIENCES Fifth Edition By Kuzma.

Product(Multiplication Theorem)

The second law of probability: rule 6 The probability of the simultaneous or

successive occurrence of two or more independent events is equal to the product of their separate probabilities.

What is the probability of drawing an ace three times in a row with replacement?

1/13 x 1/13 x 1/13 = 1/2197

A

A A

A A

A

P(AB) = P(A) x P(B | A)

Page 39: BASIC STATISTICS For the HEALTH SCIENCES Fifth Edition By Kuzma.

Bayes’ Rule

Conditional probability:P (B | A) = P (AB) / P(A)

andMultiplication theoremP (AB) = P(A | B) P (B)

Bayes’ RulesP (B | A) = P (A | B) P(B) / P(A)

Page 40: BASIC STATISTICS For the HEALTH SCIENCES Fifth Edition By Kuzma.

Addition All the different possible

probabilities must add up to 1.0 (100%)

A probability must always be < 1.0

p + (1 – p) = 1.0 P(A or B) = P (A) + P (B) – P(AB)

Page 41: BASIC STATISTICS For the HEALTH SCIENCES Fifth Edition By Kuzma.

Addition Theorem

A

A not B AB

Total 52 cards

BnotA B

Page 42: BASIC STATISTICS For the HEALTH SCIENCES Fifth Edition By Kuzma.

Addition Theorem

A

A not B AB

Total 52 cards

BnotA B

Total 52 cards

A

A not B ABB

notA B

AB

The probability of drawing an ace or a club.

P(A or B) = P (A) + P (B) – P(AB)P ( Ace or Club) = 4/52 + 13/52 – 1/52 = 16/52

Page 43: BASIC STATISTICS For the HEALTH SCIENCES Fifth Edition By Kuzma.

PROBABILITY RULES B.                  The Addition Rule:

The probability that event A or event B (or both) will occur equals the sum of the probabilities of each individual event minus the probability of both

  

  BandAPBPAPBorAP

Page 44: BASIC STATISTICS For the HEALTH SCIENCES Fifth Edition By Kuzma.

COUNTING RULES A.                  Rule 1: Number of Ways  

If event A can occur in n1, distinct ways and event B can occur in n2 ways, then the events consisting of A and B can occur in n1 – n2 ways

  B.                   Rule 2: Permutations

1.                    In determining the number of ways in which you can arrange a group of objects, you must first know whether the order of arrangement plays a role.

2.                    A permutation is a selection of r objects from a group of n objects, taking the order of selection into account

  C.                   Rule 3: Combinations   A selection of a subgroup of distinct objects, with order not being important

Page 45: BASIC STATISTICS For the HEALTH SCIENCES Fifth Edition By Kuzma.

PROBABILITY DISTRIBUTIONS

A.     Defined as a complete list of all possible outcomes, together with the probability of each

B.      Random Variable – variable which can assume any number of values

C.     A probability distribution is a list of the probabilities associated with the values of the random

variable obtained in an experiment

Page 46: BASIC STATISTICS For the HEALTH SCIENCES Fifth Edition By Kuzma.

LOTTERY PROBABILITY AND SAMPLING

A.     Sampling without replacement –

once a number has been drawn (of subject selected) that number may not be drawn again

B.      Sampling with replacement –

The same number (or subject) may be picked multiple times

Page 47: BASIC STATISTICS For the HEALTH SCIENCES Fifth Edition By Kuzma.

BINOMIAL DISTRIBUTION A.        A binomial distribution serves as a model for outcomes limited to

two choices (e.g. sick or well, dead or alive…)

B.        Expansion of the binomial term , where

1. is the probability of successful outcome

2. = 1 - is the probability of an unsuccessful outcome and

3. is the number of trials or attempts

nqp

p

q p

n

Page 48: BASIC STATISTICS For the HEALTH SCIENCES Fifth Edition By Kuzma.

BINOMIAL DISTRIBUTION

C. The binomial expansion is applicable provided that

1.       Each trial has only two possible outcomes – success or failure

2.       The outcome of each trial is independent of the outcomes of any other trial

3.       The probability of success, pm is constant from trial to trial

Page 49: BASIC STATISTICS For the HEALTH SCIENCES Fifth Edition By Kuzma.

Binomial expansionIf two coins are tossed at once, four possibilities may occur.

Coin 1 Coin 2H HH TT HT T

1/41/41/41/4

4/4 = 1

1/2

Page 50: BASIC STATISTICS For the HEALTH SCIENCES Fifth Edition By Kuzma.

Binomial expansion

If three coins are tossed at once, eight possibilities may occur.

Row First coin Second coin

Third coin

1 H H H

2 H H T

3 H T H

4 T H H

5 H T T

6 T H T

7 T T H

8 T T T

Page 51: BASIC STATISTICS For the HEALTH SCIENCES Fifth Edition By Kuzma.

BINOMIAL DISTRIBUTION

D.                  In general, the probability of an event consisting of r successes out of n trials is

 

wheren = the number of trials in an experiment

r = the number of successes n- r = the number of failures p = the probability of success q = 1 – p, the probability of failure

rnrqp

rnr

nsuccessesrP

!!

!

Page 52: BASIC STATISTICS For the HEALTH SCIENCES Fifth Edition By Kuzma.

1! 1

2! 1*2 2

3! 1*2*3 6

4! 1*2*3*4 24

5! 1*2*3*4*5 120

6! 1*2*3*4*5*6 720

7! 1*2*3*4*5*6*7 5040

8! 1*2*3*4*5*6*7*8 40320

Factorials

Page 53: BASIC STATISTICS For the HEALTH SCIENCES Fifth Edition By Kuzma.

BINOMIAL DISTRIBUTION

The expression    

is a term from the binomial expansion 

rnrqp

rnr

n

!!

!

Page 54: BASIC STATISTICS For the HEALTH SCIENCES Fifth Edition By Kuzma.

BINOMIAL DISTRIBUTION

The entire expansion lists the terms for r successes and (n – r) failures from the binomial distribution

=

=

32233

!23!2

!3

!13!1

!3pqppqqqp

3223 33 pqppqq

SPFSPFSPFP 31,22,13

Page 55: BASIC STATISTICS For the HEALTH SCIENCES Fifth Edition By Kuzma.

Six coins?

A table would be complex.

Formula?

(p + q)2 = pn + n/1 p(n-1)q + n(n-1)/(1)(2) p (n-2) q2 + …+ qn

There is an easier way! (Pascal’s Triangle)

Page 56: BASIC STATISTICS For the HEALTH SCIENCES Fifth Edition By Kuzma.

Pascal’s Triangle

1 1 1 2

2 1 2 1 4

3 1 3 3 1 8

4 1 4 6 4 1 16

5 1 5 10 10 5 1 32

6 1 6 15 20 15 6 1 64

n Binomial Coefficient Denominator of p

To complete 7 trials add a column on each end of the row insert the value 1 in those cells then add two adjacent numbers in line 6 and place that value In the cell on line 7 between the two values in the row below.

Page 57: BASIC STATISTICS For the HEALTH SCIENCES Fifth Edition By Kuzma.

PowerBall Lottery

53 White balls5 balls selected(53!)/(5!)(48!)=1/2,869,68542 red ballsPowerball selected1/47Jackpot: 1/2,869,685* 1/47 = 1/120,526,770

Page 58: BASIC STATISTICS For the HEALTH SCIENCES Fifth Edition By Kuzma.

CONCLUSION

Probability measures the likelihood that a particular event will or will not occur. In a long series of trials, probability is the ration of the number of favorable outcomes to the total number of equally likely outcomes. Permutations and combinations are useful in determining the number of outcomes. If compound events are involved, we need to select and apply the addition rule or the multiplication rule to compute probabilities. The outcome of an experiment together with its respective probabilities constitutes a probability distribution. One very common probability distribution is the binomial distribution. It presents the probabilities of various numbers of successes in trials where there are only two possible outcomes to each trial.