SADC Course in Statistics The binomial distribution (Session 06)

15
SADC Course in Statistics The binomial distribution (Session 06)

Transcript of SADC Course in Statistics The binomial distribution (Session 06)

Page 1: SADC Course in Statistics The binomial distribution (Session 06)

SADC Course in Statistics

The binomial distribution

(Session 06)

Page 2: SADC Course in Statistics The binomial distribution (Session 06)

2To put your footer here go to View > Header and Footer

Learning Objectives

At the end of this session you will be able to:

• describe the binomial probability distribution including the underlying assumptions

• calculate binomial probabilities for simple situations

• apply the binomial model in appropriate practical situations

Page 3: SADC Course in Statistics The binomial distribution (Session 06)

3To put your footer here go to View > Header and Footer

Study of child-headed households• One devastating effect of the HIV and AIDS

pandemic is the emergence of child-headed households, i.e. ones where both parents have died and the children are left to fend for themselves.

• Suppose it is of interest to study in greater detail those households that are child-headed.

• Statistical techniques that may be employed require initially, a knowledge of the distributional pattern of the random variable X corresponding to the number of child-headed households.

Page 4: SADC Course in Statistics The binomial distribution (Session 06)

4To put your footer here go to View > Header and Footer

• Interest is on the distribution of

X = number of child-headed households

• Under certain assumptions, X has a binomial distribution.

• To introduce this distribution, we first deal with a simpler (but related) distribution

A probability distribution for X

Page 5: SADC Course in Statistics The binomial distribution (Session 06)

5To put your footer here go to View > Header and Footer

The Bernoulli Distribution

The simplest probability distribution is one describing the behaviour of a dichotomous (binary) random variable, i.e. one with two possible outcomes; (Success, Failure), (Yes, No), (Female, Male), etc.

Outcome Values of random variable

Probability

Success 1 p

failure 0 1-p

Total 1

Page 6: SADC Course in Statistics The binomial distribution (Session 06)

6To put your footer here go to View > Header and Footer

Background

In general, we have a sequence of n trials, each with just two possible outcomes.e.g. visiting n households in turn and recording whether it is child-headed.

Call one outcome a “success”, the other a “failure”. Let probability (of a success) = p.

The word success is a generic term used to represent the outcome of interest, e.g. if a sampled household is child-headed we call it a “success” because that is the outcome of interest.

Page 7: SADC Course in Statistics The binomial distribution (Session 06)

7To put your footer here go to View > Header and Footer

Let X be the number of successes in n trials.

X is said to have a binomial distribution if:

• The probability of success p is the same for each trial.

• The trials have independent outcomes.

In the context of our example, X=number ofchild-headed HHs from n HHs sampled.

p=probability of a HH being child-headed.

Under what conditions would X be binomial?

Basics and terminology

Page 8: SADC Course in Statistics The binomial distribution (Session 06)

8To put your footer here go to View > Header and Footer

Binomial Probability Distribution

The probability of finding k successes out of n trials is given by

nkppknk

nkXP knk ,,1,0,)1(

)!(!

!)(

Here n! = n(n-1)(n-2)……. (3) (2) (1); 0!=1.

Thus, for example, 4! = 4 x 3 x 2 x 1 = 24.

Exercise: If p=0.2 and n=10, confirm that

3 10 3103 0 2 1 0 2 0 2

3 10 3

!P( X ) . ( . ) .

!( )!

Page 9: SADC Course in Statistics The binomial distribution (Session 06)

9To put your footer here go to View > Header and Footer

Binomial Probability Distribution

In computing binomial probabilities, the value of p is often unknown.

It is then estimated by the proportion of successes in the sample.

i.e.

i.e.

Following graphs show binomial probabilities for n=10 and differing values of p.

observed no.of successes ( say r ) in n trialsp̂

total sample size,n

r

p̂n

Page 10: SADC Course in Statistics The binomial distribution (Session 06)

10To put your footer here go to View > Header and Footer

There are 11 possible outcomes. Graph shows P(X=2)=0.3, P(X=3)=0.2, P(X>6) almost=0.

Binomial distribution with p = 0.2

0.00

0.05

0.10

0.15

0.20

0.25

0.30

0.35

0 1 2 3 4 5 6 7 8 9 10

X

Pro

ba

bil

ity

Example 1: Left-handedness

Suppose the probability of a person being left-handed is p=0.2.

Let X be number left-handed persons in a group of 10.

Graph shows probability of 0, 1, 2, … left-handed persons

Page 11: SADC Course in Statistics The binomial distribution (Session 06)

11To put your footer here go to View > Header and Footer

The distribution is symmetrical. We find P(X=2)=P(X=8)=0.044, P(X=3)=P(X=7)=0.12, etc.

Example 2: Tossing a coin A coin is tossed. The probability of getting a head is p=0.5.

Let X be number heads in 10 tosses of the coin.

Graph shows probability of getting 0, 1, 2, … heads.

Binomial distribution with p = 0.5

0.00

0.05

0.10

0.15

0.20

0.25

0.30

0 1 2 3 4 5 6 7 8 9 10

X

Pro

ba

bil

ity

Page 12: SADC Course in Statistics The binomial distribution (Session 06)

12To put your footer here go to View > Header and Footer

The distribution is now concentrated to the right.

Here P(X<4) is almost zero.

Example 3: Selecting a rural village

Ratio of rural villages to urban villages is 4:1.

Suppose 10 villages are selected at random. Let X be number of rural villages selected.

Graph shows probability of getting 0, 1, 2, … rural villages.

Binomial distribution with p = 0.8

0.00

0.05

0.10

0.15

0.20

0.25

0.30

0.35

0 1 2 3 4 5 6 7 8 9 10

X

Pro

ba

bil

ity

Page 13: SADC Course in Statistics The binomial distribution (Session 06)

13To put your footer here go to View > Header and Footer

Properties of Binomial DistributionThe mean (average) of the binomial distribution with parameters n and p = np.e.g. In a population of size 1000, suppose the probability of selecting a child-headed HH is p=0.03. Then the mean number of child-headed HHs is 1000x0.03 = 30.Recall that the mean = expected value of X. Thus

.)1()!(!

!)(

0

npppxnx

nxXE xnx

n

x

.1)1()!(!

!

0

xnxn

x

ppxnx

nNote: Since the binomial is a probability distribution,

Page 14: SADC Course in Statistics The binomial distribution (Session 06)

14To put your footer here go to View > Header and Footer

The standard deviation of the binomial distribution is

For n=1000, p=0.2 the standard deviation is therefore =[1000*0.2*0.8]½ = 12.65

The theoretical derivation is given below.

)1( pnp

2 2

0

2 2

2 2 2

1

1

1

nx n x

x

n!E( X ) x p ( p )

x!( n x )!

np( p ) n p .

Var( X ) E( X ) np( p ).

Above can be shown to be

Further Properties:

)1( pnp

Page 15: SADC Course in Statistics The binomial distribution (Session 06)

15To put your footer here go to View > Header and Footer

Practical work follows to ensure learning objectives

are achieved…