Probability Probability; Sampling Distribution of Mean, Standard Error of the Mean;...

Post on 17-Dec-2015

223 views 2 download

Tags:

Transcript of Probability Probability; Sampling Distribution of Mean, Standard Error of the Mean;...

Probability

Probability; Sampling Distribution of Mean, Standard Error of the Mean; Representativeness of the Sample Mean

Probability – Frequency View

Probability is long run relative frequency Same as relative frequency in the population Dice toss p(1) = p(2) = …=p(6) = 1/6 Coin flip p(Head) = p(Tail) = .5

Probability & Decision Making Decision making like gambling – go with what

is likely. Lady tasting tea in England. Milk first or

second? 5 cups of tea to taste. What is the probability

she gets it right?

If you cannot tell the difference, how likely will you be right on all cups?Cup Probability

Correct

1 .5 ½

2 .25 ½*½

3 .125 ½*½*½

4 .0625 ½*½*½*½

5 .03125 ½*½*½*½*½

How many cups would it take to convince you? Convention in social science is a probability of .05. Using this standard, she would have to get all 5 right to be convincing in her ability. She did; they were.

Frequency Distribution of the Mean What is the distribution of means if we roll

dice once? What is the distribution of means if we roll

dices twice and take the average? Three times? (See Excel File ‘dice’)

Dice

1 Die Ave of 2 Dice Ave of 3 Dice

M = 3.5SD = .99

M = 3.5SD = 1.87 M = 3.5

SD = 1.23

Notice the mean, standard deviation, and shape of the distributions.

Raw Data Sampling Distributions of Means

Sampling Distribution

Notion of trials, experiments, replications Coin toss example (5 flips, # heads) Repeated estimation of the mean Sampling distribution is a distribution of a

statistic (not raw data) over all possible samples. Same as distribution over infinite number of trials. Recall dice example.

Estimator

We use statistics to estimate parameters Most often Suppose we want to estimate mean height of

students at USF. Sample students, estimate M. Accuracy of estimate depends mostly upon N

and SD.

X

Example of HeightHypothetical data.

4;66

Note that graph shows the population.

8280787674727068666462605856545250Heignt in Inches

0.80

0.64

0.48

0.32

0.16

0.00

Rel

ativ

e F

requ

ency

RAW DATAHeight of USF Students

Raw Data vs. Sampling Distribution

80787674727068666462605856545250

Heignt in Inches

0.8

0.6

0.4

0.2

0.0

Rel

ativ

e F

requ

ency

Two DistributionsRaw and Sampling

Raw Data

Means (N=50)

Note middle and spread of the two distributions. How do they compare?

Definition of Bias

Statisticians have worked out properties of sampling distributions

Middle and spread of sampling distribution are known.

If mean of sampling distribution equals parameter, statistic is unbiased. (otherwise, it’s biased.) The sample mean is unbiased.

Best estimate of is .X

X

Definition of Standard Error

The standard deviation of the sampling distribution is the standard error. For the mean, it indicates the average distance of the statistic from the parameter.

80787674727068666462605856545250

Heignt in Inches

Raw Data

Means (N=50)

Standard ErrorStandard error of the mean.

Formula: Standard Error of Mean To compute the SEM,

use:

For our Example:

NX

X

57.50

4X

80787674727068666462605856545250

Heignt in Inches

Raw Data

Means (N=50)

Standard Error

Standard error = SD of means = .57

Review

What is a sampling distribution? What is bias? What is the standard error of a statistic? Suppose we repeatedly sampled 100 people

at a time instead of 50 for height at USF. What would the mean of the sampling

distribution? What would be the standard deviation of the

sampling distribution?

Definition

A sampling distribution is a distribution of _____? 1 parameters 2 samples 3 statistics 4 variables

Definition

What is the standard error of the mean? 1 average distance of standard from the error 2 average distance of raw data (X) from the data

average (X-bar) 3 square root of the sampling distribution of the

variance 4 standard deviation of the sampling distribution

of the mean

Computation

If the population mean is 50, the population standard deviation is 2, and the sample size is 100, what is the standard error of the mean?

1 .2 2 .5 3 2 4 10

Deciding whether a Sample represents a Population

X

Xz

We can use the normal distribution to figure the probability of a sample mean. If the sample mean is very unlikely (has a low probability) we conclude the sample does not represent the population. If it is likely, we conclude it does.

Suppose we grab a sample of 49 students and their mean GPA is 3.7. We know the population mean is 3.1 and the population SD is .35. Is the sample representative?

1005.

5.

05.

2.37.3

X

Xz

05.7

35.

49

35.X

Representativeness: degree to which the sample distribution resembles the population distribution.

Likely?

3210-1-2-3

Scores in standard deviations from mu

0.4

0.3

0.2

0.1

0.0Pro

ba

bili

ty (

Re

lativ

e F

req

ue

ncy

)

Standard Normal Curve

Standard Normal Curve

Standard Normal Curve

Standard Normal Curve

50 Percent

34.13 %

13.59%

2.15%

1005.

7.

05.

2.39.3

X

Xz

Area beyond 10 =?

From z table:

p = 7.69*10-23

Recall that anything beyond z = 2 is rare; anything beyond z = 3 is remote.

Rejection RegionPlace in the curve that is unlikely if the scenario is true. Area totals to probability.

3210-1-2-3

Scores in standard deviations from mu

0.4

0.3

0.2

0.1

0.0Pro

ba

bili

ty (

Re

lativ

e F

req

ue

ncy

)

Standard Normal Curve

Standard Normal Curve

Standard Normal Curve

Standard Normal Curve

50 Percent

34.13 %

13.59%

2.15%

Convention is p = .05; That 5 percent of the area least likely to occur if the scenario is true is the rejection region. In most cases, the extremes of both tails are the places for the rejection region. The sample is unrepresentative if it falls far from the center. For z, the border is +/- 1.96 for p = .05 for 2 tails. For 1 tail, it is 1.65.

Bottom 2.5 pct Top 2.5 pct

Review

We know the population mean is 50 and the population standard deviation is 10. We grab 100 people at random and find the mean of the sample is 45. Does the sample represent the population?