Economics 105: Statistics

Post on 05-Jan-2016

14 views 0 download

description

Economics 105: Statistics. Review #1 to be handed out Tuesday, due following Tuesday in class. Take-home, closed-book, closed-notes, untimed, must use Excel or calculator (and transfer answers to the exam paper). - PowerPoint PPT Presentation

Transcript of Economics 105: Statistics

Economics 105: Statistics• Review #1 to be handed out Tuesday, due following Tuesday in class. Take-home, closed-book, closed-notes, untimed, must use Excel or calculator (and transfer answers to the exam paper). • Formula sheet rules: No words, in English or otherwise. Only formulas/equations. No proofs. Symbols like B for Binomial are okay. Front & back of 1 sheet of paper. Excel help is okay.• Equation editor can be useful•Go over GH 6, GH 7 & 8 due Tuesday

Probability Distributions

Continuous Probability

Distributions

Binomial

Hypergeometric

Poisson

Probability Distributions

Discrete Probability

Distributions

Normal

Uniform

Exponential

Bernoulli

Exponential Distribution• • • • Graph• Useful for waiting time, duration, or queuing problems• Memoryless property• Find the prob no student arrives in next hour.• Find prob a student arrives in next 5 minutes.

Probability Distributions

Continuous Probability

Distributions

Binomial

Hypergeometric

Poisson

Probability Distributions

Discrete Probability

Distributions

Normal

Uniform

Exponential

Bernoulli

Normal Distribution• Let • The p.d.f. is given by

• “The bell curve”, also sometimes called the Gaussian distribution after this guy

• http://cnx.rice.edu/content/m11161/latest/#java• Reading the table … pages 915 in BLK, 11th edition. Note that numbers across the top (i.e., at top of each column) are the SECOND digit after the decimal.

The Normal Distribution• ‘Bell Shaped’• Symmetrical • Mean, Median and Mode

are EqualLocation is determined by the mean, μ

Spread is determined by the standard deviation, σ

The random variable has an infinite theoretical range: + to

Mean = Median= Mode

X

f(X)

μ

σ

By varying the parameters μ and σ, we obtain different normal distributions

Many Normal Distributions

Standardized Normal Distribution

Z

f(Z)

0

1

Values above the mean have positive Z-values, values below the mean have negative Z-values

The Z distribution always has mean = 0 and standard deviation = 1

Example• Convention • If X ~ N(100, 2500), then the Z value for

X = 200 is

• This says that X = 200 is two standard deviations (2 increments of 50 units) above the mean of 100.

Comparing X and Z units

Z100

2.00200 X

Note that the distribution is the same, only the scale has changed.

(μ = 100, σ = 50)

(μ = 0, σ = 1)

Finding Normal Probabilities

a b X

f(X) P a X b( )≤

Probability is measured by the area under the curve

P a X b( )<<=(Note that the probability of any individual value is zero)

f(X)

Probability as Area Under the Curve

0.50.5

The total area under the curve is 1.0, and the curve is symmetric, so half is above the mean, half is below

Empirical Rules

μ ± 1σ encloses about 68% of X’s

f(X)

Xμ μ+1σμ-1σ

What can we say about the distribution of values around the mean? There are some general rules:

σσ

68.26%

The Empirical Rule

• μ ± 2σ covers about 95% of X’s

• μ ± 3σ covers about 99.7% of X’s

2σ 2σ

3σ 3σ

95.44% 99.73%

(continued)

The Standardized Normal Table

• The Cumulative Standardized Normal table in the textbook (Appendix table E.2) gives the probability less than a desired value for Z (i.e., from negative infinity to Z)

Z0 2.00

0.9772

Example:

P(Z < 2.00) = 0.9772

The Standardized Normal Table

The value within the table gives the probability from Z = up to the desired Z value

.9772

2.0P(Z < 2.00) = 0.9772

The row shows the value of Z to the first decimal point

The column gives the value of Z to the second decimal point

2.0

.

.

.

(continued)

Z 0.00 0.01 0.02 …

0.0

0.1

Finding Normal Probabilities

• Suppose X ~ N(8, 25). Find P(X < 8.6)

X

8.6

8.0

• Suppose X ~ N(8, 25). Find P(X < 8.6)

Z0.12 0X8.6 8

μ = 8 σ = 10

μ = 0σ = 1

(continued)

Finding Normal Probabilities

P(X < 8.6) P(Z < 0.12)

Z

0.12

Z .00 .01

0.0 .5000 .5040 .5080

.5398 .5438

0.2 .5793 .5832 .5871

0.3 .6179 .6217 .6255

Solution: Finding P(Z < 0.12)

.5478.02

0.1 .5478

Standardized Normal Probability Table (Portion)

0.00

= P(Z < 0.12)P(X < 8.6)

Finding the X value for a Known Probability

Example:• Suppose X ~ N(8, 25)• Find the X value so that only 20% of all

values are below this X

X? 8.0

0.2000

Z? 0

Find the Z value for 20% in the Lower Tail

• 20% area in the lower tail is consistent with a Z value of -0.84Z .03

-0.9 .1762 .1736

.2033

-0.7 .2327 .2296

.04

-0.8 .2005

Standardized Normal Probability Table (Portion)

.05

.1711

.1977

.2266

…X? 8.0

0.2000

Z-0.84 0

1. Find the Z value for the known probability

2. Convert to X units using the formula:

Finding the X value

So 20% of the values from a distribution with mean 8.0 and standard deviation 5.0 are less than 3.80

More Examples• If Z ~ N(0,1), find P(-1 < Z < 1)

• If W ~ N(3,4), find P(-1 < W < 1)

Evaluating Normality• Construct charts or graphs

– For small- or moderate-sized data sets, do stem-and-leaf display and box-and-whisker plot look symmetric?

– For large data sets, does the histogram or polygon appear bell-shaped?

• Compute descriptive summary measures– Do the mean, median and mode have similar

values?– Is the interquartile range approximately 1.33 σ?– Is the range approximately 6 σ?

Evaluating Normality• Observe the distribution of the data set

– Do approximately 2/3 of the observations lie within mean 1 standard deviation?

– Do approximately 80% of the observations lie within mean 1.28 standard deviations?

– Do approximately 95% of the observations lie within mean 2 standard deviations?

• Evaluate normal probability plot– Is the normal probability plot approximately

linear with positive slope?

(continued)

The Normal Probability Plot• Normal probability plot

– Arrange data into ordered array

– Find corresponding standardized normal

quantile values

– Plot the pairs of points with observed data

values on the vertical axis and the standardized

normal quantile values on the horizontal axis

– Evaluate the plot for evidence of linearity

A normal probability plot for data from a normal distribution will be

approximately linear:

30

60

90

-2 -1 0 1 2 Z

X

The Normal Probability Plot(continued)

The Normal Probability PlotData 1/(9+1) = 1/10 Data

X Order Cumulative area Corresponding Z score X

1 1 0.1 -1.281551939 1

4 2 0.2 -0.841621042 4

12 3 0.3 -0.524400458 12

23 4 0.4 -0.253347241 23

55 5 0.5 5.47142E-10 55

67 6 0.6 0.253347241 67

75 7 0.7 0.524400458 75

87 8 0.8 0.841621042 87

112 9 0.9 1.281551939 112

Normal Probability Plot

Left-Skewed Right-Skewed

Rectangular

30

60

90

-2 -1 0 1 2 Z

X

(continued)

30

60

90

-2 -1 0 1 2 Z

X

30

60

90

-2 -1 0 1 2 Z

X Nonlinear plots indicate a deviation from normality

Other Continuous Distributions

Source: wikipedia pages