Exam 1 is two weeks from today (March 9 th ) in class 15% of your grade Covers chapters 1-6 and the...
-
Upload
megan-cain -
Category
Documents
-
view
212 -
download
0
Transcript of Exam 1 is two weeks from today (March 9 th ) in class 15% of your grade Covers chapters 1-6 and the...
• Exam 1 is two weeks from today (March 9th) in class
• 15% of your grade• Covers chapters 1-6 and the central limit
theorem.• I will put practice problems, old exams, and
specific sections that are not included on the web by the end of this week.
• You will be allowed to bring in one page of notes and a calculator. I’ll provide normal probability tables.
• Today: Continue with central limit theorem.
Announcement
Central Limit Theorem• Example One:
– Drive through window at a bank– Consider transaction times, Xi=transaction time
for person i– E(Xi) = 6 minutes and Var(Xi) = 32 minutes2.
Transaction time for each person is independent.– Thirty customers show up on Saturday morning.1.What is the probability that the total of all the
transaction times is greater than 200 minutes?2.What is the probability that the average
transaction time is between 5.9 and 6.1 minutes?
The “general interpretation” of the Central Limit Theorem suggests thatmeasurements that are the result of a large number of factors tend to benormally distributed.
Are heights and weights normallydistributed in the adult US population?
Use data to see. Create a histogram andsuperimpose normal distribution over it. “Control” for gender by doing this for eachgender.
Dataset: NHANES (National Health and Nutrition Examination Survey.)
About 16,000 adults were examined in “mobile examination centers”. The adultswere sampled to reflect the demographics of the US.
Hundreds of measurements were made oneach person.
Histogram is from data.
Blue lines are normalpdfs with
Means = 173.3 and 159.9
and std devs = 7.6 and 7.3
(the means and std devscome from the data)
Medians are 173.3 and 159.9
The data appear to be normally distributed (approximately)… We’ll see other ways to assess normality later in the semester.
Histogram is from data.
Blue lines are normalpdfs with
Means = 80.4 and 70.8
and std devs = 16.7 and 17.9
(the means and std devscome from the data)
Medians are 78.4 and 67.5
The data do not appear to normally distributed… (note difference between mean and median.) Why wouldn’t you expect normality here?
Central Limit Theorem• Example:
– 5 chemists independently synthesize a compound 1 time each.
– Each reaction should produce 10ml of a substance.– Historically, the amount produced by each reaction has
been normally distributed with std dev 0.5ml.1. What’s the probability that less than 49.8mls of the
substance are made in total?2. What’s the probability that the average amount
produced is more than 10.1ml?3. Suppose the average amount produced is more than
11.0ml. Is that a rare event? Why or why not? If more than 11.0ml are made, what might that suggest?
Answer:
• Central limit theorem:
If E(Xi)= and Var(Xi)=2 for all i (and independent) then:X1+…+Xn ~ N(n,n2)
(X1+…+Xn)/n ~ N(,2/n)
Lab:
1. Let Y = total amount made. Y~N(5*10,5*0.5) (by CLT)Pr(Y<49.8) = Pr[(Y-50)/1.58 < (49.8-50)/1.58]=Pr(Z < -0.13) = 0.45
2. Let W = average amount made.W~N(10,0.5/5) (by CLT)Pr(W > 10.1) = Pr[Z > (10.1 – 10)/0.32]=Pr(Z > 0.32) = 0.38
Lab (continued)
3. One definition of rare:It’s a rare event if Pr(W > 11.0) is small(i.e. if “Seeing probability of 11.0 or something more extreme is small”)Pr(W>11) = Pr[Z > (11-10)/0.32] = Pr(Z>3.16) = approximately zero.
This suggests that perhaps either the true mean is not 10 or true std dev is not 0.1 (or not normally distributed…)
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
Sample size: 1006(source: gallup.com)
• Let Xi = 1 if person i thinks the Presidentis hiding something and 0 otherwise.
• Suppose E(Xi) = p and Var(Xi) = p(1-p) and each person’s opinion is independent.
• Let Y = total number of “yesses”= X1+…+ X1006
• Y ~ Bin(1006,p)• Suppose p = 0.36 (this is the estimate…)• What is Pr(Y < 352)?
Note that this definitionturns three outcomes intotwo outcomes
Normal Approximation to the binomial CDF
– Even with computers, as n gets large, computing things like this can become difficult. (1006 is OK, but how about 1,000,000?)
– Idea: Use the central limit theorem approximate this probability– Y is approximately
N[1006*0.36,1006*(0.36)*(0.64)]= N(362.16,231.8) (by central limit theorem)
Pr[ (Y-362.16)/15.2 < (352-362.16)/15.2]= Pr(Z < -0.67) = 0.25
Pr(Y<352) = Pr(Y=0)+…+Pr(Y=351), where Pr(Y=k) = (1006 choose k)0.36k0.641006-k
Normal Approximation to the binomial CDF
Black “step function” is plots of bin(1006,0.36) pdf versus Y (integers)
Blue line is plot of Normal(362.16,231.8) pdf
Normal Approximation to the binomial CDF
Area under blue curve toleft of 352
is approximately equal to the
sum of areas ofrectangles (blackStepfunction) to the left of 352
Comments about normal approximation of the binomial :
Rule of thumb is that it’s OK if np>5 and n(1-p)>5.
“Continuity correction”
Y is binomial.
If we use the normal approximation to the probability that Y<k, we should calculate Pr(Y<k+.5)
If we use the normal approximation to the probability that Y>k, we should calculate Pr(Y<k-.5)
(see picture on board)