Stat 155, Section 2, Last Time
description
Transcript of Stat 155, Section 2, Last Time
Stat 155, Section 2, Last Time• Continuous Random Variables
– Probabilities modeled with areas
• Normal Curve– Calculate in Excel: NORMDIST & NORMINV
• Means, i.e. Expected Values– Useful for “average over many plays”
• Independence of Random Variables
Reading In Textbook
Approximate Reading for Today’s Material:
Pages 277-286, 291-305
Approximate Reading for Next Class:
Pages 291-305, 334-351
Midterm I - Results
Preliminary comments:
• Circled numbers are points taken off
• Total for each problem in brackets
• Points evenly divided among parts
• Page total in lower right corner
• Check those sum to total on front
• Overall score out of 100 points
Midterm I - Results
Interpretation of Scores:
• Too early for letter grades
• These will change a lot:
– Some with good grades will relax
– Some with bad grades will wake up
• Don’t believe “A & C” average to “B”
Midterm I - Results
Too early
for letter
Grades:
Recall
Previous
scatterplot
Intro Statistics Scores
20
30
40
50
60
70
80
90
100
20 40 60 80 100
Midterm 1
Mid
term
2
Midterm I - Results
Interpretation of Scores:
• 85 – 100 Very Pleased
Midterm I - Results
Interpretation of Scores:
• 85 – 100 Very Pleased
• 65 – 84 OK
Midterm I - Results
Interpretation of Scores:
• 85 – 100 Very Pleased
• 65 – 84 OK
• 0 – 64 Recommend Drop Course
(if not, let’s talk personally…)
Midterm I - Results
Histogram
of Results:
Overall I’m
very pleased
relative to
other courses
Stor 155, Sec. 2, Midterm 1
0
5
10
15
20
25
25 35 45 55 65 75 85 95 105
Scores
Fre
qu
ency
Variance of Random Variables
Again consider discrete random variables:
Where distribution is summarized by a table,
Values x1 x2 … xk
Prob. p1 p2 … pk
Variance of Random Variables
Again connect via frequentist approach:
n
iin XX
nXX
1
21 1
1,...,var
1
222
21
nXXXXXX n
1
##2
11
n
XxxXXxxX kkii
Variance of Random Variables
Again connect via frequentist approach:
2211 XxpXxp kk
n
iin XX
nXX
1
21 1
1,...,var
22
11
1#
1#
Xxn
xXXx
nxX
kkii
k
iii Xxp
1
2
Variance of Random VariablesSo define:
Variance of a distribution
As:
random variable
k
jXjjX xp
1
22
Variance of Random Variables
E. g. above game:
=(1/2)*5^2+(1/6)*1^2+(1/3)*8^2
Note: one acceptable Excel form,
e.g. for exam (but there are many)
2222 1931
1061
1421 X
X
Winning -4 0 9
Prob. 1/2 1/6 1/3
Standard Deviation
Recall standard deviation is square root of
variance (same units as data)
E. g. above game:
Standard Deviation
=sqrt((1/2)*5^2+(1/6)*1^2+(1/3)*8^2)
Winning -4 0 9
Prob. 1/2 1/6 1/3
Variance of Random VariablesHW:
C16: Find the variance and standard
deviation of the distribution in 4.59.
(0.752, 0.867)
Properties of Variancei. Linear transformation
I.e. “ignore shifts” var( ) = var
( )
(makes sense)
And scales come through squared
(recall s.d. on scale of data, var is square)
222XbaX a
Properties of Variance
ii. For X and Y independent (important!)
I. e. Variance of sum is sum of variances
Here is where variance is “more natural”
than standard deviation:
222YXYX
22YXYX
Properties of Variance
E. g. above game:
Recall “double the stakes”, gave same mean, as “play twice”, but seems different
Doubling:
Play twice, independently:
Note: playing more reduces uncertainty
(var quantifies this idea, will do more later)
222 4 XX
2222 22121 XXXXX
Winning -4 0 9
Prob. 1/2 1/6 1/3
Variance of Random Variables
HW:
C17: Suppose that the random variable X
models winter daily maximum
temperatures, and that X has mean 5o C
and standard deviation 10o C. Let Y be
the temp. in degrees Fahrenheit
(a) What is the mean of Y? (41oF)
Hint: Recall the conversion: C=(5/9)(F-32)
Variance of Random Variables
HW:
C17: (cont.)
(b) What is the standard deviation of Y?
(18oF)
And now for something completely different
Recall
Distribution
of majors of
students in
this course:
Stat 155, Section 2, Majors
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
Busine
ss /
Man
.
Biolog
y
Public
Poli
cy /
Health
Pharm
/ Nur
sing
Jour
nalis
m /
Comm
.
Env. S
ci.
Other
Undec
ided
Fre
qu
ency
And now for something completely different
Couldn’t
Find
Any
Great
Jokes,
So…
And now for something completely different
An Interesting and Relevant Issue:
• “Places Rated”
• Rankings Published by Several…
• We’ve been #1?
• Are we great ot what?
Will take a careful look later
Chapter 5
Sampling Distributions
Idea: Extend probability tools to distributions
we care about:
(i) Counts in Political Polls
(ii) Measurement Error
Counts in Political Polls
Useful model: Binomial Distribution
Setting: n independent trials of an
experiment with outcomes “Success” and
“Failure”, with P{S} = p.
Say X = #S’s has a “Binomial(n,p)
distribution”, and write “X ~ Bi(n,p)”
(parameters, like for Normal dist.)&
Binomial Distributions
Models much more than political polls:
E.g. Coin tossing
(recall saw “independence” was good)
E.g. Shooting free throws (in basketball)
• Is p always the same?
• Really independent? (turns out to be OK)
Binomial Distributions
HW on Binomial Assumptions:
5.1, 5.2 (a. no, n?, b. yes, c. yes)
Binomial Distributions
Could work out a formula for Binomial Probs,
but results are summarized in Excel function:
BINOMDIST
Example of Use:http://stat-or.unc.edu/webspace/postscript/marron/Teaching/stor155-2007/Stor155Eg19.xls
Binomial Probs in EXCEL
To compute P{X=x}, for X ~ Bi(n,p):
x
n
p
Binomial Probs in EXCEL
To compute P{X=x}, for X ~ Bi(n,p):
Cumulative:
P{X=x}: false
P{X<=x}: true
Binomial Probs in EXCEL
http://stat-or.unc.edu/webspace/postscript/marron/Teaching/stor155-2007/Stor155Eg19.xls
Check this spreadsheet for details of other
parts, and some important variations
Binomial Probs in EXCEL
Next time:
More slides on BINOMDIST,
And illustrate things like P{X < 3} = P{X <= 2}, etc.
Using a number line, and filled in dots…
Binomial Probs in EXCEL
HW:
5.3
5.4 (0.194)
Rework, using the Binomial Distribution:
4.52c,d
Binomial Distribution
“Shape” of Binomial Distribution:
Use Probability Histogram
Just a bar graph, where heights are
probabilities
Note: connected to previous histogram
by frequentist view
(via histogram of repeated samples)
Binomial Distribution
Study Distribution Shapes using Excelhttp://stat-or.unc.edu/webspace/postscript/marron/Teaching/stor155-2007/Stor155Eg20.xls
Part I: different p, note several ranges of p
are shown
Part II: different n, note really “live in different
areas”
Binomial Distribution
A look under the hoodhttp://stat-or.unc.edu/webspace/postscript/marron/Teaching/stor155-2007/Stor155Eg20.xls
Create probability histograms by:
– Create Column of xs (e.g. B9:B29)
– Create Probs (using BINOMDIST, C9:J29)
– Plot with Chart Wizard
Click Chart & Chart Wizard
Follow steps, check “series” carefully)
Binomial Distribution
With some calculation, can show:
For X ~ Bi(n,p):
Mean: (# trials x P{S})
Variance:
S. D.:
Relate to (center & spread) of each histo:http://stat-or.unc.edu/webspace/postscript/marron/Teaching/stor155-2007/Stor155Eg20.xls
pnEX X)1()var( 2 ppnx X
pnpX 1
Binomial Distribution
HW on Mean and Variance:
5.5
Binomial Distribution
E.g.: Class HW on %Males at UNC:http://stat-or.unc.edu/webspace/postscript/marron/Teaching/stor155-2007/Stor155Eg17.xls
Note Theoretical Means in E115:H115,
Compare to Sample Means in E110:H110:
Q1: Sample Mean smaller – course not representative
Q2: Sample Mean bigger – bias toward males
Q3: Sample Mean bigger – bias toward males
Q4: Sample Mean close
Which differences are “significant”?
Binomial Distribution
E.g.: Class HW on %Males at UNC:http://stat-or.unc.edu/webspace/postscript/marron/Teaching/stor155-2007/Stor155Eg17.xls
Note Theoretical SDs in E116:H115,6
Compare to Sample SDs in E112:H112:
Q1: Sample SDs smaller – course population smaller
Q2: Sample SDs bigger – variety of doors (different p)
Q3: Sample SDs bigger – variety of choices (diff. p?)
Q4: Sample SDs close
Which differences are “significant”?
Binomial Distribution
E.g.: Class HW on %Males at UNC:http://stat-or.unc.edu/webspace/postscript/marron/Teaching/stor155-2007/Stor155Eg17.xls
Probability Histograms (see 3rd column of plots),
Good view of above ideas (for samples):
Q1: mean too small, not enough spread
Q2: mean too big, too spread
Q3: mean too big, too spread
Q4: looks “about right”…
Binomial Distribution
HW:
5.13
5.19
And now for something completely different
An Interesting and Relevant Issue:
• “Places Rated”
• Rankings Published by Several…
• We’ve been #1?
• Are we great ot what?
Will take a careful look now
And now for something completely different
Interesting Article:
Analysis of Data from the Places Rated Almanac
By: Richard A. Becker; Lorraine Denby; Robert McGill; Allan R. Wilks
Published in: The American Statistician, Vol. 41, No. 3. (Aug., 1987), pp. 169-186.
Hyperlink to JSTOR
And now for something completely different
Main Ideas:
• For data base used in ratings
• Did careful analysis
• In an unbiased way
• Studied several aspects
• An interesting issue:
Who was “best”?
And now for something completely different
Who was “best”?
• Data base had 8 factors
• How should we weight them?
• Evenly?
• Other choices?
• Just choose some?
(typical approach)
• Can we make our city “best”?
And now for something completely different
Who was “best”?
• Approach:
Consider all possible ratings
(i.e. all sets of weights)
• Which places can be #1?
• Which places can be “worst”?
And now for something completely different
Which places can be #1?
• 134 cities are “best”
• Including Raleigh Durham area
Which places can be “worst”?
• Even longer list here
• But Raleigh Durham not here
And now for something completely different
Which places can be #1?
Which places can be “worst”?
Interesting fact:
Several cities on both lists!
And now for something completely different
Some conclusions:
• Be very skeptical of such ratings?
• Ask: what happens if weights change?
• Think: what motivates the rater?
• Understand how other people can have different opinions
(Just different “personal weights”)