Stat 155, Section 2, Last Time

51
Stat 155, Section 2, Last Time • Continuous Random Variables – Probabilities modeled with areas • Normal Curve – Calculate in Excel: NORMDIST & NORMINV • Means, i.e. Expected Values – Useful for “average over many plays” • Independence of Random Variables

description

Stat 155, Section 2, Last Time. Continuous Random Variables Probabilities modeled with areas Normal Curve Calculate in Excel: NORMDIST & NORMINV Means, i.e. Expected Values Useful for “average over many plays” Independence of Random Variables. Reading In Textbook. - PowerPoint PPT Presentation

Transcript of Stat 155, Section 2, Last Time

Page 1: Stat 155,  Section 2, Last Time

Stat 155, Section 2, Last Time• Continuous Random Variables

– Probabilities modeled with areas

• Normal Curve– Calculate in Excel: NORMDIST & NORMINV

• Means, i.e. Expected Values– Useful for “average over many plays”

• Independence of Random Variables

Page 2: Stat 155,  Section 2, Last Time

Reading In Textbook

Approximate Reading for Today’s Material:

Pages 277-286, 291-305

Approximate Reading for Next Class:

Pages 291-305, 334-351

Page 3: Stat 155,  Section 2, Last Time

Midterm I - Results

Preliminary comments:

• Circled numbers are points taken off

• Total for each problem in brackets

• Points evenly divided among parts

• Page total in lower right corner

• Check those sum to total on front

• Overall score out of 100 points

Page 4: Stat 155,  Section 2, Last Time

Midterm I - Results

Interpretation of Scores:

• Too early for letter grades

• These will change a lot:

– Some with good grades will relax

– Some with bad grades will wake up

• Don’t believe “A & C” average to “B”

Page 5: Stat 155,  Section 2, Last Time

Midterm I - Results

Too early

for letter

Grades:

Recall

Previous

scatterplot

Intro Statistics Scores

20

30

40

50

60

70

80

90

100

20 40 60 80 100

Midterm 1

Mid

term

2

Page 6: Stat 155,  Section 2, Last Time

Midterm I - Results

Interpretation of Scores:

• 85 – 100 Very Pleased

Page 7: Stat 155,  Section 2, Last Time

Midterm I - Results

Interpretation of Scores:

• 85 – 100 Very Pleased

• 65 – 84 OK

Page 8: Stat 155,  Section 2, Last Time

Midterm I - Results

Interpretation of Scores:

• 85 – 100 Very Pleased

• 65 – 84 OK

• 0 – 64 Recommend Drop Course

(if not, let’s talk personally…)

Page 9: Stat 155,  Section 2, Last Time

Midterm I - Results

Histogram

of Results:

Overall I’m

very pleased

relative to

other courses

Stor 155, Sec. 2, Midterm 1

0

5

10

15

20

25

25 35 45 55 65 75 85 95 105

Scores

Fre

qu

ency

Page 10: Stat 155,  Section 2, Last Time

Variance of Random Variables

Again consider discrete random variables:

Where distribution is summarized by a table,

Values x1 x2 … xk

Prob. p1 p2 … pk

Page 11: Stat 155,  Section 2, Last Time

Variance of Random Variables

Again connect via frequentist approach:

n

iin XX

nXX

1

21 1

1,...,var

1

222

21

nXXXXXX n

1

##2

11

n

XxxXXxxX kkii

Page 12: Stat 155,  Section 2, Last Time

Variance of Random Variables

Again connect via frequentist approach:

2211 XxpXxp kk

n

iin XX

nXX

1

21 1

1,...,var

22

11

1#

1#

Xxn

xXXx

nxX

kkii

k

iii Xxp

1

2

Page 13: Stat 155,  Section 2, Last Time

Variance of Random VariablesSo define:

Variance of a distribution

As:

random variable

k

jXjjX xp

1

22

Page 14: Stat 155,  Section 2, Last Time

Variance of Random Variables

E. g. above game:

=(1/2)*5^2+(1/6)*1^2+(1/3)*8^2

Note: one acceptable Excel form,

e.g. for exam (but there are many)

2222 1931

1061

1421 X

X

Winning -4 0 9

Prob. 1/2 1/6 1/3

Page 15: Stat 155,  Section 2, Last Time

Standard Deviation

Recall standard deviation is square root of

variance (same units as data)

E. g. above game:

Standard Deviation

=sqrt((1/2)*5^2+(1/6)*1^2+(1/3)*8^2)

Winning -4 0 9

Prob. 1/2 1/6 1/3

Page 16: Stat 155,  Section 2, Last Time

Variance of Random VariablesHW:

C16: Find the variance and standard

deviation of the distribution in 4.59.

(0.752, 0.867)

Page 17: Stat 155,  Section 2, Last Time

Properties of Variancei. Linear transformation

I.e. “ignore shifts” var( ) = var

( )

(makes sense)

And scales come through squared

(recall s.d. on scale of data, var is square)

222XbaX a

Page 18: Stat 155,  Section 2, Last Time

Properties of Variance

ii. For X and Y independent (important!)

I. e. Variance of sum is sum of variances

Here is where variance is “more natural”

than standard deviation:

222YXYX

22YXYX

Page 19: Stat 155,  Section 2, Last Time

Properties of Variance

E. g. above game:

Recall “double the stakes”, gave same mean, as “play twice”, but seems different

Doubling:

Play twice, independently:

Note: playing more reduces uncertainty

(var quantifies this idea, will do more later)

222 4 XX

2222 22121 XXXXX

Winning -4 0 9

Prob. 1/2 1/6 1/3

Page 20: Stat 155,  Section 2, Last Time

Variance of Random Variables

HW:

C17: Suppose that the random variable X

models winter daily maximum

temperatures, and that X has mean 5o C

and standard deviation 10o C. Let Y be

the temp. in degrees Fahrenheit

(a) What is the mean of Y? (41oF)

Hint: Recall the conversion: C=(5/9)(F-32)

Page 21: Stat 155,  Section 2, Last Time

Variance of Random Variables

HW:

C17: (cont.)

(b) What is the standard deviation of Y?

(18oF)

Page 22: Stat 155,  Section 2, Last Time

And now for something completely different

Recall

Distribution

of majors of

students in

this course:

Stat 155, Section 2, Majors

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

Busine

ss /

Man

.

Biolog

y

Public

Poli

cy /

Health

Pharm

/ Nur

sing

Jour

nalis

m /

Comm

.

Env. S

ci.

Other

Undec

ided

Fre

qu

ency

Page 23: Stat 155,  Section 2, Last Time

And now for something completely different

Couldn’t

Find

Any

Great

Jokes,

So…

Page 24: Stat 155,  Section 2, Last Time

And now for something completely different

An Interesting and Relevant Issue:

• “Places Rated”

• Rankings Published by Several…

• We’ve been #1?

• Are we great ot what?

Will take a careful look later

Page 25: Stat 155,  Section 2, Last Time

Chapter 5

Sampling Distributions

Idea: Extend probability tools to distributions

we care about:

(i) Counts in Political Polls

(ii) Measurement Error

Page 26: Stat 155,  Section 2, Last Time

Counts in Political Polls

Useful model: Binomial Distribution

Setting: n independent trials of an

experiment with outcomes “Success” and

“Failure”, with P{S} = p.

Say X = #S’s has a “Binomial(n,p)

distribution”, and write “X ~ Bi(n,p)”

(parameters, like for Normal dist.)&

Page 27: Stat 155,  Section 2, Last Time

Binomial Distributions

Models much more than political polls:

E.g. Coin tossing

(recall saw “independence” was good)

E.g. Shooting free throws (in basketball)

• Is p always the same?

• Really independent? (turns out to be OK)

Page 28: Stat 155,  Section 2, Last Time

Binomial Distributions

HW on Binomial Assumptions:

5.1, 5.2 (a. no, n?, b. yes, c. yes)

Page 29: Stat 155,  Section 2, Last Time

Binomial Distributions

Could work out a formula for Binomial Probs,

but results are summarized in Excel function:

BINOMDIST

Example of Use:http://stat-or.unc.edu/webspace/postscript/marron/Teaching/stor155-2007/Stor155Eg19.xls

Page 30: Stat 155,  Section 2, Last Time

Binomial Probs in EXCEL

To compute P{X=x}, for X ~ Bi(n,p):

x

n

p

Page 31: Stat 155,  Section 2, Last Time

Binomial Probs in EXCEL

To compute P{X=x}, for X ~ Bi(n,p):

Cumulative:

P{X=x}: false

P{X<=x}: true

Page 32: Stat 155,  Section 2, Last Time

Binomial Probs in EXCEL

http://stat-or.unc.edu/webspace/postscript/marron/Teaching/stor155-2007/Stor155Eg19.xls

Check this spreadsheet for details of other

parts, and some important variations

Page 33: Stat 155,  Section 2, Last Time

Binomial Probs in EXCEL

Next time:

More slides on BINOMDIST,

And illustrate things like P{X < 3} = P{X <= 2}, etc.

Using a number line, and filled in dots…

Page 34: Stat 155,  Section 2, Last Time

Binomial Probs in EXCEL

HW:

5.3

5.4 (0.194)

Rework, using the Binomial Distribution:

4.52c,d

Page 35: Stat 155,  Section 2, Last Time

Binomial Distribution

“Shape” of Binomial Distribution:

Use Probability Histogram

Just a bar graph, where heights are

probabilities

Note: connected to previous histogram

by frequentist view

(via histogram of repeated samples)

Page 36: Stat 155,  Section 2, Last Time

Binomial Distribution

Study Distribution Shapes using Excelhttp://stat-or.unc.edu/webspace/postscript/marron/Teaching/stor155-2007/Stor155Eg20.xls

Part I: different p, note several ranges of p

are shown

Part II: different n, note really “live in different

areas”

Page 37: Stat 155,  Section 2, Last Time

Binomial Distribution

A look under the hoodhttp://stat-or.unc.edu/webspace/postscript/marron/Teaching/stor155-2007/Stor155Eg20.xls

Create probability histograms by:

– Create Column of xs (e.g. B9:B29)

– Create Probs (using BINOMDIST, C9:J29)

– Plot with Chart Wizard

Click Chart & Chart Wizard

Follow steps, check “series” carefully)

Page 38: Stat 155,  Section 2, Last Time

Binomial Distribution

With some calculation, can show:

For X ~ Bi(n,p):

Mean: (# trials x P{S})

Variance:

S. D.:

Relate to (center & spread) of each histo:http://stat-or.unc.edu/webspace/postscript/marron/Teaching/stor155-2007/Stor155Eg20.xls

pnEX X)1()var( 2 ppnx X

pnpX 1

Page 39: Stat 155,  Section 2, Last Time

Binomial Distribution

HW on Mean and Variance:

5.5

Page 40: Stat 155,  Section 2, Last Time

Binomial Distribution

E.g.: Class HW on %Males at UNC:http://stat-or.unc.edu/webspace/postscript/marron/Teaching/stor155-2007/Stor155Eg17.xls

Note Theoretical Means in E115:H115,

Compare to Sample Means in E110:H110:

Q1: Sample Mean smaller – course not representative

Q2: Sample Mean bigger – bias toward males

Q3: Sample Mean bigger – bias toward males

Q4: Sample Mean close

Which differences are “significant”?

Page 41: Stat 155,  Section 2, Last Time

Binomial Distribution

E.g.: Class HW on %Males at UNC:http://stat-or.unc.edu/webspace/postscript/marron/Teaching/stor155-2007/Stor155Eg17.xls

Note Theoretical SDs in E116:H115,6

Compare to Sample SDs in E112:H112:

Q1: Sample SDs smaller – course population smaller

Q2: Sample SDs bigger – variety of doors (different p)

Q3: Sample SDs bigger – variety of choices (diff. p?)

Q4: Sample SDs close

Which differences are “significant”?

Page 42: Stat 155,  Section 2, Last Time

Binomial Distribution

E.g.: Class HW on %Males at UNC:http://stat-or.unc.edu/webspace/postscript/marron/Teaching/stor155-2007/Stor155Eg17.xls

Probability Histograms (see 3rd column of plots),

Good view of above ideas (for samples):

Q1: mean too small, not enough spread

Q2: mean too big, too spread

Q3: mean too big, too spread

Q4: looks “about right”…

Page 43: Stat 155,  Section 2, Last Time

Binomial Distribution

HW:

5.13

5.19

Page 44: Stat 155,  Section 2, Last Time

And now for something completely different

An Interesting and Relevant Issue:

• “Places Rated”

• Rankings Published by Several…

• We’ve been #1?

• Are we great ot what?

Will take a careful look now

Page 45: Stat 155,  Section 2, Last Time

And now for something completely different

Interesting Article:

Analysis of Data from the Places Rated Almanac

By: Richard A. Becker; Lorraine Denby; Robert McGill; Allan R. Wilks

Published in: The American Statistician, Vol. 41, No. 3. (Aug., 1987), pp. 169-186.

Hyperlink to JSTOR

Page 46: Stat 155,  Section 2, Last Time

And now for something completely different

Main Ideas:

• For data base used in ratings

• Did careful analysis

• In an unbiased way

• Studied several aspects

• An interesting issue:

Who was “best”?

Page 47: Stat 155,  Section 2, Last Time

And now for something completely different

Who was “best”?

• Data base had 8 factors

• How should we weight them?

• Evenly?

• Other choices?

• Just choose some?

(typical approach)

• Can we make our city “best”?

Page 48: Stat 155,  Section 2, Last Time

And now for something completely different

Who was “best”?

• Approach:

Consider all possible ratings

(i.e. all sets of weights)

• Which places can be #1?

• Which places can be “worst”?

Page 49: Stat 155,  Section 2, Last Time

And now for something completely different

Which places can be #1?

• 134 cities are “best”

• Including Raleigh Durham area

Which places can be “worst”?

• Even longer list here

• But Raleigh Durham not here

Page 50: Stat 155,  Section 2, Last Time

And now for something completely different

Which places can be #1?

Which places can be “worst”?

Interesting fact:

Several cities on both lists!

Page 51: Stat 155,  Section 2, Last Time

And now for something completely different

Some conclusions:

• Be very skeptical of such ratings?

• Ask: what happens if weights change?

• Think: what motivates the rater?

• Understand how other people can have different opinions

(Just different “personal weights”)