Last Lecture: Histograms: –Definition –Interpretation in terms of probability –Estimate of...

32
Last Lecture: • Histograms: – Definition Interpretation in terms of probability Estimate of distribution function Sample Means, Sample Medians, and Sample Variances / Standard Deviations (also known as “statistics”) – Definitions – Interpretations Estimates of “true” values This Thursday, see www.math.umass.edu/~jstauden / for homework 2. In the HW, you’ll also learn about percentiles and boxplots (from Chapters 2 and 3). We’ll learn a lot more about the stuff in the first 3 chapters later in the semester…
  • date post

    21-Dec-2015
  • Category

    Documents

  • view

    214
  • download

    0

Transcript of Last Lecture: Histograms: –Definition –Interpretation in terms of probability –Estimate of...

Page 1: Last Lecture: Histograms: –Definition –Interpretation in terms of probability –Estimate of distribution function Sample Means, Sample Medians, and Sample.

Last Lecture:• Histograms:

– Definition– Interpretation in terms of probability– Estimate of distribution function

• Sample Means, Sample Medians, and Sample Variances / Standard Deviations (also known as “statistics”)– Definitions– Interpretations– Estimates of “true” values

• This Thursday, see www.math.umass.edu/~jstauden/ for homework 2.

• In the HW, you’ll also learn about percentiles and boxplots (from Chapters 2 and 3).

• We’ll learn a lot more about the stuff in the first 3 chapters later in the semester…

Page 2: Last Lecture: Histograms: –Definition –Interpretation in terms of probability –Estimate of distribution function Sample Means, Sample Medians, and Sample.

Economic Growth Rate is an Estimate

0.000 0.004 0.008

0

5

10

15

ests

Perc

ent

• If 100 economists were asked to estimate the growth of the economy last quarter, a histogram of the estimates might look like this:

Sample mean growth estimate is about:0.002 (0.2%)

Min = -0.002 Max = 0.006Range = 0.008

Distribution is bell shaped.Fact: Most of data falls w/in mean +/- 2std dev for bell shaped distributions

As a result, the sample std dev of theestimates is about 0.008/4 = 0.002

(Could also calculate s =0.00173 in this case…)

Page 3: Last Lecture: Histograms: –Definition –Interpretation in terms of probability –Estimate of distribution function Sample Means, Sample Medians, and Sample.

Probability (starting chapter 4)• A probability is a number between zero and one

that is assigned to an event.

• The higher the probability, the more likely the event.

Notation: Pr( event occurs )

• If the “experiment” that generates the event were repeated many times, the probability describes the fraction of the time it would occur.

Page 4: Last Lecture: Histograms: –Definition –Interpretation in terms of probability –Estimate of distribution function Sample Means, Sample Medians, and Sample.

One way to think about probability:

Event 1

TotalAreaOf the boxIs One(think aboutwhy)

Box represents all possible events

Pr( Event 1 occurs ) = area of ovalPr( Event 1 does not occur) = 1 – Pr ( Event 1 occurs )

Page 5: Last Lecture: Histograms: –Definition –Interpretation in terms of probability –Estimate of distribution function Sample Means, Sample Medians, and Sample.

Suppose there are 2 events that are “independent”

(we’ll define independence later…)

Event 1

Box represents all possible events

Event 2

Pr( Event 1 and Event 2 ) = area of overlap= Pr( event 1 ) * Pr( event 2 )

Pr(Event 1 or Event 2)= Pr(Event 1) + Pr(Event 2) – Pr(Event 1 and Event 2)= Pr(Event 1) + Pr(Event 2) – Pr(Event 1)* Pr(Event 2)

Page 6: Last Lecture: Histograms: –Definition –Interpretation in terms of probability –Estimate of distribution function Sample Means, Sample Medians, and Sample.

Example with dicePr( rolling a 1 on 1 die in one roll) = 1/6Pr( rolling a 2 on 1 die in one roll) = 1/6Pr( rolling a 3 on 1 die in one roll) = 1/6Pr( rolling a 4 on 1 die in one roll) = 1/6Pr( rolling a 5 on 1 die in one roll) = 1/6Pr( rolling a 6 on 1 die in one roll) = 1/6

Pr( rolling less than 4 in one roll) = Pr( rolling 1 or 2 or 3 in one roll)= Pr( rolling 1 in one roll) + Pr( rolling 2 in one roll)

+ Pr( rolling 3 in one roll) - Pr( rolling a and a 2 and a 3 in one

roll) = 1/6 + 1/6 + 1/6 – 0 = 1/2

Page 7: Last Lecture: Histograms: –Definition –Interpretation in terms of probability –Estimate of distribution function Sample Means, Sample Medians, and Sample.

Example with 2 dice

1

2

Outcome 3

on die 2 4

5

6

Outcome on die 11 2 3 4 5 6 Each square is

a possibleevent

Pr( any specificevent) =

1/6 * 1/6 = 1/36

Pr( rolling a seven in total) =

6/36 = 1/6

(squares w/ xs inthem are 7s)

x

x

x

x

x

x

Page 8: Last Lecture: Histograms: –Definition –Interpretation in terms of probability –Estimate of distribution function Sample Means, Sample Medians, and Sample.

Example with 2 dice (a related interpretation)

Pr( rolling a seven in total ) = number of ways to roll a seven number of possible outcomes

In general when ways the event could occur are equally probable, Pr( event ) =

number of ways that the event could occurnumber of possible outcomes

What’s a simple expression for Pr( event doesn’t occur?)(hint: it involves Pr( event )…)

Page 9: Last Lecture: Histograms: –Definition –Interpretation in terms of probability –Estimate of distribution function Sample Means, Sample Medians, and Sample.

Aside: Odds• “Odds” are related to probabilities. More specifically, the odds of an

event are:

“Pr( event does not occur ) / Pr (event occurs) to 1”

• At start of last football season, “the odds”* that the Patriots would win the Superbowl were: 250 to 1.

(1-pr(Pats win))/pr(Pats win) = 250(1-pr(Pats win)) = 250pr(Pats win)1 = 251pr(Pats win)pr(Pats win) = 1/251

• Q: How were odds determined?• A: Doing that well is how casinos make money. The precise methods

are proprietary. One way is to try to estimate the probabilities from historical data…

*i.e. “the odds” = what some casino in NV thought.

Page 10: Last Lecture: Histograms: –Definition –Interpretation in terms of probability –Estimate of distribution function Sample Means, Sample Medians, and Sample.

Example 2: Researcher mates 2 fruit flies and observes the traits of 300 offspring:

wing sizenormal miniature

eye color normal 140 6vermilion 3 151

What is pr(a fly in the experiment has normal eye color and normal wing size)?

What is pr(a fly in the experiment has vermillion eyes)?

What is pr(a fly in the experiment has vermillion eyes, miniature wings, or both)?

= WAYS EVENT CAN OCCUR / TOTAL NUMBER OF EVENTS

Page 11: Last Lecture: Histograms: –Definition –Interpretation in terms of probability –Estimate of distribution function Sample Means, Sample Medians, and Sample.

Independence

Definition:

events A and B are independent if

Pr(event A and B) = Pr(A)*Pr(B)

Idea:

Does whether A occurs or not give you any information about whether B occurs or not?

If yes, then A and B are not independent.

Page 12: Last Lecture: Histograms: –Definition –Interpretation in terms of probability –Estimate of distribution function Sample Means, Sample Medians, and Sample.

Independence: example

Weather Raining

Not Raining

8 hour shifts thatproduce defects

Consider the following example from a latex glove manufacturer. Each number represents an 8 hour manufacturing shift.

90 9200

80 15000

8 hour shifts thatproduce no defects

Defects

Q: Are defects and weather independent?

Page 13: Last Lecture: Histograms: –Definition –Interpretation in terms of probability –Estimate of distribution function Sample Means, Sample Medians, and Sample.

Pr( rain ) = shifts w/ rain / total shifts=(90 + 9200 )/(90+9200+80+15000) = .38

Pr( defect ) = shifts w/ defect / total shifts=(90 + 80 )/(90+9200+80+15000) = .00698

Pr( defect and rain) = shifts w/ defects and rain / total shifts=(90)/(90+9200+80+15000) = .00369

.00369 does not equal (0.38)*(.00698)So the events are not independent in this sample. (Humidity is related to defects…)

Page 14: Last Lecture: Histograms: –Definition –Interpretation in terms of probability –Estimate of distribution function Sample Means, Sample Medians, and Sample.

Random VariablesLet X be a number whose value depends on the outcome of a “chance event”

Examples:A poll is asked of 100 people

X = 0 if person 1 answers no and 1 if yesorX = total number of yeses

X = measurement of a board with a rulerX = weight of a randomly selected cat

Page 15: Last Lecture: Histograms: –Definition –Interpretation in terms of probability –Estimate of distribution function Sample Means, Sample Medians, and Sample.

A probability distribution function (pdf) is associated with every

random variable.

Assume for now that X is discrete (takes values mapable to the integers or a subset of the integers). The probability distribution function is:

Pr( X = a number ) (argument is “a number”

output is probability)

p(k) = Pr( X = k)

Capital letter = random variable

Lower case letter = number

Page 16: Last Lecture: Histograms: –Definition –Interpretation in terms of probability –Estimate of distribution function Sample Means, Sample Medians, and Sample.

Properties of pdfsp(k) is greater than or equal to 0 for any k

p(k) less than or equal to 1 for any k

sum of p(k) over all possible k’s = 1

The pdf is a model for how X behaves.

Note that histograms estimate pdfs from data.

Histograms, sample means, sample variances etc show how observations of X actually behave.

Page 17: Last Lecture: Histograms: –Definition –Interpretation in terms of probability –Estimate of distribution function Sample Means, Sample Medians, and Sample.

Ways to determine PDFs:

1) Given in a table

2) Given by a formula There are “famous ones”: binomial, Poisson, hypergeometric,…

Page 18: Last Lecture: Histograms: –Definition –Interpretation in terms of probability –Estimate of distribution function Sample Means, Sample Medians, and Sample.

PDF – probabilities in a table

Let X = # coffee cart line length at 10am

# of Phone Calls in an Hour (k) Pr(k)0 0.101 0.202 0.253 0.304 0.15

Suppose greater than 4 people is impossible

Pr( X >= 0) = Pr( X > 3) = Pr( X >= 1) = Pr( X < 1) =

If you observethe line length onseveral days and make a histogram, then it will be “close” topr(k). It gets “closer” as the number of days increased.

Page 19: Last Lecture: Histograms: –Definition –Interpretation in terms of probability –Estimate of distribution function Sample Means, Sample Medians, and Sample.

Associated with PDFs are true Means and Variances (& true std devs)

idea: pdf provides model. True means and variances are attributes of the model…

• True Mean = sum( k*p(k) ) where sum is over all the possible k’s .

• True Variance = sum(p(k) * (k-mean)^2) where sum is over all possible k’s.

• Line length:Mean = E(X) = 0*.1 + 1*.2 + 2*.25 + 3*.3 + 4*.15 = 2.2Variance = Var(X) =(0-2.2)2*.1 + (1-2.2)2*.2 + (2-2.2)2*.25 + (3-2.2)2*.3 + (4-2.2)2*.15= 1.46

• Sample means and Sample variances are calculated from datasets.• True means and True variances are part of the theoretical model for

the data.

• KEY IDEA: as the size of the dataset becomes larger, the Sample means and variances get closer to the true means and variances…

Page 20: Last Lecture: Histograms: –Definition –Interpretation in terms of probability –Estimate of distribution function Sample Means, Sample Medians, and Sample.

Powerball Example:# Winners: 0 1 2 3 4 5 6Probability:8% 21% 26% 21% 13% 9% 2%(These are estimates based on historical data, but assume that they are the truthfor the sake of the example.)

Probability that I am a winner if I buy 1 ticket = 1/80 million (= 1/80M).Jackpot (pre tax) = $200 million. (if >1 person wins, jackpot is divided).

Assume whether or not I win is independent of the number of winners.

Let X = millions of dollars I win from one ticket.

PDF:x 0 200 100 66.7 50 40 33.3pr(x) ? .2283/80M .2826/80M .2283/80M .1413/80M .0978/80M .0217/80M1) What does “?” equal? (and how did I compute the other pr(x)’s)?(see next slide for ans)2) mu = E(X) = sum(x * p(x)). In dollars this is about $1.26. (you can confirm this)3) Var(X) = sum((x – mu)*pr(x))4) Interpretation: If I play powerball a lot when there is a $200 million jackpot, then I can expect to win $1.26 on average.5) If tickets are a dollar each, why doesn’t Powerball lose money? (These numbers are all based on real data.)

Page 21: Last Lecture: Histograms: –Definition –Interpretation in terms of probability –Estimate of distribution function Sample Means, Sample Medians, and Sample.

Answers to question on previous slide

• The ? = (80million-1)/80million– You know this is true since the probability that I do not win is one

minus the probability that I win (and the probability that I win is given to be 1/80 million).

• How did I compute the pr(x)’s:– The probability that I win $200million =

Pr(I win and there is only winner given that there is at least 1 winner)=Pr(I win)*Pr(there is only one winner given that there is at least one winner)=(1/80million) * (0.21/(.21+.26+.21+.13+.09+.02))Uses

independence

Uses the rule for conditional probability on page 141.

Page 22: Last Lecture: Histograms: –Definition –Interpretation in terms of probability –Estimate of distribution function Sample Means, Sample Medians, and Sample.

Cumulative Probability:

• A cumulative probability is the probability that X is less than or equal to a some number:

• Ex: powerball:• Pr(there are 3 or fewer winners)

=Pr(X<=3)=Pr(X=0 or X=1 or X=2 or X=3)=Pr(no winners)+Pr(1 winner)+Pr(2 winners)+Pr(3 winners)= 8%+21%+26%+21%

• Notation: F(3)=Pr(X <= 3) (F(k) = Pr(X<=k) is called the Cumulative Distribution Function or CDF)

• If this helps, think of F(k) as the integral of the PDF from 0 to k.

• Note: Pr(X > 3) = 1-Pr(X<=3) (careful about > and <=…)

Page 23: Last Lecture: Histograms: –Definition –Interpretation in terms of probability –Estimate of distribution function Sample Means, Sample Medians, and Sample.

Graphically:

6543210

0.2

0.1

0.0

Number of Winners (k)

Pr(

Win

ners

= k

)Pr( X <= 3 ) = sum of the areas of the shaded regions

= 1 – Pr( X>4 )= 1 – sum of the areas of the white regions

PDF for the randomvariable that representsthe number of winners

Page 24: Last Lecture: Histograms: –Definition –Interpretation in terms of probability –Estimate of distribution function Sample Means, Sample Medians, and Sample.

“Famous” PDFs• Binomial: “X~bin(n,p)”

– Setup• Let X = number of successes out of n identical trials• n identical independent trials• Each trial results in a success w/ probability p or failure with

probability q=1-p• X could possibly be 0,…,n

– PDF:• Pr(X = k) = (n choose k) pkqn-k

• (n choose k) = number of ways to choose k things from n things= (n) = n! / (k! (n-k)!)

(k)• Note that n! = n*(n-1)*…*2*1 • Also, 0! = 1

– Expectation = E(X) = npVariance = Var(X) = npqStdDev = sqrt(Var(X))

Page 25: Last Lecture: Histograms: –Definition –Interpretation in terms of probability –Estimate of distribution function Sample Means, Sample Medians, and Sample.

Example:• Suppose each person in a 5 person class comes

with probability 0.85?• Let X = number of people in class on a given

day.• What’s probability 4 people show up one day?• X~bin(5,0.85)• Pr(X = 4) = (5 choose 4) * 0.854 * 0.151

= 5 * 0.854 * 0.151

= 0.3915047

Page 26: Last Lecture: Histograms: –Definition –Interpretation in terms of probability –Estimate of distribution function Sample Means, Sample Medians, and Sample.

Why the binomial pdf is correct:

Example:• 5 Students. Each attends with probability 0.85. What’s

the probability of exactly 4 successes?• There are 5 choose 4 ( 5 = 5!/(4!*1!) ) possible

configurations of students (YYYYN, YYYNY, etc).• Each configuration has probability 0.8540.151

• Pr(X = 2) = 5* 0.8540.151 = 39% (we’re using the “or” rule here: person 1 doesn’t come or person 2 doesn’t come, or…

Probaility of 4People coming

Probaility of 1Person not coming

(remember the “and” rule for independent events)

Page 27: Last Lecture: Histograms: –Definition –Interpretation in terms of probability –Estimate of distribution function Sample Means, Sample Medians, and Sample.

“Famous” PDFs• Poisson: “X~Pois(r)”

– Setup• Let X = number of occurrences of an event in time or

space• Events are expected to occur at rate r• X could possibly be 0,1,2,…

– PDF:• Pr(X = k) = rke-r/k!• e is 2.718…• Note that 0! = 1

– Expectation = E(X) = rVariance = Var(X) = rStdDev = sqrt(Var(X))

One could show why the Poisson PDF is correct, but the math is more involved. If you’re interested, come talk to me sometime.

Page 28: Last Lecture: Histograms: –Definition –Interpretation in terms of probability –Estimate of distribution function Sample Means, Sample Medians, and Sample.

Example:

• Inspect an experimental rat’s brain for tumorous cells. You expect 10 tumorous cells in 60mm3 of brain. What’s the probability that you see either 2 or 3 tumorous cells in 10mm3?

• X = tumors found 10mm3 of brain. X~Poisson(5/3) (rate per 60mm3 is 10, so rate per 10mm3 is 10/6 = 5/3)

• Pr(X = 2 or 3) = Pr(X = 2) + Pr(X = 3) = (5/3)2e-(5/3)/2! + (5/3)3e-(5/3)/3! = 41%

Page 29: Last Lecture: Histograms: –Definition –Interpretation in terms of probability –Estimate of distribution function Sample Means, Sample Medians, and Sample.

“Famous” PDFs• Hypergeometric: “X~Hyp(N,M,n)”

– Setup• There are a total of N items. M are of type A and N-M are of type

B. n items are chosen at random without replacement.• Let X = number of chosen items that are type A

– Pr(X = k) = (M choose k)*(N-M choose n-k)/(N choose n)

– Remember:(n choose k) = number of ways to choose k things from n things= (n) = n! / (k! (n-k)!)

(k)• Note that 0! = 1

– Note that binomial is like the hypergeometric, but the binomial is with replacement… (which results in a fixed p)

Page 30: Last Lecture: Histograms: –Definition –Interpretation in terms of probability –Estimate of distribution function Sample Means, Sample Medians, and Sample.

Hypergeometric Example• Cards: probability of being dealt a flush in hearts in a

hand of poker (flush=all cards of same suit)• X = number of hearts in the hand• N = 52• M = 52/4 = 13• n = 5• Want Pr(X=5)

(13 choose 5 ) (39 choose 0)/(52 choose 5)= 1287 * 1 / 25989600.0004951981 (NOTE THAT THIS NUMBER IS DIFFERENT FROM WHAT I WROTE ON THE BOARD IN THE CLASS)

What’s probabilty of getting a flush in any suit?

• (see minitab:calc:Probability Distributions: Hypergeometric)

Page 31: Last Lecture: Histograms: –Definition –Interpretation in terms of probability –Estimate of distribution function Sample Means, Sample Medians, and Sample.

• For each of the following:– What is the random variable?– What is it’s distribution and what are numbers

for its parameters?– What is the probability that is being asked for?– How can it be computed from the probability

density function.

Page 32: Last Lecture: Histograms: –Definition –Interpretation in terms of probability –Estimate of distribution function Sample Means, Sample Medians, and Sample.

More Examples:• There are 4 security checkpoints. The probability of being searched

at any one is 0.2. You may be searched more than once and all searches are independent. What’s the probability of being searched at least one time?

• 50 geese in a flock of 200 are tagged by a wildlife biologist. The next year, 10 ducks from the flock are captured. Assume the flock still has 200 ducks and no tags are lost. What’s the probability that at least 5 of the recaptured ducks have tags?

• Suppose a written test has 5 True/False questions. Passing = at least 3 correct answers and the test can be taken at most 3 times. (Assume no learning occurs between tests if one fails!)– If one randomly guesses what’s the probability of passing?

– What’s the probability that someone who randomly guesses will eventually pass?

• An overloaded server receives an average of 25 emails per second at 12:00PM. If it receives more than 30 emails in a second, it will crash. What’s the probability of a crash at 12:00PM on a given day (based on the traffic in the previous 1 second)?