Random Thoughts 2012 (COMP 066)

17
Random Thoughts 2012 (COMP 066) Jan-Michael Frahm Jared Heinly

description

Random Thoughts 2012 (COMP 066). Jan-Michael Frahm Jared Heinly. Values to Summarize Data. Mean (EXCEL: AVERAGE( ) C an informally be seen as the middle of the data B e careful they do not always tell the whole story outliers influence the mean (significantly). Median. - PowerPoint PPT Presentation

Transcript of Random Thoughts 2012 (COMP 066)

Page 1: Random Thoughts 2012 (COMP 066)

Random Thoughts 2012(COMP 066)

Jan-Michael FrahmJared Heinly

Page 2: Random Thoughts 2012 (COMP 066)

2

Values to Summarize Data

• Mean (EXCEL: AVERAGE(<range>)• Can informally be seen as the middle of the

data• Be careful they do not always tell the whole

story outliers influence the mean (significantly)

Page 3: Random Thoughts 2012 (COMP 066)

3

Median• Median (EXCEL: MEDIAN(<range>))1) Order the data from smallest to largest2) If the dataset is an odd number the median is

the one in the middle. If there is an even number of data the average of the middle two is the median

• Which measure should be used mean or median? reporting both is never a problem

• Always ask for the other if given only one

Page 4: Random Thoughts 2012 (COMP 066)

4

Measure of Variability• Standard deviation (EXCEL: STDEV.S(<range>))1. Find the average of the data2. Subtract average from the data3. Square the differences4. Divide the sum of squares by the number of

data minus one (this is also called variance)5. Take the square root of the variance

Page 5: Random Thoughts 2012 (COMP 066)

5

Standard Deviation Properties

• Can never be negative• Smallest possible value is 0• Effected by outliers• Same unit as original data

Page 6: Random Thoughts 2012 (COMP 066)

6

Percentile• k-th percentile1. Order all numbers in the dataset2. Multiply k percent times the number of data points n

round up if not a whole number

3. Find the value at the in step 2 computed position. Then the k-th percentile is the average of that number and the next number

• Median is the 50-th percentile• Percentile is not a percent it a number that is a

certain percentage of the way through the dataset

Page 7: Random Thoughts 2012 (COMP 066)

7

Coincidences• Recall the bet that two people in the room have

the same birthday• Was it a bad bet to make?

Page 8: Random Thoughts 2012 (COMP 066)

8

Coincidences• Johnny Carson example from Paulos book: In

order to have a 50% probability of someone in the room having a particular birthday, you need 253 people.

• Does this make sense?• Wouldn’t you need only 50% or 366 people

which is 183?

Page 9: Random Thoughts 2012 (COMP 066)

9

Coincidences• 1000 letters, 1000 mailboxes, random

assignment• Probability of at least 1 getting to correct

destination• Why is it 63%?

Page 10: Random Thoughts 2012 (COMP 066)

10

Coincidences• 1000 letters 1000 random addresses (allowing

duplicates), 1000 mailboxes, random assignment

• Probability of at least 1 getting to correct destination

Page 11: Random Thoughts 2012 (COMP 066)

11

Coincidences• 1000 letters, 1000 mailboxes, random

assignment• Probability of at least 1 getting to correct

destination• Why is it 63%?• Derangements – permutation such that no

element appears in its original position Complex calculation, but as number of elements

increases, probability approaches 1 – 1/e ≈ 63%

Page 12: Random Thoughts 2012 (COMP 066)

12

Pigeonhole Principle• If n items are put into m pigeonholes with n > m, at least one pigeonhole must have more than 1 item

Source: http://en.wikipedia.org/wiki/File:TooManyPigeons.jpg

Page 13: Random Thoughts 2012 (COMP 066)

13

Pigeonhole Principle• 1.54 million people in Philadelphia• At most 500,000 hairs are on a person’s head• What is the minimum number of people that

have the same number of hairs on their head?

Page 14: Random Thoughts 2012 (COMP 066)

14

Chance Encounters• Probability that two people from USA know

someone in common ie. they are “linked” via one person

• Assumption: there are 300 million people in the USA

• Assumption: each person knows 1500 other people

• Probability that two people from USA are linked via 2 individuals

Page 15: Random Thoughts 2012 (COMP 066)

15

Degrees of Separation• Six degrees of separation

There are on average 6 links between any 2 people on earth

• Six degrees of Kevin Bacon, Bacon number Determine the number of links (movies acted in)

between a random actor and Kevin Bacon

• Assume 2 million actors• Assume each actor has acted with 80 others

Page 16: Random Thoughts 2012 (COMP 066)

16

Expected Value• Expected value = probability of event *

value of event• Ex: pay $1 to play a game, 10% chance of

winning $5, 40% chance of winning $1• Expected Value = -1 + 0.1 * 5 + 0.4 * 1 = $-

0.10• Ex: Dice game

Keep earning points until you roll a 1 When does your expected value of points stop

increasing?

Σ

Page 17: Random Thoughts 2012 (COMP 066)

17

Blood Testing• 1% of people have disease• Need to test 100 samples of blood• Probability that all samples are healthy• What if we pool the blood into 2 sets of 50 each

and then test? What is the expected number of tests?

• Can we do better?