Better than a coin toss

88
ARE YOU BETTER THAN A ARE YOU BETTER THAN A COIN TOSS? COIN TOSS? BY JOHN OLIVER AND RICHARD WARBURTON BY JOHN OLIVER AND RICHARD WARBURTON

description

So you’re a big data and distributed systems “expert”, you’ve collected 500 billion data points, thrown it into sci-lib-of-the-week, you’re using Hadoop, backing onto those cool AWS GPU instances, let it grind away for days and its spit out the answer to life the universe and everything. But is it really better than a coin toss? How do you validate whether your data analysis algorithm works? Are you learning a solution to your problems or just the data you already have? What problems can you encounter when analysing your data? How do you solve them, and what can you do easily under the time pressures of a business environment?

Transcript of Better than a coin toss

Page 1: Better than a coin toss

ARE YOU BETTER THAN AARE YOU BETTER THAN ACOIN TOSS?COIN TOSS?

BY JOHN OLIVER AND RICHARD WARBURTONBY JOHN OLIVER AND RICHARD WARBURTON

Page 2: Better than a coin toss
Page 3: Better than a coin toss

WHO ARE WE?WHO ARE WE?

Page 4: Better than a coin toss

Why you should care

The Fundamentals

Practical Problems

Applying the Theory

Page 5: Better than a coin toss
Page 6: Better than a coin toss
Page 7: Better than a coin toss
Page 8: Better than a coin toss

'EXPERTS" AREN'T VERY GOOD'EXPERTS" AREN'T VERY GOOD

Page 9: Better than a coin toss
Page 10: Better than a coin toss

BIG DATA SOLVES ALLBIG DATA SOLVES ALLKNOWN PROBLEMSKNOWN PROBLEMS

Page 11: Better than a coin toss

BIG DATA BIG DATA SOLVES ALLSOLVES ALLKNOWN PROBLEMSKNOWN PROBLEMS

... HELPS... HELPS

Page 12: Better than a coin toss

VALIDATION = TESTSVALIDATION = TESTSFOR DATAFOR DATA

Page 13: Better than a coin toss

PART 1: FUNDAMENTALSPART 1: FUNDAMENTALS

Page 14: Better than a coin toss

NULL HYPOTHESISNULL HYPOTHESISUntil proven otherwise there is no relationship between

phenomena

Page 15: Better than a coin toss

WHEN YOU HEAR "WOLF!" THERE IS A WOLF NEARBYWHEN YOU HEAR "WOLF!" THERE IS A WOLF NEARBY

Cry "Wolf!" Stay QuietWolf Nearby Ok False NegativeIts really a chicken! False Positive Ok

Page 16: Better than a coin toss

WHY IS THIS IMPORTANT?WHY IS THIS IMPORTANT?

Page 17: Better than a coin toss

It is better that ten guilty persons escape thanthat one innocent suffer

- William Blackstone

Page 18: Better than a coin toss
Page 19: Better than a coin toss

STATIC ANALYSISSTATIC ANALYSIS

Page 20: Better than a coin toss

COST BENEFIT ANALYSISCOST BENEFIT ANALYSISCosts a lot to jail an innocent manCosts very little to show someone an inappropriate houseCredibility, Liberty, Morality are also costs

Page 21: Better than a coin toss

CHOOSE THE RIGHT MEASUREMENTCHOOSE THE RIGHT MEASUREMENTThere's more than one concept of accuracy

Page 22: Better than a coin toss

RECALLRECALLnumber of true positives / number of actually true values

Page 23: Better than a coin toss

PRECISIONPRECISIONnumber of true positives / predicted true value

Page 24: Better than a coin toss

F MEASUREF MEASURE

Page 25: Better than a coin toss

CASE STUDY: MEMORY LEAKSCASE STUDY: MEMORY LEAKSAbout ~10% of our dataset had memory leaks

Predict "never leaks memory" ~= 0.9 accuracy, but F1 = 0

Our algorithm ~= 0.9 accuracy and F1 ~= 0.9

Page 26: Better than a coin toss

PROBLEM: RELIABILITY OF MEASUREMENTPROBLEM: RELIABILITY OF MEASUREMENT

Page 27: Better than a coin toss

RULE OF THUMBRULE OF THUMBIf it looks like random noise, it probably is random noise.

Page 28: Better than a coin toss

SOLUTION: CHECK YOUR DATASOLUTION: CHECK YOUR DATA

Low Standard Deviation

Coefficient of Variation = Standard Deviation / Mean

Page 29: Better than a coin toss

CAVEAT: NON-NORMAL DISTRIBUTONSCAVEAT: NON-NORMAL DISTRIBUTONS

Page 30: Better than a coin toss
Page 31: Better than a coin toss

SOLUTION: GO MADSOLUTION: GO MAD

Page 32: Better than a coin toss

MEDIAN ABSOLUTE DEVIATIONMEDIAN ABSOLUTE DEVIATION

Page 33: Better than a coin toss

PROBLEM: EXPERIMENTAL FLUKESPROBLEM: EXPERIMENTAL FLUKES

Page 34: Better than a coin toss

IS YOUR A/B TEST A HEISEN TEST?IS YOUR A/B TEST A HEISEN TEST?

Page 35: Better than a coin toss

SOLUTION: P-VALUESOLUTION: P-VALUE

Page 36: Better than a coin toss

SCIENCE WORKS - B****ES!SCIENCE WORKS - B****ES!

Page 37: Better than a coin toss

PRACTICAL PROBLEMSPRACTICAL PROBLEMSPART 2PART 2

Page 38: Better than a coin toss

PROBLEM: FALSE PROPHETSPROBLEM: FALSE PROPHETS

Page 39: Better than a coin toss

I'M AN EXPERT, LISTEN TO ME!I'M AN EXPERT, LISTEN TO ME!

Page 40: Better than a coin toss

SOLUTION: ESTABLISH GOALS AND HYPOTHESIS THEN TESTSOLUTION: ESTABLISH GOALS AND HYPOTHESIS THEN TESTSOLUTIONSSOLUTIONS

Page 41: Better than a coin toss

PROBLEM: CODE QUALITYPROBLEM: CODE QUALITYThe math works :-) the code does not :-(

@headinthebox

Page 42: Better than a coin toss

GROWTH IN A TIME OF DEBTGROWTH IN A TIME OF DEBT

Page 43: Better than a coin toss

SOLUTION: SOFTWARE ENGINEERING PRACTICESSOLUTION: SOFTWARE ENGINEERING PRACTICES

Page 44: Better than a coin toss

Everyone Lies

- House

Page 45: Better than a coin toss

SOLUTION: UNDERSTAND BIASES AND DESIGNSOLUTION: UNDERSTAND BIASES AND DESIGNAROUND THEMAROUND THEM

Page 46: Better than a coin toss

Gay couples should have an equal right to getmarried, not just to have civil partnerships

Populus: 65% vs 27%

Marriage should continue to be defined as a life-long exclusive commitment between a man and

a woman

Comres + Catholic Voices: 22% vs 70%

Page 47: Better than a coin toss

ACQUIESCENCE BIASACQUIESCENCE BIASAnswer yes if there’s a positive connotation

Page 48: Better than a coin toss

REMOVAL OF PARTICULAR ADVERTISING AND SPONSORSHIP BANSREMOVAL OF PARTICULAR ADVERTISING AND SPONSORSHIP BANS

FOR: 1045 AGAINST: 731 ABSTAIN: 121 Motion Carried

MAINTAINING AN ETHICAL UNION BY REAFFIRMING ADVERTISING AND SPONSORSHIP BANSMAINTAINING AN ETHICAL UNION BY REAFFIRMING ADVERTISING AND SPONSORSHIP BANS

FOR: 858AGAINST: 755ABSTAIN: 166Motion Carried

Page 49: Better than a coin toss

SOLUTION: PHRASE QUESTIONS NEUTRALLYSOLUTION: PHRASE QUESTIONS NEUTRALLYAnd only have one question

Page 50: Better than a coin toss

SOCIAL DESIRABILITYSOCIAL DESIRABILITYPoor people overestimate their income, rich people under

estimate it.

Page 51: Better than a coin toss

SOLUTIONSSOLUTIONSAnonymisationConfidentialityRandomized ResponseBogus Pipeline

Page 52: Better than a coin toss

BIAS TOWARDS THE FIRST ANSWER OF A QUESTIONBIAS TOWARDS THE FIRST ANSWER OF A QUESTIONMake sure to randomise the order of answers

Page 53: Better than a coin toss

WHAT WILL THE NEXT CRISIS IN WASHINGTON BE?WHAT WILL THE NEXT CRISIS IN WASHINGTON BE?

Fight over the debt ceilingDifficulty averting automatic cuts to the PentagonFailure to pass basic budget billsAll of the above

http://www.foxnews.com/politics/elections/2012/you-decide/what-will-next-crisis-washington-be

Page 54: Better than a coin toss

PROBLEM: CORRELATION DOESN’T IMPLY CAUSALITYPROBLEM: CORRELATION DOESN’T IMPLY CAUSALITY

Page 55: Better than a coin toss

DATABASE AND NETWORK ACTIVITY CORRELATINGDATABASE AND NETWORK ACTIVITY CORRELATINGPerformance Diagnosis: was actually a GC Problem.

Page 56: Better than a coin toss

SOLUTION: DOMAIN KNOWLEDGESOLUTION: DOMAIN KNOWLEDGE

Page 57: Better than a coin toss
Page 58: Better than a coin toss

SOLUTIONSSOLUTIONSUse domain knowledge - ask PilotsStratified sample setsMeasure outcomes - are planes surviving more?

Page 59: Better than a coin toss

BE RIGOROUSBE RIGOROUS

Page 60: Better than a coin toss

PART 3: APPLYING THEPART 3: APPLYING THETHEORYTHEORY

Page 61: Better than a coin toss

CORRELATIONCORRELATIONA MEASURE OF THE STRENGTH OF DEPENDENCE BETWEEN TWO VARIABLESA MEASURE OF THE STRENGTH OF DEPENDENCE BETWEEN TWO VARIABLES

Page 62: Better than a coin toss

PEARSON CORRELATIONPEARSON CORRELATIONErr...Just look it up

(Assumes linear relationship)

Page 63: Better than a coin toss

Range Strength<0.4 Weak/No Correlation<0.7 Some Correlation>0.7 Strong Correlation

Page 64: Better than a coin toss

CASE STUDY: PERFORMANCE PROBLEM WITH HIGH SYSTEMCASE STUDY: PERFORMANCE PROBLEM WITH HIGH SYSTEMTIMETIME

Hypothesis: caused by Disk I/O

Page 65: Better than a coin toss

Correlation Strength: 0.78453

Page 66: Better than a coin toss

MACHINE LEARNINGMACHINE LEARNINGApplication of statistics to learn a relationship

Page 67: Better than a coin toss

HOW MANY CLUSTERS?HOW MANY CLUSTERS?

Page 68: Better than a coin toss

WHERE'S THE ELBOW?WHERE'S THE ELBOW?

Page 69: Better than a coin toss

FITTINGFITTING

Page 70: Better than a coin toss

FITTINGFITTING

Page 71: Better than a coin toss

SOLUTION:SOLUTION:CROSS VALIDATIONCROSS VALIDATION

Page 72: Better than a coin toss
Page 73: Better than a coin toss

CHOOSE CROSS VALIDATION DATA WISELYCHOOSE CROSS VALIDATION DATA WISELY

Page 74: Better than a coin toss

SELF VALIDATINGSELF VALIDATINGEnsemble methods - Train lots of weak classifiers and merge

Page 75: Better than a coin toss

RANDOM FOREST AND BAGGINGRANDOM FOREST AND BAGGINGDivide the data into bootstrap sets

Use the rest for calculating error

Page 76: Better than a coin toss

LEARNING CURVESLEARNING CURVES

Page 77: Better than a coin toss
Page 78: Better than a coin toss
Page 79: Better than a coin toss

HOW MUCH IS TOO MUCH?HOW MUCH IS TOO MUCH?

Page 80: Better than a coin toss
Page 81: Better than a coin toss
Page 82: Better than a coin toss
Page 83: Better than a coin toss
Page 84: Better than a coin toss

MONITOR PRODUCTION DATA...IT CHANGESMONITOR PRODUCTION DATA...IT CHANGESDoes it look like the same data that you learnt with?

Page 85: Better than a coin toss

A/B TEST NEW SYSTEMSA/B TEST NEW SYSTEMSSatisfaction/Profit/Traffic...

Page 86: Better than a coin toss

COMMON THREADSCOMMON THREADSTraining set errors are misleadingCross Validation, Production Monitored Values are the onesthat really matterVisualise and compare these errors

Page 87: Better than a coin toss

CONCLUSIONCONCLUSIONAnalytics are increasingly importantWide variety of statistical and practical tips to get them rightHave fun and Best of luck!

Page 88: Better than a coin toss

@johno_oliver @RichardWarburto

QUESTIONS?QUESTIONS?http://insightfullogic.com