How A/B Tests Lie to Us and How to Drive Genuine Improvement

51
SIX PUZZLES How A/B tests lie to us, and how to drive genuine improvement Jason Cohen , founder & CTO http://wpengine.com

Transcript of How A/B Tests Lie to Us and How to Drive Genuine Improvement

SIX PUZZLES How A/B tests lie to us, and

how to drive genuine improvement

Jason Cohen, founder & CTO http://wpengine.com

The first puzzle

654 out of 24,483 visits 85% chance-to-beat

What do stats really tell us?

654 out of 24,483 visits 85% chance-to-beat

Finding the Cheaters

105 athletes. 100 are clean. 5 are juicing. Test is 95% accurate.

Finding the CheatersIN REALITY

Clean (100) Juicing (5)

TEST RESULT (95%accuracy)

Clean

Juicing 5

095

5

Insight: Rare events

are indistinguishable from random noise

758 out of 24,483 visits 85% chance-to-beat

Easy significance testing with hamsters

Trust in the Hamster.

bit.ly/abhamster

Easy significance testing

N = A+B = total # of conversions.

D = (A–B)/2 = half the difference.

Significant if D2 > N.

bit.ly/abhamster

N = 758 + 685 = 1443

D = ( 758 — 685 ) / 2 = 37

1369 > 1443?

No! Not significant.

Final Result: As predicted

Wasted another month

30

40

50

60

70

12/2/09 12/16/09 1/5/10 1/13/10 2/2/10 4/9/10 4/27/10 9/24/10 9/14/11 11/8/11

OtherInbox Splash Page

Lesson #1 Beware the Rare

Lesson #2 Trust the Hamster

When statistically-significant rodentia still aren’t

enough...

41 Shades of Blue

41 Shades of Blue

Does this...

...get more clicks than this?

2 Shades of Blue

with 1 winner at a 95% confidence level,

chance of a false-positive: 5%

3 Shades of Blue

with 1 winner at a 95% confidence level,

chance of a false-positive: 9%

10 Shades of Blue

with 1 winner at a 95% confidence level,

chance of a false-positive: 40%

41 Shades of Blue

with 1 winner at a 95% confidence level,

chance of a false-positive: 88%

The False-Positive Guarantee

Oops, you’re doing this.

Insight: Real experiments

test theories.

Lesson #3 Test theories, not headlines

Poking around your analytics

Bible Code!

Bible Code!

Edwards v. Aguillard

War and Peace

GA reveals ideas, not facts

Lesson #4 Exploration

yields new theories, not new truth.

A Unifying Concept

Headwinds to improvement

significance → huge N → N/A high failure rate → many tests → false-positives rare events → worse on both!

Summary: Subtly is the Enemy

Insight: You don’t want

“Subtle” anyway!

Lesson #5 Seek Huge Effects

Can you A/B test your way to big results?

1,116

250

4,100

1,151Individuals: 67% off

Group: 3% off

Wisdom of the Crowd

“I’m as clueless as you, but in the equal and opposite direction.”

What’s the funniest joke?

And the winner is…

Jokes, Pizza, Design, Features, Voice,

Values, Culture

Lesson #6 Crowds find objective truth

but kill creativity and greatness

Think for yourself. Test theories.

Seek big effects.

Jason Cohen, founder & CTO http://wpengine.com

Register now for our next webinar!

“Your Workflow, Your Way, with WP Engine”

Register now: http://wpeng.in/ywyw

Q & AJason Cohen, founder & CTO

http://wpengine.com