Power of Small Data

30
www.globalbigdataconference.com Twitter : @bigdataconf

Transcript of Power of Small Data

www.globalbigdataconference.comTwitter : @bigdataconf

Power of Small Data

Intended for Knowledge Sharing only

Disclaimer: Participation in this summit is purely on personal basis and not representing VISA in any form or matter. The talk is based on learnings from work across industries and firms. Care has been taken to ensure no proprietary or work related info of any firm is used in any material.

Intended for Knowledge Sharing only

Quick recap of what it is

Intended for Knowledge Sharing onlyhttps://imgflip.com/memegenerator/

Intended for Knowledge Sharing only

Quick recap of what it is

Intended for Knowledge Sharing onlyhttps://imgflip.com/memegenerator/

Intended for Knowledge Sharing only

Quick recap of what it is

Intended for Knowledge Sharing only

What love is for the soul; Actionability is for Analytics - reason for it all!

WHAT IS IT AFTER ALL??

Intended for Knowledge Sharing only

Specific answer to the question

Easy to understand

Timely & available (whenever, wherever & however needed)

Trustworthy & reliable

Scalable & Repeatable

…seems like ‘data size’ doesn’t matter, so why did we end up with Big data?

Intended for Knowledge Sharing only

Quick recap of what it is

Intended for Knowledge Sharing only

Yeah, why so much emphasis on more data?

ABILITY TO CHECK MORE HYPOTHESES…

Additional data from across the board increases chances of testing more hypotheses taking us closer to causality….

Intended for Knowledge Sharing only

TRANSACTION DATA

CLICK STREAM DATA (MOBILE & WEB)

SENTIMENT/SOCIAL DATA

• Are overall txns going up/down; where the txns are happening, etc..

• How are Consumers interacting with the website/app – drop-offs, clicks, Time spent, etc..

• Social Media, NPS surveys, Media mentions helps in gauging true Consumer reactions

DATA SOURCES TYPES OF INSIGHTS

SERVER LOGS DATA • How are consumers reacting with various functions on the front end?

LOCATION DATA • Are consumers using the product in-store or on the move?

PROMOTIONS DATA • How are consumers reacting to various marketing campaigns?

INDUSTRY DATA • Benchmarking against industry performance

9

SENSITIVITY OF STUDIES

Parse out trends from sensitive data (small variations) or….

Intended for Knowledge Sharing only 10

0.01% 0.03% 0.05% 0.10% 0.25% 0.50% 1.00% 2.50% 5.00% 7.50% 10.00%0M

100M

200M

300M

400M

500M

600M

0.01%; 537M

0.03%; 86M

0.05%; 21M10.00%; 0M

The required sample size for significance drops as the 'test' delta increases...

Delta to Test

Requ

ired

Sam

ple

Size

SENSITIVITY OF STUDIES contd…

…to parse out signal from noisy data

Intended for Knowledge Sharing only 11

0.0% 0.1% 0.5% 1.0% 5.0% 10.0% 20.0% 25.0%0K

10K

20K

30K

40K

50K

60K

52K

38K

27K23K

13K9K

6K 5K

...higher the error tolerance, lower is the required size

Acceptable error threshold

Requ

ired

Sam

ple

Size

IN GOD WE TRUST, OTHERS PLEASE BRING YOUR VALIDATION RESULTS!

Multiple cross validation across data samples ensures reliability of results…

Intended for Knowledge Sharing only 12

Pre-work & Kickoff1

Translation to Analytical Framework2

Data Collection and Preparation3

Analysis, Validation & Verification4

Actionable insights and impact sizing5

A/B Testing6

Rollouts7

Steps

Bootstrapping and/or Mutually

exclusive samples (In-time & Out-time)

TAILORING IS CARING

Sufficient sample sizes across major business segments helps micro-targeting (quickly hitting significance with MVTs)…

Intended for Knowledge Sharing only 13

gerardnico.com

Intended for Knowledge Sharing only

Quick recap of what it is

Intended for Knowledge Sharing only

But ‘overkill’ is a thing too…

LAW OF DIMINISHING RETURNS

Intended for Knowledge Sharing only 15

Source: http://insight.nau.edu/downloads/Sample%20Size%20and%20Modeling%20Accuracy.pdfOriginal Authors of the study: James Morgan, Robert Dougherty, Allan Hilchie and Bern Carey. All of the Center for Data Insights, Northern Arizona University

COMPLEXITY COST

Intended for Knowledge Sharing only 16

VOLUME, VELOCITY, VARIETY

PLATFORM COST

DATA PREP COST (incl ETL)

VERACITY CHECKS

ANALYSES COST

SCORING & DELIVERY COST

It all quickly adds up…

NOT ALL THAT SHINES IS GOLD…

Intended for Knowledge Sharing only 17

80% of the world’s is unstructured and maybe a sizeable chunk is unusable too…

www.lostateminor.com

WA’I’TING & WA’S’TING DIFFER BY JUST A SINGLE LETTER…

Intended for Knowledge Sharing only 18

Some problems need to be addressed quickly based on absolute counts, consistence, trends or severity…

TOO MUCH DATA & TOO MUCH FITTING

Intended for Knowledge Sharing only 19

If the model look too good to be true, it probably is…

blog.algotrading101.com

Intended for Knowledge Sharing only

Quick recap of what it is

Intended for Knowledge Sharing only

Some ways to maximize insight generation from smaller data size…

USUALLY EXPLORATORY PRE-WORK HELPS SET THE STAGE…

Intended for Knowledge Sharing only 21

Pre-work & Kickoff1

Translation to Analytical Framework2

Data Collection and Preparation3

Analysis, Validation & Verification4

Actionable insights and impact sizing5

A/B Testing6

Rollouts7

Steps

Strategic need, Est impact, RoI, Resources, alternatives/proxies/ historical precedents/

learnings

ADDITIONAL TYPE OF INSIGHTS ENRICHES IT WITH ACTIONABILITY…

Intended for Knowledge Sharing only 22

A SUGGESTED DECISION MATRIX ON WHEN TO CHOOSE WHAT

Intended for Knowledge Sharing only 23

THE EXECUTION FRAMEWORK: LEARN, LISTEN & TEST

Intended for Knowledge Sharing only 24

Strategy

Data Instrumentatio

n

Data Platfor

m

Reporting

Analytics

Research

Test & Optimiz

e

Data Product

s

IterativeLoop

Focus on Big WinsReduced WastageQuick FixesAdaptabilityAssured executionLearning for future initiatives

ANALYTICS IS TRANSFORMING FROM A DATA “WEAKNESS” TO MORE “DOMAIN” ACUMEN ROLES…

Intended for Knowledge Sharing only 25

Intended for Knowledge Sharing only

Quick recap of what it is

Intended for Knowledge Sharing only

Parting words…

SUMMARY

Intended for Knowledge Sharing only

1 Actionability is generating insights that can be used quickly & decisively

2 Information needs to be wider & touching complementary areas (explain variance better) to get nearer to causality

3 “Learn-Listen-Test” helps quickly validate & incorporate learnings for a tangible business impact

4 Pre-work (Knowledge Management Framework) can help target resources/fix scope/data needs to the right problems

5 The success criteria for analytics roles is shifting to a “enable business first” mode

Intended for Knowledge Sharing only

Quick recap of what it is

Intended for Knowledge Sharing only

Appendix

THANK YOU!

Intended for Knowledge Sharing only

Would love to hear from you on any of the following forums…

https://twitter.com/decisions_2_0

http://www.slideshare.net/RamkumarRavichandran

https://www.youtube.com/channel/UCODSVC0WQws607clv0k8mQA/videos

http://www.odbms.org/2015/01/ramkumar-ravichandran-visa/

https://www.linkedin.com/pub/ramkumar-ravichandran/10/545/67a

RAMKUMAR RAVICHANDRAN

29

Intended for Knowledge Sharing only

Disclaimer: Participation is purely on a personal basis and does not represent VISA,Inc. in any form or matter. The talk is based on learning from work across industries and firms. Care has been taken to ensure no proprietary or work related info of any firm is used in any material.

Director, Insights at Visa, Inc. Enable Decision Making at the Executives/ Product/Marketing level via actionable insights derived from Data.

RAMKUMAR RAVICHANDRAN