Data Driven Decisions seminar

Post on 11-Apr-2017

453 views 0 download

Transcript of Data Driven Decisions seminar

Veldkant 33A, Kontich ● info@infofarm.be ● www.infofarm.be

Data Science Company

Data Driven Decisions withloosely structured data

InfoFarm seminar25/11/2015

Veldkant 33A, Kontich ● info@infofarm.be ● www.infofarm.be

Wrap up and lunch

9:30 9:40 10:00 10:30 11:45Coffee & welcome

Dark data

External data To structure or not to structure?

Log files

Text mining

Network analysis

Image processing

Overview

Veldkant 33A, Kontich ● info@infofarm.be ● www.infofarm.beData Science Company

About us

Veldkant 33A, Kontich ● info@infofarm.be ● www.infofarm.be

Data Science Big Data

www.infofarm.be

Veldkant 33A, Kontich ● info@infofarm.be ● www.infofarm.be

Building (Big) Data (Science) solutions

– Recommendation engines, Prediction models, Automated classification, …

– Custom-made data applications

Veldkant 33A, Kontich ● info@infofarm.be ● www.infofarm.be

developmentDevelopment

Domain

knowledge

Data

Science

Visualization

Veldkant 33A, Kontich ● info@infofarm.be ● www.infofarm.be

Completing the puzzle

Veldkant 33A, Kontich ● info@infofarm.be ● www.infofarm.be

Our approach

•Explorative Data Analysis (EDA)

•Formulate hypotheses•Hypotheses testing

•Implement•Automate•Integrate•Add extra data gathering•Rollout

•Identify use cases•Clean data•Enrich data

•Gather the data you need

Acquire Prepare

AnalyzeAct

Veldkant 33A, Kontich ● info@infofarm.be ● www.infofarm.be

Our approach

“Don’t run before you can walk”

CollectDescribe

DiscoverPredict

Advise

This is were the hype

around Big Data and Data

Science generates

unrealistic expectations!

Veldkant 33A, Kontich ● info@infofarm.be ● www.infofarm.be

Our customers

10

Veldkant 33A, Kontich ● info@infofarm.be ● www.infofarm.beData Science Company

Data Driven Decisions

Veldkant 33A, Kontich ● info@infofarm.be ● www.infofarm.be

Business KnowledgeAcquired by experience

(assumed) insights

RISK: too high bias on past experience and gut feeling

Data ScienceComplementary to business knowledge

Confirmative or new insightsData-driven decision taking

RISK: too naive data intepretation, disconnected from business

Versus business knowledge

Veldkant 33A, Kontich ● info@infofarm.be ● www.infofarm.be

Veldkant 33A, Kontich ● info@infofarm.be ● www.infofarm.be

It’s all about asking the right question!

Veldkant 33A, Kontich ● info@infofarm.be ● www.infofarm.be

Finding those questions

What do you want?

What do you have?

What is feasible?

What is implementable?What can you get?

Veldkant 33A, Kontich ● info@infofarm.be ● www.infofarm.be

Veldkant 33A, Kontich ● info@infofarm.be ● www.infofarm.be

Where to start?

Veldkant 33A, Kontich ● info@infofarm.be ● www.infofarm.be

The key point is spotting opportunities to outperform your

competitors!

Veldkant 33A, Kontich ● info@infofarm.be ● www.infofarm.be

What to dream?

Veldkant 33A, Kontich ● info@infofarm.be ● www.infofarm.be

How about the in the room?

Veldkant 33A, Kontich ● info@infofarm.be ● www.infofarm.beData Science Company

Unused but valuable data sources

Veldkant 33A, Kontich ● info@infofarm.be ● www.infofarm.be

Hidden – Forgotten - Underestimated

Veldkant 33A, Kontich ● info@infofarm.be ● www.infofarm.be

Inaccessible

Veldkant 33A, Kontich ● info@infofarm.be ● www.infofarm.be

Terrifying

Veldkant 33A, Kontich ● info@infofarm.be ● www.infofarm.be

Or all of the above

Veldkant 33A, Kontich ● info@infofarm.be ● www.infofarm.be

Internal

ERP SYSTEM

Financial management

Supply Chain Management

Manufacturing Resource Planning

Human Resource

management

Customer Relationship Management

Secondary use of

Veldkant 33A, Kontich ● info@infofarm.be ● www.infofarm.be

External

And many,many more

….

Veldkant 33A, Kontich ● info@infofarm.be ● www.infofarm.be

When all comes together…

WEB SERVER LOGSWhich customers

looked at similar products?

ORDER HISTORYWhich

complementary products does the

customer own?

EXTERNAL DATAReviews or critics?

CRM INFORMATIONTypical profile of a

customer responsive on campaigns for a

similar product?

Veldkant 33A, Kontich ● info@infofarm.be ● www.infofarm.beData Science Company

Analyzing non-relational data

Veldkant 33A, Kontich ● info@infofarm.be ● www.infofarm.be

To structureOr

Not to structure…

Veldkant 33A, Kontich ● info@infofarm.be ● www.infofarm.be

Count

Parse

Veldkant 33A, Kontich ● info@infofarm.be ● www.infofarm.be

Who’s connected to who? Who bought what?

Veldkant 33A, Kontich ● info@infofarm.be ● www.infofarm.be

Red

Green

Blue

Veldkant 33A, Kontich ● info@infofarm.be ● www.infofarm.be

Log files

Veldkant 33A, Kontich ● info@infofarm.be ● www.infofarm.be

From weblogs

How long?

When?What?

What will you buy???Will you buy???

Veldkant 33A, Kontich ● info@infofarm.be ● www.infofarm.be

Over usage logsUses of Google

When you're too lazy to type in ".com"Finding PornFinding useful informationSpell Checking

Veldkant 33A, Kontich ● info@infofarm.be ● www.infofarm.be

To performance logs

Veldkant 33A, Kontich ● info@infofarm.be ● www.infofarm.be

Demo: the use of external logs

Veldkant 33A, Kontich ● info@infofarm.be ● www.infofarm.be

External weather logs

Coordinates?

Missing data

Veldkant 33A, Kontich ● info@infofarm.be ● www.infofarm.be

Do not blindly trust your data

Outliers Missing

Veldkant 33A, Kontich ● info@infofarm.be ● www.infofarm.be

But do not give up to soonLa

titud

e

Longitude

Veldkant 33A, Kontich ● info@infofarm.be ● www.infofarm.be

As usage is still possible

Veldkant 33A, Kontich ● info@infofarm.be ● www.infofarm.be

Text mining

Veldkant 33A, Kontich ● info@infofarm.be ● www.infofarm.be

From named entity extraction

Veldkant 33A, Kontich ● info@infofarm.be ● www.infofarm.be

Over sentiment analysis

Veldkant 33A, Kontich ● info@infofarm.be ● www.infofarm.be

To topic extraction

Text

TopicWhat should we

communicate about?

Veldkant 33A, Kontich ● info@infofarm.be ● www.infofarm.be

Veldkant 33A, Kontich ● info@infofarm.be ● www.infofarm.be

Network analysis

Veldkant 33A, Kontich ● info@infofarm.be ● www.infofarm.be

From network optimization

Veldkant 33A, Kontich ● info@infofarm.be ● www.infofarm.be

Over mathematical mumbo-jumbo

Veldkant 33A, Kontich ● info@infofarm.be ● www.infofarm.be

To (predictive) network analysis

Veldkant 33A, Kontich ● info@infofarm.be ● www.infofarm.be

And recommenders

Veldkant 33A, Kontich ● info@infofarm.be ● www.infofarm.be

Veldkant 33A, Kontich ● info@infofarm.be ● www.infofarm.be

Image processing

Veldkant 33A, Kontich ● info@infofarm.be ● www.infofarm.be

From human taught

Veldkant 33A, Kontich ● info@infofarm.be ● www.infofarm.be

Veldkant 33A, Kontich ● info@infofarm.be ● www.infofarm.be

To self-learned

Veldkant 33A, Kontich ● info@infofarm.be ● www.infofarm.be

Veldkant 33A, Kontich ● info@infofarm.be ● www.infofarm.beData Science Company