Some working definitions…. ‘Data Mining’ and ‘Knowledge Discovery in Databases’ (KDD) are...

26
Some working definitions…. • ‘Data Mining’ and ‘Knowledge Discovery in Databases’ (KDD) are used interchangeably • Data mining = – the discovery of interesting , meaningful and actionable patterns hidden in large amounts of data • Multidisciplinary field originating from artificial intelligence, pattern recognition, statistics, machine learning, econometrics, ….

Transcript of Some working definitions…. ‘Data Mining’ and ‘Knowledge Discovery in Databases’ (KDD) are...

Some working definitions….

• ‘Data Mining’ and ‘Knowledge Discovery in Databases’ (KDD) are used interchangeably

• Data mining = – the discovery of interesting, meaningful and

actionable patterns hidden in large amounts of data • Multidisciplinary field originating from artificial

intelligence, pattern recognition, statistics, machine learning, econometrics, ….

Data mining is a process…

• Business objectives• Model Development

– Model objective– Data collection & preparation– Model construction– Model evaluation– Combining models with business knowledge into decision

logic• Model / decision logic deployment• Model / decision logic monitoring

Data mining is a process…a marketing example

• Business objectives– Cross sell MMS bundle to lapsed users / non users

• Model Development– Model objective

• For consumers with no MMS bundle in past 6 months, predict MMS bundle ownership yes/no in next three months

– Data collection & preparation• All fields for all active customers as of end APR05; remove all customers with MMS bundle in NOV04-

APR05; Left join MMS Bundle field from MAY05, JUNE05, JULY05– Model construction

• Build various models to predict MMS Bundle MAY or JUNE or JULY = ‘N’ on 70% if the data– Model evaluation

• Evaluate predictive power on 70% data for model development and 30% test set– Combining models with business knowledge into decision logic

• Target the top 30% and randomly test two propositions (50 MMS for 5Euro; 100MMS for 7.50Euro) across two channel (Direct mail and SMS)

• Model / decision logic deployment– Run the campaign

• Model / decision logic monitoring– Compare predctions against actual response to evaluate model quality and robustness– What propositions / channels work best

Data mining tasks

• Undirected, explorative, descriptive, ‘unsupervised’ data mining– Matching & search– Profile & rule extraction– Clustering & segmentation; dimension reduction

• Directed, predictive, ‘supervised’ data mining– Predictive modeling

Data mining task example: Clustering & segmentation

Data mining task example: Clustering & segmentation

Start Looking Glass

Source: Sentient Information Systems (www.sentient.nl)

Tussenresultaat looking glass

Source: Sentient Information Systems (www.sentient.nl)

Resultaat Looking Glass

Source: Sentient Information Systems (www.sentient.nl)

Resultaat Looking Glass

Source: Sentient Information Systems (www.sentient.nl)

Case A 7

Case B 4

10987654321

Worsebusiness

Score

Betterbusiness

Case A

Case B

Past experience

Data Behaviour

GoodBad

Bad

Good

Model

Data mining task example:predictive modeling

Data mining task example:predictive modeling

Income Age Children

60K 38 2

30K 23 1

30K 29 0

... ... ...

120K 55 2

Collected data

score = (0 x Income) + (-1 x Age) + (25 x Children)

Data mining task example:predictive modeling

Income Age Children Status Value Score

60K 38 2 Good 100 12

30K 23 1 Good 45 2

30K 29 0 Bad -80 -24

... ... ... ... ... ...

120K 55 2 Bad -40 -5

Data mining techniques for predictive modeling

• Linear and logistic regression• Decision trees• Neural Networks• Nearest Neighbor• Genetic Algorithms• ….

score

=

(0 x Income) + (-1 x Age) + (25 x Children)

Linear Regression Models

Regression in pattern space

age

inco

me

Only a single line available in pattern space to separate classes

Class ‘circle’

Class ‘square’

Decision Trees

20000 customersresponse 1%

Income >150000?

18800 customersPurchases >10?

1200 customersbalance>50000?

800 customersresponse 1,8% etc.400 customers

response 0,1%

no

noyes

yes

no

Decision Trees in Pattern Space

age

inco

me

Line pieces perpendicular to axes

Each line is a split in the tree, two answers to a question

Decision Trees in Pattern Space

age

wei

ght

Goal classifier is to seperate classes (circle, square) on the basis of attribute age and income

Each line corresponds to a split in the tree

Decision areas are ‘tiles’ in pattern space

Nearest Neighbour

• Data itself is the classification model, so no abstraction like a tree etc.

• For a given instance x, search the k instances that are most similar to x

• Classify x as the most occurring class for the k most similar instances

= new instance

Any decision area possible

Condition: enough data available

Nearest Neighbor in Pattern Space

Classification

fe age

fe w

eigh

t

Nearest Neighbor in Pattern Space

Voorspellen

f.e. age

bvb.

wei

ght

Any decision area possible

Condition: enough data available

Example classification algorithm 3:Neural Networks

• Inspired by neuronal computation in the brain (McCullough & Pitts 1943 (!))

• Input (attributes) is coded as activation on the input layer neurons, activation feeds forward through network of weighted links between neurons and causes activations on the output neurons (for instance diabetic yes/no)

• Algorithm learns to find optimal weight using the training instances and a general learning rule.

invoer:bvb. klantkenmerken

uitvoer:bvb. respons

• Example simple network (2 layers)

• Probability of being diabetic = f (age * weightage + body mass index * weightbody mass index)

Neural Networks

Weightbody mass index

Probability of being diabetic

age body_mass_index

weightage

Neural Networks in Pattern Space

Classification

f.e. age

f.e.

wei

ght

Simpel network: only a line available (why?) to seperate classes

Multilayer network:

Any classification boundary possible

Dilbert’s Perspective on Data Mining