Data mining by example - building predictive model using microsoft decision trees

46
Data Mining By Example – Building Predictive Model Using Microsoft Decision Trees by Shaoli Lu

Transcript of Data mining by example - building predictive model using microsoft decision trees

Page 1: Data mining by example - building predictive model using microsoft decision trees

Data Mining By Example – Building Predictive Model Using Microsoft

Decision Trees

by Shaoli Lu

Page 2: Data mining by example - building predictive model using microsoft decision trees

Microsoft Decision Trees

• Developed by Microsoft research team, the Microsoft Decision Trees algorithm is a hybrid decision tree algorithm that supports classification and regression

Page 3: Data mining by example - building predictive model using microsoft decision trees

Goal

• To predict a prospect’s likelihood of purchasing a bike

Page 4: Data mining by example - building predictive model using microsoft decision trees

Prerequisite

• An SQL Server instance created (2005 or above)

• SQL Server Analysis Service (SSAS) –Multidimensional Feature Installed

(this is used to host and browse the mining structures; cube is not required for data mining!)

• AdventureWorksDW database attached(download from CodePlex - tailor to the SQL Server version you have)

• Visual Studio 2010 or above with SQL Server Data Tools (SSDT) installed

Page 5: Data mining by example - building predictive model using microsoft decision trees

My Demo Setup

• Visual Studio 2010

• SQL Server 2012

Page 6: Data mining by example - building predictive model using microsoft decision trees

Create Data Mining Project

• Name the project as DM Decision Trees (DM = Data Mining)

Page 7: Data mining by example - building predictive model using microsoft decision trees
Page 8: Data mining by example - building predictive model using microsoft decision trees

Create Data Source and Impersonation

Page 9: Data mining by example - building predictive model using microsoft decision trees
Page 10: Data mining by example - building predictive model using microsoft decision trees

Create Data Source View

Page 11: Data mining by example - building predictive model using microsoft decision trees
Page 12: Data mining by example - building predictive model using microsoft decision trees

Create Mining Structure

• Choose Microsoft Decision Trees model

• Select Data Source View

• Choose training data

• Select Input/Predict parameters

• Set content types

• Set Holdout percentage

• Name the mining structure and model

Page 13: Data mining by example - building predictive model using microsoft decision trees
Page 14: Data mining by example - building predictive model using microsoft decision trees
Page 15: Data mining by example - building predictive model using microsoft decision trees
Page 16: Data mining by example - building predictive model using microsoft decision trees
Page 17: Data mining by example - building predictive model using microsoft decision trees
Page 18: Data mining by example - building predictive model using microsoft decision trees
Page 19: Data mining by example - building predictive model using microsoft decision trees
Page 20: Data mining by example - building predictive model using microsoft decision trees
Page 21: Data mining by example - building predictive model using microsoft decision trees
Page 22: Data mining by example - building predictive model using microsoft decision trees
Page 23: Data mining by example - building predictive model using microsoft decision trees
Page 24: Data mining by example - building predictive model using microsoft decision trees

Deploy the mining structure and model

Page 25: Data mining by example - building predictive model using microsoft decision trees
Page 26: Data mining by example - building predictive model using microsoft decision trees

Process the mining model

• This is also called “training the model”

Page 27: Data mining by example - building predictive model using microsoft decision trees
Page 28: Data mining by example - building predictive model using microsoft decision trees

Mining Model Viewer

• Identify dominant attributes

• Left is associative with more important attributes

• Rich visualization is good for data exploration as well

Page 29: Data mining by example - building predictive model using microsoft decision trees
Page 30: Data mining by example - building predictive model using microsoft decision trees

Mining Model Accuracy Chart

• This is called “Testing the Model” using the Holdout data

• Lift chart

• Profit chart

Page 31: Data mining by example - building predictive model using microsoft decision trees
Page 32: Data mining by example - building predictive model using microsoft decision trees
Page 33: Data mining by example - building predictive model using microsoft decision trees
Page 34: Data mining by example - building predictive model using microsoft decision trees

Mining Model Prediction

• Singleton query

• Mass prediction

Page 35: Data mining by example - building predictive model using microsoft decision trees
Page 36: Data mining by example - building predictive model using microsoft decision trees
Page 37: Data mining by example - building predictive model using microsoft decision trees
Page 38: Data mining by example - building predictive model using microsoft decision trees
Page 39: Data mining by example - building predictive model using microsoft decision trees
Page 40: Data mining by example - building predictive model using microsoft decision trees
Page 41: Data mining by example - building predictive model using microsoft decision trees

Browse mining model on SQL Server

• Decision trees

• Dependency network

Page 42: Data mining by example - building predictive model using microsoft decision trees
Page 43: Data mining by example - building predictive model using microsoft decision trees
Page 44: Data mining by example - building predictive model using microsoft decision trees
Page 45: Data mining by example - building predictive model using microsoft decision trees

Summary

• Microsoft Decision Trees is a powerful data mining model, yet it is easy to build, train and use

• Can perform both Singleton (e.g. embed in an app) and Mass Predictions (e.g. targeted marketing)

• Holdout data can be used to test trained model• Rich visualizations such as Lift/Profit Charts and

Dependency Network can facilitate analysis and data exploration

• Relational database can be used for data mining; cube is not required

Page 46: Data mining by example - building predictive model using microsoft decision trees

The End