Stay Classy: Maximizing Promotion Returns Using Classification · Stay Classy: Maximizing Promotion...
Transcript of Stay Classy: Maximizing Promotion Returns Using Classification · Stay Classy: Maximizing Promotion...
#AnalyticsXC o p y r ig ht © 201 6, SAS In st i tute In c. A l l r ig hts r ese rve d.
Stay Classy: Maximizing Promotion Returns Using ClassificationKatherine SanbornManager, Business AnalyticsKellogg Company
Project beginnings
• Exploratory project utilizing EM to understand underlying patterns in our sales data
• Objective is to determine promotion attributes important to incremental lift
• Typically would use time-series forecasting with this data, but are exploring new data and tool uses
2
Agenda
• Data
• Methodology
• Visual Analytics Exploration
• Modeling Results
• Accuracy Assessment
• Lessons Learned
• Conclusion
3
Data
• Three years of point of sale data
• Our products and major competitive products
• Incremental and base variables including pricing, merchandising, and volumetric measures
– Merchandising includes display and feature measures
4
Methodology
• Define incremental lift
– Base sales
– Incremental sales
• Start with three groups segmented by incremental lift
• Use SEMMA as process to build and assess models
– EG, EM, and Visual Analytics used in various stages of project
5
Visual Analytics Exploration
6
Models from RPM Diagram
7
Custom EM Models from Project
8
Accuracy Assessment
Which is better?
(A.) Model 1: 85% Accuracy (15% misclassification)(B.) Model 2: 70% Accuracy (30% misclassification)(C.) Model 3: 75% Accuracy (25% misclassification)
Actual Classification
Model 1's Classification
Model 2's Classification
Model 3's Classification
Class X: 80 items 95 items 50 items 55 items
Class Y: 20 items 5 Items 50 items 45 items
Model 1: Assign almost
everything to X
Model 2: 50/50 Chance Assignment
Model 3: Decision Tree Assignment
X Y X Y X Y
X 80 0 X 50 30 X 55 25
Y 15 5 Y 0 20 Y 0 20
Model 3
Act
ual
75%
Act
ual
Model 1
85%
Model 2
Act
ual
70%
9
Accuracy Assessment
Cohen’s Kappa & Cohen’s Weighted Kappa
• Takes into consideration chance agreement • Allows for weighting more significant misclassifications
KappaAccurac
y
Model 1 1 - 15 / 23 = 0.348 85%
Model 2 1 - 30 /50 = 0.400 70%
Model 3 1 - 25 / 47 = 0.468 75%
It depends on the cost of the misclassification
X Y X Y
95 * 80 / 100 5 * 80 / 100
76.00 4.00
95 * 20 / 100 5 * 20 / 100
19.00 1.00
95 5 100
X Y
X
Y
X
Y
Act
ua
l
Weight Matrix
1
1 0
0
80 0 80
20Y
X
Model 1 Expectation Matrix
15 5
Which model is “better”?
10
Accuracy Assessment
Rapid Predictive Modeler Models: Accuracy Kappa W.Kappa
Main Effects Regression 64.7% 40.8% 47.7%
Forward Selection Regression 64.5% 40.4% 47.2%
Stepwise Regression 64.5% 40.6% 47.4%
Decision Tree 61.7% 39.2% 47.4%
Backwards Selection Regression 65.0% 41.3% 47.9%
Neural Network 65.2% 41.4% 48.3%
Ensemble Champion 65.0% 42.7% 49.0%
Enterprise Miner Workflow Models:
Gradient Boosting 66.0% 42.7% 49.0%
Logistic Regression (no interactions) 68.1% 47.1% 53.3%
Neural Network 68.4% 47.7% 54.2%
Regular Linear Regression 68.2% 47.3% 53.4%
Ensemble Models 68.2% 47.3% 53.6%
Weighted Matrix
A B C
A 0 1 2
B 1 0 1
C 2 1 0
Penalizes misclassification of A C and C A more than others
11
Lessons Learned
• Certain impactful promotional measures drive incremental lift
– Including feature and display variables, and price discounts
• Additional promotion attributes are needed to fine tune classification
12
Conclusion
• You can start data mining with RPM and EM today
• Gathering your data and start with RPM
13
C o p y r ig ht © 201 6, SAS In st i tute In c. A l l r ig hts r ese rve d.
#AnalyticsX