Storytelling with Data to Executives 09212016 - JMP User Community · 2019-11-30 · Demographics...

33
Storytelling with Data to Executives JIM GRAYSON (AUGUSTA UNIVERSITY) MIA STEPHENS (JMP – DIVISION OF SAS)

Transcript of Storytelling with Data to Executives 09212016 - JMP User Community · 2019-11-30 · Demographics...

Page 1: Storytelling with Data to Executives 09212016 - JMP User Community · 2019-11-30 · Demographics Employment*duration Age Rents Owns*residence Jobcategory Number*of*dependents Years*at*present*residence

Storytelling  with  Data  to  ExecutivesJIM  GRAYSON  (AUGUSTA  UNIVERSITY)

MIA  STEPHENS  (JMP  – DIVISION  OF  SAS)

Page 2: Storytelling with Data to Executives 09212016 - JMP User Community · 2019-11-30 · Demographics Employment*duration Age Rents Owns*residence Jobcategory Number*of*dependents Years*at*present*residence

Consider  This  Scenario

Discovery  Summit  2016 2

A  bank  is  struggling  with  the  way  it  decides  who  is  a  good  credit  risk  and  asks  for  your  help  to  develop  a  model.

Page 3: Storytelling with Data to Executives 09212016 - JMP User Community · 2019-11-30 · Demographics Employment*duration Age Rents Owns*residence Jobcategory Number*of*dependents Years*at*present*residence

Modeling  Approach

From  Building  Better  Models  with  JMP  Pro,  Grayson,  Gardner  and  Stephens,  2015.

BusinessAnalyticsProcess

Define  the  Problem

Prepare  for  Modeling

Modeling

Deploy  Model

Monitor  Performance

Business  Problem

May  loop  back  at  any  step

You  follow  the  Business  Analytics  Process

Page 4: Storytelling with Data to Executives 09212016 - JMP User Community · 2019-11-30 · Demographics Employment*duration Age Rents Owns*residence Jobcategory Number*of*dependents Years*at*present*residence

Data  PreparationKey  Activities:• Determine  which  data  are  needed• Compile  (or  collect  new)  data• Explore,  examine  and  understand  data• Assess  data  quality• Clean  and  transform  data• Define  features• Reduce  dimensionality• Create  training,  validation  and  test  sets

Key  Tools:• SQL/Query• Data  table  structuring  -­‐ join,  concatenate,  

update,  stack,  summarize,…• Summary  statistics  and  graphical  displays,  

interactive  tools  and  filtering  Multivariate  procedures  (clustering,  PCA,…)

• Transformations,  creating  derived  variables• Missing  data  utilities,  outlier  analysis,  

recoding,  binning• Creating  holdout  set(s)

Page 5: Storytelling with Data to Executives 09212016 - JMP User Community · 2019-11-30 · Demographics Employment*duration Age Rents Owns*residence Jobcategory Number*of*dependents Years*at*present*residence

ModelingKey  Activities:• Choose  the  appropriate  modeling  method  

or  methods• Fit  one  or  more  models• Evaluate  the  performance  of  each  model  

using  validation  statistics  (misclassification,  RMSE,  Rsquare)

• Choose  the  best  model  or  set  of  models  to  address  the  analytics  problem  (and  ultimately  the  business  problem)

• **Create  ensemble  models  

Key  Tools:• Multiple  Regression• Logistic  Regression• Naïve  Bayes• kNN• Classification  and  Regression  Trees• Bootstrap  Forests  and  Boosted  Trees• Neural  Networks• Generalized  Linear  Models• Survival  Models• Forecasting/Time  Series• Model  Comparison• Text  Mining

Page 6: Storytelling with Data to Executives 09212016 - JMP User Community · 2019-11-30 · Demographics Employment*duration Age Rents Owns*residence Jobcategory Number*of*dependents Years*at*present*residence

The  Data

Discovery  Summit  2016 6

• German  Credit  data  set  available  at    https://archive.ics.uci.edu/ml/datasets/Statlog+(German+Credit+Data)

• Contains  observations  on  30  variables  for  1000  past  applicants.

• Each  applicant  rated  as  either  a  “good  credit”  (700  cases)  or  a  “bad  credit”  (300  cases)

Page 7: Storytelling with Data to Executives 09212016 - JMP User Community · 2019-11-30 · Demographics Employment*duration Age Rents Owns*residence Jobcategory Number*of*dependents Years*at*present*residence

JMP

Page 8: Storytelling with Data to Executives 09212016 - JMP User Community · 2019-11-30 · Demographics Employment*duration Age Rents Owns*residence Jobcategory Number*of*dependents Years*at*present*residence

Presentation  of  Results

Discovery  Summit  2016 8

You  have  developed  a  model  for  identifying  good  credit  risk  applicants.

You present  your  modeling  results  to  the  executive  team.

Page 9: Storytelling with Data to Executives 09212016 - JMP User Community · 2019-11-30 · Demographics Employment*duration Age Rents Owns*residence Jobcategory Number*of*dependents Years*at*present*residence

You  Present  This  Information  …

Discovery  Summit  2016 9

The  best  model,  from  a  profit  perspective,  is  a  Two  Stage  Forward  Selection,  with  an  average  profit  of  0.1315.

Measures of Fit for RESPONSE

CreatorFit Ordinal LogisticPartitionBootstrap ForestBoosted TreeNeuralFit Generalized Two Stage Forward SelectionFit Generalized Two Stage Forward SelectionFit Generalized Double Lasso

.2.4.6.8Entropy

RSquare0.17160.10020.22420.19990.24170.35430.33780.3760

GeneralizedRSquare

0.26810.16330.33970.30720.36250.49820.47940.5222

Mean -Log p0.50610.54970.47390.48880.46320.39440.40450.3812

RMSE0.40810.43220.39530.40300.39120.35550.36200.3543

MeanAbs Dev

0.31370.35600.33770.34480.29740.25940.27290.2599

MisclassificationRate

0.25500.31500.23000.25500.22500.17500.20500.1800

N200200200200200200200200

AverageProfit

0.08570.07470.11180.11350.11530.12830.1315

0.12

AUC0.79290.70900.82070.80580.82760.87500.86580.8823

Page 10: Storytelling with Data to Executives 09212016 - JMP User Community · 2019-11-30 · Demographics Employment*duration Age Rents Owns*residence Jobcategory Number*of*dependents Years*at*present*residence

And  This  Information…

Discovery  Summit  2016 10

Predictor Fit Generalized Two Stage Forward Selection

ActualRESPONSEGood RiskNot Good Risk

Predicted Count

Good Risk12425

NotGood Risk

1635

ActualRESPONSEGood RiskNot Good Risk

Predicted Rate

Good Risk0.8860.417

NotGood Risk

0.1140.583

ActualRESPONSEGood RiskNot Good Risk

Decision Count

Good Risk988

NotGood Risk

4252

ActualRESPONSEGood RiskNot Good Risk

Decision Rate

Good Risk0.7000.133

NotGood Risk

0.3000.867

MisclassificationRate

0.2500

Page 11: Storytelling with Data to Executives 09212016 - JMP User Community · 2019-11-30 · Demographics Employment*duration Age Rents Owns*residence Jobcategory Number*of*dependents Years*at*present*residence

And  This  Information…

Discovery  Summit  2016 11

ROC Curve for RESPONSE=Good RiskSe

nsiti

vity

0.00

0.10

0.20

0.30

0.40

0.50

0.60

0.70

0.80

0.90

1.00

0.00 0.20 0.40 0.60 0.80 1.001-Specificity

PredictorProb[Good Risk]Prob(RESPONSE==Good Risk)Prob(RESPONSE==Good Risk)_1Prob(RESPONSE==Good Risk)_2Probability( RESPONSE=Good Risk )Probability( RESPONSE=Good Risk )_1Probability( RESPONSE=Good Risk )_2Probability( RESPONSE=Good Risk )_3

AUC0.79290.70900.82070.80580.82760.87500.86580.8823

Page 12: Storytelling with Data to Executives 09212016 - JMP User Community · 2019-11-30 · Demographics Employment*duration Age Rents Owns*residence Jobcategory Number*of*dependents Years*at*present*residence

What’s  The  Problem?

Discovery  Summit  2016 12

• We  are  proud  of  our  technical  work  – we  want  to  show  our  skills  and  worth  to  the  organization  – and  we  don’t  want  to  “over-­‐sell”

• We  use  our  technical  results    -­‐ which  are  not  understandable  to  a  non-­‐technical  audience  – to  provide  full  disclosure  and  understanding

• Non-­‐technical  audience  cannot  bridge  the  gap  for  how  this  “technical  jargon”  answers  their  problem  – seems  irrelevant  to  what  they  really  want  to  know  – THE  ANSWER

Page 13: Storytelling with Data to Executives 09212016 - JMP User Community · 2019-11-30 · Demographics Employment*duration Age Rents Owns*residence Jobcategory Number*of*dependents Years*at*present*residence

Recommendations

Discovery  Summit  2016 13

• Best  practices  for  storytelling  to  executives  

• Example  presentation  for  executives

Page 14: Storytelling with Data to Executives 09212016 - JMP User Community · 2019-11-30 · Demographics Employment*duration Age Rents Owns*residence Jobcategory Number*of*dependents Years*at*present*residence

SENIOR  ANALYST  (WISE  OLD  OWLS)Advice  on  communicating  analytic  results  to  senior  executives  from  Jeff  Cline,  “Owl  speaks  lion”,  ORMS  Today,  August  2016

• Have  a  five-­‐minute  version  and  a  two-­‐minute  version• Clearly  answer:  What?  So  What?  What  now?• Limit  your  presentation  slides:  Save  brilliance  for  back-­‐up  slides• Admit  ignorance  when  you  don’t  know• Be  prepared  to  talk  without  slides• Send  your  presentation  ahead• Practice  and  murder  board  before  briefing  (with  a  parliament  of  owls)

Page 15: Storytelling with Data to Executives 09212016 - JMP User Community · 2019-11-30 · Demographics Employment*duration Age Rents Owns*residence Jobcategory Number*of*dependents Years*at*present*residence

SENIOR  EXECUTIVES  (OLD  LIONS)Advice  on  communicating  analytic  results  to  senior  executives  from  Jeff  Cline,  “Owl  speaks  lion”,  ORMS  Today,  August  2016

• If  I  have  only  five  minutes,  so  do  you• Don’t  put  the  executive  back  in  math  class• It  is  not  necessary  to  share  with  me  everything  you  have  learned  in  

reaching  this  point  in  your  life• Don’t  raise  an  issue  unless  you  also  provide  recommendations• Give  me  the  main  points  early• If  you  can  answer  the  question,  say  so  and  get  back  with  me.  

Anything  else  is  a  waste  of  time• More  pictures,  fewer  words

Page 16: Storytelling with Data to Executives 09212016 - JMP User Community · 2019-11-30 · Demographics Employment*duration Age Rents Owns*residence Jobcategory Number*of*dependents Years*at*present*residence

Best  Practices

Discovery  Summit  2016 16

Nancy  Duarte  – “How  to  Present  to  Senior  Executives”  [HBR]

• Summarize  up  front  (high  level  findings,  conclusions,  recommendations,  call  to  action)

• Set  expectations  (summary  and  discussion)• Create  summary  slides  (10%  rule;  rest  in  appendix)• Give  them  what  they  asked  for  (answer  specific  request  directly)• Rehearse  (run  slides  by  honest  coach)

Page 17: Storytelling with Data to Executives 09212016 - JMP User Community · 2019-11-30 · Demographics Employment*duration Age Rents Owns*residence Jobcategory Number*of*dependents Years*at*present*residence

Best  Practices

Discovery  Summit  2016 17

Lisa  Morgan  – “Data  Storytelling:    What  It  Is,  Why  It  Matters”  [IW]

• General  Storytelling  Rules  Apply  (beginning,  middle,  end)• Consider  the  Audience  (don’t  use  one  size  fits  all  presentation)• Collaborate  (interdisciplinary  activity)• Avoid  Distractions  (address  a  specific  goal;  iceberg  rule)

Page 18: Storytelling with Data to Executives 09212016 - JMP User Community · 2019-11-30 · Demographics Employment*duration Age Rents Owns*residence Jobcategory Number*of*dependents Years*at*present*residence

Charts:    Two  Questions    -­‐>    Four  Types

Discovery  Summit  2016 18

DECLARATIVE

EXPLORATORY

DATA-­‐DRIVENCONCEPTUAL

Everyday  dataviz

Adapted  from  Good  Charts  by  Scott  Berinato,  p.  76.

Visual  discovery

Idea  illustration

Idea  generation

Page 19: Storytelling with Data to Executives 09212016 - JMP User Community · 2019-11-30 · Demographics Employment*duration Age Rents Owns*residence Jobcategory Number*of*dependents Years*at*present*residence

Two  Questions    -­‐>    Four  Types

Discovery  Summit  2016 19

DECLARATIVE

EXPLORATORY

DATA-­‐DRIVENCONCEPTUAL

• Know  the  audience• Keep  it  simple• Make  idea,  not  design,  pop

Adapted  from  Good  Charts  by  Scott  Berinato,  p.  76.

Page 20: Storytelling with Data to Executives 09212016 - JMP User Community · 2019-11-30 · Demographics Employment*duration Age Rents Owns*residence Jobcategory Number*of*dependents Years*at*present*residence

Sample  Presentation5-­‐10  MINUTE  PRESENTATION  TO  SENIOR  EXECUTIVES

Page 21: Storytelling with Data to Executives 09212016 - JMP User Community · 2019-11-30 · Demographics Employment*duration Age Rents Owns*residence Jobcategory Number*of*dependents Years*at*present*residence

German  Credit  ModelingAUGUSTA  ANALYTICS

Page 22: Storytelling with Data to Executives 09212016 - JMP User Community · 2019-11-30 · Demographics Employment*duration Age Rents Owns*residence Jobcategory Number*of*dependents Years*at*present*residence

Complete  Report

§ Executive  Summary§ Appendix  -­‐ Modeling  Methodology  and  Key  Results

DISCOVERY  SUMMIT  2016 22

Page 23: Storytelling with Data to Executives 09212016 - JMP User Community · 2019-11-30 · Demographics Employment*duration Age Rents Owns*residence Jobcategory Number*of*dependents Years*at*present*residence

Executive  Summary

§ Objectives§ Current  State§ Future  State§ Summary

DISCOVERY  SUMMIT  2016 23

Page 24: Storytelling with Data to Executives 09212016 - JMP User Community · 2019-11-30 · Demographics Employment*duration Age Rents Owns*residence Jobcategory Number*of*dependents Years*at*present*residence

Objectives

DISCOVERY  SUMMIT  2016 24

Business  Objective:    Improve  net  profits  of  loans  by  better  identifying  “good”  customers.  

Modeling  Objective:    Develop  a  classification  model  to  predict  if  an  applicant  is  a  good  or  bad  credit  risk.

Page 25: Storytelling with Data to Executives 09212016 - JMP User Community · 2019-11-30 · Demographics Employment*duration Age Rents Owns*residence Jobcategory Number*of*dependents Years*at*present*residence

Data  Resources

DISCOVERY  SUMMIT  2016 25

Financial  Resources

Checking  account  balanceSavings  account  balanceCredit  historyCredit  duration  (months)Credit  amountInstallment  rate  as  %  disposable  income

Owns  real  estateOwns  no  property

Credit  Purpose

New  carUsed  carFurnitureRadio  /  TVEducationRetraining

Demographics

Employment  durationAgeRentsOwns  residenceJob  categoryNumber  of  dependentsYears  at  present  residenceTelephone  in  nameCredit  Information

Co-­‐applicantGuarantorNumber  existing  credits

Page 26: Storytelling with Data to Executives 09212016 - JMP User Community · 2019-11-30 · Demographics Employment*duration Age Rents Owns*residence Jobcategory Number*of*dependents Years*at*present*residence

Current  State

DISCOVERY  SUMMIT  2016 26

Average  loan  ~  $20,000  

Current  Unit  Gain ~    ($0.055)

Current  Revenue  Per  Loan  ~  ($1100)70%  Good  Risks 30%  Bad  Risks

Page 27: Storytelling with Data to Executives 09212016 - JMP User Community · 2019-11-30 · Demographics Employment*duration Age Rents Owns*residence Jobcategory Number*of*dependents Years*at*present*residence

Developed  Model

DISCOVERY  SUMMIT  2016 27

Developed  model  to  maximize  profits:Maximize  RevenuesMinimize  Losses

Reality  

Prediction    

Good  Credit

Good  Credit

Bad  Credit

Bad  Credit

TRUE

FALSE

+0.35

-­‐1.0

Page 28: Storytelling with Data to Executives 09212016 - JMP User Community · 2019-11-30 · Demographics Employment*duration Age Rents Owns*residence Jobcategory Number*of*dependents Years*at*present*residence

Future  State

DISCOVERY  SUMMIT  2016 28

Average  loan  ~  $20,000  

Predicted  Average  Unit  Gain ~    $0.1315

Predicted  Average  Revenue  Per  Loan  ~  $2630

Page 29: Storytelling with Data to Executives 09212016 - JMP User Community · 2019-11-30 · Demographics Employment*duration Age Rents Owns*residence Jobcategory Number*of*dependents Years*at*present*residence

Final  Results

DISCOVERY  SUMMIT  2016 29

Unit  Gain  (Loss)    

Current  State

Classification  Model

-­‐$0.055 $0.1315

Revenue  (Loss)  Per1,000  Customers -­‐$1,100,000 $2,630,000

Net  Revenue  Improvement  Per  1,000  Customers $3,730,000

Page 30: Storytelling with Data to Executives 09212016 - JMP User Community · 2019-11-30 · Demographics Employment*duration Age Rents Owns*residence Jobcategory Number*of*dependents Years*at*present*residence

SummaryCurrent  State:Average  loss  per  loan  ~  ($1100)  

Modeling  Results:Developed  a  classification  model  to  maximize  net  profitsEstimated  average  gain  per  loan  made  ~  $2236  

Key  Drivers:

Co-­‐Applicant  for  Loan,  Owns  Residence,  Rents,  Number  of  Existing  Credits,  and  Interactions  Between  Many  Factors

DISCOVERY  SUMMIT  2016 30

Page 31: Storytelling with Data to Executives 09212016 - JMP User Community · 2019-11-30 · Demographics Employment*duration Age Rents Owns*residence Jobcategory Number*of*dependents Years*at*present*residence

Appendix  to  Executive  Summary

DISCOVERY  SUMMIT  2016 31

JMP  13  Web  ReportsJMP  13  Dashboards

Page 32: Storytelling with Data to Executives 09212016 - JMP User Community · 2019-11-30 · Demographics Employment*duration Age Rents Owns*residence Jobcategory Number*of*dependents Years*at*present*residence

Summary  – Key  Points

Discovery  Summit  2016 32

§ Summarize  up  front§ Don’t  put  the  executive  back  in  math  class§ It  is  not  necessary  to  share  with  the  executive  everything  

you  have  learned  in  reaching  this  point  in  your  life§ Give  them  what  they  asked  for

Page 33: Storytelling with Data to Executives 09212016 - JMP User Community · 2019-11-30 · Demographics Employment*duration Age Rents Owns*residence Jobcategory Number*of*dependents Years*at*present*residence

Resource  List

Discovery  Summit  2016 33

1. “How  to  Present  to  Senior  Executive”  by  Nancy  Duarte,  HBR (Communications),  October  4,  2012.2. “Create  a  Presentation  Your  Audience  Will  Care  About”  by  Nancy  Duarte,  HBR (Communications),  

October  10,  2012.3. “Do  Your  Slides  Pass  the  Glance  Test?”  by  Nancy  Duarte,  HBR (Communications),  October  22,  2012.4. “Structure  Your  Presentation  Like  a  Story”  by  Nancy  Duarte,  HBR (Communications),  October  31,  

2012.5. “Data  Storytelling:    What  It  Is,  Why  It  Matters”  by  Lisa  Morgan,  Information  Week  (Commentary),  May  

30,  2016.    [http://www.informationweek.com/big-­‐data/big-­‐data-­‐analytics/data-­‐storytelling-­‐what-­‐it-­‐is-­‐why-­‐it-­‐matters/a/d-­‐id/1325544  |  last  accessed  June  30  2016]

6. Good  Charts  by  Scott  Berinato  ,  Harvard  Business  School  Publishing  2016.7. “Owl  speaks  lion”  by  Jeff  Kline,  ORMS  Today,  August  2016.