Marketing Analytics: A Practical Guide to Real Marketing Science

207

Transcript of Marketing Analytics: A Practical Guide to Real Marketing Science

Page 1: Marketing Analytics: A Practical Guide to Real Marketing Science
Page 2: Marketing Analytics: A Practical Guide to Real Marketing Science

Praiseformarketinganalytics‘ForthoseMBAswhobarelypassedtheirquantitativemarketingandstatisticsclasseswithouttrulyunderstandingthecontent,MarketingAnalyticsprovideseverythingmanagersandexecutivesneedtoknowpresentedasaconversationwithexamplestoboot!You’lldefinitelysoundsmarterintheboardroomafterreadingthisbook!’

JamesMourey,PhDandassistantprofessorofmarketingatDePaulUniversity(Chicago)

‘MarketingAnalyticsisamust-readforanalyticspractitionersandmarketingmanagersseekingacomprehensiveoverviewofthemostactionabletechniquesthatvirtuallyanyorganizationcanapplytogainimmediatebenefits.Ratherthancomplicatethebookwithtechnicaldetailsthatmaynotbeofinteresttoallreaders,DrGrigsbysuccinctlyillustratestheconceptswithrealexamplesandprovidesreferencesforanalystsneedingdeeperguidanceortheory.IwishMarketingAnalyticshadbeenpublished15yearsago–itwould’vesavedmealotofindependentresearch!’

WDeanVogt,Jr,marketingresearchandanalyticspractitioner

‘MarketingAnalyticsisapracticalguidebookwritteninaconversationaltonethatmakescomplextheorieseasilyunderstood.Theauthor’sexperienceintheindustrycombinedwithhisinherentgiftforexplainingeverythingasuccessfulmarketinganalystneedstoknowmakesthisbookamust-read.’

KatyRichardson,FounderandPrincipal,214Creative

‘Thisisagreatbookforpractitionerswhohavelearnedplentyoftheoriesandwanttolearnhowtoapplymethodologies.Itisalsoagreat,easy-to-readresourceforanyonewhodoesnothaveadeeptheoreticalbackgroundbutwantstolearnhowanalyticsworkinreallife.’

IngridGuo,VP,Analytics,andManagingDirector,JavelinMarketingGroup(Beijing)

‘Mike’swritingisstraightforwardandentertaining.Hebringsaconversationalandrelatabletoneandapproachtosomefairlycomplexmaterial.Sometimesmarketerscantakethemselvesalittletooseriously,especiallywhenitcomestothemathematicalsideofthings.Mike’sworkremindsustolightenupandhavefunwithit.’

KatyRollings,PhD,loyaltyanalystatGameStop

‘Thebooksummarizesallthecriticaltopicsinaconsumer-focusedanalyticapproach,andthecasesarefuntoread.’

ErnanHaruvy,PhD,ProfessorofMarketing,UTDallas

Page 3: Marketing Analytics: A Practical Guide to Real Marketing Science

‘Thisbookgivesabroadoverviewofmarketinganalyticstopeoplewhodon’thaveanyrelatedbackground…Examplesareexplainedtogivereadersacleareridea.Ithinkthebookisworthareadforanyonewhowantstobecomeamarketinganalyst.’

YuanFang,MSc(marketinganalyticscandidate)

‘Inonesentence,theroleofmarketingistodeterminewhotheorganizationcanserveandhowitcanbestbedone.Tothisend,MikeGrigsbyescortsthereaderthroughthedifficultprocessofunderstanding,explaining,andanticipatingcustomerbehaviour,aptlydeliveredwiththeno-nonsenseauthorityearnedbyveteransofmarketingsuccess.IfMarketingAnalyticsistheclass,I’msittingfrontrow!’

AllynWhite,PhD

‘InhisbookMarketingAnalytics,MikeGrigsbytakespassionatemarketingstrategistsonapractical,real-lifejourneyforsolvingcommonmarketingchallenges.Bycombiningtheconceptsandknowledgeareasofstatistics,marketingstrategyandconsumerbehaviour,Mikerecommendsscientificandinnovativesolutionstocommonmarketingproblemsinthecurrentbusinessenvironment.Everychapterisaninterestingjourneyforthereader.

WhatIlikemostaboutthebookisitssimplicityandhowitappliestorealwork-relatedsituationsinwhichalmostallofushavebeeninvolvedwhilepractisingmarketingofanysort.IalsolikehowMiketalksabouttangiblemeasurementsofstrategicrecommendedmarketingsolutionsaswellashowtheyaddvaluetocompanies’strategicendeavours.Ihighlyrecommendreadingthisbookasitaddsacompletelynewdimensiontomarketingscience.’

KristinaDomazetoska,projectmanagerandimplementationconsultantatInsala–TalentDevelopmentandMentoringSolutions

‘Mike’sbookistherightblendoftheoryappliedtotherealworld,large-scaledataproblemsofmarketing.It’sexactlythebookIwishI’dhadwhenIstartedoutinthisfield.’

JeffWeiner,SeniorDirector,ChannelandEmployeeAnalytics–USRegion,Aimia

‘Iloveyourbook!Itoffersatrulyaccessibleguidetothebasicsandpracticeofmarketinganalytics.Iespeciallylikehowyoubringinyourcorrectinsightsone.g.theoverrelianceoncompetitive(vsconsumer)behaviorinmarketingstrategy.’

KoenHPauwels,AssociateProfessorattheTuckSchoolofBusiness,DartmouthandÖzyeğinUniversity,Istanbul

‘IfoundMarketingAnalyticsinterestingandeasytocomprehend.Ithasluciddescriptionsalongwiththeillustrations,whichcomplementthetext.Evenalaymancanunderstand,asthereisnojargonortechnicallanguageused.’

SunpreetKaurSahni,AssistantProfessoratGNIMT,PhD(marketing)Ludhiana,

Page 4: Marketing Analytics: A Practical Guide to Real Marketing Science

Punjab,India

‘Thisisanexcellentreadforpeopleintheindustrywhoworkinstrategyandmarketing.ThisisoneofthefirstbooksthatIhavereadthatcoverstheentirespectrumfromdemand,segmentation,targeting,andhowresultscanbecalculated.Inanagewheremarketingisbecomingmoreandmoresophisticated,thisbookprovidesthetoolsandthemathematicsbehindthefacts.MarketingAnalyticsiswrittenwithascientificvoice,butwasveryreadable,withthesciencewrappedintoeverydayactivities,basedonacharacterwecanallrelateto,thatarederivedfromtheseformulas,ultimatelydrivingROI.’

ElizabethJohnson,VP,ShopperMarketing–DigitalSolutionsRetailigence

‘IstronglyrecommendMarketingAnalyticstobothbeginnersandfolkswhodon’thavemuchbackgroundinstatistics.Averyprecisebook.Complicatedtopicsaroundstatistics,marketingandmodellingarecondensedverywellinamuch-simplifiedlanguage,alongwithreal-worldexamplesandbusinesscases,whichmakesitamusingtoreadandgivesclearunderstandingaboutapplicationsoftheconcepts.Thebooksetsthegroundwithexactlywhatoneneedstoknowfromstatisticsaswellasmarketing,andrunsthroughhowthesetwo,coupledwithanalytics,canhelpsolvereal-worldbusinessproblems.Later,italsocoversMarketResearchtopicsandconcludeswiththeCapstone,coveringapplicationofallthemethodologiestoDigitalAnalytics.IbelievethatMarketingAnalyticswillbeahandyreferenceormanualforstudentsaswellasmarketinganalyticsprofessionals.’

SasmitKhokale,MS(MIS),AnalyticsPractitioner

Page 5: Marketing Analytics: A Practical Guide to Real Marketing Science
Page 6: Marketing Analytics: A Practical Guide to Real Marketing Science

NoteontheEbookEdition

Foranoptimalreadingexperience,pleaseviewlargetablesandfiguresinlandscapemode.

Thisebookpublishedin2015by

KoganPageLimited

2ndFloor,45GeeStreet

LondonEC1V3RS

UnitedKingdom

www.koganpage.com

©MikeGrigsby,2015

E-ISBN9780749474188

Fullimprintdetails

Page 7: Marketing Analytics: A Practical Guide to Real Marketing Science

ContentsForeword

Preface

Introduction

PARTONEOverview

01A(little)statisticalreviewMeasuresofcentraltendency

Measuresofdispersion

Thenormaldistribution

Relationsamongtwovariables:covarianceandcorrelation

Probabilityandthesamplingdistribution

Conclusion

Checklist:You’llbethesmartestpersonintheroomifyou…

02Briefprinciplesofconsumerbehaviourandmarketingstrategy

Introduction

Consumerbehaviourasthebasisformarketingstrategy

Overviewofconsumerbehaviour

Overviewofmarketingstrategy

Conclusion

Checklist:You’llbethesmartestpersonintheroomifyou…

PARTTWODependentvariabletechniques

03Modellingdependentvariabletechniques(withoneequation):whatarethethingsthatdrivedemand?

Introduction

Dependentequationtypevsinter-relationshiptypestatistics

Deterministicvsprobabilisticequations

Businesscase

Resultsappliedtobusinesscase

Page 8: Marketing Analytics: A Practical Guide to Real Marketing Science

Modellingelasticity

Technicalnotes

Highlight:Segmentationandelasticitymodellingcanmaximizerevenueinaretail/medicalclinicchain:fieldtestresults

Abstract

Theproblemandsomebackground

Descriptionofthedataset

First:segmentation

Then:elasticitymodelling

Last:testvscontrol

Discussion

Conclusion

Checklist:You’llbethesmartestpersonintheroomifyou…

04WhoismostlikelytobuyandhowdoItarget?

Introduction

Conceptualnotes

Businesscase

Resultsappliedtothemodel

Liftcharts

Usingthemodel–collinearityoverview

Variablediagnostics

Highlight:Usinglogisticregressionformarketbasketanalysis

Abstract

Whatisamarketbasket?

Logisticregression

Howtoestimate/predictthemarketbasket

Conclusion

Checklist:You’llbethesmartestpersonintheroomifyou…

05Whenaremycustomersmostlikelytobuy?Introduction

Page 9: Marketing Analytics: A Practical Guide to Real Marketing Science

Conceptualoverviewofsurvivalanalysis

Businesscase

Moreaboutsurvivalanalysis

Modeloutputandinterpretation

Conclusion

Highlight:Lifetimevalue:howpredictiveanalysisissuperiortodescriptiveanalysis

Abstract

Descriptiveanalysis

Predictiveanalysis

Anexample

Checklist:You’llbethesmartestpersonintheroomifyou…

06Modellingdependentvariabletechniques(withmorethanoneequation)

Introduction

Whataresimultaneousequations?

Whygotothetroubleofusingsimultaneousequations?

Desirablepropertiesofestimators

Businesscase

Conclusion

Checklist:You’llbethesmartestpersonintheroomifyou…

PARTTHREEInter-relationshiptechniques

07Modellinginter-relationshiptechniques:whatdoesmy(customer)marketlooklike?

Introduction

Introductiontosegmentation

Whatissegmentation?Whatisasegment?

Whysegment?Strategicusesofsegmentation

ThefourPsofstrategicmarketing

Page 10: Marketing Analytics: A Practical Guide to Real Marketing Science

Criteriaforactionablesegmentation

Aprioriornot?

Conceptualprocess

Checklist:You’llbethesmartestpersonintheroomifyou…

08Segmentation:toolsandtechniques

Overview

Metricsofsuccessfulsegmentation

Generalanalytictechniques

Businesscase

Analytics

Comments/detailsonindividualsegments

K-meanscomparedtoLCA

Highlight:WhyGoBeyondRFM?

Abstract

WhatisRFM?

Whatisbehaviouralsegmentation?

WhatdoesbehaviouralsegmentationprovidethatRFMdoesnot?

Conclusion

Segmentationtechniques

Checklist:You’llbethesmartestpersonintheroomifyou…

PARTFOUROther

09MarketingresearchIntroduction

Howissurveydatadifferentthandatabasedata?

Missingvalueimputation

Combatingrespondentfatigue

Afartoobriefaccountofconjointanalysis

Structuralequationmodelling(SEM)

Checklist:You’llbethesmartestpersonintheroomifyou…

Page 11: Marketing Analytics: A Practical Guide to Real Marketing Science

10Statisticaltesting:howdoIknowwhatworks?Everyonewantstotest

Samplesizeequation:usetheliftmeasure

A/Btestingandfullfactorialdifferences

Businesscase

Checklist:You’llbethesmartestpersonintheroomifyou…

PARTFIVECapstone

11Capstone:focusingondigitalanalyticsIntroduction

Modellingengagement

Businesscase

Modelconception

HowdoImodelmultiplechannels?

Conclusion

PARTSIXConclusion

12TheFinale:whatshouldyoutakeawayfromthis?Anyotherstories/soapboxrants?

WhatthingshaveIlearnedthatI’dliketopassontoyou?

Whatotherthingsshouldyoutakeawayfromallthis?

Glossary

Bibliographyandfurtherreading

Index

Testbanksanddatasetsrelatingtochaptersareavailableonlineat:www.koganpage.com/MarketingAnalytics

Page 12: Marketing Analytics: A Practical Guide to Real Marketing Science

IForewordnMarketingAnalyticsMikeGrigsbyprovidesanewwayofthinkingaboutsolvingmarketingandbusinessproblems,withapracticalsetofsolutions.Thisrelevantguideis

intendedforpractitionersacrossavarietyoffields,butisrigorousenoughtosatisfytheappetiteofscholarsaswell.

IcancertainlyappreciateMike’smotivationsforthebook.Thisbookishiswayofgivingbacktotheanalyticscommunitybyofferingadviceandstep-by-stepguidanceforwaystosolvesomeofthemostcommonsituations,opportunities,andproblemsinmarketing.Heknowswhatworksforentry,mid-level,andveryexperiencedcareeranalyticsprofessionals,becausethisisthekindofguidehewouldhavelikedatthesestages.

WhileMike’seducationincludesaPhDinMarketingScience,healsopullsfromhisvastexperiencesfromhisstartasanAnalyst,throughhisjourneytoVPofAnalytics,towalkthereaderthroughthetypesofquestionsandbusinesschallengeswefaceintheanalyticsfieldonaregularbasis.Hisauthorityonthesubjectmatterisobvious,andhisenthusiasmiscontagious,andbestcapturedbymyfavouritesentenceofhisbook:‘Nowlet’slookatsomedataandrunamodel,becausethat’swhereallthefunis.’

Whatthiseducationandexperiencemeansfortherestofusisthatwehaveawell-informedauthorprovidinguswithinsightintotherealitiesofwhatisneededfromtheexcitingworkwedo,andhowwecannotonlyprovidebetterdecisionmaking,butalsomovetheneedleonimportanttheoreticalandmethodologicalapproachesinAnalytics.

Morespecifically,MarketingAnalyticscoversbothinter-relationalanddependency-drivenanalyticsandmodellingtosolvemarketingproblems.Inalightandconversationalstyle(bothengagingandsurprising)Mikearguesthat,ultimately,allmarketsrelyonastrongunderstandingoftheever-changing,difficulttopredict,sometimesfuzzy,andelusivemindsandheartsofconsumers.Anythingwecandotobetterarmourselvesasmarketerstodevelopthisunderstandingiscertainlytimewellspent.Consumerscanandshouldbethefocalpointofgreatstrategy,operationalstandardsofexcellenceandprocesses,tacticaldecisions,productdesign,andsomuchmore,whichiswhyitmakesperfectsensetobetterunderstandnotjustconsumerbehaviours,butalsoconsumerthoughts,opinions,andfeelings,particularlyrelatedtoyourvertical,competitors,andbrand.

Afterareviewofseminalworkonconsumerbehaviour,andanoverviewofgeneralstatisticsandstatisticaltechniques,MarketingAnalyticsdivesintorealisticbusinessscenarioswiththecleveruseofcorporatedialoguebetweenScott,ourfictitiousanalyst,andhisboss.Asourprotagonistprogressesthroughhiscareer,weseeanimprovementinhistoolkitofanalyticaltechniques.Hemovesfromanentrylevelanalystinacubicaltoa

Page 13: Marketing Analytics: A Practical Guide to Real Marketing Science

seniorleaderofanalyticswithstaff.Theproblemsbecomemorechallenging,andtheprocessforchoosingtheanalyticstoapplytothesituationspresentedisanuncannyreflectionofreality–atleastbasedonmyexperiences.

WhatIappreciateabsolutelymostaboutthisworkthoughisthefullspectrumofproblemsolving,notjustanalyticsinavacuum.Mikewalksusfromtheinitialmomentwhenaproblemisidentified,throughcommunicationofthatproblem,framingbytheAnalyticsteam,techniqueselectionandexecution(fromthestraightforwardtosomewhatadvanced),communicationofresults,andusefulnesstothecompany.ThisrareandcertainlymorecompletepicturewarrantsatitlesuchasProblemSolvingusingMarketingAnalyticsinlieuoftheshortertitleMikechose.

MarketingAnalyticswillhaveyourethinkingyourmethods,developingmoreinnovativewaystoprogressyourmarketinganalyticstechniques,andadjustingyourcommunicationpractices.Finally,abookweallcanuse!

DrBeverlyWright,VP,Analytics,BKVConsulting

Page 14: Marketing Analytics: A Practical Guide to Real Marketing Science

WPreface

e’llstartbytryingtogetafewthingsstraight.Ididnotsetouttowritea(typical)textbook.I’llmentionsometextbooksdownthelinethatmightbehelpfulinsome

areas,butthisistooslimforanacademictome.Leafthroughitandyou’llnotfindanymathematicproofs,noraretherepagesuponpagesofequations.Thisismeanttobeagentleoverview–moreconceptualthanstatistical–forthemarketinganalystwhojustneedstoknowhowtogetonwiththeirjob.Thatis,it’sforthosewhoare,orhopetobe,practitioners.Thisiswrittenwithpractitionersinmind.

Page 15: Marketing Analytics: A Practical Guide to Real Marketing Science

IntroductionWhoistheintendedaudienceforthisbook?Thisisnotmeanttobeanacademictomefilledwithmathematicminutiaandclutteredwithstatisticalmumbo-jumbo.Therewillneedtobeanequationnowandthen,butifyourinterestiseconometricrigour,you’reinthewrongplace.AcoupleofgoodbooksforthatareEconometricAnalysisbyWilliamH.Greene(1993)andEconometricModels,TechniquesandApplicationsbyMichaelIntriligator,RonaldG.BodkinandChengHsiao(1996).So,thisbookisnotaimedatthestatistician,althoughtherewillbeafairamountofverbiageaboutstatistics.

Thisisnotmeanttobeareplacementforaprogrammingmanual,eventhoughtherewillbeSAScodesprinkledinnowandthen.Ifyou’reallaboutBI(businessintelligence),whichmeansmostlyreportingandvisualizingdata,thisisnotforyou.

Thiswillnotbeamarketingstrategyguide,butbeawarethatasmathematicsisthehandmaidenofscience,marketinganalyticsisthehandmaidenofmarketingstrategy.Thereisnopointtoanalyticsunlessithasastrategicpayoff.It’snotwhatisinterestingtotheanalyst,butwhatisimpactfultothebusinessthatisthefocusofmarketingscience.

So,towhomisthisbookaimed?Notnecessarilyattheprofessionaleconometrician/statistician,butthereoughttobesomesatisfactionhereforthem.Primarily,theaimisatthepractitioner(orthosewhowillbe).Theintendedaudienceisthebusinessanalystthathastopullatargetedlist,thecampaignmanagerthatneedstoknowwhichpromotionworkedbest,themarketerthatmustDE-marketsomesegmentofhercustomerstogainefficiency,themarketingresearcherthatneedstodesignandimplementasatisfactionsurvey,thepricinganalystthathastosetoptimalpricesbetweenproductsandbrands,etc.

Whatismarketingscience?Asalludedtoabove,marketingscienceistheanalyticarmofmarketing.Marketingscience(interchangeablewithmarketinganalytics)seekstoquantifycausality.Marketingscienceisnotanoxymoron(likemilitaryintelligence,happilymarriedorjumboshrimp)butisanecessary(althoughnotsufficient)partofmarketingstrategy.Itismorethansimplydesigningcampaigntestcells.Itsoverallpurposeistodecreasethechanceofmarketersmakingawrongdecision.Itcannotreplacemanagerialjudgment,butitcanofferboundariesandguardrailstoinformstrategicdecisions.Itencompassesareasfrommarketingresearchallthewaytodatabasemarketing.

Whyismarketingscienceimportant?

Page 16: Marketing Analytics: A Practical Guide to Real Marketing Science

Marketingsciencequantifiesthecausalityofconsumerbehaviour.Ifyoudon’tknowalready,consumerbehaviouristhecentre-point,thehub,thepivotaroundwhichallmarketinghinges.Any‘marketing’thatisnotaboutconsumerbehaviour(understandingit,incentingit,changingit,etc.)isprobablyheadingdownthewrongroad.

Marketingsciencegivesinput/informationtotheorganization.Thisinformationisnecessaryfortheverysurvivalofthefirm.Muchlikeanorganismrequiresinformationfromitsenvironmentinordertochange,adaptandevolve,anorganizationneedstoknowhowitsoperatingenvironmentchanges.Tonotcollectandactandevolvebasedonthisinformationwouldbedeath.Tosurvive,forboththeorganizationandtheorganism,insights(fromdata)arerequired.Yes,thisisreasoningbyanalogybutyouseewhatImean.

Marketingscienceteasesoutstrategy.Unlessyouknowwhatcauseswhat,youwillnotknowwhichlevertopull.Marketingsciencetellsyou,forinstance,thatthissegmentissensitivetoprice,thiscohortprefersthismarcom(marketingcommunication)vehicle,thisgroupisundercompetitivepressure,thispopulationisnotloyal,andsoon.Knowingwhichlevertopull(bydifferentconsumergroups)allowsoptimizationofyourportfolio.

Whatkindofpeopleinwhatjobsusemarketingscience?Mostpeopleinmarketingscience(alsocalleddecisionscience,analytics,CRM,direct/databasemarketing,insights,research,etc.)haveaquantitativebent.Theireducationistypicallysomecombinationinvolvingstatistics,econometrics/economics,mathematics,programming/computerscience,business/marketing/marketingresearch,strategy,intelligence,operations,etc.Theirexperiencecertainlytouchesanyandallpartsoftheabove.Theidealanalyticpersonhasastrongquantitativeorientationaswellasafeelforconsumerbehaviourandthestrategiesthataffectit.Asinallmarketing,consumerbehaviouristhefocalpointofmarketingscience.

MarketingscienceisusuallypractisedinfirmsthathaveaCRMordirect/databasemarketingcomponent,orfirmsthatdomarketingresearchandneedtoundertakeanalyticsonthesurveyresponses.Forecastingisapartofmarketingscience,aswellasdesignofexperiments(DOE),webanalyticsandevenchoicebehaviour(conjoint).Inshort,anyquantitativeanalysisappliedtoeconomic/marketingdatawillhaveamarketingscienceapplication.Sowhilethesubjectsofanalysisarefairlybroad,thenumberof(typical)analytictechniquestendstobefairlynarrow.SeeConsumerInsightbyStone,BondandFoss(2004)togetaviewofthisinaction.

WhydoIthinkIhavesomethingtosayaboutmarketingscience?Fairquestion.Mywholecareerhasbeeninvolvedinmarketinganalytics.Formorethan

Page 17: Marketing Analytics: A Practical Guide to Real Marketing Science

25yearsI’vedonedirectmarketing,CRM,databasemarketing,marketingresearch,decisionsciences,forecasting,segmentation,designofexperimentsandalltherest.WhilemyBBAandMBAareinfinance,myPhDisinmarketingscience.I’vepublishedafewtradeandacademicarticles,I’vetaughtschoolatbothgraduateandundergraduatelevelsandI’vespokenatconferences,allinvolvedinmarketingscience.I’vedoneallthisforfirmslikeDell,HP,theGapandSprint,aswellasconsultancieslikeTargetbase.OvertheyearsI’vegatheredafewopinionsthatI’dliketosharewithy’all.Andyes,I’vebeeninTexasforover15years.

Whatistheapproach/philosophyofthisbook?Aswithmostnon-fictionwriters,IwrotethisbecauseIwouldhavelovedtohavehadit,orsomethinglikeit,earlier.WhatIhadinminddidnotactuallyexist,asfarasIknew.

IhadbeenapractitionerfordecadesandthereweretimesIjustwantedtoknowwhatIshoulddo,whatanalytictechniquewouldbestsolvetheproblemIhad.Ididnotneedamathematically-orientedeconometricstextbook(likeGreene’s,orKmenta’sElementsofEconometrics(1986)asgreatastheyeachare).Ididnotneedalistofstatisticaltechniques(likeMultivariateDataAnalysisbyHairetal(1998)orMultivariateStatisticalAnalysisbySamKashKachigan(1991))asgreataseachofthemalsoare.WhatIneededwasa(simple)explanationofwhichtechniquewouldaddressthemarketingproblemIwasworkingon.Iwantedsomethingdirect,accessible,andeasytounderstandsoIcoulduseitandthenexplainit.Itwasokayifthebookwentintomoretechnicaldetailslater,butfirstIneededsomethingconceptualtoguideinsolvingaparticularproblem.WhatIneededwasamarketing-focusedbookexplaininghowtousestatistical/econometrictechniquesonmarketingproblems.Itwouldbeidealifitshowedexamplesandcasestudiesdoingjustthat.Voila.

GenerallythisbookhasthesamepointofviewasbookslikePeterKennedy’sAGuidetoEconometrics(1998)andGlennL.UrbanandStevenH.Star’sAdvancedMarketingStrategy(1991).Thatis,thetechniqueswillbedescribedintwoorthreelevels.Thefirstisreallyjustconceptual,devoidofmathematics,andtheaimistounderstand.Thenextlevelismoretechnical,andwilluseSASorsomethingelseasneededtoillustratewhatisinvolved,howtointerpretit,etc.Thenthefinallevel,ifthereisone,willberathertechnicalandaimedreallyonlyattheprofessional.Andtherewillbebusinesscasestoofferexamplesofhowanalyticssolvesmarketingquestions.

OnethingIlikeaboutStephanSorger’s2013book,MarketingAnalytics,isthatintheopeningpageshechampionsaction-ability.Marketingsciencehastobeaboutaction-ability.IknowsomeacademicpuristswillreadthefollowingpagesandgaspthatIoccasionallyallow‘badstats’tocreepin.(Forexample,itiswellknownthatforecastingoftenisimprovedifcollinearindependentvariablesarefound.Shock!)Butthepointisthatevenanimperfectmodelisfarmorevaluablethanwaitingforacademicwhitetower

Page 18: Marketing Analytics: A Practical Guide to Real Marketing Science

purity.Businessisabouttimeandmoneyandevenacloudyinsightcanhelpimprovetargeting.Putsimply,thisbook,andmarketingscience,isultimatelyaboutwhatworks,notwhatwillbepublishedinanacademicresearchpaper.

Alloftheabovewillbecastintermsofbusinessproblems,thatis,intermsofmarketingquestions.Forexample,amarketer,say,needstotargethismarketandhehastolearntodosegmentation.Orshehastomanageagroupthatwilldosegmentationforher(aconsultant)andneedstoknowsomethingaboutitinordertointelligentlyquestion.Theproblemwillbeaddressedintermsofwhatissegmentation,whatdoesitmeantostrategy,whydoit,etc.Thenadescriptionofseveralanalytictechniquesusedforsegmentationwillbedetailed.Thenafairlyinvolvedandtechnicaldiscussionwillshowmoreadditionalstatisticaloutput,andanexampleortwowillbeshown.ThisoutputwilluseSAS(orSPSS,etc.)asnecessary.Thiswillalsohelpguidestudentsastheypreparetobecomeanalysts.

Therefore,thephilosophyistopresentabusinesscase(aneedtoanswerthemarketingquestions)anddescribeconceptuallyvariousmarketingsciencetechniques(intwoorthreeincreasinglydetailedlevels)thatcananswerthosequestions.Thenwith,say,SASoutputwillbedevelopedthatshowshowthetechniqueworks,howtointerpretitandhowtouseittosolvethebusinessproblem.Finally,moretechnicaldetailsmaybeshown,asneeded.Okay?

So,ontoalittlestatisticalreview.

Page 19: Marketing Analytics: A Practical Guide to Real Marketing Science

Partone

Overview

Page 20: Marketing Analytics: A Practical Guide to Real Marketing Science

01

A(little)statisticalreviewMeasuresofcentraltendency

Measuresofdispersion

Thenormaldistribution

Relationsamongtwovariables:covarianceandcorrelation

Probabilityandthesamplingdistribution

Conclusion

Checklist:You’llbethesmartestpersonintheroomifyou…Youknewwehadtodothis,haveageneralreviewofbasicstatistics.Ipromise,it’llbemostlyconceptual,agentlereminderofwhatwelearnedinIntroductoryStatistics.AlsonotetheDefinitionBoxeshelpingtodescribekeyterms,pointoutjargon,etc.

MeasuresofcentraltendencyFirstwe’lldealwithsimpledescriptivestatistics,confinedtoonevariable.We’llstartwithmeasuresofcentraltendency.

Measuresofcentraltendencyincludethemean,medianandmode.

Mean:adescriptivestatistic,ameasureofcentraltendency,themeanisacalculationsummingupthevalueofalltheobservationsanddividingbythenumberofobservations.

Themeaniscalculatedas:

Thatis,sumalltheobservationsup(alltheindividualXs)andthendividebythenumberofobservations(Xs).Thisiscommonlycalled‘theaverage’butI’dliketoofferadifferentviewof’average’.

Average:themostrepresentativemeasureofcentraltendency,NOTnecessarilythemean.

Averageisthemeasureofcentraltendency,thenumbermostlikelytooccur,themostrepresentativenumber.Thatis,itmightnotbethemean;itcouldbethemedianoreven

Page 21: Marketing Analytics: A Practical Guide to Real Marketing Science

themode.Thisisourfirstincursionintoastatisticalwayofthinking.

I’dliketopersuadeyouthatit’spossible,forexample,thatthemedianismorerepresentativethanthemean,insomecases–andthatinthosecasesthemedianistheaverage,themostrepresentativenumber.

Median:themiddleobservationinanoddnumberofobservations,orthemeanofthemiddletwoobservations.

Themedianis,bydefinition,thenumberinthemiddle,the50thpercentile,thatvaluethathasjustasmanyobservationsaboveitasbelowit.

ConsiderhomesalespricesviaFigure1.1.Themeanis141,000butthemedianis110,000.Whichnumberismostrepresentative?Isubmititisnotthemean,butthemedian.Ialsosubmitthatthebestmeasureofcentraltendency,inthisexample,isthemedian.Thereforethemedianistheaverage.Iknowthat’snotwhatyoulearnedinthirdgrade,butgetusedtoit.Statisticshasawayofturningoneslightlyaskew.

Figure1.1Homesalesprices

Justtobeclear,Isuggestthatthemeasureofcentraltendencythatbestdescribesthehistogramaboveshouldbecalled‘average’.Modeisthenumberthatappearsmostoften,medianistheobservationinthemiddleandmeanistheobservationssummedovertheircount.

Mode:thenumberthatappearsmostoften.

Averageisthemostrepresentativenumber.Ofcourseitdoesn’thelpthisargumentthatExceluses=AVERAGE()asthefunctiontocalculatethemeaninsteadof=MEAN().I’vetriedaskingBillaboutitbuthe’snotreturnedmycalls,sofar.

MeasuresofdispersionMeasuresofcentraltendencyalonedonotadequatelydescribethevariable(avariableisa

Page 22: Marketing Analytics: A Practical Guide to Real Marketing Science

thingthatvaries,likehomesalesprices).Theotherdimensionofavariableisdispersion,orspread.

Therearethreemeasuresofdispersion:range,varianceandstandarddeviation.

Range:ameasureofdispersionorspread,calculatedasthemaximumvaluelesstheminimumvalue.

Rangeiseasy.It’ssimplytheminimum(smallestvalue)observationsubtractedfromthemaximum(largestvalue).It’snotparticularlyuseful,especiallyinamarketingcontext.

Varianceisanothermeasureofdispersionorspread.

Variance:ameasureofspread,calculatedasthesummedsquareofeachobservationlessthemean,dividedbythecountofobservationslessone.

Conceptuallyittakeseachobservationandsubtractsthemeanofalltheobservationsfromit,thensquareseachobservationandaddsupthesquares.Thatquantityisdividedbyn–1,thetotalnumberofobservations,lessone.Theformulaisbelow.Notethisisthesampleformula,nottheformulaforthepopulation.

(NotethatX-baristhesymbolforsamplemean,whileµwouldbethesymboltouseforpopulationmean;swouldbethesymboltouseforsamplestandarddeviationandσwouldbethesymboltouseforpopulationstandarddeviation.)

Now,whatdoesvariancetellus?Unfortunately,notmuch.Itsaysthat(fromTable1.1)thisvariableof18observationshasameanof25andavariance,orspread,of173.6.Butvariancegetsustothestandarddeviation,whichDOESmeansomething.

Table1.1Variance

X X-mean squared

2 –23 529.3

5 –20 400.3

8 –17 289.2

10.9 –14.1 199.3

13.9 –11.1 123.6

16.9 –8.1 65.9

19.9 –5.1 26.2

22.9 –2.1 4.5

Page 23: Marketing Analytics: A Practical Guide to Real Marketing Science

25.9 0.9 0.8

28.9 3.9 15.1

31.9 6.9 47.4

33 8 63.9

34 9 80.9

35 10 99.9

36 11 120.9

39 14 195.8

42 17 288.8

45 20 399.7

Mean=25.0 Sum=2,951.3

Count=18 Variance=173.6

Standarddeviation:thesquarerootofvariance.

Standarddeviationiscalculatedbytakingthesquarerootofvariance.Inthiscasethesquarerootof173.6is13.17.Now,whatdoes13.17mean?Itdescribesspreadordispersioninawaythatremovesthescaleofthevariable.Thatis,thereareknownqualitiesofastandarddeviation.Inafairlynormaldistributiondispersionisspreadaroundthemean(whichequalsthemodewhichequalsthemedian).Thatis,thereisasymmetricalspreadaroundthemeanof25.Inthiscasethespreadis25+/–13.17.Thatmeansthat,ingeneral,onestandarddeviation(+/–13.17)fromthemeanwillcontain68%ofallobservations:seeFigure1.2.Thatis,asthecountincreases(basedonthecentrallimittheorem)thedistributionapproachesnormal.Inanormal(bell-shaped)curve,50%ofallobservationsfalltotheleftofthemeanand50%ofallobservationsfalltotherightofthemean.Knowingthestandarddeviationgivesinformationaboutthevariablethatcannotbeobtainedanyotherway.

Figure1.2Standarddeviation

Page 24: Marketing Analytics: A Practical Guide to Real Marketing Science

So,bysayingavariablehasameanof25andastandarddeviationof13.17,automaticallymeansthat68%ofallobservationsarebetween11.8and38.2.ThisimmediatelytellsmethatifIfindanobservationthatis<11.8,itisalittlerare,orunusual,giventhat68%willbe>11.8(and<38.2).

So,onestandarddeviationaccountsfor34%belowthemeanand34%abovethemean.Thesecondstandarddeviationaccountsfor14%andthethirddeviationaccountsforalmost1.99%.Thismeansthatthreestandarddeviationstotheleftofthemeanaccountsfor34%+14%+1.99%,ornearly50%ofallobservations.Likewiseforthepositive/rightsideofthemean.

Asanexample,itiswellknownthatIQhasameanof100andastandarddeviationofabout15.Thismeansthat34%ofthepopulationshouldfallbetween100and115.Thisisbecausethemeanis100andthestandarddeviationis15,or115.Thesecondstandarddeviationaccountsforanother14%.Or48%(34%+14%)ofthepopulationshouldbebetween100and130.Finally,justunder2%willbe>3standarddeviation,orhavinganIQ>130.Soyouseehowusefulthestandarddeviationis.Itimmediatelygivesmoreinformationaboutthespread,orhowlikelyorunusualparticularobservationsare.Forexample,ifwehadanIQtestthatshowed150,thisisaVERYrareevent,inthatit’sintherealmof>4standarddeviations:100–115is1,115–130is2,145is3and150is3.33standarddeviationsabovethemean.

ThenormaldistributionI’vealreadymentionedthenormaldistributionbutlet’ssayacouplemoreclarifyingthingsaboutit.Thenormaldistributionisthetraditionalbell-shapedcurve.Onecharacteristicofanormaldistributionisthatthemeanandthemedianandthemodearevirtuallythesamenumber.Thenormaldistributionissymmetricaboutthemeasureofcentraltendency(mean,medianandmode)andthestandarddeviationdescribesthespread,asabove.

Let’salsomentionthecentrallimittheorem.Thissimplymeansthatasn,orthecount,

Page 25: Marketing Analytics: A Practical Guide to Real Marketing Science

increases,thedistributionapproachesanormaldistribution.Thisallowsustotreatallvariablesasnormal.

Nowforaquickwordaboutz-scoresasthiswillbehandylater.

Z-score:ametricdescribinghowmanystandarddeviationsanobservationisfromitsmean.

Az-scoreisameasureofthenumberofstandarddeviationsanobservationisrelativetoitsmean.Itconverts

anobservation,intothenumberofstandarddeviationsaboveorbelowthemeanbytakingtheobservation(Xi)andsubtractingthemeanfromitandthendividingthatquantitybyitsstandarddeviation.IntermsofIQ,anobservationof107.5willhaveaz-scoreof(107.5–100)/15,or0.5.ThismeansthatanIQof107.5isone-halfastandarddeviationabovethemean.Since34%(from100–115)lieabovethemean,az-scoreof0.5meansthisobservationoccurshalfway,orabout17%,abovethemean.Thismeansthisobservationis17%aboveaverage(whichis50%)orgreaterthan67%ofthepopulation.Notethat17%+14%+1.99%(orabout33%)areabovethisobservation.

Relationsamongtwovariables:covarianceandcorrelationAlloftheabovedescriptivediscussionswereaboutonevariable.Rememberthatavariableisanitemthattakesonmultiplevalues.Thatis,avariableisathingthatvaries.Nowlet’stalkabouthavingtwovariablesandthedescriptivemeasuresofthem.

CovarianceCovariance,likevariance,ishowonevariablevariesintermsofanothervariable.

Covariance:thedispersionorspreadoftwovariables.

It,likevariance,doesnotmeanmuch;it’sjustanumber.Ithasnoscale,norboundaries,andinterpretationisminimal.Theformulais:

ItmerelydescribeshoweachXobservationvariesfromitsmean,intermsofhoweachYobservationvariesfromitsmean.Thensumtheseupanddividebyn,thecount.Again,thenumberisnearlyirrelevant.

SaywehavethedatasetinTable1.2.Notethecovarianceis77.05,whichagainmeansverylittle.

Page 26: Marketing Analytics: A Practical Guide to Real Marketing Science

Table1.2Covarianceandcorrelation

X Y

2 3

4 5

6 7

8 9

9 9

11 11

11 8

13 10

15 12

17 14

19 16

21 22

22 22

24 11

26 12

28 22

30 24

32 26

33 28

33 39

Covar= 77.05

Correl= 87.90%

CorrelationCorrelation,likestandarddeviation,doeshaveameaning,andanimportantone.

Correlation:Ameasureofbothstrengthanddirection,calculatedasthecovarianceofXandYdividedbythestandarddeviationofX*thestandarddeviationofY.

Page 27: Marketing Analytics: A Practical Guide to Real Marketing Science

Correlationexpressesbothstrengthanddirectionofthetwovariables.Itrangesfrom–100%to+100%.Anegativecorrelationmeansthatas,say,Xgoesup,Ytendstogodown.Averystrongpositivecorrelation(say80%or90%)meansthatasXgoesupby,say10,Yalsogoesupbynearlythesameamount,maybe8or9.NotethatinTable1.2thecorrelationis87.9%whichisprobablyaverystrongrelationshipbetweenXandY.TheformulaforcorrelationiscovarianceofXandYdividedbythestandarddeviationofX*thestandarddeviationofY.Thatis,togofromcovariancetocorrelation,covarianceisdividedbythestandarddeviationofxmultipliedbythestandarddeviationofy.Theformulais:

ProbabilityandthesamplingdistributionProbabilityisanimportantconceptinstatisticsofcourseandI’llonlytouchonithere.

First,let’stalkabouttwokindsofthinking:deductiveandinductive.Deductivethinkingiswhatyouaremostfamiliarwith:basedonrulesoflogicandconclusionsfromcausality.Becauseofthisthing,thisconclusionmustbetrue.However,statisticalthinkingisinductive,notdeductive.Inductivethinkingreasonsfromsampletopopulation.Thatis,statisticsisaboutinferencesandgeneralizingtheconclusion.Thisiswhereprobabilitycomesin.Typically,inmarketing,weneverhavethewholepopulationofadataset:wehaveasample.

Here’swhereitgetsalittletheoretical.SaywehaveasampleofdataonXthatcontains1,000observationswithameanof50.Now,theoretically,wecouldhaveaninfinitenumberofsamplesthathaveavarietyofmeans.Indeed,weneverknowwhereoursampleis(withitsmeanof50)inthetotalpossibilityofsamples.Ifwedidhavealargenumberofsamplesdrawnfromthepopulationandwecalculatedthosemeansofthosesamplesthatwouldconstituteasamplingdistribution.

Forexample,saywehaveabarrelcontaining100,000marbles.Thatisthewholepopulation.10%ofthesemarblesareredand90%ofthesemarblesarewhite.Wecanonlydrawasampleof100atatimeandcalculatethemeanofredmarbles.

Inthiscase(contrivedasitis)weKNOWtheaveragenumberofmarblesdrawn,overall,willbe10%.Butnote–andthisisimportant–thereisnoguaranteethatanyoneofoursamplesof100willactuallybe10%.Itcouldbe5%(3.39%ofthetimeitwillbe)anditcouldbe14%(5.13%ofthetimeitwillbe).Itwill,ofcourse,onaverage,be10%.Indeed,only13.19%ofthesampledrawnwillactuallybe10%!Thebinomialdistributiontellsustheabovefacts.

Therefore,wecouldhavedrawnanunusualsamplethathadonly5%redmarbles.Thiswouldoccur3.39%ofthetime,roughly1outof33.That’snotthatrare.Andwehavein

Page 28: Marketing Analytics: A Practical Guide to Real Marketing Science

actualitynowaytoreallyknowhowlikelythesamplewehaveistocontainthepopulationmeanof10%.Thisiswhereconfidenceintervalscomein,whichwillbedealtwithlaterinstatisticaltesting.

ConclusionThat’sallIwanttomentionintermsofstatisticalbackground.Morewillbeappliedlater.Nowlet’sgetonwiththefun.

Checklist

You’llbethesmartestpersonintheroomifyou:

Rememberthreemeasuresofcentraltendency:mean,medianandmode.

Rememberthreemeasuresofdispersion:range,varianceandstandarddeviation.

Constantlypointouttherealdefinitionofaverageas‘themostrepresentativenumber’,thatis,itmightNOTnecessarilybethemean.

Alwayslookatametricintermsofbothcentraltendencyaswellasdispersion.

Thinkofaz-scoreasameasureofthelikelihoodofanobservationoccurring.

Observethatcorrelationisabouttwodimensions:strengthanddirection.

Page 29: Marketing Analytics: A Practical Guide to Real Marketing Science

02

BriefprinciplesofconsumerbehaviourandmarketingstrategyIntroduction

Consumerbehaviourasthebasisformarketingstrategy

Overviewofconsumerbehaviour

Overviewofmarketingstrategy

Conclusion

Checklist:You’llbethesmartestpersonintheroomifyou…

IntroductionYouwillnotethatIhavetiedtwosubjectstogetherinthischapter;consumerbehaviourandmarketingstrategy.That’sbecausemarketingstrategyisallaboutunderstandingconsumerbehaviourandincentivizingitinsuchawaythatthefirmandtheconsumerbothwin.Iknowalotofmarketerswillbesaying,‘Butwhataboutcompetitors?Aretheynotpartofmarketingstrategy?’Andtheansweris,‘No,notreally.’Iamawareofthegaspsthiswillcause.

Byunderstandingconsumerbehaviour,partofthatinsightwillcomefromwhatexperienceconsumershavewithcompetitors,butthefocusisonconsumer,notcompetitive,behaviour.IknowJohnNashandhisworkingametheorytakesabackseatinmyview,butthisisonpurpose.Muchlikethefinancialmotto‘watchthepenniesandthedollarswillfollow’,Isay,‘focusontheconsumerandcompetitiveunderstandingwillfollow’.

Justtobeclear,marketingscienceshouldbeattheconsumerlevel,NOTthecompetitivelevel.Byfocusingoncompetitorsyouautomaticallymovefromamarketingpointofviewtowardafinancial/economicpointofview.

Consumerbehaviourasthebasisformarketingstrategy

Inmarketing,theconsumeriscentralIliketouseStevenP.Schnaars’MarketingStrategybecauseofthefocusonconsumerbehaviour(Schnaars,1997).Andbecausehe’sright.Amarketingorientationisconsumer-

Page 30: Marketing Analytics: A Practical Guide to Real Marketing Science

centric;anythingelseisbydefinitionNOTmarketing.Marketingdrivesfinancialresultsandinordertobemarketing-orientedtheremustbeaconsumer-centricfocus.Thatmeansallmarketingactivitiesaregearedtolearnandunderstandconsumer(andultimatelycustomer)behaviour.

Themarketingconceptdoesnotmeangivingtheconsumer(only)whattheywant,because:

1. theconsumer’swantscanbewidelydivergent;2. theconsumer’swantscontradictthefirm’sminimumneeds;and3. theconsumermightnotknowwhattheywant.Itismarketing’sjobtolearnand

understandandincentivizeconsumerbehaviourtoawin-winposition.

Theobjectionfromproduct-centricmarketersAsafairargument,consumer-centricityrunscontratoproductmanagers.ProductmanagersfocusondevelopingproductsandTHENfindingconsumerstobuythem.(Immediateexamplesthatspringtomindcomefromtechnology,suchasoriginalHP,Apple,etc.)Thissometimesworks,butoftenitdoesnot.TheposterchildforproductfocusregardlessofwhatconsumersthinktheywantisChrysler’sminivanstrategy.ThestoryisthatChryslerchiefLeeIacoccawantedtodesignandproducetheminivanbutthemarketresearchtheydidtoldhimtherewasnodemandforit.Consumerswereconfusedbythe‘halfwaybetweenacarandaconversion(full-size)van’andwerenotinterestedinit.IacoccawentaheadanddesignedandbuiltitanditbasicallysavedChrysler.Whatisthepoint?Onepointisthatconsumersdonotalwaysknowwhattheywant,especiallywithanew/innovativeproducttheyhavenoexperiencewith.ThesecondpointisthatnoteveryonehasthegeniusofLeeIacocca.

Overviewofconsumerbehaviour

BackgroundofconsumerbehaviourAsimpleviewofconsumerbehaviourisbestunderstoodinthemicroeconomicanalysisof‘theconsumerproblem’.Thisisgenerallysummarizedinthreequestions:

1. Whatareconsumers’preferences(intermsofgoods/services)?2. Whatareconsumers’constraints(allocatinglimitedbudgets)?3. Givenlimitedresources,whatareconsumers’choices?

Thisassumesthatconsumersarerationalandhaveadesiretomaximizetheirsatisfaction.

Let’stalkaboutgeneralassumptionsofconsumerpreferences.Thefirstisthatpreferencesarecomplete,meaningconsumerscancompareandrankallproducts.Thesecondassumptionisthatpreferencesaretransitive.Thisisthemathematicrequirement

Page 31: Marketing Analytics: A Practical Guide to Real Marketing Science

thatifXispreferredtoYandYispreferredtoZthenXispreferredtoZ.Thethirdassumptionisthatproductsaredesirable(a‘good’isgoodorofvalue).Thismeansthatmoreisbetter(costsnotwithstanding).

Aquicklookintotheassumptionsabovemakesitclearthattheyaremadeinordertodothemathematics.Thisultimatelymeansthatcurveswillbeproduced(thebaneofmostmicroeconomicsstudents)thatlendthemselvestosimplegraphics.Thisimmediatelyleadsintousingthecalculusforanalyticreasons.Calculusrequiressmoothcurvesandtwicedifferentiabilityinordertowork.THISmeansthatsomeheroicassumptionsindeedarerequired,especiallyceterisparibus(holdingallotherthingsconstant).

ThedecisionprocessConsumersgothroughashopping-purchasingprocess,usingdecisionanalyticstocometoachoice.Itshouldberecognizedthatnotalldecisionsareequallyimportantorcomplex.Basedontheriskofawrongchoice,eitherextendedproblemsolvingorlimitedproblemsolvingwilltendtobeused.

Extendedproblemsolvingisusedwhenthecostoftheproductishigh,ortheproductwillbelivedwithforalongtime,orit’stheinitialpurchase,etc.Somethingaboutthechoicerequiresmorethought,evaluationandrigour.

Limitedproblemsolvingisofcoursetheopposite.Whenproductsareinexpensive,shortlived,notreallyimportantorwithlowriskofa‘wrong’decision,limitedproblemsolvingisused.Oftenoneormoreofthe(below)stepsareomitted.Thechoiceismoreautomatic.Thechoiceisusuallyreducedtoarule:whatexperiencetheconsumerhashadbefore,whatbrandtheyhavedisliked,whatpriceislowenough,whattheirneighbourshavetoldthem,etc.

Thetypicaldecisionprocessintermsofconsumerbehaviour(forexample,seeConsumerBehaviorbyEngel,BlackwellandMiniard,1995)isaboutneedrecognition,searchforinformation,informationprocessing,alternativeevaluation,purchaseandpost-purchaseevaluation.Therearemarketingopportunitiesalongeachsteptoinfluenceandincent.

Needrecognition

Theinitiatoroftheconsumerdecisionprocessisneedrecognition.Thisisarealizationthatthereisa‘cognitivedissonance’betweensomeidealstateandthecurrentstate.Thereismuchadvertisingaroundneedarousal.Fromeducatingconsumersonrealneeds(survival,satisfaction)toinformingconsumersaboutpseudo-needs(‘jumponthebandwagon–allofyourfriendshavealreadydoneit!’)needarousaliswhereitstarts.

Searchforinformation

Nowtheconsumerrecallswhattheyhaveheardorwhattheyknowabouttheproductto

Page 32: Marketing Analytics: A Practical Guide to Real Marketing Science

infer,dependingonwhethertheproductrequireslimitedorextensiveengagement,anabilitytomakeadecision.Obviouslyadvertisingandbrandingcomeintoplayhere,informingconsumersofbenefits,differentiation,etc.

Informationprocessing

Thenextstepisfortheconsumertoabsorbwhatinformationtheyhaveandwhatfactstheyknow.MostmarketingmessagingstrategiespreferforconsumerstoNOTprocessinformation,buttorecallsuchthingsaspositivebrandexposure,satisfactionfrompreviousinteractionsoremotionalloyalty.Ifconsumersdonot‘process’information(iecriticallyevaluatecostsandbenefits)thentheycanusebrandequity/satisfactiontomaketheshorthanddecision.Itismarketingscience’sjobtofindthosethatareconsidering,distinctfromthosethathave‘alreadydecided’.

Pre-purchasealternativeevaluation

Now,afterinformationhasbeenprocessed,comesthecriticalfinalcomparison:doesthepotentialproducthaveattributestheconsumerconsidersgreaterthantheconsumer’sstandards?Thatis,givenbudgetarystandards,whatistheproductlikelytoofferintermsofsatisfaction(economicutilization)aftertheconsumerhasdecideditisaboveminimumqualifications?

Purchase

Finally,thewholepointofthemarketingfunnelispurchase.Asaleisthelastpiece.Thisisthedecisionoftheconsumerbasedontheshoppingprocessdescribedabove.Theactualpurchaseactioncarrieswithinitalltheabove(andbelow)processesandalloftheactualandperceivedproductattributes.

Post-purchaseevaluation

Buttheconsumerdecisionprocessdoesnot(usually)endwithpurchase.Generallyitisacomparisonwithwhattheconsumerthought(hoped)wouldbetheutilizationgainedfromconsumingtheproductcomparedtowhatactual(perceived)satisfactionwasreceivedfromtheproduct.Thatis,thecreationofloyaltystartspostpurchase.

Now,withconsumerbehaviourcentrallylocated,let’sthinkaboutafirm’sstrategy.Keepthedifferencesbetweencompetitivemovesandconsumerbehaviourfirmlyinmind.

OverviewofmarketingstrategyTheabovewastofocusonconsumerbehaviour.Marketing,tobemarketing,isaboutunderstandingandincentivizingconsumerbehaviourinsuchawaythatboththeconsumerandthefirmgetwhattheywant.Consumerswantaproductthattheyneedwhentheyneeditatapricethatgivesthemvaluethroughachanneltheyprefer.Firmswantloyalty,customersatisfactionandgrowth.Sinceamarketisaplacewherebuyersandsellersmeet,

Page 33: Marketing Analytics: A Practical Guide to Real Marketing Science

marketingisthefunctionthatmovesthebuyersandsellerstowardeachother.

Giventheabove,itshouldbenotedthatmarketingstrategyhasevolved(primarilyviamicroeconomics)toafirmvs.firmrivalry.Thatis,marketingstrategyisindangerofforgettingthefocusonconsumerbehaviourandjumpingdeepintosomethinglikegametheorywhereinonefirmcompeteswithanotherfirm.

Everythingthatfollowsaboutmarketingstrategycanbethoughtofasanindirectconsequenceoffirmvs.firmbasedonadirectconsequenceoffocusingonconsumerbehaviour.Thatis,fightingafirmmeansincentivizingconsumers.Thinkofitasaniceberg:whatisseen(firmscompeting)isthetipabovethesurface,butwhatisreallyhappeningthatmovestheicebergisunseen(fromotherfirm’spointofview)belowthesurface(incentivizingconsumers).

TypesofmarketingstrategyEveryoneshouldbeawareofMichaelPorterandhismonumentalarticleandbookaboutcompetitivestrategy(Porter,1979/1980).Thisiswheremarketingstrategybecameadiscipline.

FirstPorterdetailedfactorscreatingcompetitiveintensity.(Tomakeanobviouspoint:whatarefirmscompetingover?Consumerloyalty.)Thesefactorsarethebargainingpowerofsuppliers,thebargainingpowerofbuyers,thethreatofnewentrants,therivalryamongexistingfirmsandthethreatofsubstituteproducts:

Thebargainingpowerofbuyersmeansfirmsloseprofitfrompowerfulbuyersdemandinglowerprices.Thismeansconsumersaresensitivetoprice.

Thebargainingpowerofsuppliersmeansfirmsloseprofitduetopotentialincreasedfactor(input)prices.Suppliersonlyhavebargainingpowerbecauseafirm’smarginsarelow,becauseafirmcannotraiseprices,becauseconsumersaresensitivetoprice.

Thethreatofnewentrantslowersprofitsduetonewcompetitorsenteringthemarket.Again,consumersaresensitivetopriceandveryinformedabouttheotherfirm’sofferings.

Theintensityofrivalrycauseslowerpricesbecauseofthezerosumgamesuppliedbyconsumers.Thereareonlyacertainnumberofpotentialloyalcustomersandifafirmgainsonethenanotherfirmlosesthatone.

Thethreatofsubstituteproductsinvitesconsumerstochooseamongthelower-pricedproducts.

Notehowallofthisstrategy(whichappearslikefirmsfightingotherfirms)isactuallybasedonconsumerbehaviour.AmIputtingtoofineapointonthis?Maybe,butitdoeshelpusfocus,right?

Page 34: Marketing Analytics: A Practical Guide to Real Marketing Science

Basedonthesefactorsafirmcanascertaintheintensityofcompetition.Themorecompetitivetheindustryis,themoreafirmmustbeapricetaker,thatis,theyhavelittlemarketpower,meaninglittlecontroloverprice.Thisaffectstheamountofprofiteachfirmintheindustrycanexpect.Giventhis,afirmcanevaluatetheirstrengthsandweaknessesanddecidehowtocompete.Ornot.

Porterthendidabrilliantthing:hedevised,basedontheabove,threegenericstrategies.Afirmcancompeteoncosts(bethelow-costprovider),afirmcandifferentiateandfocusonhigh-endproductsorafirmcansegmentandfocusonasmaller,nichepartofthemarket.Thepointisthefirmneedstocreateandadheretoaparticularstrategy.Oftenfirmsaredilutedanddoeverythingatonce.

However,TreacyandWiersematookPorter’sframeworkandevolvedit(TreacyandWiersema,1997).Theytoocameupwiththreestrategies(disciplines):operationalexcellence(basicallyafocusonlowercosts),productleadership(afocusonhigher-enddifferentiatedproducts)andcustomerintimacy(adifferentiation/segmentationstrategy).YoucanseetheiruseandextensionofPorter’sideas.Bothhavethesamebottomline:firmsshouldbedisciplinedandconcentratetheireffortscorporate-wideonprimarilyone(andonlyone)strategicfocus.

AppliedtoconsumerbehaviourStephanSorger’sexcellentMarketingAnalytics(Sorger,2013)hasabriefdescriptionofcompetitivemoves,bothoffensiveanddefensive.Summariesofeachmovebutappliedviaconsumerbehaviourarenowconsidered.

Defensivereactionstocompetitormoves

Bypassattack(theattackingfirmexpandsintooneofourproductareas)andthecorrectcounterisforustoconstantlyexplorenewareas.RememberTheodoreLevitt’sMarketingMyopia(Levitt,1960)?Ifnot,re-readit;youknowyouhadtoinschool.

Encirclementattack(theattackingfirmtriestooverpoweruswithlargerforces)andthecorrectcounteristomessagehowourproductsaresuperior/uniqueandofmorevalue.Thisrequiresaconstantmonitoringofmessageeffectiveness.

Flankattack(theattackingfirmtriestoexploitourweaknesses)andthecorrectcounteristonothaveanyweaknesses.Thisagainrequiresmonitoringandmessagingtheuniqueness/valueofourproducts.

Frontalattack(theattackingfirmaimsatourstrength)andthecorrectcounteristoattackbackinthefirm’sterritory.Obviouslythisisararelyusedtechnique.

Offensiveactions

Page 35: Marketing Analytics: A Practical Guide to Real Marketing Science

Newmarketsegments:thisusesbehaviouralsegmentation(seethelatterchaptersonsegmentation)andincentsconsumerbehaviourforawin-winrelationship.

Go-to-marketapproaches:thislearnsaboutconsumers’preferencesintermsofbundling,channels,buyingplans,etc.

Differentiatingfunctionality:thisapproachextendsconsumers’needsbyofferingproductandpurchasecombinationsmostcompellingtopotentialcustomers.

ConclusionTheabovewasabriefintroductiononbothconsumerbehaviourandhowthatbehaviourappliestomarketingstrategy.Theover-archingpointisthatmarketingscience(andmarketingresearch,marketingstrategy,etc.)shouldallbefocusedonconsumerbehaviour.Goodmarketingisconsumer-centric.Haveyouheardthatbefore?

Checklist

You’llbethesmartestpersonintheroomifyou:

Rememberthatinmarketing,theconsumeriscentral,NOTTHEFIRM.

Pointouttheconsumer’sproblemisalwayshowtomaximizeutilization/satisfactionwhilemanagingalimitedbudget.

Thinkabouttheconsumer’sdecisionprocesswhileundertakingallanalyticprojects.

Recallthatstrategyisafocusonconsumerbehaviour,notcompetitivebehaviour.

RememberthatbothPorterandTreacyandWiersemaprovidethreegeneralstrategies.

Observethatcompetitivecombatcanbethoughtofintermsofconsumerbehaviour.

Page 36: Marketing Analytics: A Practical Guide to Real Marketing Science

Parttwo

Dependentvariabletechniques

Page 37: Marketing Analytics: A Practical Guide to Real Marketing Science

03

Modellingdependentvariabletechniques(withoneequation)Whatarethethingsthatdrivedemand?Introduction

Dependentequationtypevsinter-relationshiptypestatistics

Deterministicvsprobabilisticequations

Businesscase

Resultsappliedtobusinesscase

Modellingelasticity

Technicalnotes

Highlight:Segmentationandelasticitymodellingcanmaximizerevenueinaretail/medicalclinicchain:fieldtestresults

Checklist:You’llbethesmartestpersonintheroomifyou…

IntroductionNow,ontothefirstmarketingproblem:determiningandquantifyingthosethingsthatdrivedemand.Marketingisaboutconsumerbehaviour(whichI’vetouchedonbutaboutwhichIwillhavemoretosaylater)andthepointofmarketingisaboutincentivizingconsumerstopurchase.Thesepurchases(typicallyunits)arewhateconomistscalldemand.(Bytheway,financeismoreaboutsupplyandthetwotogetheraresupplyanddemand.RememberbackinBeginningEconomics?)

Dependentequationtypevsinter-relationshiptypestatisticsBeforewediveintotheproblemathand,itmightbegoodtobackupandgivesomesimpledefinitions.Therearetwokindsof(general)statisticaltechniques:thedependentequationtypeandtheinter-relationshiptype.Dependenttypestatisticsdealwithexplicitequations(whichcaneitherbedeterministicorprobabilistic,seebelow).Inter-relationshiptechniquesarenotequations,butthevariancebetweenvariables.Thesewillbecovered/definedlaterbutaretypesoffactoranalysisandsegmentation.Clearlythis

Page 38: Marketing Analytics: A Practical Guide to Real Marketing Science

currentchapterisaboutanequation.

DeterministicvsprobabilisticequationsNowlet’stalkabouttwokindsofequations:deterministicandprobabilistic.Deterministicisalgebraic(y=mx+b)andtheleftsideexactlyequalstherightside.

Profit=Revenue–expenses.

Ifyouknowtwoofthequantitiesyoucanalgebraicallysolveforthethird.ThisisNOTthekindofequationdealtwithinstatistics.Ofcoursenot.

Statisticsdealswithprobabilisticequations:

Y=a+bXi+e.

HereYisthedependentvariable(say,sales,unitsortransactions),aistheconstantorintercept,Xissomeindependentvariable(s)(say,price,advertising,seasonality),bisthecoefficientorslopeandeistherandomerrorterm.It’sthisrandomerrortermthatmakesthisequationaprobabilisticone.Ydoesnotexactly=a+bXibecausethereissomerandomdisturbance(e)thatmustbeaccountedfor.ThinkofitasY,onaverage,equalssomeinterceptplusbXi.

Asanexample,saySales=constant+price*slope+error,thatis,Sales=a+Price*b+e.NotethatY(sales)dependsonprice,+/–.

BUSINESSCASEOk,saywehaveaguy,Scott,who’sananalyticmanagerataPCmanufacturingfirm.ScotthasanMSineconomicsandhasbeendoinganalyticsforfouryears.HestartedmostlyasanSASprogrammerandhasonlyrecentlybeenusingstatisticalanalysistogiveinsightstodrivemarketingscience.

Scottiscalledintohisboss’soffice.Hisbossisagoodstrategistwithadirectmarketingbackgroundbutisnotwellversedineconometrics/analytics,etc.

‘Scott’,thebosssays,‘weneedtofindawaytopredictourunitsales.Morethanthat,weneedsomethingtohelpusunderstandwhatdrivesourunitsales.Somethingthatwecanuseasalevertohelpincreasesalesoverthequarter.’

‘Ademandmodel.’Scottsays.‘Unitsareafunctionof,what,price,advertising?’

‘Sure.’

Scottgulpsandsays,‘I’llseewhatIcando.’

Thatnighthethinksaboutitandhassomeideas.He’llfirsthavetothinkabout

Page 39: Marketing Analytics: A Practical Guide to Real Marketing Science

causality(‘Demandiscausedby…’)andthenhe’llhavetogetappropriatedata.

It’ssmarttoformulateatheoreticmodelfirst,regardlessofwhatdatayoumayormaynothave.First,trytounderstandthedata-generatingprocess(‘thisiscausedbythat,andmaybethat,etc.’)andthenseewhatdata,orproxiesfordata,canbeusedtoactuallyconstructthemodel.

It’salsowisetohypothesizethesignsofthe(independent,right-handside)variablesyouthinksignificantincausingyourdependentvariabletovary.Rememberthatthedependentvariable(leftsideoftheequation)isdependentupontheindependentvariable(s)(rightsideoftheequation).

Forinstance,it’swellknownthatpriceisprobablyasignificantvariableinaunit-demandmodelandthatthesignshouldbenegative.Thatis,aspricegoesup,units,onaverage,shouldgodown.Thisisthelawofdemand,theonlylawinallofeconomics–excepttheonethatmosteconomicforecastswillbewrong.(‘Economistshavepredicted12ofthelast7recessions.’)

(Foryousticklers,yes,thereisa‘Giffengood’.Thisisanoddproductwherebyanincreasingpricecausesanincreaseindemand.Theseareusuallynon-normalgoods(typicallyluxurygoods)likefineartorwine.Forthevastmajorityofproductsmostmarketersworkon,however,thesenormalgoodsareruledbythelawofdemand:pricegoesup,quantity(units)goesdown.)

SoScottthinksthatpriceandadvertisingspendareimportantingeneratingdemand.Alsothatthereshouldbesomethingabouttheseason.He’sontheconsumersideofthebusinessandithasstrongback-to-schoolandChristmasseasonalspikes.

Hethinkshecaneasilygetthenumberofunitssoldandtheaveragepriceofthoseunits.Seasonalityiseasy;it’sjustavariabletoaccountfortimeoftheyear,sayquarterly.Advertisingspend(fortheconsumermarket)mightbealittletougherbutlet’ssayheisabletotwistsomearmsandeventuallysecureaguessastotheaverageamountofadvertisingspentontheconsumermarket,byquarter.

Thiswillbeatimeseriesmodelsinceithasseasonandquarterlyunits,averagepricesandadvertisingspend,bytimeperiod,quarterly.(Therewillbesomeeconometricsuggestionsontimeseriesmodellinginthetechnicalsection,particularlypertainingtoserialcorrelation.)

Fornow,let’smakesurethere’sagoodgraspoftheproblem.Scottwilluseadependentvariabletechniquecalledordinaryregression(ordinaryleastsquares,OLS)tounderstand(quantify)howseason,advertisingspendandpricecause(explainthemovementof)unitssold.Thisiscalledastructuralanalysis:heistryingtounderstandthestructureofthedata-generatingprocess.Heisattemptingtoquantifyhowprice,advertisingspendandseasonexplain,orcause(mostof)themovementinunitsales.

Page 40: Marketing Analytics: A Practical Guide to Real Marketing Science

Whenhe’sthroughhe’llbeabletosaywhetherornotadvertisingspendissignificantincausingunitsales(he’llhavetomakecertainnoadvertisersareinearshotwhenhedoes)andwhetherDecemberispositiveandJanuarynegativeintermsofmovingunitsales,etc.

Now,Scottisreadytodesigntheordinaryregressionmodel.

Conceptualnotes

Ordinaryregressionisacommon,well-understoodandwell-researchedstatisticaltechniquethathasbeenaroundover200years.Rememberthatregressionisadependentvariabletechnique,Y=a+bXi+e,whereeisarandomerrortermnotspecificallyseenbutwhoseimpactisfeltinthedistributionofthevariables.

Ordinaryregression:astatisticaltechniquewherebyadependentvariabledependsonthemovementofoneormoreindependentvariables(plusanerrorterm).

Simpleregressionhasoneindependentvariableandmultipleregressionhasmorethanoneindependentvariable,thatis:

y=a+b1x1+b2x2…+bnxn,etc.

Scott’smodelforhisbosswillusemultipleregressionbecausehehasmorethanoneindependentvariable.

Theoutputofthemodelwillhaveestimatesabouthowsignificanteachvariableis(we’llseeitscoefficientorslope)andwhetherit’ssignificantornot(basedonitsvariance).Thisistheheartofstructuralanalysis,quantifyingthestructureofthedemandforPCs.

So,Scottcollecteddata(seeTable3.1)andranthemodel

Units=price+advertising

andnowseeshowthemodelfits.

Table3.1Demandmodeldata

Quarter Unitsales Avgprice Adspend

1 50 1,400 6,250

2 52.5 1,250 6,565

3 55.7 1,199 6,999

4 62.3 1,099 7,799

1 52.5 1,299 6,555

2 59 1,200 7,333

Page 41: Marketing Analytics: A Practical Guide to Real Marketing Science

3 58.2 1,211 7,266

.. .. .. ..

Thereisonegeneralmeasureofgoodnessoffit:R2.R2isthesquareofthecorrelationcoefficient,inthiscasethecorrelationofactualunitsandpredictedunits.Whilecorrelationmeasuresstrengthanddirection,R2measuressharedvariance(explanatorypower)andrangesfrom0%–100%.

(AninterestingbutratheruselessbitoftriviaiswhyR2iscalledR2.Yes,R2isthesquareofR,andRisthecorrelationcoefficient.CorrelationissymbolizedastheGreekletterrho,ρ.Why?InGreeknumeralsα=1,β=2,etc.,andρ=100(kindoflikeRomannumerals,I=1,II=2,C=100,etc.).Rememberthattherangeofcorrelationisfrom–100%to+100%.ρ=rhoandinEnglish=R.Nowimpressyouranalyticfriends.)

Notethedataisquarterly,whichwe’lladdresssoonenough.ScottrunsordinaryregressionandfindstheoutputasTable3.2.

Table3.2Ordinaryregression

Adspend Avgprice Constant

Coefficient 0.0007 –0.0412 101.83

Standerr 0.0003 0.0047

R2 83%

t-ratio 2.72 –8.67

Thefirstrowistheestimatedcoefficient,orslope.Notethatpriceisnegative,ashypothesized.Thesecondrowisthestandarderror,oranestimateofthestandarddeviationofthevariable,whichisameasureofdispersion.

Standarderror:anestimateofstandarddeviation,calculatedasthestandarddeviationdividedbythesquarerootofthenumberofobservations.

Let’stalkaboutsignificance,shallwe?Inmarketingweoperateat95%confidence.Rememberz-scores?1.96isthez-scorefor95%confidence,whichisthesameasap-value<0.05.So,ifat-ratio(whichinthiscaseisthecoefficientdividedbyitsstandarderror)is>|1.96|thevariableisconsideredsignificant.Significancemeansthatthere’slessthana5%chanceofthevariablehaving0impactandthet-ratiotestsfortheprobabilitythatthevariable’simpactislikelytobe0.95%ofallstandard-normalobservationswillbewithin+/–1.96z-scores.

Noticethatadvertisingspendhasacoefficientof0.0007(rounded)andastandarderrorof0.0003(rounded).Thet-ratio(coefficientdividedbyitsstandarderror)is2.72whichis

Page 42: Marketing Analytics: A Practical Guide to Real Marketing Science

>1.96soitissaidtobepositiveandsignificant.(‘Whew’theadvertiserssay.)Likewisepriceissignificant(<–1.96)andnegative,asexpected.

Nowlet’smentionfit;howwellthemodeldoeswithjustthesetwovariables.R2isthegeneralmeasureofgoodnessoffitandinthiscaseis83%.Thatis,83%ofthevariancebetweenactualandpredictedunitsisshared,or83%ofthemovementoftheactualdependentvariableis‘explained’bytheindependentvariables.Thiscanbeinterpretedas83%ofthemovementintheunitsalescanbeattributedtopriceandadvertisingspend.Thisseemsprettygood;that’safairlyhighamountofexplanatorypower.That’sprobablywhyScott’sbosswantedhimtodothismodel.

ThenextstepisforScotttoaddseasonality,whichhehypothesizedtobeavariablethatimpactsconsumerPCunitssold.Scotthasquarterlydatasothisiseasytodo.Thenewmodelwillbeunits=price+advertising+season.

Let’stalkaboutdummyvariables(binaryvariables,thosewithonlytwovalues,1or0).Theseareoftencalled‘slopeshifters’becausetheirpurpose(whenturned‘on’asa1)istoshifttheslopecoefficientupordown.Theideaofabinaryvariableistoaccountforchangesintwostatesofnature:onoroff,yesorno,purchaseornot,respondornot,q1ornot,etc.

Scott’smodelisaquarterlymodelsoratherthanuseonevariablecalledquarterwithfourvalues(1,2,3,4)heusesamodelwiththreedummy(binary)variables,q2,q3andq4,each0or1.Thisallowshimtoquantifytheimpactofthequarteritself.Table3.3showspartofthedataset.

Table3.3Quarterlymodel

Quarter Unitsales Avgprice Adspend Q2 Q3 Q4

1 50 1,400 6,250 0 0 0

2 52.5 1,250 6,565 1 0 0

3 55.7 1,199 6,999 0 1 0

4 62.3 1,099 7,799 0 0 1

1 52.5 1,299 6,555 0 0 0

2 59 1,200 7,333 1 0 0

3 58.2 1,211 7,266 0 1 0

4 64.8 999 8,111 0 0 1

1 55 1,299 6,877 0 0 0

2 61.5 1,166 7,688 1 0 0

Page 43: Marketing Analytics: A Practical Guide to Real Marketing Science

.. .. .. .. .. .. ..

Abrieftechnicalnote

Whenusingbinaryvariablesthatformasystem,youcannotusethemall.Thatis,foraquarterlymodelyouhavetodroponeofthequarters.Otherwisethemodelwon’tsolve(effectivelytryingtodivideby0)andyouwillhavefallenintothe‘dummytrap’.SoScottdecidestodropq1,whichmeanstheinterpretationofthecoefficientsonthequartersamountstocomparingeachquartertoq1.Thatis,q1isthebaseline.

Nowlet’stalkaboutthenewmodel’s(Table3.4)outputanddiagnostics.NotefirstthatR2improvedto95%,whichmeansaddingquarterlydataimprovedthefitofthemodel.Thatis,price,advertisingspendandseasonnowexplains95%ofthemovementinunitsales,whichisoutstanding.It’sabettermodel.Notethechangeinpriceandadvertisingcoefficients.

Table3.4Regressionoutput

Q4 Q3 Q2 Adspend Avgprice Constant

Coefficient 3.825 2.689 1.533 0.0011 –0.0275 80.7153

Standerr 1.36 1.157 0.997 0.0003 0.0064 9.8496

R2 95%

t-ratio 2.81 2.32 1.54 4.1 –4.3 8.19

Now,forwhatitmeansandhowcanitbeused,theresultsoftheoutputwillbeappliednext.

ResultsappliedtobusinesscaseSonow,whatdoesallthistellus?Analyticswithoutapplicationtoanactionablestrategyismeaningless,muchlikespecialeffectsinamoviewithoutaplot.Lookingattheoutputagain,Scottcanmakesomeactionableandimportantstructuralcomments.

AgaintheR2asameasureoffitis>95%whichmeanstheindependentvariablesdoaverygoodjobexplainingthemovementofunitsales.Allofthevariablesaresignificantatthe95%level(wherez-score>|1.96|)exceptq2.Thecoefficientsonthevariablesallhavetheexpectedsigns.Comparingthequarterstoq1(whichwasdroppedtoavoidthedummytrap),Scottseesthattheyareallpositive,whichmeanstheyareallgreaterthanq1,onaverage.

ThepowerfulthingaboutordinaryregressionisthatitparcelsouttheimpactOFeachindependentvariable,takingintoaccountalltheothervariables.Thatis,itholdsallothervariablesconstantandquantifiestheimpactofeachandeveryvariable,oneatatime.This

Page 44: Marketing Analytics: A Practical Guide to Real Marketing Science

meansthat,whentakingallvariablesintoaccount,q4tendstoaddabout3.825unitsmorethanq1.Thisiswhyabinaryvariableiscalledaslopeshifter;justturning‘on’q4adds3.825units,regardlesswhatelseishappeninginpriceoradvertisingspend.Giventheverystrongseasonalpatternofunitsalesthesequarterlyestimatesseemreasonable.

Advertisinghasasignificantandpositiveimpactonunitsales.0.0011asacoefficientmeansevery1,000increaseinadvertisingspendtendstoincreaseunitsby1.1.

Nowlet’slookatprice.Thepricecoefficientisnegative,asexpectedat–0.0275.Whenpricemovesupby,say,100,unitstendtodecreaseby2.75.Now,howcanthisbeuseful?Justknowingthequantificationisvaluablebutmoreimportantlyistocalculatepriceelasticity.

ModellingelasticityElasticityisamicroeconomiccalculationthatshowsthepercentchangeinresponsegivenapercentchangeinstimulus,orinthiscase,thepercentchangeinunitssoldgivenapercentchangeinprice.

Elasticity:ametricwithnoscaleordimension,calculatedasthepercentchangeinanoutputvariablegivenapercentchangeinaninputvariable.

Usingaregressionequationmeansthecalculationofelasticityis:pricecoefficient*averagepriceoveraveragequantity(units).

Averagepriceis1,102andaveragequantityofunitssoldis63sothepriceelasticitycalculatedhereis:

–0.0275*1,102/63=–0.48

Thismeansthatifpriceincreasesby,say10%,unitssoldwilldecreasebyabout4.8%.ThisisstrategicallylucrativeinformationallowingScottandhisteamtooptimizepricingtomaximizeunitssold.Therewillbemoreonthistopiclater.

Asaquickreview,rememberthattherearetwotypesofelasticity:elasticandinelastic.

Elasticdemand:aplaceonthedemandcurvewhereachangeinaninputvariableproducesmorethanthatchangeinanoutputvariable.

InelasticitymeansthatanX%increaseinpricecausesa<X%decreaseinunitssold.

Inelasticdemand:aplaceonthedemandcurvewhereachangeinaninputvariableproduceslessthanthatchangeinanoutputvariable.

Thatis,ifpriceweretoincreaseby,say,10%,unitswoulddecrease(rememberthelawofdemand:ifpricegoesup,quantitygoesdown)bylessthan10%.Meaning,ifelasticity<

Page 45: Marketing Analytics: A Practical Guide to Real Marketing Science

|1.00|thedemandisinelastic(thinkofitasunitsbeinginsensitivetoapricechange).Ifelasticity>|1.00|thedemandiselastic.

Thesimplereasonwhyelasticityisimportanttoknowisthatittellswhathappenstototalrevenue,intermsofpricing.Inaninelasticdemandcurvetotalrevenuefollowsprice.Soifpriceweretoincrease,totalrevenuewouldincrease.SeeTable3.5belowforamathematicexample.

Table3.5Elasticity,inelasticity,andtotalrevenue

Inelastic 0.075 Increasepriceby 10.00%

p1 10.00 p2 11.00 10.00%

u1 1,000 u2 993 –0.75%

tr1 10,000 tr2 10,918 9.20%

Elastic 1.25 Increasepriceby 10.00%

p1 10.00 p2 11.00 10.00%

u1 1,000 u2 875 –12.50%

tr1 10,000 tr2 9,625 –3.80%

Letmeaddonequicknoteaboutelasticitymodelling,somethingwhichisacommonmistake.Itiswellknownthatifthenaturallogarithmistakenforalldata(dependentaswellasindependentvariables)thentheelasticitycalculationdoesnothavetobedone.Elasticitycanbereadrightoffthecoefficient.Thatis,thebetacoefficientIStheelasticity.

ln(y)=b1ln(x1)+b2ln(x2)…+bnln(xn)

Theproblemwiththisisthat,whilethecalculationiseasier(takingthepricemeansandtheunitmeans,etc.isnotrequired),modellingallthedatainnaturallogsspecificallyassumesaconstantelasticity.Thisassumptionseemsheroicindeed.Tosaythereisthesameresponsetoa5%pricechangeasthereistoa25%pricechangewouldstrikemostmarketersasinappropriate.Amodelinlogswouldhaveaconstantlyconcavecurvetotheoriginthroughout.Formoreonmodellingelasticityfromamarketingpointofview,seeanarticleIwrotethatappearedintheCanadianJournalofMarketingResearch,called‘ModelingElasticity’(Grigsby,2002).

UsingthemodelHowistheordinaryregressionequationused?Thatis,howarepredictedunitscalculated?

NoteFigure3.1showstheactualaswellasthepredictedunitsales.Thegraphshowshowwellthepredictedsalesfittheactualsales.Theequationis:

Page 46: Marketing Analytics: A Practical Guide to Real Marketing Science

Y=a+B1x1+B2x2…+BnXnor

Units=constant+b1*q2+b2*q3+b3*q4+b4*price+b5*advert

Figure3.1Actualandpredictedunitsales

Forthesecondobservation(Table3.6)thismeans:

80.7+(3.8*0)+(2.6*0)+(1.5*1)+(0.001*6,565)–(0.02*1,250)=55.2

Table3.6Averagepriceandadspend

Quarter Unitsales Avgprice Adspend Q2 Q3 Q4 Predictedsales

1 50.0 1,400 6,250 0 0 0 49.2

2 52.5 1,250 6,565 1 0 0 55.2

3 55.7 1,199 6,999 0 1 0 58.2

4 62.3 1,099 7,799 0 0 1 63.0

1 52.5 1,299 6,555 0 0 0 52.3

2 59.0 1,200 7,333 1 0 0 57.5

3 58.2 1,211 7,266 0 1 0 58.2

TechnicalnotesWe’llgooversomedetailedbackgroundinformationinvolvingmodellingingeneralandregressioninparticularnow.Thiswillbealittlemoretechnicalandonlynecessaryforafullerunderstanding.

First,beawarethatregressioncarrieswithitsome‘baggage’,someassumptionsthatif

Page 47: Marketing Analytics: A Practical Guide to Real Marketing Science

violated(andthey/somealmostalwaysaretosomeextent)themodelhasshortcomings,bias,etc.Asalludedtoearlier,oneofthebestbooksoneconometricsisPeterKennedy’s1998workAGuidetoEconometrics.Thisisbecauseheexplainsthingsfirstconceptuallyandthenaddsmoretechnical/statisticaldetail,forthosethatwant/needit.Hecoverstheassumptionsandfailingsoftheassumptionsofregressionaswellasanyone.Myphilosophyinthisbookissimilarandthissectionwilladdsometechnical,butnotnecessarilymathematical,details.

TheassumptionsThefirstassumption–dealingwithfunctionalform–isthatthedependent

variable(unitsales,above)canbemodelledasalinearequation.Thisdependentvariable‘depends’ontheindependentvariables(season,priceandadvertising,asabove)andsomerandomerrorterm.

Thesecondassumption–dealingwiththeerrorterm–isthattheaveragevalueoftheerrortermiszero.

Thethirdassumption–dealingwiththeerrorterm–isthattheerrortermhassimilarvariancescatteredacrossalltheindependentvariables(homoscedasticity)andthattheerrorterminoneperiodisnotcorrelatedwithanerrorterminanother(later)period(noserial(orauto)correlation).

Thefourthassumption–dealingwithindependentvariables–isthattheindependentvariablesarefixedinrepeatedsamples.

Thefifthassumption–dealingwithindependentvariables–isthatthereisnoexactcorrelationbetweentheindependentvariables(noperfectcollinearity).

Eachoftheseassumptionsisrequiredfortheregressionmodeltowork,tobeinterpretable,tobeunbiased,efficient,consistent,etc.Afailureofanyoftheseassumptionsmeanssomethinghastobedonetothemodelinordertoaccountfortheconsequencesofaviolationoftheassumption(s).Thatis,goodmodelbuildingrequiresatestforeveryassumptionand,ifthemodelfailsthetest,acorrectiontothemodelmustbeapplied.Allthisrequiresanunderstandingoftheconsequencesofviolatingeveryassumption.

Allofthesewillbedealtwithaswegothroughthebusinesscases.Butfornow,let’sjustdealwithserialcorrelation.Serialcorrelationmeanstheerrorterminperiodxiscorrelatedwiththeerrorterminperiodx+1,allthewaythroughthewholedataset.Serialcorrelationisverycommonintimeseriesandmustbedealtwith.

Asimpletest,calledtheDurbin-Watsontest,iseasytoruninSAStoascertaintheextentofserialcorrelation.Iftheresultofthetestisabout2.00thereisnotenoughautocorrelationtoworryabout.

Page 48: Marketing Analytics: A Practical Guide to Real Marketing Science

Theconsequenceofaviolationoftheassumptionofnoerrortermcorrelationisthatthestandarderrorsarebiaseddownward,thatis,thestandarderrorstendtobesmallerthantheyshouldbe.Thismeansthatthet-ratios(measuresofsignificance)willbelarger(appearmoresignificant)thantheyreallyare.Thisisaproblem.

Thecorrectionforserialcorrelation(atleastfora1-periodcorrelation)iscalledCochran-Orcutt(althoughtheSASoutputactuallydoesaYule-Walkerestimate,whichsimplymeansithaswaystoputthefirstobservationbackintothedataset)anditbasicallytransformsallthedatabythecorrelationof1-periodlagoftheerrorterm.Themodelisre-runandDurbin-Watsonisre-runandthoseresultsused.

SeeTables3.7and3.8forD-Wbeingnear2.0(from1.08to1.93).Thisseemstoindicatethemodeltransformationworked.Notethechangeincoefficients:pricewentfrom–.0256to–.0274.Notethestandarderrorwentfrom.006to.004andsignificanceincreased.

Table3.7Serialcorrelation

Variable Estimate Standarderror Tvalue

Intercept 78.47 6.41 12.24

Price –0.0256 0.006 –4.27

Advertising 0.001109 0.00019 5.65

Q2 1.5723 0.7422 2.12

Q3 2.9698 1.0038 2.96

Q4 4.357 0.8948 4.87

R2 98.61%

Durbin-Watson 1.08

Table3.8Serialcorrelation

Variable Estimate Standarderror Tvalue

Intercept 78.47 6.41 12.24

Price –0.0274 0.004 –6.17

Advertising 0.001109 0.00019 5.65

Q2 1.5723 0.7422 2.12

Q3 2.9698 1.0038 2.96

Page 49: Marketing Analytics: A Practical Guide to Real Marketing Science

Q4 4.357 0.8948 4.87

R2 98.61%

Durbin-Watson 1.93

Nowthattheserialcorrelationhasbeentakencareof,confidenceininterpretationoftheimpactsofthemodelhasimproved.Aquicknotethoughaboutserialcorrelationandthediagnostics/correctionsI’vejustmentioned.Whilemostserialcorrelationislaggedononeperiod(calledanautoregressive1orAR(1)process)thisdoesnotmeanthattherecannotbeotherserialcorrelationproblems.Partofitisaboutthekindofdatagiven.IfitisdailydatatherewilloftenbeanAR(7)process.Thismeansthereisstrongercorrelationbetweenperiodslaggedby7thanperiodslaggedby1.IfthereismonthlydatatherewilloftenbeanAR(12)process,etc.Thus,keepinmindtheD-WstatisreallyonlyappropriateforanAR(1).Thatis,ifthedataisdaily,eachMondaywouldtendtobecorrelatedwithallotherMondays,etc.ThismeansserialcorrelationofanAR(7)type,andnotanAR(1).Thus,dailydatatendstobelaggedby7observations,monthlydatatendstobelaggedby12observations,quarterlydataby4,etc.

HIGHLIGHT

SEGMENTATIONANDELASTICITYMODELLINGCANMAXIMIZEREVENUEINARETAIL/MEDICALCLINIC

CHAIN:FIELDTESTRESULTS

AbstractMostmedicalproductsorservicesarethoughttobeinsensitivetoprice.Thisdoesnotmeanthebestwaytomaximizerevenueistounilaterallyraiseeverypriceindiscriminatelyforallregionsinallclinicsforallproductsorservices.Thereshouldbesomecustomers,someregions,somesegments,someclinics,someproductsorservicesthataresensitivetoprice.Marketinganalyticsneedstogiveguidancetoexploittheseopportunities.

Usingtransactionalandsurveydatafromalargenationalretail/medicalchain,Icollectedinformationthatincluded,bycustomerandbyclinic,thenumberofunits,pricepaidandrevenuerealizedforeachproduct/servicepurchasedoveratwo-yearperiod.Therewasatelephonesurveyadministeredtocontactthreecompetingclinicsaroundeachofthefirm’sclinicsandascertaincompetitivepriceschargedforcertain‘shopped’products/services.Thus,adatasetwascreatedthathadbothown-andcross-priceofseveralproductsorservices.

Becausemuchofacustomer’spurchasingbehaviourcouldbeattributedtoclinicdifferences(staffing,employeecourtesy,location,growth,operationaldiscounts,etc.)

Page 50: Marketing Analytics: A Practical Guide to Real Marketing Science

clinicsegmentationwasdone.Toemphasize,thiswascreatedtoaccountforclinicsinfluencing(causing)somecustomerbehaviourotherthanresponsestoown-andcross-price.Forexample,onesegmentprovedtobelarge(intermsofnumberofclinics),suburbanandservingmostlyloyalcustomers.Anothersegmentwasfairlysmall,urbanandservingrathersickpatientswithcustomerswhoweremostlydissatisfiedandhadahighnumberofdefectors.Obviouslycontrollingforthesedifferenceswasimportant.

Aftersegmentation,elasticitymodellingwasdoneoneachsegmentforselectedproductsorservices.Thisoutputshowedthatsomesegmentsandsomeproductsorservicesaresensitivetoprice;othersarenot.Thisdetailstheineffectivenessofsimplyraisingpricesonallproducts/servicesacrossthechain.Inordertomaximizerevenue,pricesshouldbeloweredonaproductinaclinicthatissensitivetoprice.Thissensitivitycomesfromlackofloyalty,lackoflong-termcommitment,knowledgeofcompetingprices,acustomer’sbudget,etc.

Aftertheanalysiswasfinishedandshowntothefirm’smanagement,theyputa90-daytestvscontrolinplace.Theychoseselected(shopped)products/segmentsandregionstotest.After90days,thetestclinicsout-performedthecontrolclinics,intermsofaveragenetrevenue,by>10%.Thisseemstoindicatethatthereareanalyticwaystoexploitpricesensitivityinordertomaximizerevenue.

TheproblemandsomebackgroundGivenaparticularchainofretail/medicalclinicsacrossthenation,pricingpracticeswerenotoriouslysimplistic:raisepricesonnearlyeveryproductorservice,foreveryclinic,ineveryregion,aboutthesameamount,everyyear.Growthwasachievedforatimebutoverthelasthandfulofyearscustomersatisfactionbegantodip,defectionsincreased,loyaltydecreased,employeesatisfaction/courtesydecreased,itwasmoreandmoredifficulttooperationallyenforcepriceincreasesandthefirmoverallhadminimalgrowthandlargerandlargerusesofdiscounts,etc.Muchofthedeteriorationinthesemetricswasroot-causedbacktopricingpolicies.Sotheprimarymarketingproblemwastounderstandtowhatextentpricingaffectedtotalrevenue.Thatis,couldpricesensitivitybediscovereddifferentlybysegmentorregion,fordifferentproductsorservices,toallowthefirmtoexploitthosedifferences?

Pricingismostlyaroundoneoftwopractices.Thefirst,cost-plus,isafinancialdecisionbasedontheinputcostoftheproductsorservicesandincorporatingmarginintothefinalprice.Thisisthetypicalapproach,especiallyintermsofproductsorservicesthoughttobeinsensitivetoprice(egemergencies,radiology,majorsurgery,etc.).Theotherpricingavenueisforshoppedproductsorservices.Theseareproductsorservicesthoughttopossiblybemoresensitivetoprice(exams,discretionaryvaccines,etc.).Fortheseproductsorservicesasurveywascreatedandthreecompetingclinicsaroundeachofthefirm’sclinicswerecalledandaskedwhatpricestheycharged.Thenthefirmtypically

Page 51: Marketing Analytics: A Practical Guide to Real Marketing Science

increasedtheirownprices(verymuchoperationallyascost-plus)butwithanunderstandingwherethecompetitionpricedthosesameproductsorservices.Theysometimeslistenedtoanindividualclinic’srequestorprotestforaless-than-typicalpriceincrease.

DescriptionofthedatasetThetransactionaldatabaseprovidedown-firmbehaviouraldataatthecustomerlevel.Thiscouldberolleduptothecliniclevel.Thetransactionaldataincluded:products/servicespurchased,pricepaidforeach,discountapplied,totalrevenue,numberofvisits,timebetweenvisits,ailment/complaint,clinicvisited,staffing,etc.

Theclinicdataincludedaggregationsoftheabove,aswellastradearea,location(ruralvsurban),staffinganddemographicsfromthecensusdatamappedtozipcodelevel.Alsoavailablewascertainmarketresearchsurveydata.Theseincludedcustomersatisfaction/loyaltyanddefectionsurveys,employeesatisfactionsurveys,etc.

Mostinterestingwasthecompetitivesurveydata.Thissurveyaskedthreecompetitorsneareachofthefirm’sclinicswhatpricestheychargedforshoppedproducts.Shoppedproductsarethosebelievedtobemorepricesensitiveandincludedexams,vaccines,minorsurgery,etc.Thus,foreachofthefirm’sclinics,theylookedatownpricespaidbycustomersforeveryproduct/service(bothshoppedandother)aswellasthreecompetitors’priceschargedforselectedshoppedproducts/services.Theown-pricedataallowedelasticitymodellingtobeundertaken,andthecross-pricedatashowedaninterestingcausefromcompetitivepressures.Sometimesthesecompetitivepressuresmadeadifferenceonownpricesensitivityandsometimesnot.Thisprovidedlucrativeopportunitiesformarketingstrategy.

First:segmentation

Whysegment?Thefirststepwastodoclinicsegmentation.

Segmentation:amarketingstrategyaimedatdividingthemarketintosub-markets,whereineachmemberineachsegmentisverysimilarbysomemeasuretoeachotherandverydissimilartomembersinallothersegments.

Thisisbecauseconsumers’behaviour,insomeways,maybecausedbyaclinic’sperformance,staffing,culture,etc.Thatis,whatmightlooklikeaconsumer’schoicemightbemorecausedbyaclinic’sfirmographics.Thedatasetcontainedallrevenueandproducttransactionsthatcouldberolledupbyclinic.Thismeantthatyear-over-yeargrowth,discountingchanges,customervisits,etc.,couldbeusefulmetrics.Alsoimportantwasthelocationofaclinic(rural,urban,etc.).Sotherewasalotofknowledgeaboutthe

Page 52: Marketing Analytics: A Practical Guide to Real Marketing Science

clinicanditsperformanceanditwasthesethingsthatitwasnecessarytocontrolforintheelasticitymodels.

Becauselatentclassanalysis(LCA)hasbecomethegoldstandardtheselasttenyears,LCAwasusedasasegmentationtechnique.Ithasprovenfarsuperiortotypical(k-means,asegmentationalgorithmdiscussedlater)techniques,especiallyinoutputtingmaximallydifferentiatedsegments.Anobviouspoint:themoredifferentiatedsegmentsarethemoreuniquemarketingstrategiescanbecreatedforeachsegment.

ProfileoutputAfterrunningLCAontheclinicdata,theprofilebelowwascreated(seeTable3.9).Acoupleofcommentsonthesegments,particularlythosetobeusedinthefieldtest.Segment1isthelargest(intermsofnumberofclinicsincluded)andhasthelargestpercentofannualrevenue.Segment1ismostheavilysituatedinsuburbanareasandmarketresearchshowsthemtohavethemostloyalcustomers.Segment2isthenext-to-largestbutonlybringsinabouthalfoftheirfairshareofrevenue.Segment4,whilesmall,represents>20%ofoverallrevenueandismostlyinurbanareas.Marketresearchrevealsthissegmenttobetheleastsatisfiedandcontainsthemostdefectors.Thesedifferenceshelpaccountforcustomer’ssensitivitytoprice,aswillbeshowninthemodelslater.

Table3.9Elasticitymodelling

Segment1 Segment2 Segment4

%Market 36% 34% 7%

%Revenue 41% 19% 21%

#ofclients 5,743 3,671 15,087

Rev/visit 135 120 215

%Suburb 56% 51% 45%

%Rural 13% 20% 3%

%Urban 31% 29% 52%

Then:elasticitymodellingOverviewofelasticitymodelling

Let’sgobacktobeginningmicroeconomics:priceelasticityisthemetricthatmeasuresthepercentchangeinanoutputvariable(typicallyunits)fromapercentchange,inthiscase(net)price,fromaninputvariable.Ifthepercentchangeis>100%,thatdemandiscalledelastic.Ifitis<100%,thatdemandiscalledinelastic.Thisisanunfortunateterm.The

Page 53: Marketing Analytics: A Practical Guide to Real Marketing Science

clearconceptisoneofsensitivity.Thatis,howsensitivearecustomerswhopurchaseunitstoachangeinprice?Ifthereisasay10%changeinpriceandcustomersrespondbypurchasing<10%units,theyareclearlyinsensitivetoprice.Ifthereisasay10%changeinpriceandcustomersrespondbypurchasing>10%units,theyaresensitivetoprice.

Butthisisnotthekeypoint,atleastintermsofmarketingstrategy.Thelawofdemandisthatpriceandunitsareinverselycorrelated(rememberthedownwardslopingdemandcurve?).Unitswillalwaysgotheoppositedirectionofapricechange.Buttherealissueiswhathappenstorevenue.Sincerevenueisprice*units,ifdemandisinelastic,revenuewillfollowthepricedirection.Ifdemandiselastic,revenuewillfollowtheunitdirection.Thus,toincreaserevenueinaninelasticdemandcurve,priceshouldincrease.Toincreaserevenueinanelasticdemandcurve,priceshoulddecrease.

FrompointelasticitytomodellingelasticityMostofusweretaughtinmicroeconomicsthesimpleideaofpointelasticity.Pointelasticityisthepercentdifferencebetween(x,y)points.Thatis,thepercentchangeinunitsgivenapercentchangeinprice.Saypricegoesfrom9–11,andunitsgofrom1000–850.Thepointelasticityiscalculatedas[(1000–850)/1000]/[(11–10)/10)whichis–68%.Notethepercentchangeinunitsis15%,fromapercentchangeinpriceof22%.Obviouslyunitsareasmallerchange(lesssensitive)thanthepricechangesothis(point)demandisinelastic.Thatis,theelasticityatthispointonthedemandcurveisinsensitivetoprice.Notethatasthedemandcurvegoesfromahighpricetoalowprice,theslopechangesandthesensitivitychanges.Thisisthekeymarketingstrategyissue.

Thuselasticityisamarginalfunctionoveranaveragefunction.Theoverallmathematicalconceptof‘marginal’istheaverageslopeofacurvewhichisaderivative.Sotocalculatetheoverallaverageelasticityrequiresthederivativeoftheunitsbypricefunction(ie,thedemandcurve)measuredatthemeans,meaning:

Elasticity=dQ/dP*averageprice/averageunits.

Somathematicallythederivativerepresentstheaverageslopeofthedemandfunction.Inastatisticalmodel(thataccountsforrandomerror)thesameconceptapplies:amarginalfunctionoveranaveragefunction.Inastatistical(regression)modelthebetacoefficientistheaverageslope,thus:

Elasticity=βPrice*averageprice/averageunits.

Aquicknoteonamathematicallycorrectbutpracticallyincorrectconcept:modellingelasticityinlogs.Whileit’struethatifthenaturallogistakenbothofthedemandandprice,thereisnocalculationatthemeans;thebetacoefficientistheelasticity.However–andthisisimportant–runningamodelinnaturallogsalsoimpliesaverywrongassumption:constantelasticity.Thismeansthereisthesameimpactatasmallpricechangeasatalargepricechangeandnomarketerbelievesthat.Thus,modellinginnatural

Page 54: Marketing Analytics: A Practical Guide to Real Marketing Science

logsisneverrecommended.

Own-pricevscross-priceandsubstitutesNowcomestheinterestingpartofthisdataset.Ithascompetitorprices!Asurveywasdoneaskingthreecompetitorsnearesttoeachclinicthepricestheychargedfor‘shoppedproducts’.Theseproductsareassumedtobegenerallypricesensitive.Itookthehighestcompetitorpriceandthelowestcompetitorpriceandusedthatascross-pricedataforevery(shopped)product.Thusthedemandmodel(bysegment)foreachshoppedproductwillbe:

Units=f(own-price,highcross-price,lowcross-price,etc.)

Thereasoncompetitivepricesaresointerestingisbecauseoftwothings.First,competitivepricesarecausesofbehaviour.Second,ifacompetitorisastrongsubstitute,strategicchoicesrevealthemselves.

Acompetitorisregardedasasubstituteifthecoefficientontheircross-priceispositive.Thismeansthereisapositivecorrelationwithafirm’sowndemand.Thus,ifthecompetitionisasubstituteandchoosestoraisetheirprices,ourowndemandwillincreasebecausetheircustomerswilltendtoflowtoourdemand(withlowerprices).Ifthecompetitorisasubstituteandchoosestolowertheirprices,ourowndemandwilldecreasebecausetheircustomerswilltendtoflowoutofourdemand(withhigherprices).Thus,knowingifacompetitorisasubstitutegivesexplanatorypowertothemodelaswellasapotentialstrategiclever.

Buttherealissueishowstrongasubstituteacompetitoris.Thisstrengthisrevealedinthecross-pricecoefficients.Sayforaparticulardemandmodelthecoefficientonownpriceis–1.50andthecoefficientonhighcross-priceis+1.10.Ownpricehastheexpectednegativecorrelation(ownpricegoesup,(own)unitsgodown).Highcross-priceispositive,meaninginthiscasethehigh-pricecompetitorisasubstitute.Ifownelasticityispricesensitiveandwelowerourprices,thehighcompetitorscanlowertheirpricesaswell,decreasingourdemand.Butnotethattheyarenotastrongsubstitute.Astrongsubstitutewillnotonlyhaveapositivecoefficientbutthatcoefficientwillbe(absolutevalue)>ownpricecoefficient.Meaning,intheaboveexample,ifwelowerourpricesby10%weexpectourdemandtoincreaseby15%.Ifthecompetitormatchesourpricechangeandlowersby10%,thatwillaffectourdemandby11%,thatis,theywerenotastrongsubstitute.

However,ifourownpricecoefficientwas–1.50andthehigh-pricecompetitorcoefficientwasinstead+3.00,averydifferentstoryunfolds.Ifwelowerourpricesby10%ourdemandwillgoupby15%.Butthestrongsubstitutecanlowertheirpriceby5%andimpactourunitsby15%(5%*3=1.5).Oriftheyalsolowerby10%andmatchusthatwillimpactourunitsby30%!Clearlythisstrongcompetitorisfarmorepowerfulthan

Page 55: Marketing Analytics: A Practical Guide to Real Marketing Science

thefirstscenario.Notealsothatnoneofthis‘gametheory’knowledgeispossiblewithoutcrossprices.

ModellingoutputbysegmentThenextfourtablesshowelasticitymodellingresultsbysegmentbyfourselectedshoppedproducts.(Inthefieldtestonlyvaccines(two),minorsurgeryandexamswereused.)Followingeachtablearenotesonstrategicuses.

Table3.10Elasticitymodelling

Vaccinex Seg1 Seg2 Seg4

Vaccinexfirm –0.377 –1.842 –3.702

Vaccinexcomphi –0.839 0.062 1.326

Vaccinexcomplo –0.078 –0.167 –0.757

Segment1:Anelasticity<|1.00|(0.377,inabsoluteterms)meansthisproductforthissegmenthasademandthatisinelastic.Thissegmentisloyal(viamarketresearch)andnocompetitorisasubstitute(nopositivecross-priceelasticity).Thereforeincreaseprice.

Afewdetailsonsegment1vaccinexcalculationsfollow.Forown-priceelasticity,thefirm’spricewas28andtheownpricecoefficientwas–1.2andtheaverageunitswere89.Thusownpriceelasticityis–0.377=–1.2*28/89.Highcompetitorpriceelasticityiscalculatedas–0.839=–1.915*39/89andlowpricecompetitorelasticityis–0.078=–0.33*21/89.Allothercalculationsaresimilar.

Segment2:Anelasticity>|1.00|(1.842,inabsoluteterms)meansthisproductforthissegmenthasademandthatiselastic.Thehighcompetitorisaweaksubstitute(0.062).Thereforedecreaseprice.

Segment4:Anelasticity>1.00(3.702,inabsoluteterms)meansthisproductforthissegmenthasademandthatiselastic.Thissegmenttendstobedissatisfiedwithahighnumberofdefectors(viamarketresearch).Thehighcompetitorisaweaksubstitute(1.326).Thereforedecreaseprice.

Table3.11Furtherelasticitymodelling

Vacciney Seg1 Seg2 Seg4

Vaccineyfirm –0.214 –0.361 –0.406

Vaccineycomphi 0.275 0.018 0.109

Vaccineycomplo 0.196 0.123 0.864

Page 56: Marketing Analytics: A Practical Guide to Real Marketing Science

Segment1:Anelasticity<|1.00|(0.214,inabsoluteterms)meansthisproductforthissegmenthasademandthatisinelastic.Thissegmentisloyal(viamarketresearch)andthelowcompetitorisaweaksubstitute.Thehighcompetitorisastrongsubstitute.Notethepositive0.275is>absolute0.214meaningthehighcompetitorcanmatch/retaliateagainstthefirmwithasmallerpricedecrease.Thereforetest(rememberthissegmentisloyal)increasingprice.

Segment2:Anelasticity<|1.00|(0.361,inabsoluteterms)meansthisproductforthissegmenthasademandthatisinelastic.Whilebothcompetitorsaresubstitutes,theyeachareweak.Thereforetestincreasingprice.

Segment4:Anelasticity<|1.00|(0.406,inabsoluteterms)meansthisproductforthissegmenthasademandthatis(surprisingly)inelastic.Thissegmenttendstobedissatisfiedwithahighnumberofdefectors(viamarketresearch).Whilebothcompetitorsaresubstitutes,thelowcompetitorisastrongsubstitute.Thereforecautiouslytestincreasingprice.

Table3.12Furtherelasticitymodelling

Minorsurgery Seg1 Seg2 Seg4

Minsurgfirm –0.57 –0.17 –1.09

Minsurgcomphi 0.202 0.475 –0.59

Minsurgcomplo –0.06 0.291 0.215

Segment1:Anelasticity<|1.00|(0.573,inabsoluteterms)meansthisproductforthissegmenthasademandthatisinelastic.Thissegmentisloyal(viamarketresearch)andthehighcompetitorisaweaksubstitute.Thereforetestincreasingprice.

Segment2:Anelasticity<|1.00|(0.173,inabsoluteterms)meansthisproductforthissegmenthasademandthatisinelastic.Bothcompetitorsarestrongsubstitutes.Therefore(cautiously)testincreasingprice.

Segment4:Anelasticity>|1.00|(1.090,inabsoluteterms)meansthisproductforthissegmenthasademandthatis(barely)elastic.Thissegmenttendstobedissatisfiedwithahighnumberofdefectors(viamarketresearch).Thelowcompetitorisaweaksubstitute.Thereforetestdecreasingprice.

Table3.13Furtherelasticitymodelling

Exams Seg1 Seg2 Seg4

Examfirm –0.1 –0.03 –0.1

Examcomphi 0.008 0.075 0.096

Page 57: Marketing Analytics: A Practical Guide to Real Marketing Science

Examcomplo –0.02 –0.03 0.023

Segment1:Anelasticity<|1.00|(0.100,inabsoluteterms)meansthisproductforthissegmenthasademandthatisinelastic.Thissegmentisloyal(viamarketresearch)andthehighcompetitorisaweaksubstitute.Thereforetestincreasingprice.

Segment2:Anelasticity<|1.00|(0.025,inabsoluteterms)meansthisproductforthissegmenthasademandthatisinelastic.Thehighcompetitorisastrongsubstitute.Thereforetestincreasingprice.

Segment4:Anelasticity<|1.00|(0.095,inabsoluteterms)meansthisproductforthissegmenthasademandthatisinelastic.Thissegmenttendstobedissatisfiedwithahighnumberofdefectors(viamarketresearch).Bothcompetitorsaresubstitutesandthehighcompetitorisastrongsubstitute.Therefore(cautiously)testincreasingprice.

Theaboveanalysisshowshowelasticitycanbeusedasastrategicweapon.Becauseitinvolvesbothown-price(customer’ssensitivity)aswellascross-price(potentialcompetitor’sretaliation)thestrategicleversarelucrative.

ExampleofelasticityguidanceNowlet’stalkabouttransferringthemodellingfromthesegmentleveltothecliniclevel,wherepricingguidanceneedstobe.Thebasicideawastousethesegmentmodel’spricecoefficientandapplythattotheelasticitycalculationbyclinic.Thatis,elasticityatthesegmentlevel:

Segmentquantity=

Segmentprice-coefficient*segmentaverageprice/segmentaveragequantity.

Translatingelasticityto(each)clinic:Clinicquantity=

Segmentprice-coefficient*clinicaverageprice/clinicaveragequantity.

Nowlet’slookataparticularclinic’stestresults.Thisclinicisinsegment4,averypricesensitivesegment.Guidanceforvaccinex(atthisclinic)wastodecreasepriceby6%.Thisdecreasebroughttheclinic’spricepositiondownfromthehighest(comparedtothesurroundingcompetitors)toamiddle-pricedoption.Thehighcompetitorwasaweaksubstitute,sostrongretaliationwasthoughtunlikely.

Forthevaccinexproduct,duringthe90-dayfieldtest,thisclinicgenerated2,292invaccinexrevenueandsold84units,makingaveragenetrevenueof27.28.Thematchedcontrolcellwas25.86,givinga5.48%test-over-controlresult.Thiscomesfromtwothingsinteracting:first,thissegmentingeneralissensitivetopriceandsecond,thisclinichasno(strong)substitutes.Thusguidancewastodecreasepricewithnofearofretaliationfromthecompetitors.

Lookatanotherparticularclinic’stestresults.Thisclinicisinsegment1,aprice

Page 58: Marketing Analytics: A Practical Guide to Real Marketing Science

insensitivesegment.Guidanceforexams(atthisclinic)wastoincreasepriceby2%.Thisincreasebroughttheclinic’spricepositionupfromthemiddle(comparedtothesurroundingcompetitors)tothehighest-pricedoption.Rememberthissegmenttendstobeveryloyal.Thehighcompetitorwasaweaksubstitute,sostrongretaliationwasthoughtunlikely.

Fortheexamproduct,duringthe90-dayfieldtest,thisclinicgenerated27,882inexamrevenueandsold499unitsmakingaveragenetrevenueof55.88.Thematchedcontrolcellwas47.41givinga17.85%test-over-controlresult.Thiscomesfromtwothingsinteracting:first,thissegmentingeneralisinsensitivetopriceandsecond,thissegmentandthisclinichaveno(strong)substitutes.Thusguidancewastoincreasepricewithnofearofretaliationfromeitherthecustomersorcompetitors.

Last:testvscontrolTherewerenearly100clinicsthatmetcriteriatobepartofthefieldtest.Therewereabout25testclinicsand75controlclinics.Thetestclinicswouldgettheelasticityguidanceandthecontrolclinicswouldcontinuebusinessasusual.

Matchedcellsbyregionbysegmentweredesigned.Thetestmetricwasaveragenetrevenue(byregion,bysegment,byproduct,etc.).Theoverallresultwasthatthetestclinicsoutperformedthecontrolclinics,intermsofaveragenetrevenue,by>10%in90days.Ofcourseregionsandsegmentsandproductshadadistributionofresults.Oneregionwasextremelypositive,anotherregionwasslightlynegative,onesegment(segment1,theloyalsegment)wasverypositiveandsegment4(thedissatisfiedsegment)waslessso.Suchastrongoverallresultindicateselasticityanalysiscanhelpguideoptimalpricing.

DiscussionIstheregametheoryinthemedicalservicesworld?Mostpractitionerswouldprobablysaynotreally,theirjobismoreaboutpatientcarethancompetition.However,oneinterestingexamplethatmightcontradictcommonwisdomcomesfromthisstudy.

Therehappenedtobetwoclinics,callthemXandY,whicheachcamefromthesameregion,thesamesegment4,butonehadastrongsubstitute(low)competitorandtheotherdidnot.Forexams,bothclinicsweregivenapricedecreaseof4%.Theclinicthatfacedthestrongcompetitor(clinicX)hadonehalftheaveragenetrevenuegainsvscontrolasclinicY.ThismightindicatethelowcompetitoraroundclinicXalsoloweredtheirexamprices(nextsurveywillverify)butbecausetheywereastrongsubstitutetheyonlyneededtolowerby1%tonegativelyimpactthefirm’s4%pricedecrease.

Itseemsthatatleastfortheshoppedproducts,pricesinthemedicalservicesareaareNOTsoinsensitive.Italsoseemsthatsomekindof‘gametheory’mightgoon,especially

Page 59: Marketing Analytics: A Practical Guide to Real Marketing Science

incloselocale,torespondandretaliatewithpricechanges.Thatwasprobablywhythecompetitivesurveywasdoneinthefirstplace.

Conclusion

Whyiselasticitymodellingsorarelydone?Inmynearly30yearsofmarketinganalysisoverawidevarietyoffirmsinmanydifferentindustries,elasticitymodelling(asdiscussedhere)isvirtuallyneverdone.Oftentherearesurveysonpricesandpurchasing,etc.Butthisisself-reportedandprobablyself-serving(‘Yes,yourpricesaretoohigh!’).Anothercommonandslightlybettermarketingresearchtechniqueisconjointanalysis.Itissomewhatartificialandstillself-reportedbutanalyticallycontrolsforsuchthings.

Mypointisthatifthereisrealbehaviour–realpurchasingresponsesbasedonrealpricechanges–inthetransactionaldatabase,whywouldTHOSEdataelementsnotbebesttomeasurepricesensitivity?Theanswerseemstobethattranslatingwhatwaslearnedinmicroeconomicsintostatisticalanalysisisawidestepandnotusuallytaught.Thatis,goingfrompointelasticitytostatisticallymodellingelasticityisknowledgenoteasilygained.Note,however,thestepsarequitestraightforwardandthemodellingisnotdifficult.Perhapsthischapterisonewaytogetelasticitymodellingusedmoreinpractice,especiallygiventhepotentialbenefits.

Checklist

You’llbethesmartestpersonintheroomifyou:

Remembertherearetwotypesofstatisticalanalysis:dependentvariabletypesandinter-relationshiptypes.

Recallthattherearetwotypesofequations:deterministicandprobabilistic.

Observethatregression(ordinaryleastsquares,OLS)isadependentvariabletypeanalysisusingindependentvariablestoexplainthemovementinadependentvariable.

PointoutthatR2isameasureofgoodnessoffit;itshowsbothexplanatorypowerandsharedvariancebetweentheactualdependentvariableandthepredicteddependentvariable.

Rememberthatthet-ratioisanindicationofstatisticalsignificance.

Alwaysavoidthe‘dummytrap’;keeponelessbinaryvariableinasystem(eg,inaquarterlymodelonlyusethreenotfourquarters).

Page 60: Marketing Analytics: A Practical Guide to Real Marketing Science

Thinkintermsofthetwokindsofelasticity:inelasticandelasticdemandcurves.

Focusontherealissueofelasticity:whatimpactithasontotalrevenue(notunits).

Rememberpriceandunitsarenegativelycorrelated.Inaninelasticdemandcurvetotalrevenuefollowsprice;inanelasticdemandcurvetotalrevenuefollowsunits.Toincreasetotalrevenueinaninelasticdemandcurvepriceshouldincrease;toincreasetotalrevenueinanelasticdemandcurvepriceshoulddecrease.

Rememberthatregressioncomeswithassumptions.

Page 61: Marketing Analytics: A Practical Guide to Real Marketing Science

04

WhoismostlikelytobuyandhowdoItarget?Introduction

Conceptualnotes

Businesscase

Resultsappliedtothemodel

Liftcharts

Usingthemodel–collinearityoverview

Variablediagnostics

Highlight:Usinglogisticregressionformarketbasketanalysis

Checklist:You’llbethesmartestpersonintheroomifyou…

IntroductionThenextmarketingquestionisaroundtargeting,particularlywhoislikelytobuy.Notethatthisquestionisstatisticallythesameas‘Whoislikelytorespond(toamessage,anoffer,etc.)?’Thisprobabilityquestionisthecentreofmarketingscienceinthatitinvolvesunderstandingchoicebehaviour.Thetypicaltechniqueinvolved(especiallyfordatabase/directmarketing)islogisticregression.

ConceptualnotesLogisticregressionhasalotofsimilaritiestoordinaryregression.Theybothhaveadependentvariable,theybothhaveindependentvariables,theyarebothsingleequations,andtheybothhavediagnosticsaroundtheimpactofindependentvariablesonthedependentvariableaswellas‘fit’diagnostics.

Buttheirdifferencesarealsomany.Logisticregressionhasadependentvariablethattakesononlytwo(asopposedtocontinuous)values:0or1,thatis,it’sbinary.Logisticregressiondoesnotusethecriteriaof‘minimizingthesumofthesquarederrors’(whichisordinaryleastsquares,orOLS)tocalculatethecoefficients,butrathermaximumlikelihoodviaagridsearch.Theinterpretationofthecoefficientsisdifferent.Oddsratios(eβ)aretypicallyusedandfitisnotaboutapredictedvs.anactualdependentvariable.

Maximumlikelihood:anestimationtechnique(asopposedtoordinaryleastsquares)

Page 62: Marketing Analytics: A Practical Guide to Real Marketing Science

thatfindsestimatorsthatmaximizethelikelihoodfunctionobservingthesamplegiven.

Asaslightdetail,anotherimportantdifferencebetweenlogisticregressionandordinaryregressionisthatlogisticregressionactuallymodelsthe‘logit’ratherthanthedependentvariable.Alogitisthelogoftheevent/(1–theevent),thatis,thelogoftheoddsoftheeventoccurring.Recallthatordinaryregressionmodelsthedependentvariableitself.

(Bytheway,yesthereisatechniquethatcanmodel>twovalues,butnotcontinuous.Thatis,thedependentvariablemighthave3or4or5,etc.,values.Thistechniqueiscalledmultinomiallogit(discriminateanalysiswilldothisaswell)butwewillnotcoveritexcepttosayit’sthesameaslogisticregression,butthedependentvariablehascodesformultipledifferentvalues,ratherthanonly0or1.)Alloftheabovemeansthattheoutputoflogisticregressionisaprobabilitybetween0%and100%,whereastheoutputofordinaryregressionisanestimated(predicted)valuetofittheactualdependentvariable.Figure4.1showsaplotofactualevents(the0sandthe1s)aswellasthelogistic(s-curve).

Figure4.1Actualeventsandlogistics

Nowlet’slookatsomedataandrunamodel,becausethat’swhereallthefunis.

BUSINESSCASENowScott’sboss,veryimpressedwithwhathedidondemandmodelling,callsScottintohisoffice.

‘Scott,weneedtobettertargetthoselikelytobuyourproducts.Wesendoutmillions

Page 63: Marketing Analytics: A Practical Guide to Real Marketing Science

ofcatalogues,basedonmagazinesubscriberlists,buttheresponserateistoosmall.WhatcanwedotomakeourmailingROIbetter?’

Scottthinksforaminute.Theresponseratewastoosmall?Responserateistherateofresponse,whichisthenumberofthosethatresponded(purchased),dividedbythetotalnumberthatgotthecommunication.It’sanoverallmetricofsuccess.

‘Wewanttotargetthoselikelytopurchasebasedonacollectionofcharacteristics.Wehavebothcustomersandnon-customersinourdatabase–fromthesubscriberlistswe’vebeenmailing–sowecouldmodeltheprobabilitytorespondbasedoncloneorlookalikemodelling.’

‘Whatdoesthatmean?’thebossasks.

‘I’llhavetodigintoitabitmorebutIknowwecandeveloparegression-typemodelthatscoresthedatabasewithdifferentprobabilitiestopurchaseforeachname.WecansortthedatabasebyprobabilitytopurchaseandonlymailasdeepasROIlimits.’

‘Soundsgood.Gettoworkonthatandgetbacktomewhenyouhavesomething.’WiththatthebossswivelsinhischairsoScottknowstheconversationisover.

Resultsappliedtothemodel

NoteTable4.1overleafwhichshowsthesimplifieddataset.Thisisalistofcustomersthatpurchasedandthosethatdidnotpurchase.Scotthasdataonwhichcampaignstheyeachreceived,aswellassomedemographics.Theobjectiveistofigureoutwhichofthenon-purchasers‘looklike’thosethatdidpurchaseandre-mailthem,perhapswiththesamecampaign(ifwefindonethatwaseffective)ordesignanothercampaign.

Table4.1Simplifieddataset

Id Revenue Purchase Campaigna Campaignb Campaignc Income Sizehh Educ

999 1500 1 1 0 1 150000 1 19

1001 1400 1 1 0 1 137500 1 19

1003 1250 1 1 0 0 125000 2 15

1005 1100 1 1 0 0 112500 2 13

1007 2100 1 0 1 0 145000 3 16

1009 849 1 0 0 0 132500 3 17

1010 750 1 0 0 0 165000 3 16

1011 700 1 0 0 0 152500 3 9

1013 550 1 1 0 1 140000 4 15

Page 64: Marketing Analytics: A Practical Guide to Real Marketing Science

1015 850 1 1 0 1 127500 4 18

1017 450 1 1 0 1 115000 4 17

1019 0 0 0 0 1 102500 5 16

1021 0 0 0 0 1 99000 6 15

1023 0 0 0 1 1 86500 7 16

1025 0 0 0 1 1 74000 6 15

1027 0 0 0 1 1 61500 5 14

1029 0 0 0 1 1 49000 4 13

1033 0 0 1 0 1 111000 4 12

1034 0 0 0 0 1 98500 3 11

1035 0 0 0 0 1 86000 3 10

Theendresultwillbetoscorethedatabasewith‘probabilitytopurchase’inordertounderstandwhat(statistically)worksandstrategizewhattodonexttime.Thisisthecornerstoneofdirect(database)marketing.

Usingthe(contrived)dataset,youcanrunproclogisticdescendinginSAS.SeeTable4.2fortheoutputofthecoefficients.Thesecoefficientsarenotinterpretedthesamewayasinordinaryregression.

Table4.2Co-efficientoutput

Intercept –57.9

Campaigna –8.48

Campaignb 16.52

Campaignc –9.96

Income 0.001

Sizehh –3.41

Education 0.2

Becauselogisticregressioniscurvilinearandboundby0and1,theimpactoftheindependentvariablesaffectsthedependentvariabledifferently.Theactualimpactis

e^coefficient.

Forexample,education’scoefficientis0.200.Theimpactwouldbe:

Page 65: Marketing Analytics: A Practical Guide to Real Marketing Science

e.200=1.225,thatis(2.71828^.200).

Thismeansthatforeveryyearofaddededucation,theincreaseinprobabilityis22.5%.Thismetriciscalledtheoddsratio.Thisobviouslyhastargetingimplications:aimourproductatthehighesteducatedfamiliesaspossible.Notethattwoofthethreecampaignsarenegative(whichtendtodecreaseprobabilitytopurchase)sothisalsoaddscredencetoneedingbettertargeting.

Forlogisticregression,thereisnotreallyagoodnessoffitmeasure,likeR2inOLS.Logithasaprobabilityoutputbetweenadependentvariableof1and0.Oftenthe‘confusionmatrix’isused,andpredictiveaccuracyisasignofagoodmodel.Table4.3showstheconfusionmatrixoftheabovemodel.(TheconfusionmatrixfromSASuses‘ctable’asanoption.)Saythereare10,000observations.

Table4.3Confusionmatrix

Actualnon-events Actualevents

Predictednon-events 1,000 1,750

Predictedevents 500 6,750

Thetotalnumberofevents(purchases)is6,750+1,750or8,500.Themodelpredictedonly6,750+500or7,250.Thetotalaccuracyofthemodelistheactualeventspredictedcorrectlyandtheactualnon-eventspredictedcorrectly,meaning6,750+1,000or7,750/10,000=77.5%.Thenumberoffalsepositivesis500(themodelpredicted500peoplewouldhavetheeventthatdidnot).Thisisanimportantmeasureofdirectmarketing,intermsofthecostofawrongmailing.

Asananalytic‘trick’itoftenhelpstodetermineifthedependentvariable(sales,inthiscase)hasanyabnormalobservations.Rememberthez-score?Thisisafastandsimplewaytocheckifanobservationis‘outofbounds’.Thez-scoreiscalculatedas((observation–mean)/standarddeviation).

Let’ssaythemeanofrevenueis358.45andthestandarddeviationofrevenueis569.72.So,ifyourunthiscalculationforalltheobservationsonrevenueyouwillseethat(Table4.1)id#1007((2,100–358.45)/569.72)=3.074.Thismeansthatobservationis>3standarddeviationsfromthemean,averynon-normalobservation.Itiscommontoaddanewvariable,callit‘positiveoutlier’anditwilltakethevaluesof0aslongasthez-scoreonsalesis<3.00,thenittakesthevalueof1ifz-score>3.Usethisnewvariableasanotherindependentvariabletohelpaccountforoutliers.Someofthecoefficientsshouldchangeandthefitusuallyimproves.Thisnewvariablecanbeseenasaninfluentialobservation.

Table4.4Newvariables

Page 66: Marketing Analytics: A Practical Guide to Real Marketing Science

Intercept –51.9

Influence 15.54

Campaigna –6.06

Campaignb 16.6

Campaignc –9.07

Income 0.002

Sizehh –1.65

Education 0.211

Notethemostlyslightchangesincoefficients.Thisoughttomeanpredictiveaccuracyincreases.Notetheupdatedconfusionmatrixbelow.

Table4.5Updatedconfusionmatrix

Actualnon-events Actualevents

Predictednon-events 1,250 1,000

Predictedevents 250 7,500

Thetotalnumberofevents(purchases)isstill8,500butnotetheshiftinaccuracy.Themodelnowpredicts7,500+250=7,750.Thetotalaccuracyofthemodelistheactualeventspredictedcorrectlyandtheactualnon-eventspredictedcorrectly,meaning7,500+1,250or8,750/10,000=87.5%.Thenumberoffalsepositivesis250(themodelpredicted250peoplewouldhavetheeventthatdidnot).Thisisanimportantmeasureofdirectmarketing,intermsofthecostofawrongmailing.Themodelimprovedbecauseofaccountingforinfluentialobservations.

LiftchartsAcommonandimportanttool,especiallyindirect/databasemarketingisthelift(orgain)chart.

Lift/gainschart:avisualdevicetoaidininterpretinghowamodelperforms.Itcomparesbydecilesthemodel’spredictivepowertorandom.

Thisisasimpleanalyticdevicetoascertaingeneralfitaswellasatargetingaidintermsofhowdeeptomail.

Thegeneralprocedureistorunthemodelandoutputtheprobabilitytorespond.Sortthedatabasebyprobabilitytorespondanddivideinto10equal‘buckets’.Thencountthe

Page 67: Marketing Analytics: A Practical Guide to Real Marketing Science

numberofactualrespondersineachdecile.Ifthemodelisagoodone,therewillbealotmorerespondersintheupperdecilesandalotfewerrespondersinthelowerdeciles.

Asanexample,saytheaverageresponserateis5%.Wehave10,000totalobservations(customers).Eachdecilehas1,000customersinit,someofthemhaverespondedandsomeofthemhavenot.Overallthereare500responders(500/10,000=5%).So,randomly,wewouldexpectonaverage50ineachdecile.Instead,becausethemodelworks,saythereare250indecile1anditdecreasesuntilthebottomdecilehasonlyoneresponderinit.The‘lift’isdefinedasthenumberofrespondersineachdeciledividedbytheaverage(expected)numberofresponders.Indecile1thismeans250/50=500%.Thisshowsusthatthefirstdecilehasaliftof5X,thattherearefivetimesmoreresponderstherethanaverage.Italsosaysthatthoseinthetopdecilewhodidnotrespondareverygoodtargets,sinceagain,theyall‘lookalike’.Thisisanindicationthemodelcandiscriminatetherespondersfromthenon-responders.

Figure4.2Liftchart

Notethatineachdecilethereare1,000customers.250alreadyrespondedindecile1.Allofthecustomersindecile1haveahighprobabilityoftop10%responding.Thereare750morepotentialtargetsindecile1thathaveNOTresponded.Thisistheplacetofocustargetingandthisiswhyit’scalled‘clonemodelling’.

Tobrieflyaddressthedatabasemarketingquestion,‘HowdeepdoImail?’let’slookattheliftchartabove.Thisisanaccumulationofactualresponderscomparedtoexpectedresponders.Dependingonbudget,etc.,thisliftcharthelpstotarget.Mostdatabasemarketerswillmailasfarasanydecileout-respondstheaverage.Thatis,untiltheliftis<100%.Anotherwayofsayingthisistomailuntilthemaximumdistancebetweenthecurvesisachieved.However,asapracticalmatter,mostdirectmarketers(especiallycataloguers)haveasetbudgetandcanonlyAFFORDtomailsodeep,regardlessofthestatisticalperformanceofthemodel.Thus,mostoftheattentionisonthefirstoneortwodeciles.

Usingthemodel–collinearityoverview

Page 68: Marketing Analytics: A Practical Guide to Real Marketing Science

Anotherverycommonissuethatmustbedealtwithin(especially)regressionmodelingiscollinearity.

Collinearity:ameasureofhowvariablesarecorrelatedwitheachother.

Collinearityisdefinedasoneormoreindependentvariablesthataremorecorrelatedwitheachotherthaneitherofthemiswiththedependentvariable.Thatis,ifthereare,saytwoindependentvariablesinthemodel,damagingcollinearityisifX1andX2aremorecorrelatedthanX1andYand/orX2andY.Mathematically:

ρ(X1,X2)>ρ(Y,X1)orρ(Y,X2)whereρ=correlation.

Theconsequencesofcollinearityarethat,whiletheparameterestimatesofeachindependentvariableremainunbiased,thestandarderrorsaretoowide.Thismeanswhensignificancetestingiscalculated(parameterestimate/standarderroroftheestimate)forat-ratio(oraWaldratio)thesevariablestendtoshowlesssignificancethantheyreallyhave.Thisisbecausethestandarderroristoolarge.Collinearitycanalsoswitchsignswhichreturnnonsensicalresults.Thus,collinearitymustbetestedanddealtwith.

Aquicknoteonoverlysimplistic‘diagnostics’I’veseeninpracticefollows.It’spossibletorunacorrelationmatrixonthevariablesandobtainthe(simplePearson)correlationcoefficientforeachpair.ThisdoesNOTcheckfordamagingcollinearity,thisisacheckforsimple(linear)correlation.I’veseenanalystsjustrunthematrixanddrop(yes,drop!)anindependentvariablejustbecausethecorrelationofitandanotherindependentvariableis,say,greaterthan80%.(Wheredidtheyget80%?Thisisarbitraryandbeneathanyonecallingthemselvesanalytic.)Ok,offthesoapbox.

Theabove‘testing’isirksomebecauserealtesting(withSASandSPSS)isrelativelyeasy.VIFisthemostcommon.Runprocregressandinclude‘/VIF’asanoption.VIFisthevarianceinflationfactor.Basicallyitregresseseachindependentvariableonallotherindependentvariablesanddisplaysametric.Thismetricis1/(1–R2).Ifthismetricis>10.0(indicatinganR2of>90%)thenasaruleofthumb,somevariableistoocollineartoignore.Thatis,iftherearethreeindependentvariablesinthemodel,x1,x2andx3,VIFwillregressx1=f(x2andx3)andshowR2,thenx2=f(x1andx3)andshowR2andlastx3=f(x1andx2)andshowR2.

Notethatwearenotreallytestingforcollinearity(becausetherewillnearlyALWAYSbesomecollinearity).Wearetestingforcollinearitybadenoughtocauseaproblem(calledillconditioning).

Therecommendedapproachistoincludevariablesthatmaketheoreticsense.IfVIFindicatesavariableiscausingaproblembutthereisastrongreasonforthatvariabletobeincluded,oneoftheothervariablesshouldbeexaminedinstead.(ItisimportanttonotethatdroppingavariableisNOTthefirstcourseofaction.Simplydroppingavariableisarbitrary(andverysimplistic)analytics).Thatis,astronger,moredefendablemodel

Page 69: Marketing Analytics: A Practical Guide to Real Marketing Science

resultsfromastrategicunderstandingofthedatageneratingprocess,notbasedonstatisticaldiagnostics.Thescienceofmodellingwouldemphasizediagnostics;theartofmodellingwouldemphasizebalanceandbusinessimpact.DidImentionsometimesinapracticalbusinessenvironment‘badstatistics’areallowedbalancedonrunningabusiness?Gasp!

Dependingontheissuesanddata,etc.,otherpossiblesolutionsexist.Puttingalltheindependentvariablesinafactormatrixwouldkeepthevariables’varianceintactbutthefactorsare,bydefinition,orthogonal(uncorrelated).

Another(correcting)techniqueiscalledridgeregression(typicallyusingSteinestimates)andrequiresspecialsoftware(inSAS‘procregdata=x.xoutvifoutset=xxridge=0to1by0.01;modely=x1x2’etc.)andexpertisetouse.Ingeneral,ittradescollinearityforbiasintheparameterestimates.Again,thebalanceisinknowingthecoefficientsarenowbiasedbutadrasticreductionincollinearityresults.Isitworthit?Sorry,buttheansweris,itdepends.

WhileVIFishelpful,theconditionindexhasbecome(sinceBelsley,KuhandWelsch’s1980bookRegressionDiagnostics)thestateoftheartincollinearitydiagnostics.Themathsbehinditisfascinatingbutmanytextbookswillilluminatethat.Wewillfocusonanexample.Theapproach,withoutgettingTOOmathematical,istocalculatetheconditionindexofeachvariable.Theconditionindexisthesquarerootofthelargesteigenvalue(calledthecharacteristicroot)dividedbyeachvariable’seigenvalue.(Aneigenvalueisthevarianceofeachprincipalcomponentwhenusedinthecorrelationmatrix.)Theeigenvaluesadduptothenumberofvariables(includingtheintercept):seeTable4.6below.Thisisapowerfuldiagnosticbecauseasetofeigenvaluesofrelativelyequalmagnitudeindicatesthatthereislittlecollinearity.Asmallnumberoflargeeigenvaluesindicatesthatasmallnumberofcomponentvariablesdescribemostofthevariabilityofthevariables.Azeroeigenvalueimpliesperfectcollinearityand–thisisimportant–verysmalleigenvaluesmeansthereisseverecollinearity.Again,aneigenvaluenear0.00indicatescollinearity.Asaruleofthumb,aconditionindex>30indicatesseverecollinearity.

Table4.6Variance

Indvar

Eigenvalue

Condindex

Propinter

PropX1

PropX2

PropX3

PropX4

PropX5

PropX6

Inter 6.861 1.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000

X1 0.082 9.142 0.000 0.000 0.000 0.091 0.014 0.000 0.000

X2 0.046 12.256 0.000 0.000 0.000 0.064 0.001 0.000 0.000

X3 0.011 25.337 0.000 0.000 0.000 0.427 0.065 0.001 0.000

Page 70: Marketing Analytics: A Practical Guide to Real Marketing Science

X4 0.000 230.420 0.000 0.000 0.000 0.115 0.006 0.016 0.456

X5 0.000 1048.100 0.000 0.000 0.831 0.000 0.225 0.328 0.504

X6 0.000 432750.000 0.999 1.000 0.160 0.320 0.689 0.655 0.038

CommonoutputsalongwiththeVIFandconditionindexaretheproportionsofvariance(seeTable4.6).Thisproportionofvarianceshowsthepercentageofthevarianceofthecoefficientassociatedwitheacheigenvalue.Ahighproportionofvariancerevealsastrongassociationwiththeeigenvalue.

Let’stalkaboutTable4.6.Firstlookattheconditionindex.Theeigenvalueontheinterceptis6.86andthefirstconditionindexisthesquarerootof6.86/6.86=1.00.Nowthesecondconditionindexisthesquarerootof6.86/0.082=9.142.Thediagnosticsindicatethatthereareasmanycollinearityproblemsasthereareconditionindexes>30,orinthiscasetheremaybethreeproblems(230.42,1048.1and432750).Looktotheproportionofvariancetable.Anyproportion>0.50isaredflag.LookatthelastX6variable.VariableX6isrelatedtotheintercept,X1,X4andX5.X5isrelatedtoX2(0.8306)andX6(0.504).ThisindicatesX6isthemostproblematicvariable.Somethingoughttobedoneaboutthat.

PossiblesolutionsmightmeancombiningX5andX6intoafactorandusetheresultingfactorasavariableinsteadofX5andX6ascurrentlymeasured.Thisisbecausefactorsarebyconstructionuncorrelated(wecallitorthogonal).Anotheroptionwouldbetotransform(especially)X6,eithertakingitsexponent,orsquareroot,orsomethingelse.ThepointistotrytofindanX6-likevariablecorrelatedwiththedependentvariablebutLESSCORRELATEDwith,especially,X5.Areyouabletogetalargersample?CanyoutakedifferencesinX6,ratherthanjusttherawmeasure?Andyes,ifthereisatheoreticalreason,youcandropX6andre-runthemodelandseewhatyouhave.Droppingavariableisalastresort.HaveImentionedthat?

AbriefproceduralnoteOnprobablymostoftheanalytictechniqueswe’lltalkabout,certainassumptionsarebuiltin.Thatis,regressionhasmanyassumptionsaboutlinearity,normality,etc.WhileforOLSImentionedoneassumption(especiallyfortimeseriesdata)wasnoserialcorrelation,thissameassumptionisappliedtologisticregressionaswell.Mostregressiontechniquesusemostoftheseassumptions.SowhileinlogitIshowedhowtotestandcorrectforcollinearity,thissametestneedstobeappliedinOLSaswell.Itjusthappenedtocomeupduringourdiscussionoflogisticregression.

Thismeansthatinreality,foreveryregressiontechniqueused,everyassumptionshouldbecheckedandeveryviolationofassumptionsshouldbetestedforandcorrected,ifpossible.ThisgoesforOLS,logitandanythingelse.Okay?

Page 71: Marketing Analytics: A Practical Guide to Real Marketing Science

VariablediagnosticsAsinallregression,asignificancetestisperformedontheindependentvariablesbutbecauselogitisnon-linear,thet-testbecomestheWaldtest(whichisthet-testsquared,so1.962=3.84,at95%).Thep-valuestillneedstobe<0.05.

PseudoR2

LogisticregressiondoesnothaveanR2statistic.Thisfreaksalotofpeopleoutbutthat’swhyIshowedthe‘confusionmatrix’,whichisameasureofgoodnessoffit.Remember(fromOLS)R2isthesharedvariancebetweentheactualdependentvariableandthepredicteddependentvariable.Themorevariancethesetwosharethecloserthepredictedandactualdependentvariablesare.RememberOLSoutputsanestimateddependentvariable.LogisticregressiondoesNOToutputanestimateddependentvariable.Theactualdependentvariableis0or1.The‘logit’isthenaturallogoftheevent/(1–event).Sotherecanbeno‘estimated’dependentvariable.IfyouHAVEtohavesomemeasureofgoodnessoffitI’dsuggestusingtheloglikelihoodonthecovariateandintercept.SPSSandSASbothoutputthe–2LLontheinterceptonlyandthe–2LLontheinterceptandcovariates.Thinkofthe–2LLoninterceptasTSS(totalsumofsquares)and–2LLoninterceptandcovariatesasRSS(regressionsumofsquares).R2isRSS/TSSandthiswillgiveanindication(calledapseudo-R2)forthosethatneedthatmetric.

ScoringthedatabasewithprobabilityformulaTypicallyafteralogisticregressionisrun,especiallyinadatabasemarketingprocess,themodelhastobeappliedtoscorethedatabase.Yes,SASnowhas‘procscore’butIwantyoutobeabletodoityourselfandtounderstandwhat’shappening.It’soldfashionedbutyouwillknowmore.

Saywehavethebelow(Table4.7)modelwithprobabilitytopurchase.Thatis,thedependentvariableispurchase=1fortheeventandpurchase=0forthenon-event.Becauseofthelogisticcurveboundingbetween0and1,theformulaisprobability=1/(1+e–Z)whereZ=α+βXi.Fortheabovemodelthismeans:

Probability=1/(1+2.71828^–(4.566+X1*–0.003+x2*1.265+x3*0.003))

Thisreturnsaprobabilitybetween0%and100%foreachcustomer(2.71828=e).Soapplythisformulatoyourdatabaseandeachcustomerwillhaveascore(thatcanbeusedforaliftchart,seeabove)forprobabilitytopurchase.

Table4.7Probabilitytopurchase

Independentvariable Parameterestimate

Page 72: Marketing Analytics: A Practical Guide to Real Marketing Science

Intercept 4.566

X1 –0.003

X2 1.265

X3 0.003

HIGHLIGHT

USINGLOGISTICREGRESSIONFORMARKETBASKETANALYSIS

AbstractIngeneral,marketbasketanalysisisabackward-lookingexercise.Itusesdescriptiveanalysis(frequencies,correlation,mathematicalKPIs,etc.)andoutputsthoseproductsthattendtobepurchasedtogether.Thatgivesnoinsightsintowhatmarketersshoulddowiththatoutput.Predictiveanalytics,usinglogisticregression,showshowmuchtheprobabilityofaproductpurchaseincreases/decreasesgivenanotherproductpurchase.Thisgivesmarketersastrategiclevertouseinbundling,etc.

Whatisamarketbasket?Ineconomics,amarketbasketisafixedcollectionofitemsthatconsumersbuy.ThisisusedformetricslikeCPI(inflation)etc.Inmarketing,amarketbasketisanytwoormoreitemsboughttogether.

Marketbasketanalysisisused,especiallyinretail/CPG,tobundleandofferpromotionsandgaininsightinshopping/purchasingpatterns.‘Marketbasketanalysis’doesnot,byitself,describeHOWtheanalysisisdone.Thatis,thereisnoassociatedtechniquewiththosewords.

Howisitusuallydone?Therearethreegeneralusesofdata:descriptive,predictiveandprescriptive.Descriptiveisaboutthepast,predictiveusesstatisticalanalysistocalculateachangeonanoutputvariable(eg,sales)givenachangeinaninputvariable(say,price)andprescriptiveisasystemthattriestooptimizesomemetric(typicallyprofit,etc.).Descriptivedata(means,frequencies,KPIs,etc.)isanecessary,butnotusuallysufficient,step.Alwaysgettoatleastthepredictivestepassoonaspossible.Notethatpredictiveheredoesnotnecessarilymeanforecastedintothefuture.Structuralanalysisusesmodelstosimulatethemarket,andestimate(predict)whatcauseswhattohappen.Thatis,usingregression,achangeinpriceshowswhatistheestimated(predicted)changeinsales.

Page 73: Marketing Analytics: A Practical Guide to Real Marketing Science

Marketbasketanalysisoftenusesdescriptivetechniques.Sometimesitisjusta‘report’ofwhatpercentofitemsarepurchasedtogether.Affinityanalysis(aslightstepabove)ismathematical,notstatistical.Affinityanalysissimplycalculatesthepercentoftimecombinationsofproductsarepurchasedtogether.Obviouslythereisnoprobabilityinvolved.Itisconcernedwiththerateofproductspurchasedtogether,andnotwithadistributionaroundthatassociation.ItisverycommonandveryusefulbutNOTpredictive–thereforeNOTsoactionable.

LogisticregressionLet’stalkaboutlogisticregression.Thisisanancientandwell-knownstatisticaltechnique,probablytheanalyticpillaruponwhichdatabasemarketinghasbeenbuilt.Itissimilartoordinaryregressioninthatthereisadependentvariablethatdependsononeormoreindependentvariables.Thereisacoefficient(althoughinterpretationisnotthesame)andthereisa(typeof)t-testaroundeachindependentvariableforsignificance.

Thedifferencesarethatthedependentvariableisbinary(havingtwovalues,0or1)inlogisticandcontinuousinordinaryregressionandtointerpretthecoefficientsrequiresexponentiation.Becausethedependentvariableisbinary,theresultisheteroskedasticity.Thereisno(real)R2,and‘fit’isaboutclassification.

Howtoestimate/predictthemarketbasketTheuseoflogisticregressionintermsofmarketbasketbecomesobviouswhenitisunderstoodthatthepredicteddependentvariableisaprobability.Theformulatoestimateprobabilityfromlogisticregressionis:

P(i)=1/1+e–Z

whereZ=α+βXi.Thismeansthattheindependentvariablescanbeproductspurchasedinamarketbaskettopredictlikelihoodtopurchaseanotherproductasthedependentvariable.Notethatthereisnotanissueofcausalityhere,ie,presupposingthatone(independentproduct)causesthepurchaseofthedependentproduct,onlythattheyareassociatedtogether.Theabovemeanstospecificallytakeeach(major)categoryofproduct(focusdrivenbystrategy)andrunningaseparatemodelforeach,puttinginallsignificantotherproductsasindependentvariables.Forexample,saywehaveonlythreeproducts,x,yandz.Theideaistodesignthreemodelsandtestsignificanceofeach,meaningusinglogisticregression:

x=f(y,z)

y=f(x,z)

z=f(x,y).

Ofcourseothervariablescangointothemodelasappropriatebuttheinterestiswhether

Page 74: Marketing Analytics: A Practical Guide to Real Marketing Science

ornottheindependent(product)variablesaresignificantinpredicting(andtowhatextent)theprobabilityofpurchasingthedependentproductvariable.Ofcourse,aftersignificanceisachieved,theinsightsgeneratedarearoundthesignoftheindependentvariable,ie,doestheindependentproductincreaseordecreasetheprobabilityofpurchasingthedependentproduct.

AnexampleAsasimpleexample,sayweareanalysingaretailstore,withcategoriesofproductslikeconsumerelectronics,women’saccessories,newbornandinfantitems,etc.Thus,usinglogisticregression,aseriesofmodelsshouldberun.Thatis:

consumerelectronics=f(women’saccessories,jewelleryandwatches,furniture,entertainment,etc.)

Thismeanstheindependentvariablesarebinary,codedasa‘1’ifthecustomerboughtthatcategoryanda‘0’ifnot.Table4.8detailstheoutputforallofthemodels.Notethatotherindependentvariablescanbeincludedinthemodel,ifsignificant.Thesewouldoftenbeseasonality,consumerconfidence,promotionssent,etc.

Table4.8Associatedprobabilities

Consumerelectronics

Women’saccessories

Newborn,infant,etc.

Jewellery,watches

Furniture Homedécor

Entertainment

Consumerelectronics

XXX Insig Insig –23% 34% 26% 98%

Women’saccessories

Insig XXX 39% 68% 22% 21% Insig

Newborn,infant,etc.

Insig 43% XXX –11% –21% –31% 29%

Jewellery,watches

–29% 71% –22% XXX 12% 24% –11%

Furniture 31% 18% –17% 9% XXX 115% 37%

Homedécor 29% 24% –37% 21% 121% XXX 31%

Entertainment 85% Insig 31% –9% 41% 29% XXX

Sportinggoods

18% –37% –29% –29% 24% 9% 33%

Tointerpret,lookat,say,thehomedécormodel.Ifacustomerboughtconsumerelectronics,thatincreasestheprobabilityofbuyinghomedécorby29%.Ifacustomer

Page 75: Marketing Analytics: A Practical Guide to Real Marketing Science

boughtnewborn/infantitems,thatdecreasestheprobabilityofbuyinghomedécorby37%.Ifacustomerboughtfurniture,thatincreasestheprobabilityofbuyinghomedécorby121%.Thishasimplications,especiallyforbundlingandmessaging.Thatis,offering,say,homedécorandfurnituretogethermakesgreatsense,butofferinghomedécorandnewborn/infantitemsdoesnotmakesense.

Andhereisaspecialnoteaboutproductspurchasedtogether.Ifitisknown,viatheabove,thathomedécorandfurnituretendtogotogether,thesecanbeandshouldbebundledtogether,messagedtogether,etc.ButthereisnoreasontoPROMOTEthemtogetherortodiscountthemtogetherbecausetheyarepurchasedtogetheranyway.

ConclusionTheabovedetailedasimple(andmorepowerfulway)todomarketbasketanalysis.Ifgivenachoice,alwaysgobeyondmeredescriptivetechniquesandapplypredictivetechniques.

Checklist

You’llbethesmartestpersonintheroomifyou:

Candifferentiatebetweenlogisticandordinaryregression.Logisticandordinaryregressionaresimilarinthatbotharesingleequationshavingadependentvariableexplainedbyoneormoreindependentvariables.Theyaredissimilarinthatordinaryregressionhasacontinuousdependentvariablewhilelogisticregressionhasabinaryvariable;ordinaryregressionusesleastsquarestoestimatethecoefficientswhilelogisticregressionusesmaximumlikelihood.

Rememberthatlogisticregressionpredictsaprobabilityofanevent.

Alwaystestforoutliers/influentialobservationsusingz-scores.

Pointoutthatthe‘confusionmatrix’isameansofgoodnessoffit.

Observethatlift/gainchartsareusedasameasureofmodellingefficacyaswellas(egindirectmail)depthofmailing.

Remembertoalwayscheck/correctforcollinearity.

Suggestlogisticregressionasawaytomodelmarketbaskets.

Page 76: Marketing Analytics: A Practical Guide to Real Marketing Science

05

Whenaremycustomersmostlikelytobuy?Introduction

Conceptualoverviewofsurvivalanalysis

Businesscase

Moreaboutsurvivalanalysis

Modeloutputandinterpretation

Conclusion

Highlight:Lifetimevalue:howpredictiveanalysisissuperiortodescriptiveanalysis

Checklist:You’llbethesmartestpersonintheroomifyou…

IntroductionSurvivalanalysisisanespeciallyinterestingandpowerfultechnique.Intermsofmarketingscienceitisrelativelynew,mostlygettingexposureintheselast20yearsorso.Itanswersaveryimportantandparticularquestion:‘WHENisanevent(purchase,response,churn,etc.)mostlikelytooccur?’I’dsubmitthisisamorerelevantquestionthan‘HOWLIKELYisanevent(purchase,response,churn,etc.)tooccur?’Thatis,acustomermaybeVERYlikelytopurchasebutnotfor10months.Istiminginformationofvalue?Ofcourseitis;remember,timeismoney.

Bewarethough.Giventheincreaseinactionableinformation,itshouldbenosurprisethatsurvivalanalysisismorecomplexthanlogisticregression.Rememberhowmuchmorecomplexlogisticregressionwasthanordinaryregression?

ConceptualoverviewofsurvivalanalysisSurvivalanalysis(viaproportionalhazardsmodelling)wasessentiallyinventedbySirDavidCoxin1972withhisseminalandoft-quotedpaper,‘RegressionModelsandLifeTables’intheJournaloftheRoyalStatisticalSociety(Cox,1972).It’simportanttonotethistechniquewasspecificallydesignedtostudytimeuntileventproblems.Thiscameoutofbiostatisticsandtheeventofstudywastypicallydeath.That’swhyit’scalled‘survivalanalysis’.Getit?

Page 77: Marketing Analytics: A Practical Guide to Real Marketing Science

Thegeneralusecasewasindrugtreatment.Therewouldbe,say,adrugstudywhereapanelwasdividedintotwogroups;onegroupgotthenewdrugandtheothergroupdidnot.Everymonththetestsubjectswerecalledandbasicallyasked,‘Areyoustillalive?’andtheirsurvivalwastracked.Therewouldbetwocurvesdeveloped,onefollowingthetreatmentgroupandanotherfollowingthenon-treatmentgroup.Ifthetreatmenttendedtoworkthetimeuntilevent(death)wasincreased.

Onemajorissueinvolvedcensoredobservations.It’saneasymattertocomparetheaveragesurvivaltimesofthetreatmentvs.thenon-treatmentgroup.

Censoredobservation:thatobservationwhereinwedonotknowitsstatus.Typicallytheeventhasnotoccurredyetorwaslostinsomeway.

Butwhataboutthosesubjectsthatdroppedoutofthestudybecausetheymovedawayorlostcontact?Orthestudyendedandnoteveryonehasdiedyet?Eachoftheseinvolvescensoredobservations.ThequestionaboutwhattodowiththesekindsofobservationsiswhyCoxregressionwascreated;anon-parametricpartiallikelihoodtechnique,whichhecalledproportionalhazards.Itdealswithcensoredobservations,whicharethosepatientsthathaveanunknowntimeuntileventstatus.Thisunknowntimeuntileventcanbecausedbyeithernothavingtheeventatthetimeoftheanalysisorlosingcontactwiththepatient.

Whataboutthosesubjectsthatdiedfromanothercauseandnotthecausethetestdrugwastreating?Arethereothervariables(covariates)thatinfluence(increaseordecrease)thetimeuntiltheevent?Thesequestionsinvolveextensionsofthegeneralsurvivalmodel.Thefirstisaboutcompetingrisksandthesecondisaboutregressioninvolvingindependentvariables.Thesewillbedealtwithsoonenough.

BUSINESSCASEAttheendoftheyearScottcalledhisteamandthemarketingorganizationtogetherforareviewandbrainstormingexercise.ThisissomethingScottbelievedeverysmartanalyticsproshoulddo.Hewasespeciallyinterestedinhowtheanalyticteamwasperceivedasprovidingvaluelastyearandwhatmightbedonedifferentlyintheupcomingyear.

DuringthemeetingthemarketingmanagerscomplimentedScottandhisteamforprovidingactionableinsights.Theresultsgavemostofthemagoodbonusandtheywantedtogetanotheronethisyear.TheydidnotallcompletelyunderstandthetechnicaldetailsandScottmadetheculturearoundthatokay.Hetriedtomakehisteamviewedasconsultants;accessible,conversationalandengagedwiththebroaderorganization.

‘Thanks’,Scottsaidandturnedtothedirectorofconsumermarketing,Stacy.‘Wherecanweimprove?Whattargetingwouldhelpyouandyourteam?’

Page 78: Marketing Analytics: A Practical Guide to Real Marketing Science

‘Well,wehaveaprettygoodprocessnow.Wepulllistsbasedonlikelihoodtorespond.It’sworkedwell.’

‘Yeah,I’mgladofthat.Theliftchartsfromlogithelpedusmailonlyasdeepasweneededto.’

‘ThisgivesusthebestROIinthecompany.’

‘Butisthatallwecando?Justtargetthosemostlikelytorespond?’Scottasked.

‘Whatelseisthere?’Stacyasked,checkingherphone.

‘Yeah,I’mnotsure’,Scottsaid.‘Whatdoyouneedtoknowtodoyourjob?Whatiftherewerenorestrictionsondataorfeasibilityoranythingelse?Youhaveamagicbuttonthatifyoupushityouwouldknowtheonethingthatwouldallowyoutodoyourjobbetter,betterthaneverbefore,aknowledgethatgivesyouatremendousadvantage.’

‘Easy!’Kristinasaid.‘IfIknewwhatproducteachcustomerwouldpurchaseinwhatorder,thatis,ifIknewWHENhewouldpurchaseadesktop,oranotebook,Iwouldnotsendalotofuselesscataloguesore-mailstohim.I’dsendtohimthemostcompellingmarcomatjusttherighttimewithjusttherightpromotionandjusttherightmessagingtomaximizehispurchase.’

Theyalllookedather.Thentheynoddedtheirheads.KristinahadtalkedwithScottaboutjoininghisteamaftershegraduates.

‘Itsoundslikesciencefiction’,Stacysaid.‘Wewouldgetalistofcustomerswithamostlikelytimetopurchaseeachproduct?’

Scottrubbedhischin.‘Yes.It’sapredictionofwheneachcustomerisgoingtopurchaseeachproduct.’

‘But’,saidMark,‘whatdoesthatmean?Before?’MarkwasananalystonScott’steam.‘Wewanttopredictwhenthey’llpurchase?’

‘Ithinkso’,Scottsaid.‘Predictwhenthey’llbuyadesktop,whenthey’llbuyanotebook,etc.’

‘Imaginehavingthedatabasescoredwiththenumberofdaysuntileachcustomerislikelytobuypersonalelectronics,adesktop,etc.’Kristinasaid.‘We’djustsortthedatabasebyproductsandthosemorelikelytobuysoonerwouldgetthecommunication.’

‘Butdoesthatmeanusingregression,orlogit,orwhat?’

‘Idon’tknow’,Scottsaid.‘Whatdowedoaboutpredictingthosewhohavenotpurchasedaproduct?Isthisprobabilitytobuyateachdistincttimeperiod?’

Theyallleftthemeetingexcitedaboutthenewmetric(timeuntilpurchase)butScottwaswonderingwhattechniquewouldanswerthatquestion.Iftheyusedordinaryregression,thedependentvariablewouldbe‘numberofdaysuntilpurchaseofadesktop’

Page 79: Marketing Analytics: A Practical Guide to Real Marketing Science

basedonsomezero-day,sayJanuaryfirsttwoyearsago.Thosethatpurchasedadesktopwouldhavetheeventatthatmanydays.ThosethatdidnotpurchaseadesktopgaveScottachoice.Eitherhewouldcapthenumberofdaysatnow,saytwoyearsfromthezerodate,whichmeans,say,725,iftheywereonfilefromthezerodateonward.Thatis,thosethathavenotpurchasedadesktopwouldbeforcedtohavetheeventat725days.Notagoodchoice.Theotheroptionwouldbetodeletethosethatdidnotpurchaseadesktop.Alsonotagoodchoice.

Rulenumerouno:nevereverunderanycircumstancesdeletedata.Never.Ever.Thisisan‘Offwiththeirheads!’crime(unlessofcoursethedataiswrongoranoutlier).

Ignoringthetimeuntiltheevent-dependentvariablecouldgiverisetologisticregression.Thatis,thosereceivinga1iftheydidpurchaseadesktopanda0iftheydidnot.Thisputshimrightbackintoprobability,andtheyallagreedthattimingwasamorestrategicoption.SoScottconcludedthatbothOLSandlogithaveseverefaultsintermsoftimeuntileventproblems.

It’simportanttomakeaclarificationaboutatrapalotofpeoplefallinto.Survivalanalysisisatechniquespecificallydesignedtoestimateandunderstandtimeuntileventproblems.Theunderlyingassumptionisthateachtimeperiodisindependentofeachothertimeperiod.Thatis,thepredictionhasno‘memory’.Someunder-educated/under-experiencedanalyststhinkthatifwearesaytryingtopredictwhatmonthaneventwillhappentheycando12logitsandhaveonemodelforJanuary,anotherforFebruary,etc.Thecollecteddatawouldhavea1ifthecustomerpurchasedinJanuaryanda0ifnot,likewise,ifthemodelwasforFebruaryacustomerwouldhavea1iftheypurchasedinFebruaryanda0ifnot.Thisseemslikeitwouldwork,right?Wrong.FebruaryisnotindependentofJanuary.InorderforthecustomertobuyinFebruarytheyhadtodecideNOTtobuyinJanuary.See?Thisiswhylogitisinappropriate.

Nowforyouacademicians,yes,logisticregressionisappropriateforasmallsubsetofaparticularproblem.Ifthedataisperiodic(aneventthatcanonlyoccuratregularandspecificintervals)then,yes,logisticregressioncanbeusedtoestimatesurvivalanalyses.Thisrequiresawholedifferentkindofdataset,onewhereeachrowisnotacustomerbutatimeperiodwithanevent.I’dstillsuggesteventhen,whynotjustusesurvivalanalysis(inSASliferegorphreg)?

Moreaboutsurvivalanalysis

Asmentioned,survivalanalysiscamefrombiostatisticsintheearly1970s,wherethesubjectstudiedwasanevent:death.Survivalanalysisisaboutmodellingthetimeuntilanevent.Inbiostatisticstheeventistypicallydeathbutinmarketingtheeventcanberesponse,purchase,churn,etc.

Duetothenatureofsurvivalstudies,thereareacoupleofcharacteristicsthatareendemictothistechnique.Asalludedtoearlier,thedependentvariableistimeuntilevent,

Page 80: Marketing Analytics: A Practical Guide to Real Marketing Science

sotimeisbuiltintotheanalysis.Thesecondendemicthingtosurvivalanalysisisobservationsthatarecensored.Acensoredobservationiseitheranobservationthathasnothadtheeventoranobservationthatwaslosttothestudyandthereisnoknowledgeofhavingtheeventornot–butwedoknowatsomepointintimethattheobservationhasnothadtheevent.

Inmarketingit’scommonfortheeventtobeapurchase.Imaginescoringadatabaseofcustomerswithtimeuntilpurchase.Thatisfarmoreactionablethan,fromlogisticregression,probabilityofpurchase.

Let’stalkaboutcensoredobservations.Whatcanbedoneaboutthem?Rememberwedonotknowwhathappenedtotheseobservations.Wecoulddeletethem.Thatwouldbesimple,butdependinghowmanytherearethatmightbethrowingawayalotofdata.Also,theymightbethemostinterestingdataofall,sodeletingthemisprobablyabadidea.(And,rememberthe‘Offwiththeirheads!’crimementionedpreviously.)Wecouldjustgivethemaximumtimeuntilaneventtoallthosethathavenothadtheevent.Thiswouldalsobeabadidea,especiallyifalargeportionofthesampleiscensored,asisoftenthecase.(Itcanbeshownthatthrowingawayalotofcensoreddatawillbiasanyresults.)Thus,weneedatechniquethatcandealwithcensoreddata.Also,deletingcensoredobservationsignoresalotofinformation.Whilewedon’tknowwhen(orevenif)thecustomer,say,purchased,wedoknowasofacertaintimethattheydidNOTpurchase.Sowehavepartoftheircurve,partoftheirinformation,partoftheirbehaviour.Thisshouldnoteverbedeleted.ThisiswhyCoxinventedpartiallikelihood.

Figure5.1Generalsurvivalcurve

Theaboveisageneralsurvivalcurve.Theverticalaxisisacountofthoseinthe‘riskset’anditstartsoutwith100%.Thatis,attime0everyoneis‘atrisk’ofhavingtheeventandnoonehashadtheevent.Atday1,thatis,afteroneday,onepersondied(hadtheevent)andtherearenow99thatareatrisk.Noonediedfor3daysuntil9hadtheeventatday5,etc.Notethatataboutday12,29hadtheevent.

NownoteFigure5.2.Onesurvivalcurveisthesameasabove,buttheotheroneis

Page 81: Marketing Analytics: A Practical Guide to Real Marketing Science

‘furtherout’.Notethat50%ofthefirstcurveisreachedat14days,butthesecondcurvedoesnotreach50%until28days.Thatis,they‘livelonger’.

Figure5.2Survivalanalysis

Survivalanalysisisatypeofregression,butwithatwist.Itdoesnotusemaximumlikelihood,butpartiallikelihood.(Themostcommonformofsurvivalanalysis,proportionalhazards,usespartiallikelihood.)Thedependentvariableisnowtwoparts:timeuntiltheeventandwhethertheeventhasoccurredornot.Thisallowstheuseofcensoredobservations.

Theabovegraphsaresurvivalgraphs.MuchofCoxregressionisnotaboutthesurvivalcurve,butthehazardrate.Thehazardisnearlythereciprocalofthesurvivalcurve.Thisendsupastheinstantaneousriskthataneventwilloccuratsomeparticulartime.Thinkofmetricslikemilesperhourasanalogoustothehazardrate.At40milesperhouryouwilltravel40milesinonehourifspeedremainsthesame.Thehazardquantifiestherateoftheeventineachperiodoftime.

SASdoesbothsurvivalmodelling(withproclifereg)andhazardmodelling(asprocphreg).SPSSonlydoeshazardmodelling(asCoxregression).Liferegdoesleftandintervalcensoringwhilephregdoesonlyrightcensoring(thisisnotusuallyanissueformarketing).Withliferegadistributionmustbespecified,butwithphreg(asit’ssemi-parametric)thereisnodistribution.Thisisoneoftheadvantagesofphreg.Theotheradvantageofphregisthatitincorporatestime-varyingindependentvariables,whileliferegdoesnot.(Thisalsoisnotusuallymuchofanissueformarketing.)

Itypicallyuseliferegasiteasilyoutputsatime-until-eventprediction,itisonthesurvivalcurveanditisrelativelyeasytounderstandandinterpret.That’swhatwe’lldemonstratehere.

Imightmentionthatsurvivalanalysisisnotjustaboutthetimeuntileventprediction.Aswithallregressionstheindependentvariablesarestrategiclevers.Saywefindthatforevery1,000e-mailswesendpurchasestendtohappenthreedayssooner.Doyouseethefinancialimplicationshere?Howvaluableisittoknowyouhaveincentivizedagroupof

Page 82: Marketing Analytics: A Practical Guide to Real Marketing Science

customersinmakingpurchasesearlier?Ifthisdoesnotinterestyouthenyouareinthewrongcareerfield.

Modeloutputandinterpretation

SoScott’steaminvestigatedsurvivalanalysisandconcludeditwasworthashot.Itseemedtogiveawaytoanswerthekeyquestion,‘WHENisacustomermostlikelytopurchaseadesktop?’

Table5.1Finaldesktopmodel,lifereg

Independentvariables Beta e^B (e^B)-1 AvgTTE

Anypreviouspurchase –0.001 0.999 –0.001 –0.012

Recentonlinevisit –0.014 0.987 –0.013 –0.148

#Directmails 0.157 1.17 0.17 1.865

#E-mailsopened –0.011 0.989 –0.011 –0.12

#E-mailsclicked –0.033 0.968 –0.032 –0.352

Income –0.051 0.95 –0.05 –0.547

Sizehousehold –0.038 0.963 –0.037 –0.408

Education –0.023 0.977 –0.023 –0.249

Bluecollaroccupation 0.151 1.163 0.163 1.792

#Promotionssent –0.006 0.994 –0.006 –0.066

Purchdesktop<year 2.09 8.085 7.085 77.934

Thetableaboveliststhefinaldesktopmodelusinglifereg.Thevariablesareallsignificantatthe95%level.Thefirstcolumnisthenameoftheindependentvariable.Theinterpretationofliferegcoefficientsrequirestransformations.Thisgetstheparameterestimatesintoaformtomakestrategicinterpretation.

Thenextcolumnisthebetacoefficient.ThisiswhatSASoutputsbut,aswithlogisticregression,isnotverymeaningful.Anegativecoefficienttendstobringtheeventofadesktoppurchasein;apositivecoefficienttendstopushtheevent(desktoppurchase)out.Thisisaregressionoutputsointhatregardinterpretationisthesame,ceterisparibus.

Togetpercentimpactsontimeuntilevent(TTE),eachbetacoefficientmustbeexponentiated,e^B.That’sthethirdcolumn.Thenextcolumnsubtracts1fromitandconvertsitintoapercentage.Notethat,forexample,‘recentonlinevisit’e^Betaisa0.987impactontime,or,if1issubtractedshowsa1.3%decreaseinaverageTTE.Toconvertthattoascale–saytheaverageis11weeks–thismeans–0.013*11=–0.148weeks.The

Page 83: Marketing Analytics: A Practical Guide to Real Marketing Science

interpretationisthatifacustomerhadarecentonlinevisitthattendstopullin(shorten)TTEby0.148weeks.Notrealimpactfulbutitmakessense,right?

Noticethelastvariable,‘purchdesktop<year’.Seehowit’spositive,2.09?Thismeansifthecustomerhaspurchasedadesktopinthelastyearthetimeuntil(another)desktoppurchaseispushedoutby((e^B)–1)*11=77.934weeks.Seehowthisworks?Seehowstrategicallyinsightfulsurvivalanalysiscanbe?Youcanbuildabusinesscasearoundmarcomsent(costofmarcom)anddecreasingthetimeuntilpurchase(revenuerealizedsooner).

Astypicallyusedonadatabase,eachcustomerisscoredwithtimeuntiltheevent,inthiscase,timeuntiladesktoppurchase.Thedatabaseissortedandalistisdesignedwiththosemostlikelytopurchasenext(seeTable5.2below).Thistimeuntilevent(TTE)isatthe(50%decile)median.

Table5.2Timeuntilevent(inweeks)

CustomerID TTE

1000 3.365

1002 3.702

1004 4.072

1006 4.479

1011 5.151

1013 5.923

1015 6.812

1017 7.834

1022 9.009

1024 10.36

1026 12.43

1030 14.92

Notethatcustomer1000isexpectedtopurchaseadesktopin3.3weeksandcustomer1030isexpectedtopurchaseadesktopin14.9weeks.Usingsurvivalanalysis(inSAS,proclifereg)allowedScott’steamtoscorethedatabasewiththoselikelytopurchasesooner.Thislistismoreactionablethanusinglogisticregression,wherethescoreisjustprobabilitytopurchase.

Nowlet’stalkaboutcompetingrisks.Whilesurvivalanalysisisaboutdeath,thestudy

Page 84: Marketing Analytics: A Practical Guide to Real Marketing Science

usuallyisinterestedinONEkindofdeath,ordeathfromONEcause.Thatis,thebiostatstudyisabout,say,deathbyheartattackandnotaboutdeathbycancerordeathbyacaraccident.Butit’struethatinastudyofdeathbyheartattackapatientisalsoatriskforotherkindsofdeath.Thisiscalledcompetingrisks.

Inthemarketingarena,whilethefocusmightbeonapurchaseeventfor,say,adesktopPC,thecustomerisalso‘atrisk’forpurchasingotherthings,likeanotebookorconsumerelectronics.Fortunately,thisisaneasyjobofjustcodingtheeventsofinterest.Thatis,ScottcancodeforaneventasDT(desktop)purchase,withallelsecodedasanon-event.Hecandoanothermodelasapurchaseeventof,say,notebooks,andallelseisanon-event,thatis,allotherthingsarecensored.ThusTable5.3showsthreemodels,havingapurchaseeventfordesktop,notebookandconsumerelectronics.

Table5.3Threemodelcomparison

CustomerID TTdesktoppurch TTnotebookpurch TTconsumerelectronicspurch

1000 3.365 75.66 39.51

1002 3.702 88.2 45.95

1004 4.072 111.2 55.66

1006 4.479 15.05 19.66

1011 5.151 13.07 9.109

1013 5.923 9.945 7.934

1015 6.812 22.24 144.5

1017 7.834 3.011 5.422

1022 9.009 2.613 5.811

1024 10.36 1.989 6.174

1026 12.43 4.448 8.44

1030 14.92 0.602 7.76

Alittletechnicalbackground

First,somethingtonoteaboutliferegisthatitrequiresyoutogiveitadistribution.(Phregdoesnotrequirethatyougiveitadistribution,somethingalotofanalystslike.)Inusinglifereg,I’dsuggesttestingalldistributions,andtheonethatfitsthebest(lowestBICorloglikelihood)istheonetouse.Anotherviewwouldbetoacknowledgethatthedistributionhasashapeandascertainwhatshapemakessensegiventhedatayou’reusing.

PseudoR2

Page 85: Marketing Analytics: A Practical Guide to Real Marketing Science

WhileR2asametricmakesnosense(sameaswithlogisticregression)alotofanalystslikesomekindofR2.Toreview,R2inOLSisthesharedvariancebetweentheactualdependentvariableandthepredicteddependentvariable.Insurvivalanalysisthereisnopredicteddependentvariable.Mostfolksusethemedianasthepredictionandthat’sokay.I’dsuggestrunningasimplemodelwith,andwithout,covariates.Thatis,inSASwithproclifereg,runthemodelwithoutthecovariates(independentvariables)andcollectthe–2loglikelihoodstat.Thenrunthemodelwiththecovariatesandcollectthe–2LLstatanddivide.Thismetric(byanalogy)showsthepercentofexplainedoverthepercentunexplained.

Conclusion

Survivalanalysisisnotacommontopicinmarketinganalyticsanditshouldbe.Whileit’struethatmarketersandbiostatisticians(wheresurvivalanalysisoriginated)donotmoveinthesamecircles,I’venowgivenyousomeofthebasics,sogoandgettowork.

HIGHLIGHT

LIFETIMEVALUE:HOWPREDICTIVEANALYSISISSUPERIORTODESCRIPTIVEANALYSIS

AbstractTypicallylifetimevalue(LTV)isbutacalculationusinghistoricaldata.Thiscalculationmakessomeratherheroicassumptionstoprojectintothefuturebutgivesnoinsightsintowhyacustomerislowervalued,orhowtomakeacustomerhighervalued.Usingpredictivetechniques,heresurvivalanalysisgivesanindicationastowhatcausespurchasestohappensooner,andthushowtoincreaseLTV.

DescriptiveanalysisLifetimevalue(LTV)istypicallydoneasjustacalculation,usingpast(historical)data.Thatis,it’sonlydescriptive.

WhiletherearemanyversionsofLTV(dependingondata,industry,interest,etc.)thefollowingisconceptuallyappliedtoall.LTV,viadescriptiveanalysis,worksasfollows:

1. Ituseshistoricaldatatosumupeachcustomer’stotalrevenue.2. Thissumthenhassubtractedfromitsomecosts:typicallycosttoserve,costto

market,costofgoodssold,etc.3. Thisnetrevenueisthenconvertedintoanannualaverageamountanddepictedasa

cashflow.

Page 86: Marketing Analytics: A Practical Guide to Real Marketing Science

4. Thesecashflowsareassumedtocontinueintothefutureanddiminishovertime(dependingondurability,salescycle,etc.)oftendecreasingarbitrarilybysay10%eachyearuntiltheyareeffectivelyzero.

5. These(future,diminished)cashflowsarethensummedupanddiscounted(usuallybyweightedaveragecostofcapital)togettheirnetpresentvalue(NPV).

6. ThisNPViscalledLTV.Thiscalculationisappliedtoeachcustomer.

Thuseachcustomerhasavalueassociatedwithit.Thetypicaluseisformarketerstofindthe‘high-valued’customers(basedonpastpurchases).Thesehigh-valuedcustomersgetmostofthecommunications,promotions/discountsandmarketingefforts.Descriptiveanalysisismerelyabouttargetingthosealreadyengaged,muchlikeRFM(recency,frequency,monetary),whichwewilldiscusslater.

Thisseemstobeagoodstartingpointbut,asisusualwithdescriptiveanalysis,contributesnothinginformative.Whyisonecustomermorevaluable,andwilltheycontinuetobe?Isitpossibletoextractadditionalvalue,butatwhatcost?Isitpossibletogarnermorerevenuefromalowervaluedcustomerbecausetheyaremoreloyalorcostlesstoserve?Whatpartofthemarketingmixiseachcustomermostsensitiveto?LTV(asdescribedabove)givesnoimplicationsforstrategy.Theonlystrategyistoofferandpromoteto(only)thehigh-valuedcustomers.

PredictiveanalysisHowwouldLTVchangeusingpredictiveanalysisinsteadofdescriptiveanalysis?FirstnotethatwhileLTVisafuture-orientedmetric,descriptiveanalysisuseshistorical(past)dataandtheentiremetricisbuiltonthat,withassumptionsaboutthefutureappliedunilaterallytoeverycustomer.PredictiveanalysisspecificallythrustsLTVintothefuture(whereitbelongs)byusingindependentvariablestopredictthenexttimeuntilpurchase.SincethemajorcustomerbehaviourdrivingLTVistiming,amountandnumberofpurchases,astatisticaltechniqueneedstobeusedthatpredictstimeuntilanevent.(OrdinaryregressionpredictingtheLTVamountignorestimingandnumberofpurchases.)

Survivalanalysisisatechniquedesignedspecificallytostudytimeuntileventproblems.Ithastimingbuiltintoitandthusafutureviewisalreadyembeddedinthealgorithm.Thisremovesmuchofthearbitrarinessoftypical(descriptive)LTVcalculations.

So,whataboutusingsurvivalanalysistoseewhichindependentvariables,say,bringinapurchase?DecreasingtimeuntilpurchasetendstoincreaseLTV.Whilesurvivalanalysiscanpredictthenexttimeuntilpurchase,thestrategicvalueofsurvivalanalysisisinusingtheindependentvariablestoCHANGEthetimingofpurchases.Thatis,descriptiveanalysisshowswhathappened;predictiveanalysisgivesaglimpseofwhatmightCHANGEthefuture.

Page 87: Marketing Analytics: A Practical Guide to Real Marketing Science

StrategyusingLTVdictatesunderstandingthecausesofcustomervalue:whyacustomerpurchases,whatincreases/decreasesthetimeuntilpurchase,probabilityofpurchasingatfuturetimes,etc.Thenwhentheseinsightsarelearned,marketinglevers(shownasindependentvariables)areexploitedtoextractadditionalvaluefromeachcustomer.Thismeansknowingthatonecustomeris,say,sensitivetopriceandthatadiscountwilltendtodecreasetheirtimeuntilpurchase.Thatis,theywillpurchasesooner(maybepurchaselargertotalamountsandmaybepurchasemoreoften)withadiscount.Anothercustomerprefers,say,productXandproductYbundledtogethertoincreasetheprobabilityofpurchaseandthisbundlingdecreasestheirtimeuntilpurchase.Thisinsightallowsdifferentstrategiesfordifferentcustomerneedsandsensitivities.Survivalanalysisappliedtoeachcustomeryieldsinsightstounderstandandincentivizechangesinbehaviour.

Thismeansjustassumingthepastbehaviourwillcontinueintothefuture(asdescriptiveanalysisdoes)withnoideawhy,isnolongernecessary.It’spossiblefordescriptiveandpredictiveanalysistogivecontradictoryanswers.Whichiswhy‘crawling’mightbedetrimentalto‘walking’.

Ifafirmcangetacustomertopurchasesooner,thereisanincreasedchanceofaddingpurchases–dependingontheproduct.Butevenifthenumberofpurchasesisnotincreased,thefirmgettingrevenuesoonerwilladdtotheirfinancialvalue(timeismoney).

Alsoabusinesscasecanbecreatedbyshowingthetrade-offingivingup,say,marginbutobtainingrevenuefaster.Thismeansstrategycanrevolvearoundmaximizationofcostbalancedagainstcustomervalue.

Theideaistomodelnexttimeuntilpurchase,thebaseline,andseehowtoimprovethat.Howisthiscarriedout?Abehaviourally-basedmethodwouldbetosegmentthecustomers(basedonbehaviour)andapplyasurvivalmodeltoeachsegmentandscoreeachindividualcustomer.Bybehaviourwetypicallymeanpurchasing(amount,timing,shareofproducts,etc.)metricsandmarcom(openandclick,directmailcoupons,etc.)responses.

AnexampleLet’suseanexample.Table5.4showstwocustomersfromtwodifferentbehaviouralsegments.CustomerXXXpurchasesevery88dayswithanannualrevenueof43,958,costsof7,296foranetrevenueof36,662.Saythesecondyearisexactlythesame.Soyearonediscountedat9%isNPVof33,635andyeartwodiscountedat9%fortwoyearsis30,857foratotalLTVof64,492.CustomerYYYhassimilarcalculationsforLTVof87,898.

Table5.4Comparisonofcustomersfromdifferentbehaviouralsegments

Page 88: Marketing Analytics: A Practical Guide to Real Marketing Science

Customer Daysbetweenpurchases

Annualpurchases

Totalrevenue

Totalcosts

NetrevYR1

NetrevYR2

YR1Disc

YR2Disc

LTVAT9%

XXX 88 4.148 43,958 7,296 36,662 36,662 33,635 30,857 64,492

YYY 58 6.293 62,289 12,322 49,967 49,967 45,842 42,056 87,898

Theabove(usingdescriptiveanalysis)wouldhavemarketerstargetingcustomerYYYwith>23,000valueovercustomerXXX.ButdoweknowanythingaboutWHYcustomerXXXissomuchlowervalued?Isthereanythingthatcanbedonetomakethemhighervalued?

Applyingasurvivalmodeltoeachsegmentoutputsindependentvariablesandshowstheireffectonthedependentvariable.Inthiscasethedependentvariableis(average)timeuntilpurchase.Saytheindependentvariables(whichdefinedthebehaviouralsegments)arethingslikepricediscounts,productbundling,seasonalmessages,addingadditionaldirectmailcataloguesandofferingonlineexclusives.Thesegmentationshouldseparatecustomersbasedonbehaviourandthesurvivalmodelsshouldshowhowdifferentlevelsofindependentvariablesdrivedifferentstrategies.

Table5.5overleafshowsresultsofsurvivalmodellingonthetwodifferentcustomersthatcomefromtwodifferentsegments.Theindependentvariablesarepricediscountsof10%,productbundling,etc.TheTTEistimeuntileventandshowswhathappenstotimeuntilpurchasebasedonchangingoneoftheindependentvariables.Forexample,forcustomerXXX,givingapricediscountof10%onaveragedecreasestheirtimeuntilpurchaseby14days.GivingYYYa10%discountsdecreasestheirtimeuntilpurchasebyonly2days.ThismeansXXXisfarmoresensitivetopricethenYYY–whichwouldnotbeknownbydescriptiveanalysisalone.LikewisegivingXXXmoredirectmailcataloguespushesouttheirTTEbutpullsinYYYby2days.NotealsothatverylittleofthemarketingleversaffectYYYverymuch.WearealreadygettingnearlyallfromYYYthatwecan,andnomarketingeffortdoesverymuchtoimpacttheTTE.However,withXXXthereareseveralthingsthatcanbedonetobringintheirpurchases.Again,noneofthesewouldbeknownwithoutsurvivalmodellingoneachbehaviouralsegment.

Table5.5Resultsofsurvivalmodelling

XXX YYY

Variables TTE TTE

Pricediscount10% –14 –2

Productbundling –4 12

Seasonalmessage 6 5

Page 89: Marketing Analytics: A Practical Guide to Real Marketing Science

Fivemorecatalogues 11 –2

Onlineexclusive –11 3

Table5.6belowshowsnewLTVcalculationsonXXXafterusingsurvivalmodellingresults.WedecreasedTTEby24days,byusingsomecombinationsofdiscounts,bundlingandonlineexclusives,etc.NotenowtheLTVforXXX(afterusingpredictiveanalysis)isgreaterthanYYY.

Table5.6LTVcalculations

Customer Daysbetweenpurchases

Annualpurchases

Totalrevenue

Totalcosts

NetrevYR1

NetrevYR2

YR1Disc

YR2Disc

LTVAT9%

XXX 64 5.703 60,442 10,032 50,410 50,410 33,635 30,857 88,677

YYY 58 6.293 62,289 12,322 49,967 49,967 45,842 42,056 87,898

Whatsurvivalanalysisoffers,inadditiontomarketingstrategylevers,isafinancialoptimalscenario,particularlyintermsofcoststomarket.Thatis,customerXXXrespondstoadiscount.It’spossibletocalculateandtestwhatisthe(just)neededthresholdofdiscountstobringapurchaseinbysomanydayswiththeestimatedlevelofrevenue.Thisendsupbeingacost/benefitanalysisthatmakesmarketersthinkaboutstrategy.Thisistheadvantageofpredictiveanalysis–givingmarketersstrategicoptions.

Checklist

You’llbethesmartestpersonintheroomifyou:

Pointoutthat‘timeuntilanevent’isamorerelevantmarketingquestionthan‘probabilityofanevent’.

Rememberthatsurvivalanalysiscameoutofbiostatisticsandissomewhatrareinmarketing,butverypowerful.

Observethattherearetwo‘flavours’ofsurvivalanalysis:liferegandproportionalhazards.Liferegmodelsthesurvivalcurveandproportionalhazardsmodelsthehazardrate.

Championcompetingrisks,anaturaloutputofsurvivalanalysis.Inmarketing,thisgivestimeuntilvariouseventsortimeuntilmultipleproductspurchased,etc.

Understandthatpredictivelifetimevalue(usingsurvivalanalysis)ismoreinsightfulthandescriptivelifetimevalue.

Page 90: Marketing Analytics: A Practical Guide to Real Marketing Science
Page 91: Marketing Analytics: A Practical Guide to Real Marketing Science

06

Modellingdependentvariabletechniques(withmorethanoneequation)Introduction

Whataresimultaneousequations?

Whygotothetroubleofusingsimultaneousequations?

Desirablepropertiesofestimators

Businesscase

Checklist:You’llbethesmartestpersonintheroomifyou…

IntroductionSofarwe’vedealtwithoneequation,arathersimplepointofview.Ofcourse,consumerbehaviourisanythingbutsimple.Marketingscienceisdesignedtounderstand,predictandultimatelyincentivize/changeconsumerbehaviour.Thisrequirestechniquesthatareascomplicatedasthatbehaviourissophisticated.Thisiswheresimultaneousequationscomein,asamorerealisticmodelofbehaviour.

Simultaneousequations:asystemofmorethanonedependentvariable-typeequation,oftensharingseveralindependentvariables.

Whataresimultaneousequations?Simplyput,simultaneousequationsaresystemsofequations.Youhadthisinalgebra.It’simportant.Thisbeginstobuildasimulationofanentireprocess.It’sdoneinmacroeconomics(remembertheKeynesianequations?)anditcanbedoneinmarketing.

PredeterminedandexogenousvariablesTherearetwokindsofvariables:predetermined(laggedendogenousandexogenous)andendogenousvariables.Generally,exogenousarevariablesdeterminedOUTSIDEthesystemofequationsandendogenousaredeterminedINSIDEthesystemofequations.(Thinkofendogenousvariablesasbeingexplainedbythemodel.)Thiscomesinhandytoknowwhenusingtheruleintheidentityproblembelow.(TheidentityproblemisaGIANTpainintheneckbutthemodelcannotbeestimatedwithoutgoingthroughthesehoops.)

Page 92: Marketing Analytics: A Practical Guide to Real Marketing Science

Thisisimportantbecauseapredeterminedvariableisonethatiscontemporaneouslyuncorrelatedwiththeerrorterminitsequation.Notehowthistiesupwithcausality.IfYiscausedbyXthenYcannotbeanindependentvariableincontemporaneouslypredicting/explainingY.

Saywehaveasystemcommonineconomics:

Q(demand)=D(I)+D(price)+Income+D(error)

Q(supply)=S(I)+S(price)+S(error)

NotethatthevariablesQandpriceareendogenous(computedwithinthesystem)andincomeisexogenous.Thatis,incomeisgiven.(D(I)istheinterceptinthedemandequationandS(I)istheinterceptinthesupplyequation.)Theseequationsarecalledstructuralformsofthemodel.Algebraically,thesestructuralformscanbesolvedforendogenousvariablesgivingareducedformoftheequations.

Reducedformequations:ineconometrics,modelssolvedintermsofendogenousvariables.

Thatis:

Thereducedformoftheequationsshowshowtheendogenousvariables(thosedeterminedwithinthesystem)DEPENDonthepredeterminedvariablesanderrorterms.Thatis,thevaluesofQandPareexplicitlydeterminedbyincomeanderrors.Thismeansthatincomeisgiventous.

Notethattheendogenousvariablepriceappearsasanindependentvariableineachequation.Infact,itisNOTindependent,itdependsonincomeanderrortermsandthisistheissue.Itisspecificallycorrelatedwithitsown(contemporaneous)errorterm.Correlationofanindependentvariableanditserrortermsleadstoinconsistentresults.

Whygotothetroubleofusingsimultaneousequations?First,becauseit’sfun.AlsonotethatifasystemshouldbemodelledwithsimultaneousequationsandISNOT,theparameterestimatesareINCONSISTENT!Lastly,insightsaremorerealistic.Thesimulationsuggeststheappropriatecomplexity.

Page 93: Marketing Analytics: A Practical Guide to Real Marketing Science

ConceptualbasicsGenerally,anyeconomicmodelhastohavethenumberofvariableswithvaluestobeexplainedtobeequaltothenumberofindependentrelationshipsinthemodel.Thisistheidentificationproblem.

Manytextbooks(Kmenta,Kennedy,Greene,etc.)cangivethemathematicderivationforthesolutionofsimultaneousequations.Thegeneralproblemisthattherehavetobeenoughknownvariablesto‘fix’eachunknownquantityestimated.Thatis,thereneedstobearule.Thegoodnewsisthatthereis.Hereistheruleforsolvingtheidentificationproblem:

Thenumberofpredeterminedvariablesexcluded

intheequationMUSTbe>=thenumberofendogenous

variablesincludedintheequation,lessone.

Let’susethisruleonthesupply-demandequationabove:

Q(demand)=D(I)+D(price)+Income+D(error)

Q(supply)=S(I)+S(price)+S(error)

Demand:thenumberofpredeterminedvariablesexcluded=zero.IncomeistheonlypredeterminedvariableanditISNOTexcludedfromthedemandequation.Thenumberofendogenousvariablesincludedlessone=2–1=1.Thetwoendogenousvariablesarequantityandprice.Sothenumberofpredeterminedvariablesexcludedintheequation=0andthisis<thenumberofendogenousvariablesincludedintheequation.Thereforethedemandequationisunder-identified.

Supply:thenumberofpredeterminedvariablesexcluded=one.Incomeistheonlypredeterminedvariableanditisexcludedfromthesupplyequation.Thenumberofendogenousvariablesincludedlessone=2–1=1.Thetwoendogenousvariablesarequantityandprice.Sothenumberofpredeterminedvariablesexcludedintheequation=0andthisis<thenumberofendogenousvariablesincludedintheequation.Thereforethesupplyequationisexactlyidentified.

DesirablepropertiesofestimatorsWehavenottalkedabout(andit’sabouttimewedid)whatarethedesirablepropertiesofestimators.Thatis,wehavespenteffortestimatingcoefficientson,say,priceandadvertisingbuthavenotdiscussedhowtoknowiftheestimatoris‘good’.Thatisthepurposeofthefollowingbriefdescription.Ifyouneedafuller(moretheoreticallystatistical)backgroundvirtuallyanyeconometricstextbookwillsuffice.(Asmentionedintheintroductiontothisbook,IpersonallylikeKmenta’sElementsofEconometricsandKennedy’sAGuidetoEconometrics.)

Page 94: Marketing Analytics: A Practical Guide to Real Marketing Science

UnbiasednessAdesirablepropertymosteconometriciansagreeonisunbiasedness.Unbiasednesshastodowiththesamplingdistribution(rememberthestatisticalintroductionchapter?Youdidn’tthinkthatwouldeverbementionedagain,didyou?).

Ifwetakeanunlimitednumberofsamplesofwhatevercoefficientwe’reestimating,andaverageeachofthesesamplestogetherandplotthedistributionofthoseaveragesofthesamples,whatwewouldendupwithisthedistributionofthebetacoefficientofthatvariable.Theaverageoftheseaveragesisthecorrectvalueofthebetacoefficient,onaverage.Honest.Nowwhatdoesthismean?Itmeanstheestimatorofbetaissaidtobeunbiasedifthemeanofthe(verylargenumberofsamples)samplingdistributionisthesamevalueastheestimatedbetacoefficient.Thatis,iftheaveragevalueofbetainrepeatedsamplingisbeta,thentheestimatorforbetaisunbiased,onaverage.NotethatthisdoesNOTmeanthattheestimatedvalueofbetaisthecorrectvalueofbeta.ItmeansONAVERAGEtheestimatedvalueofbetawillbethevalueofbeta.Soundslikedoubletalk,huh?

Theobviousquestionishowdoyouknowifyourestimatorisunbiased?Thatisunfortunatelyaverymathematicallycomplexdiscussion.Theshortansweris:itdependsonhowthedataisgeneratedanditdependsalotonthedistributionoftheerrortermofthemodel.Rememberstatisticsusesinductivethinking(notdeductivethinking)soitisviewedfrominferences,indirectly.Thatis,anestimator,say,viaregression,isdesignedwiththesepropertiesinmind.Thusthesepropertiesproduceassumptionstotakeintoaccounthowthedataisgeneratedandwhatthatdoestothedisturbanceandhencewhatthatmeansforthesamplingdistribution.Asanexample,forregression,theassumptionsare:

1. ThedependentvariableactuallyDEPENDSonalinearcombinationofindependentvariablesandcoefficients.

2. Theaverageoftheerrortermiszero.3. Theerrortermshavenoserialcorrelationandhavethesamevariance(withall

independentvariables).4. Theindependentvariablesarefixedinrepeatedsamples,oftencallednon-stochastic

X.5. Thereisnoperfectcollinearitybetweentheindependentvariables.

Inaveryrealway,econometricmodellingisallaboutdealingwith(detectingandcorrecting)violationsoftheaboveassumptions.Justtomaketheobviouspoint:theseassumptionsaremadesothatthesamplingdistributionoftheparameterestimateshavedesirableproperties,suchasunbiasedness.Now,howimportantisunbiasedness?SomeeconometriciansclaimitisVERYimportantandtheyspendalltheirtimeandeffortaroundthat(andotherproperties).Imyselftakelittlecomfortinunbiasedness.Iwantto

Page 95: Marketing Analytics: A Practical Guide to Real Marketing Science

knowiftheestimatorsarebiasedornot,maybeevenaguessastohowmuch,butintherealworld,itisnotoftenofmuchpracticalmatter.ThisisbecauseyoucouldhavetheoreticallyanynumberofsamplesandwhileonaveragethesamplingdistributionIStherealbetaestimate,youneverreallyknowwhichsampleyouhave.It’spossibleyouhaveanunusuallybadsample.Andintherealworldyouarenotusuallyabletotakemanysamples,indeedyouusuallyonlyhaveONE,theoneinfrontofyou.

EfficiencyWhatisoftenmoremeaningful,afterunbiasedness,inmanycases,isefficiency.Thatis,anestimatorthathasminimumvarianceofalltheunbiasedestimators.Insimpletermsitmeansthatestimator,ofalltheunbiasedestimators,hasthesmallestvariance.

ConsistencyUnbiasednessandefficiencyareaboutthesamplingdistributionoftheestimatedcoefficientanddonotdependonthesizeofthesample.Asymptoticpropertiesareaboutthesamplingdistributionoftheestimatedcoefficientinlargesamples.Consistencyisanasymptotic(largesample)property.

Becausethesamplingdistributionchangesasthesamplesizeincreases,themeanandthevariancecanchange.Consistencyisthepropertythatthetruebetavaluewillcollapsetothepointofthepopulationbetavalue,assamplesizeincreasestoinfinity.

ConsistencyissomethingIlikealot,because(indatabasemarketing,forexample)wetypicallyworkwithverylargesamplesandthereforecantakecomfortinthesamplingpropertiesoftheestimators.

WhyamIbringingalltheaboveupnow?Becauseinsimultaneousequations,theonlypropertytheestimatorscanhave(becausetheindependentvariableswillNOTbefixedinrepeatedsamples,thatis,thenon-stochasticXassumptionisviolated)willbeconsistency.

BUSINESSCASEScott’sbosscalledhimintohisoffice.Thesubjectofthemeetinginvitewas‘Cannibalization?’

‘Scott,ourpricingteamsarealwaysatwar,asyouknow.Wehavealwaysfeltthatoneproductcouldcannibalizeanotherwithwildpricingsfromtheproductteams.’

‘Yeah,wetalkaboutthateveryquarter.’

‘WhatIwonderedwas,givenyoursuccessatquantifyingsomuchofourmarketingoperations,canwedosomethingaboutcannibalization?’

Page 96: Marketing Analytics: A Practical Guide to Real Marketing Science

‘Whatdoyoumean,“dosomethingaboutit”?’

‘Canweputtogethersomemodelofoptimization?WhatpricesSHOULDthethreeproductteamscharge,inordertomaximizeouroverallrevenue?’

‘Soit’spricingfortheenterpriseinsteadofpricingfortheproduct.Thatsoundslikeaverycomplicatedproblem.’

‘Butitissimilartotheelasticitymodellingthatyoudid,especiallyintermsofsubstitutes,right?’

‘Yeah,Ithinkso.I’mnotsurehowtogetthedemandofeachproductintotheregression.I’llhavetoresearchit.’

‘Great,thanks.E-mailmetomorrowyourideas.’

Scottlookedathimandblinked.Hisbossturnedhischairaroundandwentbacktolookingoverhisothere-mails.Scottgotupandwentbacktohisplace,alittlebewildered.

Coulditbejusthavingademandequationfor,say,desktopsthatincludedthepriceofdesktopsaswellasthepricesofnotebooksandservers?Thatdidnotseemlikeittookintoaccountalloftheinformationavailable.Thatis,theremustbecross-equationcorrelation,meaningconsumersfeelthepricesofnotebookschangeastheyshopforadesktop,etc.WhatScottneededwasawaytosimultaneouslymodeltheimpactofeachproduct’spriceoneachproduct’sdemand.

Theaboveisademandsystem.Itisasetofthreesimultaneousequationsthataresolved(naturally)simultaneously.Thissetofequationspositsthatthedemand(quantity)ofeachproductisimpactedbytheown-priceoftheproductaswellasthecross-priceoftheotherproducts.

Notethattheapproachherewillbefairlybriefandeconometricallyoriented.Foradetailedmathematicalandmicroeconomicallyorientedtreatment,seeAngusDeatonandJohnMuellbauer’soutstanding1980workEconomicsandConsumerBehavior.Inthatbooktheythoroughlydetailconsumerdemandanddemandsystemswhereintheyultimatelypositthe(unfortunatelynamed)AlmostIdealDemandSystem(AIDS).

SoScottresearchedsimultaneousequations.RightawayitwasobviousthatthistechniqueviolatestheOLSassumptionofindependentvariablesfixedinrepeatedsample,ornon-stochasticX.Thatis,theindependentvariablessolutiondependedonthevaluesoftheindependentvariablesintheotherequations.Thisultimatelymeanttheonlydesirableproperty(notunbiasedness,notefficiency)wasconsistency.Thatis,simultaneousequationshavedesirableasymptoticproperties.

Page 97: Marketing Analytics: A Practical Guide to Real Marketing Science

Scottfoundanotherissueresultingfromsimultaneousequations:theproblemofidentity.Hehadtoapplytherule(mentionedabove)thateachequationbeatleastjustidentified.Recalltheruleforidentificationis:

Thenumberofpredeterminedvariables

excludedintheequationbe>=thenumber

ofendogenousvariablesincludedinthe

equation,lessone.

NowScotthadtoputtogethertheequationsfromthedatahecollected.Hegotweeklydataondesktop,notebookandworkstationsales(units)forthelastthreeyears.Hegottotalrevenueofeachaswell,whichwouldgivehimaverageprice(price=totalrevenue/units).Hewoulduseseasonalityandconsumerconfidence.Hecollectednumberofdirectmailssentandthenumberofe-mailssent,openedandclickedbyweek.

Scottputtogethertheresultsoverleaffromthemodel(Table6.1).Notetheidentificationstatusonallis‘overidentified’.Fordesktops:thenumberofpredeterminedvariablesexcludedis4(numberofe-mails,numberofvisits,JanuaryandOctober)andthenumberofendogenousvariablesincluded(lessone)is3(quantityofdesktops,priceofdesktops,priceofnotebooksandpriceofworkstations).Thus,4>3.Fornotebooks:thenumberofpredeterminedvariablesexcludedis4(numberofdirectmails,consumerconfidence,DecemberandOctober)andthenumberofendogenousvariablesincluded(lessone)is3(quantityofnotebooks,priceofdesktops,priceofnotebooksandpriceofworkstations).Thus,4>3.Forworkstations:thenumberofpredeterminedvariablesexcludedis6(numberofe-mails,numberofdirectmails,numberofvisits,consumerconfidence,DecemberandAugust)andthenumberofendogenousvariablesincluded(lessone)is3(quantityofworkstations,priceofdesktops,priceofnotebooksandpriceofworkstations).Thus,6>3.

Table6.1Modelresults

PriceDT

PriceNB

PriceWS

#DMs

#EMs

#Visits

Consconf

Jan Dec Oct Aug

QuantityDT

–1.2 2.3 0.4 3.7 XX XX 5.3 XX 1.2 XX 0.5

QuantityNB

1.1 –2 0.2 XX 6.2 2.2 XX –0.8 XX XX 2.9

QuantityWS

0.2 0.8 –2.6 XX XX XX XX –1.1 XX –1.9 XX

Now,whatdoesTable6.1mean?Thiswasdesignedasanoptimalpricingproblem.WhatdoesthemodeltellScott?

Page 98: Marketing Analytics: A Practical Guide to Real Marketing Science

First,sincethefocusisonpricingandspecificallycannibalization,lookatthedesktopmodel.Thepricecoefficientisnegative,aswe’dexpect:pricegoesup,quantitygoesdown.Nownoticethecoefficientonnotebooks.It’spositive(+2.3).Thismeansitisseen(bydesktopbuyers)asapotentialsubstitute.NotethatifnotebookpricesgodownthatispositivelycorrelatedwiththedemandfordesktopsandthequantityofdesktopswillGODOWNaswell.Thisiskeystrategicinformation.Itmeansthepricingpeoplecannot(andnevercould)priceinavacuum.RememberHazlitt’sbookEconomicsinOneLesson(1979)?Thelessonwasthateverythingis(directlyorindirectly)connected.Whathappenswithnotebookpricesaffectswhathappenstodesktopdemand.Thismeansaportfolioapproachshouldbetakenandnotasiloapproach.Noteaswellthat,inthedesktopequation,thepricesofworkstationsarealsoasubstitute,butless.It’sobviousthatthisinformationcanbeusedtomaximizetotalprofit.Itmightbethatoneparticularbrand(orproduct)willsubsidizeothers,butasuccessfulfirmwilloperateasanenterprise.Similarconclusionsarefortheotherproducts,intermsofpricing.

Theotherindependentvariablesareinterpretedlikewise.Consumerconfidenceandnumberofdirectmailsarepositiveininfluencingdesktopssalesbutnotintheotherproducts.Fornotebooks,e-mailsandvisitsarepositivebutAugustseasonalityisnegative.ForworkstationsbothJanuaryandOctoberarenegative.Allofthisisstrategicallylucrative.Forexample,don’tsende-mailstodesktopstargets,don’tsenddirectmailstonotebooktargetsanddon’tdomuchmarcominJanuary.

Scottusedtheabovemodeltohelpreorganizethepricingteams.Theybegantopriceasanenterpriserandnotinsilos.Notallofthemlikeditatfirstbuttheincreasesinrevenue(whichtranslatedintobonusesforthem)helpedtoassuagetheirmisgivings.

Conclusion

Simultaneousequationscanquantifyphenomenaandcangiveanswersimpossibletogetotherwise.Yes,it’sdifficult,requiresspecializedsoftwareandahighlevelofexpertise.But,asthebusinesscaseaboveshows,howelsewouldthefirmknowaboutoptimizingpricesacrossproductsorbrands?Inshort,thepriceisworthit.

Checklist

You’llbethesmartestpersonintheroomifyou:

Learntoenjoytheaddedcomplexitythatsimultaneousequationsbringtoanalytics–itbettermatchesconsumerbehaviour.

Rememberthatsimultaneousequationsusetwokindsofvariables:predetermined(laggedendogenousandexogenous)andendogenousvariables.

Page 99: Marketing Analytics: A Practical Guide to Real Marketing Science

Pointoutthatestimatorshavedesirableproperties:unbiasedness,efficiency,consistency,etc.

Observethateconometricsisreallyallaboutdetectingandcorrectingviolationsofassumptions(linearity,normality,sphericalerrorterms,etc.).

Provethatsimultaneousequationscanbeusedforoptimalpricingandunderstandingcannibalizationbetweenproducts,brands,etc.

Page 100: Marketing Analytics: A Practical Guide to Real Marketing Science

Partthree

Inter-relationshiptechniques

Page 101: Marketing Analytics: A Practical Guide to Real Marketing Science

07

Modellinginter-relationshiptechniquesWhatdoesmy(customer)marketlooklike?Introduction

Introductiontosegmentation

Whatissegmentation?Whatisasegment?

Whysegment?Strategicusesofsegmentation

ThefourPsofstrategicmarketing

Criteriaforactionablesegmentation

Aprioriornot?

Conceptualprocess

Checklist:You’llbethesmartestpersonintheroomifyou…

IntroductionAsmentionedearlier,therearetwogeneraltypesofmultivariateanalysis:dependentvariabletechniquesandinter-relationshiptechniques.Mostofthefirstpartofthisbookhasbeenconcernedwithdependentvariabletechniques.Theseincludeallofthetypesofregression(ordinary,logistic,survivalmodelling,etc.),aswellasdiscriminateanalysis,conjointanalysis,etc.

Thepointofdependentvariabletechniquesistounderstandtowhatextentthedependentvariabledependsontheindependentvariables.Thatis,howdoespriceimpactunits,whereunitsisthedependentvariable(somethingwearetryingtounderstandorexplain)andpriceistheindependentvariable,avariablethatishypothesizedtocausethemovementinthedependentvariable.

Inter-relationshiptechniqueshaveacompletelydifferentpointofview.Theseincludemultivariatealgorithmslikefactoranalysis,segmentation,multi-dimensionalscaling,etc.Inter-relationshiptechniquesaretryingtounderstandhowvariables(price,productpurchases,advertisingspend,etc.)interact(inter-relate)together.Rememberhowfactoranalysiswasusedtocorrectforcollinearityinregression?Itdidthisbyextractingthevarianceoftheindependentvariablesinsuchawaysoaseachfactor(whichcontainedthevariables)wasuncorrelatedwithallotherfactors,thatis,theinter-relationshipbetweenthe

Page 102: Marketing Analytics: A Practical Guide to Real Marketing Science

independentvariableswasconstructedtoformfactors.

Thissectionwillspendconsiderableeffortonaninter-relationshiptechniquethatisofupmostinterestandimportancetomarketing:segmentation.

IntroductiontosegmentationOk.Thisintroductorychapterisdesignedtodetailsomeofthestrategicusesandnecessitiesofsegmentation.Thechapterfollowingthiswilldiveintomoreoftheanalytictechniquesandwhatsegmentationoutputmaylooklike.Segmentationisoftenthebiggestanalyticprojectavailableandonethatprovidespotentiallymorestrategicinsightsthananyother.Plus,it’sfun!

Whatissegmentation?Whatisasegment?Agoodplacetostartistomakesureweknowwhatwe’retalkingabout.Radical,Iknow.Bydefinition,segmentationisaprocessoftaxonomy,awaytodividesomethingintoparts,awaytoseparateamarketintosub-markets.Itcanbecalledthingslike‘clustering’or‘partitioning’.Thus,amarketsegment(cluster)isasub-setofthemarket(orcustomermarket,ordatabase,etc.)

Segmentation:inmarketingstrategy,amethodofsub-dividingthepopulationintosimilarsub-marketsforbettertargeting,etc.

Thegeneraldefinitionofasegmentisthatmembersare‘homogeneouswithinandheterogeneousbetween’.Thatmeansthatagoodsegmentationsolutionwillhaveallthemembers(say,customers)withinasegmenttobeverysimilartoeachotherbutverydissimilartoallmembersofallothersegments.Homogeneousmeans‘same’andheterogeneousmeans‘different’.

It’spossibletohaveveryadvancedstatisticalalgorithmstoaccomplishthis,oritcanbeaverycrudebusinessrule.Thenextchapterwillmentionafewstatisticaltechniquesfordoingsegmentation.Notethatabusinessrulecouldsimplybe,‘Separatethedatabaseintofourparts:highestuse,mediumuse,lowuseandnouseofourproduct’.Thismanagerialfiathasbeen(andstillis)usedbymanycompanies.

RFM(recency,frequencyandmonetaryvariables)isanothersimplebusinessrule:separatethedatabaseinto,say,decilesbasedonthreemetrics:howrecentlyacustomerpurchased,howfrequentlyacustomerpurchasedandhowmuchmoneyacustomerspent.Manycompaniesarenotdoingmuchmorethanthis,intermsofsegmentation.ThesecompaniesarecertainlynotmarketingcompaniesbecausetechniqueslikeRFMarereallyfromafinancial,andnotacustomer,pointofview.Therefore,asegmentisthatentitywhereinallmembersassignedtothatsegmentare,bysomedefinition,alike.

Whysegment?Strategicusesofsegmentation

Page 103: Marketing Analytics: A Practical Guide to Real Marketing Science

So,whysegmentatall?Therearethreetypicalusesofsegmentation:findingsimilarmembers,makingmodellingbetterand–mostimportant–usingmarketingstrategytoattackeachsegmentdifferently.

Findinghomogeneousmembersisavaluableuseofastatisticaltechnique.Thebusinessproblemtendstobe:findallthosethatare‘alike’andseehow,say,satisfactiondiffersbetweenthem,orfindallthosethatare‘homogeneous’bysomemeasureandseehowusagevariesbetweenthem.

Asimpleexamplemightbein,say,telecommunications,wherewearelookingatchurn(attrition)rates.Wewanttounderstandthemotivationofchurn,whatbehaviourcanpredictchurn.So,conductsegmentationandidentifycustomersineachsegmentthatarealikeinallimportantwaystothebusiness(products,usage,demographics,channelpreferences,etc.)andshowdifferentchurnratesbysegment.Notethatchurnisnotthevariablethatallsegmentsarealikeon,churniswhatwearetryingtounderstand.Thuswecontrolforseveralinfluences(allmemberswithinasegmentarealike)andnowcanseehighversuslowchurners,afterallothersignificantvariableshavebeeneliminated.

Asecondusage,alsosophisticatedandnuanced,istousesegmentationtoimprovemodelling.Intheabovechurnexample,saysegmentationwasdoneandwewanttopredictchurn.Werunaseparateregressionmodelforeachsegmentandfindthatdifferentindependentvariablesaffectchurndifferently.Thiswillbefarmoreaccurate(andactionable)thanone(average)modelappliedtoeveryonewithoutsegmentation.Thisapproachtakesadvantageofthedifferentreasonstochurn.Onesegmentmightchurnduetodroppedcalls,anothermightchurnbecauseofthepriceoftheplanandanotherissensitivetotheirbillbasedoncalls,minutesanddataused.Thus,eachmodelwillexploitthesedifferencesandbefarmoreaccuratethanotherwise.Themoreaccuratethemodel,thegreatertheinsights;thegreatertheunderstanding,themoreobviousthestrategyofhowtocombatchurnineachsegment.

Butfromamarketingpointofview,thereasontosegmentisthesimpleanswerthatnoteveryoneisalike;notallcustomersarethesame.Onesizedoesnotfitall.

I’devenofferatweakon‘segmentation’atthispoint.Marketsegmentationusesthemarketingconcept,wherethecustomeriskingandstrategyisthereforecustomer-centric.NotethatanalgorithmlikeRFMisfromthefirm’s(financial)pointofviewwithmetricsthatareimportanttothefirm.RFMisaboutdesigningvaluetiersbasedonafinancialperspective(seeChapter8highlight,‘WhygobeyondRFM?’).

Sincemarketingsegmentationshouldbefromthecustomer’spointofview,whydosegmentation?Thatis,howdoes‘onesizedoesnotfitall’operateintermsofcustomer-centricity?

Generally,it’sbasedonrecognizingthatdifferentcustomershavedifferentsensitivities.Thesedifferentsensitivitiescausethemtobehavedifferentlybecausetheyaremotivated

Page 104: Marketing Analytics: A Practical Guide to Real Marketing Science

differently.

Thismeansconsiderableeffortneedstobeappliedtolearnwhatmakeseachbehaviouralsegmentasegment.(Thespecifictechniquestodothisareexplainedinthenextchapter.)Itmeansdevelopingastrategytoexploitthesedifferentsensitivitiesandmotivations.

Usuallythereisasegmentsensitivetoprice,andasegmentnotsensitivetoprice.Oftenthereisasegmentthatprefersonechannel(sayonline)andasegmentthatprefersanotherchannel(sayoffline).TypicallyonesegmentwillhavehighpenetrationofproductXwhileanothersegmentwillhavehighpenetrationofproductY.Onesegmentneedstobecommunicatedtodifferently(style,imaging,messaging,etc.)thananothersegment.Notethatthisisfarmoreinvolvedthanasimplebusinessrule.

Theideaisthatifasegmentissensitiveto,say,price,thenthosemembersshouldgetadiscountorabetteroffer,inordertomaximizetheirprobabilitytopurchase(theyfaceanelasticdemandcurve).Thesegmentthatisnotsensitivetoprice(becausetheyareloyal,wealthy,nosubstitutesavailable,etc.)shouldnotbegiventhediscountbecausetheydon’tneeditinordertopurchase.

Iknowtheaboveaddscomplexitytotheanalysis.ButnotethatconsumerbehaviourIScomplex.Behaviourincorporatessimultaneousmotivationsandmultidimensionalfactors,sometimesnearlyirrational(rememberDanAriely’sbook,PredictablyIrrational?).

Understandingconsumerbehaviourrequiresacomplex,sophisticatedsolution,ifthegoalistodomarketing,ifthegoalistobecustomer-centric.Asimplersolutionwon’twork.Muchliketheproblemthathappenswhenwetakeathree-dimensionalglobeoftheearthandspreaditoutoveratwo-dimensionalspace.Greenlandisnowwayoffinsize;theworldiswrong.Beingoverlysimplisticproduceswrongresults;justlikeapplyingaunivariatesolutiontoamultivariateproblemwillproducewrongresults.

FortheMBA(whichseemstoneedalistàlaPowerPoint)I’dsuggestthefollowingasbenefitsofsegmentation:

MarketingResearch:learningWHY.Segmentationprovidesarationaleforbehaviour.

MarketingStrategy:targetingbyproduct,price,promotionandplace.Strategyusesthemarketingmixbyexploitingsegmentdifferences.

MarketingCommunications:messagingandpositioning.Somesegmentsneedatransactionalstyleofcommunication;othersegmentsneedarelationshipstyleofcommunication.Onesizedoesnotfitall.

MarketingEconomics:imperfectcompetitionleadstopricemakers.Withthefirmcommunicatingjusttherightproductatjusttherightpriceinjusttherightchannelatjusttherighttimetothemostneedytarget,suchcompellingoffersgivethe

Page 105: Marketing Analytics: A Practical Guide to Real Marketing Science

firmnearlymonopolisticpower.

ThefourPsofstrategicmarketingSegmentationispartofastrategicmarketingprocesscalledthefourPsofstrategicmarketing,coinedbyPhilipKotler.Kotlerisprobablythemostwidelyrecognizedmarketingguruintheworld,essentiallycreatingthedisciplineofmarketingasseparatefromeconomicsandpsychology.HewrotemanytextbooksincludingMarketingManagement(1967),nowinits14thedition,whichhasbeenusedfordecadesasthepillarofallmarketingeducation.

MostmarketersareawareofthefourPsoftacticalmarketing:product,price,promotionandplace.Theseareoftencalledthe‘marketingmix’.Butbeforetheseareapplied,amarketingstrategyshouldbedeveloped,basedonthefourPsofstrategicmarketing.

PartitionThefirststepistopartitionthemarketbyapplyinga(behavioural)segmentationalgorithmtodividethemarketintosub-markets.Thismeansrecognizingstrategicallythatonesizedoesnotfitall,andunderstandingthateachsegmentrequiresadifferenttreatmenttomaximizerevenue/profitorsatisfaction/loyalty.

ProbeThissecondstepisusuallyaboutadditionaldata.Oftenthismaycomefrommarketingresearch,probingforattitudesaboutthebrand,itscompetitors,shoppingandpurchasingbehaviour,etc.Sometimesitcancomefromdemographicoverlaydata,whichisespeciallyvaluableifitincludeslifestyleinformation.Last,probingdatacancomefromcreatedvariablesfromthedatabaseitself.Thesetendtobearoundvelocity(timebetweenpurchases)orshareofproductspenetrated(whatpercentdoesthecustomerbuyofcategoryX,whatpercentofcategoryY,etc.),seasonality,consumerconfidenceandinflation,etc.

PrioritizeThisstepisafinancialanalysisoftheresultingsegments.Whicharemostprofitable,whicharegrowingfastest,whichrequiremoreefforttokeeporcosttoserve,etc.?PartofthepointofthisstepistofindthosethatwemightdecidetoDE-market,thatis,thosethatarenotworththeefforttocommunicateto.

PositionPositioningisaboutusingalloftheaboveinsightsandapplyinganappropriatemessage,

Page 106: Marketing Analytics: A Practical Guide to Real Marketing Science

orthecorrectlookandfeelandstyle.Thisisthetoolthatallowsthecreationofcompellingmessagesbasedonasegment’sspecificsensitivities.Thismarketingcommunicationisoftencalledmarcom.ThisincorporatesthefourPsoftacticalmarketing.

CriteriaforactionablesegmentationI’vealwaysthoughtthelistbelowguidedasegmentationprojectthatendedupbeingactionable.ThistooprobablycamefromPhilipKotler(asdomostthingsthataregoodandimportantinmodernmarketing).

Identifiability.Inordertobeactionableeachsegmenthastobeidentifiable.Oftenthisistheprocessofscoringthedatabasewitheachcustomerhavingaprobabilityofbelongingtoeachsegment.

Substantiality.Eachsegmentneedstobesubstantialenough(largeenough)tomakemarketingtoitworthwhile.Thusthere’sabalancebetweendistinctivenessandsize.

Accessibility.Notonlydothemembersofthesegmenthavetobeidentifiable,theyhavetobeaccessible.Thatis,therehastobeawaytogettothemintermsofmarketingefforts.Thistypicallyrequireshavingcontactinfo,e-mail,directmail,SMS,etc.

Stability.Segmentmembershipshouldnotchangedrastically.Thethingsthatdefinethesegmentsshouldbestablesothatmarketingstrategyispredictableovertime.Segmentationassumestherewillbenodrasticshocksindemand,orradicalchangesintechnology,etc.,intheforeseeablefuture.

Responsiveness.Tobeactionable,thesegmentationmustdriveresponses.Ifmarcomdataisoneofthesegmentationdimensions,thisisusuallyachievable.

Aprioriornot?Asthisisapractitioner’sguidetomarketingscience,itshouldcomeasnosurprisethatIadvocatestatisticalanalysistoperformsegmentation.However,it’safactthatsometimesthereare(top-down)dictumsthatdefinesegments.Thesearemanagerialfiatsthatdemandamarketbebased(apriori)onmanagerialjudgment,ratherthansomeanalytictechnique.Theusualdimension(s)managerswanttoartificiallydefinetheirmarketbytendtobeusage,profit,satisfaction,size,growth,etc.Analytically,thisisaunivariateapproachtowhatisclearlyamultivariateproblem.

Inmyopinion,thereisaplaceformanagerialjudgment,butitisNOTinsegmentdefinition.Afterthesegmentsaredefined,thenmanagerialjudgmentshouldascertainifthesolutionmakessense,ifthesegmentsthemselvesareactionable.

Page 107: Marketing Analytics: A Practical Guide to Real Marketing Science

Conceptualprocess

Settleona(marketing/customer)strategyThegeneralfirststepinbehaviouralsegmentationisoneofstrategy.Afterthefirmestablishesgoals,astrategyneedstobeinplacetoreachthosegoals.Thereshouldbeachampion,abusinessleader,astakeholderthatistheultimateuserofthesegmentation.

Analyticsneedstorecognizethatasegmentationnotdrivenbystrategyisakintoabodywithoutaskeleton.Strategysupportseverything.Averydifferentsegmentationshouldresultifthestrategyisaboutmarketshareasopposedtoastrategyaboutnetmargin.

Astrategydiscussionshouldrevolvearoundcustomerbehaviour.Whatisthemindsetinacustomer’smind?Whatisthebehaviourwearetryingtounderstand?Whatincentiveareweemploying?Anygoodsegmentationsolutionshouldtietogethercustomerbehaviourandmarketingstrategy.Remember,marketingiscustomer-centric.

Collectappropriate(behavioural)dataThenextanalyticstepinbehaviouralsegmentationistocollectappropriate(behavioural)data.Thistendstobegenerallyaroundtransactions(purchases)andmarcomresponses.

Afewcommentsoughttobemadeaboutwhatismeantby‘behaviouraldata’.Mytheoryofconsumerbehaviour(andit’sokayifyoudon’tagree)istoenvisionfourlevels(seeFigure7.1overleaf):primarymotivations,experientialmotivations,behavioursandresults.Results(typicallyfinancial)arecausedbybehaviours(usuallysomekindoftransactionpurchasesandmarcomresponses),whicharecausedbyoneorboth(primaryandexperiential)motivations.Primarymotivations(pricevaluation,attitudesaboutlifestyle,tastesandpreferences,etc.)aregenerallypsychographicandnotreallyseen.Theyaremotivationalcauses(searching,needarousal,etc.)withoutbrandinteraction.Experientialmotivationstendtohavebrandinteractionandareanothermotivatortoadditionalbehavioursthatultimatelycause(financial)results.Thesemotivationsarethingslikeloyalty,engagement,satisfaction,etc.Notethatengagementisanexperientialcause(therehasbeeninteractionwiththebrand)andisnotabehaviour.Engagementwouldbemetricslikerecencyandfrequency.TherewillbemoreonthistopicwhenwediscussRFM(seeChapter8highlight).I’llwarnyouthisisoneofmysoapboxes.

Figure7.1RevenueGrowthMargin

Page 108: Marketing Analytics: A Practical Guide to Real Marketing Science

Usuallytransactionsandmarcomresponses(fromdirectmail,e-mail,etc.)arethemaindimensionsofbehaviouralsegmentation.Oftenadditionalvariablesarecreatedfromthesedimensions.

Wewanttoknowhowmanytimesacustomerpurchased,howmucheachtime,whatproductswerepurchased,whatcategorieseachproductpurchasedbelongedto,etc.Oftenvaluableprofilingvariablesgoalongwiththis,includingnetmarginoneachpurchase,costofgoodssold,etc.Wewanttoknowthenumberoftransactionsoveraperiodoftime,thenumberofunitsandifanydiscountswereappliedtothesetransactions.

Intermsofmarcomresponseswewanttocollectwhatkindofvehicle(directmail,e-mail,etc.),opens,clicks,websitevisits,storepurchases,discountsused,etc.Wewanttoknowwheneachvehiclewassentandwhatcategoryofproductwasfeaturedoneachvehicle.Anyversioningneedstobecollected,andanyoffers/promotions,etc.,needtobeannotatedinthedatabase.Allofthisdatasurroundingtransactionsandresponsesisthebasisofcustomerbehaviour.

Generallyweexpecttofindasegmentthatisheavilypenetratedinonetypeofcategory(broadproductspurchased)butnotanotherandthiswillbedifferentbymorethanonesegment.Asbearsrepeating,onesegmentisheavilypenetratedbycategoryX,whileanotherisheavilypenetratedbycategoryY,etc.Wealsoexpecttofindoneormoresegmentsthatprefere-mailoronlinebutnotdirectmail,orviceversa.Wetypicallyfindasegmentthatissensitivetopriceandonethatisnotsensitivetoprice.Theseinsightscomedifferentlyfromthesebehaviouraldimensions.

Create/useadditionaldata

Page 109: Marketing Analytics: A Practical Guide to Real Marketing Science

Nowcomesthefunpart.Hereyoucancreateadditionaldata.Thisdataatleasttakestheformofseasonalityvariables,calculatestimebetweeneachpurchase,timebetweencategoriespurchased,peaksandvalleysoftransactionsandunitsandrevenue,shareofcategories(percentofbabyproductscomparedtototal,percentofentertainmentcategoriescomparedtototal),etc.Thereshouldbemetricslikenumberofunitsandtransactionspercustomer,percentofdiscountspercustomer,toptwoorthreecategoriespurchasedpercustomer,etc.Allofthesecanbeused/testedinthesegmentation.

Asformarcom,thereshouldbeahostofmetricsaroundmarcomtypeandofferandtimeuntilpurchase.Thereshouldbebusinessrulestyingacampaigntoapurchase.Thereshouldbevariablesindicatingcategoriesfeaturedonthecover,orsubjectlines,oroffersandpromotions.

Notehowalloftheaboveexpandbehaviouraldata.Butthereareothersourcesofdataaswell.Oftenprimarymarketingresearchisused.Thistendstobearoundsatisfactionorloyalty,somethingaboutcompetitivesubstitutes,maybemarcomawarenessorimportanceofeachmarcomvehicle.

Thirdpartyoverlaydataisarichsourceofadditionalinsightsintofleshingoutthesegments.Thisisoftenmatcheddatalikedemographics,interests,attitudes,lifestyles,etc.Thisdataistypicallymosthelpfulwhenitdealswithattitudesorlifestyle,butdemographicscanbeinterestingaswell.Againallofthisadditionaldataisaboutfleshingoutthesegmentsandtryingtounderstandthemindset/rationaleofeachsegment.

RunthealgorithmAsmentioned,thealgorithmdiscussionwillbecoveredindepthinthenextchapter,butafewcommentscanbemadenow,particularlyintermsofprocess.Notethatthealgorithmisguidedbystrategyanduses(definingorsegmenting)variablesbasedonstrategy.

Thealgorithmistheanalyticgutsofsegmentationandcareshouldbetakeninchoosingwhichtechniquetouse.Thealgorithmshouldbefastandnon-arbitrary.Analytically,wearetryingtoachievemaximumseparation(segmentdistinctiveness).

Theultimateideaofsegmentationistoleveladifferentstrategyagainsteachsegment.ThereforeeachsegmentshouldhaveadifferentreasonforBEINGasegment.Thealgorithmneedstoprovidediagnosticstoguideoptimization.Thegeneralmetricofsuccessis‘homogeneouswithinandheterogeneousbetween’segments.Therehavebeenmanysuchmetricsoffered(SAS,viaprocdiscrim,uses‘thelogarithmofthedeterminantofthecovariancematrix’asametricofsuccess).Intheprofiling,thedifferentiationofeachsegmentshouldmakeitselfclear.

Justtostackthedeck,letmedefinewhatagoodalgorithmforsegmentationshouldbe.Itshouldbemultivariable,multivariate,andprobabilistic.Itshouldbemultivariablebecauseconsumerbehaviourismostcertainlyexplainedbymorethanonevariable,andit

Page 110: Marketing Analytics: A Practical Guide to Real Marketing Science

shouldbemultivariatebecausethesevariablesthatareimpactingconsumerbehavesimultaneously,interactingwitheachother.Itshouldbeprobabilisticbecauseconsumerbehaviourisprobabilistic;ithasadistributionandatsomepointthatbehaviourcanevenbeirrational.Gasp!

ProfiletheoutputProfilingiswhatweshowtootherpeopletoprovethatthesolutiondoesdiscriminatebetweensegments.Generallythemeansand/orfrequenciesofeachkeyvariable(especiallytransactionsandmarcomresponses)areshowntoquicklygaugedifferencesbyeachsegment.Notethatthemoredistincteachsegmentisthemoreobviousastrategy(foreachsegment)becomes.

ToshowthemeansofKPIs(keyperformanceindicators)bysegmentiscommon,butoftenanothermetricteasesoutdifferencesbetter.Usingindexesoftenspeedsdistinctiveness.Thatis,takeeachsegment’smeananddividebythetotalmean.Forexample,saysegmentonehasaveragerevenueof1,500andsegmenttwohasaveragerevenueof750andthetotalaverage(allsegmentstogether)is1,000.Dividingsegmentonebythetotalis1,500/1,000=1.5,thatis,segmentonehasrevenue50%aboveaverage.Notealsothatsegmenttwois750/1,000=0.75meaningthatsegmenttwocontributesrevenue25%lessthanaverage.Applyingindexestoallmetricsbysegmentimmediatelyshowsdifferences.Thisisespeciallyobviouswheresmallnumbersareconcerned.Asanotherexample,saysegmentonehasaresponserateof1.9%andtheoverallgrandtotalresponserateis1.5%.Whilethesenumbers(segmentonetototal)areonly0.4%different,notethattheindexofsegmentone/totalis1.9%/1.5%showingthatsegmentoneis27%greaterthanaverage.Thisiswhyweliketo(andshould)useindexes.

Whileseeingdrasticdifferencesineachsegmentisverysatisfying,themostenjoyablepartofprofilingoftenistheNAMINGofeachsegment.Firstyoumustrealizethatnamingasegmenthelpsdistinguishthesegments.Themoresegmentsyouhavethemoreimportantthisbecomes.

Ihaveacoupleofsuggestionsaboutnamingsegments;takethemasyouseefit.Sometimesthenamingofsegmentsislefttothecreativedepartmentandthat’sokay.Butusuallyanalyticshastocomeupwithnames.

Eachnameshouldbeonlytwoorthreewords,ifpossible.Theyshouldbemoreinformativethansomethinglike‘HighRevenueSegment’or‘LowResponseSegment’.Theyshouldincorporatetwoorthreesimilardimensions.Eitherkeepmostofthemtoproductmarcomresponsedimensions,orkeepthemalongastrategicdimensionortwo(highgrowth,costtoserve,netmargin,etc.).It’stemptingtonamethemplayfullybutthisstillhastobeusable.Thatis,while‘BohemianMix’isfun,whatdoesitmeanstrategicallyorfromamarketingpointofview?

Page 111: Marketing Analytics: A Practical Guide to Real Marketing Science

Modeltoscoredatabase(iffromasample)Thenextstep,ifthesegmentationwasdoneonasample,istoscorethedatabasewitheachcustomer’sprobabilitytobelongtoeachsegment.Thisisoftencarriedoutquicklywithdiscriminateanalysis.Apply(inSAS)procdiscrimtothesampleandgettheequationsthatscoreeachcustomerintoasegment.(Discriminateanalysisisacommontechnique,oncecategories(segments)aredefined,tofitvariablesinequationstopredictcategory(segment)membership.)Thenruntheseequationsagainstthedatabase.

Ifthisisaccurateenough(whatever‘accurateenough’means)thenyou’regoodtogo.ButdiscrimsometimesisNOTaccurateenough.Imyselfthinkthisisbecauseyouhavetousethesamevariables(althoughwithdifferentweights)oneachsegment.Thiscanbeinefficient.Thereisalsotheassumptioninherentinprocdiscrimaboutthesamevarianceacrossasegmentwhichishardlyevertrue,soyoumayneedtoturntoanothertechnique.

Ihaveoftensettledforlogisticregression,whereadifferentequationscoreseachsegment.Thatis,ifIhavefivesegments,thefirstlogitwillbewithabinarydependentvariable:1ifthecustomerisinsegmentoneand0ifnot.Thesecondlogitwillbea1ifthecustomerisinsegmenttwoanda0ifnot.ThenIputinvariablestomaximizeprobabilityofeachsegmentandIremovethosevariablesthatareinsignificantandrunallequationsagainstallcustomers.Eachcustomerwillhaveaprobabilitytobelongtoeachsegmentandthemaximumscorewins,ie,thesegmentthathasthehighestprobabilityisthesegmenttowhichthecustomerisassigned.

TestandlearnThetypicallaststepistocreateatestandlearnplan.Thisisgenerallyabroad-basedtestdesign,aimedatlearningwhichelementsdriveresults,whichisdirectlyinformedbythesegmentationinsights.

NoteChapter10ondesignofexperiments(DOE).Theoverallideahereistodevelopatestingplantotakeadvantageofsegmentation.Thefirstthingtotestistypicallyselection/targeting.Thatis,pullasampleofthoselikelytobelongtoaveryhighlyprofitable,heavyusagesegmentanddoamailingtothemandcomparerevenueandresponsestosomegeneralcontrolgroup.Thesehigh-endsegmentsshoulddrasticallyout-performthebusinessasusual(BAU)group.

Acommonnextstep(dependingonstrategy,etc.)mightbepromotionaltesting.Thiswouldusuallyfollowwithelasticitymodellingbysegment.Oftenoneormoresegmentsarefoundtobeinsensitivetopriceandoneormoresegmentsarefoundtobesensitivetoprice.Thetesthereistoofferpromotionsanddetermineifthesegmentinsensitivetopricewillstillpurchaseevenwithalowerdiscount.Thismeansthefirmdoesnothavetogiveawaymargintogetthesameamountofpurchases.

Othertypicaltestsrevolvearoundproductcategories,channelpreferenceand

Page 112: Marketing Analytics: A Practical Guide to Real Marketing Science

messaging.Afullfactorialdesigncouldgetmuchlearningimmediatelyandthenmarcomcouldbeaimedappropriately.Thegeneralideaisthatifasegmentis,say,heavilypenetratedinproductX,sendthemaproductXmessage.IfasegmentmighthaveapropensityforproductY(givenproductX)doatestandseehowtoincentivizebroadercategorypurchases.Thenextchapterwillgothroughadetailedexampleofwhatthistestingmightmean.

Checklist

You’llbethesmartestpersonintheroomifyou:

Pointoutthatsegmentationisastrategic,notananalytic,exercise.

Rememberthatsegmentationismostlyamarketingconstruct.

Arguethatsegmentationisaboutwhat’simportanttoaconsumer,notwhat’simportanttoafirm.

Recallthatsegmentationgivesinsightsintomarketingresearch,marketingstrategy,marketingcommunicationsandmarketingeconomics.

ObservethefourPsofstrategicmarketing:partition,probe,prioritizeandposition.

UncompromisinglydemandthatRFMbeviewedasaservicetothefirm,notaservicetotheconsumer.

Requireeachsegmenttohaveitsownstoryrationaleforwhyitisasegment.Thereshouldbeadifferentstrategylevelledateachsegment,otherwisethereisnopointinbeingasegment.

Page 113: Marketing Analytics: A Practical Guide to Real Marketing Science

08

Segmentation:toolsandtechniquesOverview

Metricsofsuccessfulsegmentation

Generalanalytictechniques

Businesscase

Analytics

Comments/detailsonindividualsegments

K-meanscomparedtoLCA

Highlight:WhyGoBeyondRFM?

Segmentationtechniques

Checklist:You’llbethesmartestpersonintheroomifyou…

OverviewThepreviouschapterwasmeanttobeageneral/strategicoverviewofsegmentation.Thischapterisdesignedtoshowtheanalyticaspectsofit,whichistheheartofthesegmentationprocess.Analyticsisthefulcrumofthewholeproject.

Afewbookstonote,intermsoftheanalyticsofsegmentation,wouldbeSegmentationandPositioningforStrategicMarketingDecisionsbyJamesH.Myers(1996),MarketSegmentationbyMichelWedelandWagnerA.Kamakura(1998)andAdvancedMethodsofMarketResearch,editedbyRichardP.Bagozzi(2002),especiallythechapters‘TheCHAIDApproachtoSegmentationModelling’and‘ClusterAnalysisinMarketResearch’.NotealsothepapersofJayMagdison(2002)fromtheStatisticalInnovationswebsite(www.statisticalinnovations.com).

MetricsofsuccessfulsegmentationAsmentionedearlier,thegeneralideaofsuccessfulsegmentationis‘homogeneouswithinandheterogeneousbetween’.Thereareseveralpossibleapproachestoquantifyingthisgoal.Generally,aratioofthosemembersinthesegmentiscomparedtoallthosemembersnotinthesegment,andthesmallerthebetter.Thishelpsustocomparea3-segmentsolutionwitha4-segmentsolution,ora4-segmentsolutionusingvariablesa–fwitha4-

Page 114: Marketing Analytics: A Practical Guide to Real Marketing Science

segmentsolutionusingvariablesd–j.SAS(viaprocdiscrim)hasthe‘logofthedeterminantofthecovariantmatrix’.Thisisagoodmetrictouseincomparingsolutionsevenifit’sabadly-namedone.

Generalanalytictechniques

BusinessrulesTheremaybeaplaceforbusiness-rulesegmentation.Ifdataissparse,underpopulated,orveryfewdimensionsareavailable,there’slittlepointtryingtodoananalyticsegmentation.There’snothingforthealgorithmtooperateon.

I(again)cautionagainstamanagerialfiat.Ihavehadmanagerswhoinvestedthemselvesinthesegmentationdesign.Theyhavetoldmehowtodefinethesegments.Thisistypicallyflawed.Iwouldn’tsaytoignoremanagement’sknowledge/intuitionoftheirmarketandtheircustomers.Myadviceistogothroughthesegmentationprocess,dotheanalyticsandseewhattheresultslooklike.Typicallytheanalyticresultsareappealingandmorecompellingthanmanagerialjudgment.Thisisbecauseamanager’sdictumisaroundoneortwooratmostthreedimensions,arbitrarilydefined.Buttheanalyticoutputoptimizesthevariablesandseparationisthemathematical‘best’.Itwouldbeunlikelythatoneperson’sintuitioncouldout-performastatisticalalgorithm.Iwouldevensaythatifananalyticoutputisverydifferentthanamanager’spointofview,thatmanagerhasalottolearnabouthisownmarket.Thestatisticalalgorithmencourageslearning.Mostoftenmanagerialfiatisaboutusage(high,mediumandlow),satisfaction,netprofit,etc.Noneoftheserequire/allowmuchinvestigationintoWHYtheresultsarewhattheyare.Noneoftheserequireanunderstandingofconsumerbehaviour.

ThisiswhyRFM(recency,frequency,andmonetary)issoinsidious.Itisabusinessrule,it’sappealing,itisbasedondataanditworks.Itisultimatelya(typicallyfinancial)manager’spointofview.Itdoesnotencouragelearning.Marketingstrategyisreducedtonothingmorethanmigratinglowervaluetiersintohighervaluetiers.

Agoodoverviewofsegmentation,fromthemanagerialroleandnottheanalyticalrole,isArtWeinstein’sbook,MarketSegmentation(1994),whichprovidesagooddiscussionofsegmentationbasedonbusinessrules.

CHAIDCHAID(chi-squaredautomaticinteractiondetection)isanimprovementoverAID(automaticinteractiondetection).Strictlyspeaking,CHAIDisadependentvariabletechnique,NOTaninter-relationshiptechnique.I’mincludingitherebecauseCHAIDisoftenusedasasegmentationsolution.

Thisbringsustothefirstquestion:‘Whyuseadependentvariabletechniqueintermsofsegmentation?’Myansweristhatitisinappropriate.Adependentvariabletechniqueis

Page 115: Marketing Analytics: A Practical Guide to Real Marketing Science

designedtounderstand(predict)whatcausesadependentvariabletomove.Bydefinition,segmentationisnotaboutexplainingthemovementinsomedependentvariable.

OK.Howdoesitwork?Whiletherearemanyvariationsofthealgorithm,ingeneralitworksthefollowingway.CHAIDtakesthedependentvariable,looksattheindependentvariablesandfindstheoneindependentvariablethat‘splits’thedependentvariablebest.‘Best’hereisperthechi-squaredtest.(AIDwasbasedontheF-test,whichistheratioofexplainedvarianceoverunexplainedvarianceandisused(inmodelling)asathresholdthatprovesthemodelisbetterthanrandom.)Itthentakesthat(secondlevel)variableandsearchestheremainingindependentvariablestotestwhichonebestsplitsthatsecondlevelvariable.Itdoesthisuntilthenumberoflevelsassignedisreached,oruntilthereisnoimprovementinconvergence.

Belowisasimpleexample(Figure8.1).ProductrevenueisthedependentvariableandCHAIDisrunandthebestsplitisfoundtobeincome.Incomeissplitintotwogroups:highincomeandlowincome.Thenextbestvariableisresponserate,whereeachincomelevelhastwodifferentresponserates.Highincomeissplitintermsofresponserate>9%andresponserate>6%and<9%.Lowincomeissplitbetween<2%and>2%and<6%.Thusthissimplifiedexamplewouldshowfoursegments:highincomehighresponse,highincomemediumresponse,lowincomemediumresponseandlowincomelowresponse.

Figure8.1CHAIDoutput

TheadvantagesofCHAIDarethatitissimple,easytouseandeasytoexplain.Itprovidesastunningvisualtoshowhowtointerpretitsoutput.

Thedisadvantagesaremany.First,itisnotamodelinthestatistical/mathematicalsenseoftheword,butaheuristic,aguide.Thismeanstheanalysistendstobeunstable;thatis,differentsamplescanproducewildlydifferentresults.Therearenocoefficientsthatshowsignificance,therearenosignsonthevariables(positiveornegative)andthereisnorealmeasureoffit.

CHAIDisapopulartechnique,duetoitseaseandsimplicity.Iwouldofferitisnotappropriateforsegmentation.Itsbestuseisprobablyintermsofdataexploration.Iwouldcaution,however,thatthiscanbecomeacrutchandmightencourageyoutobypassyour

Page 116: Marketing Analytics: A Practical Guide to Real Marketing Science

ownbrain.Irememberwhensomeonewhoworkedformewasassignedtobuildaregressionmodel.ShehadCHAIDonherPCsoshewasrunningallkindsofCHAIDoutputandhadmanypagesoftreediagrams.AfterawhileIaskedhowitwasgoingandshewasstillexploringthedata.Shehadhundredsofvariablesandshesaidshehadnorealideaaboutwhatcausedwhat.SheclaimedsheneededCHAIDtominethedatabecauseshehadnocluewhatvariablesmightcause/explainthemovementinthedependentvariable.Itoldherthatifshe,astheanalyst,trulyhadnoideawhatsoeverastowhatmightcauseorexplainthemovementinthedependentvariable(inthiscasesales)thenshewasnottherightpersontodothemodel.AsanalystyouMUSThavesomeideaofthedata-generatingprocessandyouMUSThavesomeideaabout‘thiscausesthat’,egpricechangescausechangesindemand.So,useCHAIDfordesigningstructure,notexplainingcausality.

HierarchicalclusteringHierarchicalclusteringISaninter-relationshiptechnique.ItalsohasagraphicaldisplaybutunlikeCHAIDitisNOTvisuallyappealing.

Hierarchicalclusteringcalculatesa‘nearnessmetric’,atypeofsimilarityviasomeinter-relationshipvariables.Therearemanyoptionshowtodothisbutconceptuallytheideaisthatsomeobservations(saycustomers)are‘closetoeachother’basedonsomesimilarvariables.Thenadendogram(ahorizontaltreestructure)isproducedandtheanalystchooseshowtodividetheresultantgraphics.SeeFigure8.2.

Figure8.2Hierarchicalclustering–dendogram

Page 117: Marketing Analytics: A Practical Guide to Real Marketing Science

Notethat,forinstance,observations34and56arejoinedtogether(becausetheyaresimilar)andthesearenextjoinedtoobservation111.Nowtherearethreeobservationsinthiscluster.Asthenumberofobservationsincreasesthegraphicislessandlessusable.Onedisadvantageisthattheanalystisrequiredto(arbitrarily)decidewheretobreaktheclustersoff.Thatis,itultimatelyisuptotheanalysttochoosehowmanyandwhichobservationsareinthefinalclusters.ArbitrarychoiceisNOTbasedonanalytics,butintuition.

Anadvantageofhierarchicalclusteringisitcalculatesthedistanceofeveryobservationfromallotherobservations,sothestarting‘seeds’aremathematicallydistinct.Oftenhierarchicalclusteringisusedfornothingelsethanthesestartingseedsasaninputintoanotheralgorithm.NotewellJamesH.Myers’bookonsegmentation(Myers,1996),whichhasaverygoodandconceptualtreatmentofhierarchicalclustering.

K-meansclustering

Page 118: Marketing Analytics: A Practical Guide to Real Marketing Science

K-meansisprobablythemostpopular(analytic)segmentationtechnique.SAS(usingprocfastclus)andSPSS(usingpartitioning)haveverypowerfulalgorithmstodoK-meansclustering.K-meansiseasytodo,fairlyeasytounderstandandexplainandtheoutputiscompelling.K-meansworksandhasbeeninuseforover50years.

K-meanswasinventedbyzoologistsinthe1960sforphylumclassification.WhileEWForgy,RCJanceyandMRAnderbergwereearlyalgorithmdesigners(1960s)itwasJamesMacQueen(1967)whocoinedtheterm‘K-means’.It’scalledK-meansbecauseKisthenumberofclustersandthecentroidsarethemeansoftheclusters.Notetheyweretryingtodecide,basedonananimal’s(particularlyabutterfly’s)characteristics,towhichphylumtheybelonged.Theywantedanalgorithmfortaxonomy.

Thegeneralalgorithm(andaswithallothertechniques,therearevariousversions)isasfollows:

1. Setup:choosenumberofclusters,choosesomekindof‘maximumdistance’todefineclustermembershipandchoosewhichclusteringvariablestouse.

2. Findthefirstobservationthathasalltheclusteringvariablespopulatedandcallthiscluster1.

3. Findthenextobservationthathasalltheclusteringvariablespopulatedandtesthowfarawaythisobservationisfromthefirstobservation.Ifit’sfarenoughawaythencallthiscluster2.

4. Findthenextobservationthathasalltheclusteringvariablespopulatedandtesthowfarawaythisobservationisfromthefirstandsecondobservations(clusters).Ifit’sfarenoughawaythencallthiscluster3.Continuewithsteps2–4untilthenumberofclusterschosenisdefined.

5. Gotothenextobservationandtestwhichclusteritisclosesttoandassignthatobservationtothatcluster.

6. Continuewithstep5untilallobservationsthathavetheclusteringvariablespopulatedhavebeenassigned.

Thereareseveralthingsgoodaboutthisalgorithm.Itisveryfastandcanhandlealargeamountofdata.Itworks.Itwillachievesomekindofseparation.

Therearemanydisadvantages.Personally,IHATEthearbitrarinessofwhattheanalystmustdecide.Asstatedabove,theanalysttellsthealgorithmhowmanyclusterstoform(asifheknows).Thereislittle(analytically)tobasethisimportantcriterionon.Second,hehastotellthealgorithmwhatvariablestousetodefinetheclusters.Again,asifheknowshowmanyclustersthereare.Thisisanextremelyimportantchoice.TheclustersareDEFINEDbasedonthisarbitrarychoice.

AnotherdisadvantagewithK-meansisthattherearenorealdiagnosticsonhowwellitfits,howwellitpredictsandhowwellitscoresthoseobservations(customers)intoeachsegment.Becauseit’sbasedonthesquarerootofEuclideandistance

Page 119: Marketing Analytics: A Practical Guide to Real Marketing Science

eachobservationisplacedinthesegmentitis‘closestto’.Thereisnolikelihoodmetric.Supposeacustomerisnewonfile,orhassomeunusualbehaviour.Thiscustomermightnotexhibitrealsegmentbehaviourbutisplacedsomewhere,regardless.

Becauseofthesearbitrarychoices(andthefactthatK-meansgivesnodiagnosticstoaidthesechoices)mostclusteringprojectsendupwiththeanalystgeneratingmanysolutions.Hewilldoa4anda5anda6anda7andan8-clustersolution.Hewilluseineachvariables1–5andthenvariables5–10andthenvariables10–12,etc.Becausetherearenorealdiagnosticstoguidehimhewilloutputreamsofpaperandsharethesepilesofprofileswithhispeersandtheultimateusersofthesegmentationandbasicallythrowuphishandsandsay,‘Whatdoyouthink?Whichofthese20outputsdoyoulikethebest?’Andthenmaybesomebodywilldecidewhattheylike,typicallyforstrategicreasons.Notethesubjectivityhere?

Anotherobviousdisadvantage(giventhealgorithmabove)isthatiftheorderofthedatasetisdifferent,theK-meanssolutionwillbedifferent.Somealgorithmsimprovethisoptionbynotjustgoingdownthelist,buttakingarandomobservationaseachstartingseed.Thisisbetter,butthesameproblemremains.Re-order,orre-do,thealgorithm–withthesamenumberofclustersandthesamevariables–andtheoutputwillbe(very)different.Thisshouldstrikeallanalyticpeopleasagreatproblem.

AlastproblemwithK-meansisthatitisnotanoptimizingalgorithm.Itdoesnottrytomaximize/minimizeanything.Ithasnogenerallycontrollingobjective.

Therefore,IwouldsuggestthatK-meansisnotaviableoptionforactionablesegmentation.Thealgorithmistooarbitraryandtheoutputissubjective,somethingmostgoodanalystsabhor.

LatentclassanalysisLatentclassanalysis(LCA)isamassiveimprovementonalltheabove.Itisnowthestateoftheartinsegmentation.Tome,thebestsoftwareforthisisLatentGoldfromStatisticalInnovations.JayMagdisonisageniusandhaswrittensomeofthebestarticlesonit.Especiallysee‘Anontechnicalintroductiontolatentclassmodels’(2002)and‘Latentclassmodelsforclustering:acomparisonwithK-means’(2002).

LCAtakesacompletelydifferentviewofsegmentation.Ratherthan,asinthecaseofK-means,wherethevariablesdefinethesegments,LCAassumesthescoresonthevariablesarecausedbythe(hidden)segment.Thatis,LCApositsalatent(categorical)variable(segmentmembership)thatmaximizesthelikelihoodofobservingthescoresseenonthevariables.

Itthenrunsthistaxonomyandcreatesaprobabilityofeachobservationbelongingto

Page 120: Marketing Analytics: A Practical Guide to Real Marketing Science

eachsegment.Thesegmentthathasthehighestprobabilityisthesegmentintowhichtheobservationisplaced.ThismeansLCAisastatisticaltechniqueandnotamathematical(likehierarchicalorK-meansclustering)technique.

TherearesomedisadvantagesofLCA.SASdoesnotdoit,atleastnotasaproc.SPSSdoesnotdoiteither:youhavetobuyspecialsoftware.StatisticalInnovationscreatedLatentGold,whichhasprobablybecomethegoldstandard(getit,‘gold’?).Italsorequiressometrainingandsomeexpertise,butLatentGoldismenudrivenandveryeasytouse.Also,likethelightbulb,itisnottruethatyouhavetounderstandalloftheintricatedetailsinordertouseit.Sometrainingisrequired,buttheresultsarewellworthit.

Theadvantageshavebeenalludedtobutjusttobeclear,LCAhasaLOTofadvantages.Ultimatelysegmentation’susefulnessisaboutstrategy.Thebetterthedistinctivenessthemoreobviouslyastrategybecomeslevelledoneachsegment.

However,thereareseveralimportantanalyticadvantages,especiallyinthewayLatentGoldarticulatesthealgorithm.First,LCAtellsyoutheoptimalnumberofsegments.Youdonothavetoguess.LCAusestheBIC(BayesInformationCriterion)and–LL(negativeloglikelihood)anderrorratetogiveyoudiagnosticsastowhatisthe‘best’numberofsegmentsgiventhesescoresonthesevariablesandthisdataset.

Second,LCAgivesindicationsastowhichvariablesaresignificantinthesegmentationsolution.Youdonothavetoguess.AnyvariablethathasanR2<10%canbedeemedinsignificant.

Third,LCAproducesanoutputthatscoreseveryobservationwiththeprobabilityofbelongingtoeachsegment.Ifobservation#1hasaprobabilityofbelongingtosegment1of95%andprobabilityofbelongingtosegment2of5%it’sprettyobvioustowhichsegmentthatobservationbelongs.Observation#1exhibitsverystrongsegment1behaviour.Butwhataboutobservation#2thathasaprobabilityofbelongingtosegment1of55%andprobabilityofbelongingtosegment2of45%?Thisobservationdoesnotdemonstrateverystrongsegmentbehaviour,foranysegment.UnderK-meansthisobservationwouldlikelybeassignedtosegment1.ButLCAgivesyouadiagnostic.Typicallysomeassumptionshouldbemade.It’susuallysomethinglike,anyobservationthatdoesnotscoreatleast70%likelihoodofbelongingtoanysegmentshouldbeeliminatedfromtheoutput.Thoseobservationsareplacedinsomeotherbuckettobedealtwithinsomeotherway.Thereshouldnotbemorethan5%oftheseoutliers,givenmostmarketingmodelsareat95%confidence.Agoodsolutionwillhavefarlessthan5%outliers.

Thesediagnosticsmaketheanalyticsveryfastandveryclean.Theyalsomakethesegmentationsolutionverydistinct.Asmentioned,thisisthehallmarkofagoodsegmentationsolution:distinctiveness.Butthisisnotjustvaluablefortheanalyst;itisofupmostimportancetothestrategist.Themoredistinctthesegmentationsolutionthe

Page 121: Marketing Analytics: A Practical Guide to Real Marketing Science

clearereachstrategybecomes.

Table8.1Latentclassanalysis

RFM CHAID K-means LCA

Multivariable XX XX XX XX

Customer-centric XX

Multivariate XX XX

Probabilistic XX

BUSINESSCASEScott’sbosscalledhimintotheoffice.Helookedaroundwhilehisbossplayedwiththephone,whichalwaysirritatedScott.

‘SoScott’,hisbosssaid,grudginglylookingupfromhissmartphone.‘Wearereadytomakeamajorpushinconsumerstrategy.We’veaddedconsumerelectronicstoourproductmixandnowwanttodivedeeper.’

‘Thatsoundsgood.Whatdoesthatmeanformygroup?’

‘We’dliketoexploreversioningourdirectmailcatalogues,positioningoure-mailsmorestrategically,etc.WeallrememberyourONESIZEDOESNOTFITALLspeechattheoffsitelastquarter.’

‘Yeah,sorry,therehadbeenafewcocktailsand…’

‘No,it’srighton.We’retalkingaboutinitiatingacustomermarketsegmentationprojectandyouareslatedtoleadit.’

Scottgulped.Thatwouldbealotofwork.Itwouldbealotoffunandveryvisible.‘I’llstartputtingateamtogetherandbegintogothroughtheprocess.’

Scottwentbacktohisoffice(he’dbeenpromotedbynow)andsketchedoutaprocess,outputtingasegmentationbasedonconsumerbehaviour.Hewroteonhiswhiteboardalistofstepsandtheninvitedstakeholderstoacollectionofmeetings.Theywerestartingabigproject:customersegmentation.

Strategize

Thefirststepinbehaviouralsegmentationistostrategize.Thistendstobeaviewfromtwolenses:marketingstrategyandconsumerbehaviour.Thesetwoshouldnotbecontradictory.

Page 122: Marketing Analytics: A Practical Guide to Real Marketing Science

Scott’steammetandtherewassomediscussionbutScottprovidedleadershipongoalsbasedonthemantraofPeterDrucker,thelegendarymanagementguruwhocreatedbusinessmanagementasadistinctandseparatediscipline.Druckersaidthereareonlythreemetricsthatmakeanybusinesssense:increasingrevenue,increasingcustomersatisfactionanddecreasingexpenses.Ifyouareworkingonaprojectthatcannottietoatleastoneofthesemetricsyoushouldaskyourselfwhetheryoureallyshouldbedoingthatproject.Scott’steamdecidedtheirmarketingstrategyforthesegmentationwouldbeincreasingnetprofitmargin.Thewholepointforeachsegmentwasstrategizingcross-sell/up-sellopportunities.Thiswasadeparturefromlastyear’sstrategyofmostlyacquiringcustomers.Theyrealizedhowexpensiveacquisitioncanbe.

Intermsofconsumerbehaviour,Scott’steamhypothesizedpotentialconsumersegments.Therewouldlikelybeoneormoregenerallysensitivetoprice,oneormorehavingdifferentproductpenetrations,oneormorereactingtocompellingmessagesdesignedforthemandoneormorethatpreferonechanneloveranother.Thisisjustusingtacticalmarketing(product,price,promotionandplace)differentiallyagainsteachsegment.

Therealissuewasintermsofbehaviour.Theytalkedlongaboutwhatcausedthebehaviourstheywouldsee.Theyrationalizedtheremightbeaconsumersegmentheavilyintogamesandentertainment,oranotherconsumersegmentveryhightech/web-centric/earlyadopters,etc.Theremightbeanothersegmentneedingarelationship,moreonthelow-techside,needingtheirhandsheldthroughthetechno-babble.Theyknewmostoftheir(behavioural)datawouldbetransactionsandmarcomresponses.

Sotheteamthoughtthat,giventhemarketingstrategyofincreasingnetrevenueandthevariouspotentialconsumerbehavioursegments,astrategycouldbelevelleddifferentlyateachsegment.Thatis,acompletelydifferentcommunicationstylewouldbeusedon,say,aprice-sensitive,low-techconsumerasopposedtoaheavygamer.Scottthoughttherewasalotofexcitementandbuy-inforthisoutput.

Collectbehaviouraldata

Scottwenttohisdatabaseteamandtheytalkedaboutwhatdatatheyhad.Firsttheyhadtodefineaconsumer(asopposedtoasmallbusiness,eg,asoleproprietorship)butthatwasfairlystraightforward.Thentheytalkedaboutdata.

Scottwantedbehaviouraldata,specificallytransactionsandmarcomresponses.Theytalkedabouttwoorthreeyearsofhistory.ThePCconsumerbusinesshasastrongseasonality(peakinginAugustandevenmoreinDecember)andScotthadalreadylearnedhowseasonalityhadtobetakenintoaccount.

Intermsoftransactions,theissuewaswhatkindofgranularitywasneeded.Theydecidedtheyneededonlybroadproductcategories–laptops,desktopsandworkstations(veryfewconsumerswouldbuyaserver)–andonlygoonelevelbelowthis,eg,high-end

Page 123: Marketing Analytics: A Practical Guide to Real Marketing Science

desktopversusscaled-backdesktop,andsoon.They’daddconsumerelectronics,whichincludedtelevisions,printers,software(personalproductivity,games,etc.),digitalcameras,accessories,etc.They’dincludeproductdetailsaswellasgrossrevenueanddiscountsapplied,netrevenue,numberofpurchases,timebetweenpurchases,monthstheproduct(s)werepurchased,etc.

Thinkingaboutmarcomresponses(asignofbehaviourandanindicationofengagement)theytalkedaboutbothdirectmailande-mail.Theywouldmostlyignoresocialmedia/in-boundmarketingbecauseofdifficultyinmatchingcustomers,andwebbanner/advertising(again,itcannotbetieddirectlytoaparticularcustomer).Theyknewtowhomtheysentacatalogue,whentheysentit,whatwasonthecoverandwhatoffers/promotionswereinsideeachone.Eachcataloguehadaunique800phonenumber,sowhenthecustomersrang,thecallcentrewouldknowwhichcataloguehaddriven(atleast)thatinquiry.Ifapromotionwasusedonlinethosewerealsouniquetoeachcatalogue.Thesamedatawasavailablefore-mail.Eachwassenttoaparticulare-mailaddressandtheycouldkeeptrackofeachopenandclick,etc.Soagain,therewasalotofdata.

Collectadditionaldata

Thenextstepwastocollectadditionaldata.Thiscouldcomefromseveralpossiblesources.Itcouldcomefromcreating/derivingdatafromthedatabase.Itcouldcomefromoverlaydataandfromprimarymarketresearchdata.

Fromtheconsumerdatabasetheycreatedadditionalvariables.Theseincludedmonthlydummyvariablesforseasonality.Theycalculatedtimebetweenpurchases,theyderivedtypicalmarketbasketsandtheyputtogethershareofproducts,thatis,percentofdesktops,percentofconsumerelectronics,andsoon.

Theypurchasedoverlaydata.Thisincludedbothdemographics(suchasage,education,income,gender,sizeofhouseholdandoccupation)aswellaslifestyleandinterestvariables.Theyhopedthesewouldfleshoutthesegments.Thisdatawasprettywellmatched,atabout80%totheirconsumerdatabase.

TherewasalimitedamountofprimarymarketingresearchbutScottfoundafewstudiesthatcouldbehelpful(especiallyintheProbephaseofthefourPsofstrategicmarketing).Theyhaddoneacustomersatisfactionstudyandanawarenessstudy.Thesestudieseachtookcustomernamesfromthedatabaseand,whilenotwellrepresentedcouldbematchedtothetransactionfile.

Analytics

Collectdataandsample

Notetherearetwokindsofvariablesinthisenvironment:segmentingvariablesandprofilingvariables.Segmentingvariablesarethoseusedtocreatethesegments,while

Page 124: Marketing Analytics: A Practical Guide to Real Marketing Science

profilingvariablesareeverythingelse.Theprimarymarketingresearchdatawillbeprofilingvariables,astheyaretoounderpopulatedtobeusedassegmentingvariables.Mostofthedemographicswillbeprofilingvariables,asdemographicsaretypicallynotusefulindefiningsegments.Buttheother(behavioural)variableswillgothroughthealgorithmandbetestedastowhetherornottheyaresignificantandifsowillbekeptassegmentingvariables.Notethatanythingthatisnotasegmentingvariablewillbeaprofilingvariable.

What’snextiswhatScotthasbeenmostlookingforwardto:theanalytics.Thereareseveralstepsinthisprocessandtheyareallenjoyable.

Sofirsthewouldhavetotakeasample.LCAcannotoperateonmillions(orevenhundredsofthousands)ofrecords.Thealgorithmwouldtakeyearstoconverge.Sohechoosesarandomsampleof,say,20,000customerrecords.Theserecordshavebeenmatchedwithtransactionsandmarcomresponses,deriveddataandoverlaydataand(wherepossible)primarymarketingresearchdata.

Usuallythereisnoneedtoworryaboutoversampling(acertainvariable)orstratifying,etc.

Oversampling:asamplingtechniqueforcingaparticularmetrictobeoverrepresented(larger)inthesamplethaninsimplerandomsampling.Thisisdonebecauseasimplerandomsamplewouldproducetoofewofthatparticularmetric.

Stratifying:asamplingtechniquechoosingobservationsbasedonthedistributionofanothermetric.Thisisdonetoensurethesamplecontainsadequateobservationsofthatparticularmetric.

Intypicalconsumermarketingasimplerandomsampleisfine.Takealookatanygoodgeneralstatisticsbookforsampling,etc.,suchasStatisticalAnalysisforDecisionMaking,byMorrisHamburg(1987).

Normalize

Now,eventhoughnotstrictlynecessary,isthetimetoweedoutnon-normality.Iliketodothissteptoensureagainststrangeorweirddataelements.So,therearetwostages.

Thefirststageissimplytotesteveryvariablefor‘non-normality’.Thisgenerallymeanstakingthez-scoreofeachvariableorstandardizingeachvariable,thendeletinganyobservationthathasascore>+/–3.0standarddeviations.(Threestandarddeviationsis99.9%oftheobservationsinanormaldistributionandisthereforeveryNON-normal.)Theseareclearlynon-normaldataelementsandthereshouldnotbeverymanyofthem.Somepeoplereplacetheseoutlierswiththemeanbutifthereareenoughobservationsthisisnotnecessaryandalittletooarbitraryformytaste.

ForthesecondstageIwillhavetoaskyoutomakesureyou’resittingdown.RememberhowI’veclamouredabouthowbadK-meansisandhowit’snotagood

Page 125: Marketing Analytics: A Practical Guide to Real Marketing Science

solution?WellnowI’maskingyoutouseK-meanstotestfornormality.

TheideaistorunK-meanswithaLOTofclusters,like100orso.Usethe(typicallybehavioural)variablesthatmakemostsensetoyouindefiningtheclusters.Allwearetryingtodoisformclustersthatareunusualintermsofbehaviouralmotivations.Sonowwith,say,100clusters,thoseclustersthatareverysmall(likehavingonlyafewcustomersinthem)arebymultivariatedefinition‘unusual’.Theseobservationsshouldbeeliminated.Thepointisthatwhilewe’velookedatanysinglevariablebeingunusual,thistechniqueusesamultivariableapproachtofindagroupofcustomersmovinginsuchawaytobenon-normal.That’swhytheseobservations(customers)aredeletedfromfurtheranalysis.

Notethatwearetryingtounderstandthenormalmarket.That’swhythereiseffortputforthtodetectnon-normality.Becausewehaveasampleit’sevenmoreimportanttoascertainunusualscoresonvariablesorunusualcustomerbehaviourandeliminateit.

So,let’ssaythatScottandhisteamdidtheaboveprocessandtheirsamplewentfrom20,000to18,000.Thenherandomlysplitsthis18,000intotwofiles,AandB.Thiswillbeatestfileandavalidationfileforlater.

RunLCA

NowScottfeedstestfileAintothesoftwareandisreadytorunLCA.Hefirstchoosestorunasolutioncreatingsegments2through9,justtonarrowdownwherethingsare.LCAshowsdiagnostics(BIC,LL,etc.,seeabove)tohelpwiththeoptimalnumberofsegments(seeTable8.2).NotethattheBICgoesdownandisataminimumatsixsegments.ThistellsScottsixsegmentsareprobablytherightnumber.TheBICistheBayesInformationCriterion.Thinkofitasanareaoferror(essentiallynegativeprobability)withthesmallertheareathebetter.Whicheverclusterhasthesmallesterror(intermsofpredictingmembership)thebetteritis.

Table8.2BayesInformationCriterion

BIC

2cluster 92,454

3cluster 79,546

4cluster 61,565

5cluster 59,605

6cluster 58,456

7cluster 58,989

8cluster 59,650

9cluster 60,056

Page 126: Marketing Analytics: A Practical Guide to Real Marketing Science

Nowherunsthesecondmodel,afterdeletingthosevariablesthatareinsignificantandcomesupwithTable8.3.

Table8.3BayesInformationCriterion:secondmodel

BIC

3cluster 64,466

4cluster 56,550

5cluster 41,058

6cluster 40,611

7cluster 57,089

8cluster 58,067

Thevariablesheusesalsogivediagnosticsastowhicharesignificant.NoteTable8.4below,showingR2<10%formostofthedemographics.TheseScottremoves.

Table8.4Listofvariablesremoved

Age 0.05

Education(years) 0.07

Income 0.01

Sizehousehold 0.02

Occupation–bluecollar 0.05

Occupation–whitecollar 0.04

Occupation–agriculture 0.02

Occupation–government 0.01

Occupation–unemployed 0.02

Ethnicity–asian 0.02

Ethnicity–white 0.02

Ethnicity–black 0.01

Thisispartofthemodellingexercise:putvariablesin,runthesegmentsolutions,seewhereBICisbest,lookatsignificanceandremovethosethatareinsignificant,etc.Whilethisseemstimeconsuming,itendsupbeingfarfasterthan,say,K-means,mostlybecause

Page 127: Marketing Analytics: A Practical Guide to Real Marketing Science

thereisabsolutelyagoodsolutionattheend,notanarbitraryquagmireofundifferentiatedclusters.

Thevariablesthatendupbeingsignificantinclude:

Figure8.3SignificantVariables

Notethatthesevariablesarebehavioural,asexpected.Revenuevariablesarenoteventested,astheyaretheRESULTofbehaviour.Demographicstypicallyarenotsignificantandarealsonotbehavioural.Ofcourse,anyandallofthesevariablescanbeusedforprofiling.

Thenextstepistocorrectforwhitenoise,usingbi-variateresiduals.Thisstepaddsalargenumberofparametersandwillslowtheanalysisdown.Waydown.Analytically,allthreedimensionsarenudgedsimultaneously:findthenumberofsegments,findthesignificantvariablesandcorrectwithbivariateresiduals.

Thenextstepistomarkthosebivariateresiduals.Theseareindicationsofsomepatternremainingthattheindependentvariablesarenoteliminating.Thebivariateresidualsshouldbecheckeddowntoabout3.84.Thisisthe95%levelofconfidence(rememberthe95%z-scoreforlinearmodelsis1.96and3.84=1.96*1.96,acurvilinearmetric).

Thecommonlaststepistorunthesecondfilethroughusingthesamenumberofsegments,six,andthesamevariablesfoundtobesignificant.Checkthebivariateresidualsandlookatthetwooutputs.Theyshouldappearessentiallythesame.Iusuallydonotstatistically‘test’thissameness,Ijustlookatit.Ihaveneverseenthetworesultstobe

Page 128: Marketing Analytics: A Practical Guide to Real Marketing Science

differentinanymeaningfulway.

Profileandoutput

Theprofilegenerallyusesallthevariables.Oftenthereisa‘top-down’viewanda‘bottom-up’view,orastrategyviewandatacticalview,orageneralviewandaspecificview.Belowisthestrategic,top-downorgeneralviewofthesixsegments.Thislensputsthesegmentstogether,tocompareandcontrast,allatonce,lookingatKPIs.

Table8.5Generalviewofsixsegments

Seg1 Seg2 Seg3 Seg4 Seg5 Seg6

%ofmarket 30% 24% 19% 15% 9% 3%

%ofrevenue 32% 39% 9% 17% 2% 0%

#Totalpurch 14.49 25.64 8.88 18.17 7.95 9.65

RevDTpurch 3,150 4,730 999 2,592 352 81

RevNBpurch 2,320 720 680 1,152 630 168

Revtotalpurch 6,281 9,786 2,742 6,811 1,393 1,154

#DMsent 13.5 9.1 19.5 5.6 6.8 9.5

#EMsent 15.9 17.8 9.1 12.9 15.5 12.8

#EMopen 1.4 3.2 0.4 4.5 1.7 2.6

#EMclick 0.1 0.4 0 2.3 0.3 0.2

#Prodpurchcallcentre 3.6 2.6 8 0.9 2 3.9

#Prodpurchonline 10.9 23.1 0.9 17.3 6 5.8

Education(years) 19.1 12.9 11.8 17.9 13.8 13.8

$Income 185K 60K 45K 125K 15K 75K

%Q4purchase 25% 70% 83% 14% 15% 41%

Avgtimebetweenpurch 6.5 3.1 16.5 4.2 9.4 15.4

Avgtimebetweenwebvisits 3.2 2.1 9.5 1.9 3.9 8.5

Afewquickcommentscanbemadeontheaboveoutput.Firstisthatsomedemographicsareshown.Thisistypical.Rememberthatwhiledemographicsarenotstatisticallysignificantindesigningthesegmentation,theymightstillbeofuseinfleshingoutthesegments(andadvertisersseemtolovedemographics).Thefirststageispartitioningandthesecondstageisprobing.Addingadditionaldataispartoftheprobingstage.

Page 129: Marketing Analytics: A Practical Guide to Real Marketing Science

Let’slookatthesegmentationsolution.Segment1isthelargestintermsofmarketandeachsegmentissuccessivelysmallerwithsegment6thesmallestat3%.Thestoryishowsegmentsizecomparestopercentofrevenuegenerated.Notethatsegment2contributes39%oftherevenuewithonly24%ofthemarket.Notethatsegment5,conversely,isnotpullingitsfairsharehaving9%ofthemarketbutgeneratingonly2%oftherevenue.ThesemetricsbegintoletScottknowwhereheshouldputhisresourcesandwhichsegmentsare‘worth’marketingto.Seethegraphbelow.

Figure8.4%ofmarketvs%ofrevenue

*Doesnotaddto100%duetorounding.

Anotherstorydisplaysitselfaroundchannelpreference.Segment2andsegment4seemtobeveryweb-centric,whilesegment3isNOTonethatpursuesonlinepurchases.Segment4opens4.5ofthe12.9e-mailssenttothem,whereassegment3opens0.4ofthe9.1e-mailssenttothem.Segment2purchases23.1oftheir25.64productsonline(andsegment4purchases17.3oftheir18.17productsonline)butagainsegment3purchasesonly0.9oftheir8.88productsonline.Theseareclearbehaviouraldifferences.

Segment1hasthehighestandsegment5(mostlystudents,seebelowdetails)hasthelowestincome.Segment1hasthemosteducationandsegment2theleasteducation.Thefiguresbelowshowoccupationsandotherdemographics.

Comments/detailsonindividualsegmentsAfewnotesandobservationsoneachsegmentfollow.

Segment1

Segment1isthelargestsegment(30%ofthemarket)andcontributes32%oftherevenue.

Segment1purchasesmoredesktops(3.5)andnotebooks(2.9)thananyothersegment.Theyhaveahighpenetrationofproductivesoftware(twicetheaverage)probablyheavilyinvestedinsmartphoneandtabletownership,whichmeanstheyareveryhigh-tech

Page 130: Marketing Analytics: A Practical Guide to Real Marketing Science

comfortable.

Segment1receivesthesecond-highestnumberofdirectmailsande-mailssent.It’sinterestingtonote,however,thattheyhavethenext-to-lowestnumberofe-mailsclicked/numberofe-mailsopenat0.7%.

Segment1hasthelargestsizehousehold(4.1)andmost(70%)whitecollaroccupations.Theyhavethehighestincomeandhighesteducation.Theyareyoungishandprobablycouldbecalledyuppies.

Segment2

Segment2isthenext-to-largestsegment(24%ofthemarket)andcontributesmorethantheirfairshareoftherevenueat39%.

Segment2paysbyfarthehighestdesktopprices(75%aboveaverage)andhasnearlyfourtimeshigherthanaveragegamingsoftwarepurchases.Almostnoproductivitypurchases,butalotofaccessory(nearlythreetimesaverage)andphonepurchases(nearlytwiceaverage).

Segment2showsnext-to-highestnumberofe-mailopensandthehighestnumberofproductspurchasedonline,88%aboveaverage.Thissegmentcallsthecallcentrenext-to-lowestnumberoftimesfromthecataloguebuthasthehighestnumberofcallsfrome-mailsandtheyhavethemostonlineconfigurations.

Thissegmentisthegamers!Theytendtobeyoungandsinglewithnext-to-smallestsizeofhousehold.Theypurchaseallofthegamingaccessories:headphones,joystick,etc.

Segment3

Segment3makesup19%ofthecustomermarketbutonlyaccountsfor9%oftherevenue.Thissegmentdoesnotcomeclosetopullingitsweight.

Segment3purchasesalargeamountofdigitalcameras(nearlytwiceaverage)and50%morephones.Whentheydopurchasetheytendtobuylow-endentry-leveltechnology,whichisonereasontheirrevenuecontributionissolow.

Segment3receivesthehighestnumberofcataloguesandthelowestnumberofe-mails.Thissegmentopensfewerandclickslessthananyother.Segment3needsa(directmail)discountinordertopurchase.

Segment3callsfromdirectmailmoreandpurchasesfromthecallcentremorethananyothersegment.Conversely,thissegmentcallsfrome-maillessandpurchasesonlinelessthananyothersegment.

Segment3needshand-holding.Theyarelowtechandneedarelationshiptofosterapurchase.TheytendtobeAfrican-American,withahighpercentageofbluecollarandgovernmentoccupations.Thissegmenthastheleasteducation.Theycallthecallcentre

Page 131: Marketing Analytics: A Practical Guide to Real Marketing Science

withcomplaintsmorethananyothersegmentandtendtopurchasemostlyduringtheChristmasseason.

Segment4

Segment4is15%ofthemarketandgenerates17%oftherevenue.

Segment4purchasesnext-to-mostdesktopsandnext-to-mostnotebooks.Theyareveryhightech,purchasingthemostTVs,cameras,networkandotheraccessories.

Thissegmenthasthehigheste-mailopensandbyfar(overfourtimesaverage)e-mailclicksthananyothersegment.Theypurchasefewerproductsfromthecallcentreandnext-to-mostproductspurchasedonlinethananyothersegment.Theyhavetheshortesttimebetweenwebvisits.

Segment4isveryweb-centricandprobablybelieves‘printisdead!’TheytendtobeAsian,veryhightech,withengineeringwhitecollaroccupations.Theywouldbeearlyadopters,withnext-to-highesteducationcomparedtoothersegments.Theyignoredirectmailandmakemostoftheirpurchasesonline.

Segment5

Thissegmentistheleastsuccessful,being9%ofthemarketbutonlypulling2%oftherevenue.

Segment5purchaseslow-endproducts(fewdesktop,largelynotebooks),mostlyduringback-to-schoolsalesandusuallywithadiscount.Theypurchasenearlyzeroconsumerelectronics.

Segment5receivesthenext-to-leastnumberofdirectmailsandmakesthenext-to-leastcallcentrepurchases.

Segment5appearstobemostlystudents,single,unemployed,lowincome,etc.

Segment6

Segment6isonly3%ofthemarketingandgenerates<1%oftherevenue.

Segment6reallyonlypurchasesaccessoriesandoccasionalitems,spareparts,etc.

Thissegmentisnotreallyengagedinourbrand,doesnotreallyrespondtocommunications,etc.Segment6doesnotvisitourwebsitemuchandhasthelongesttimebetweenpurchases.ThissegmentmightbeatargettoDE-marketto.Notethehighpercentageofagriculturaloccupations.

Tables8.6and8.7presentsomedetailsbysegment,asreferencedabove.

Table8.6Detailsbysegment

Segment Segment Segment Segment Segment Segment

Page 132: Marketing Analytics: A Practical Guide to Real Marketing Science

1 2 3 4 5 6

%ofmarket 30% 24% 19% 15% 9% 3%

%ofrevenue 32% 39% 9% 17% 2% 0%

NumDTpurch 3.5 2.2 1.11 2.88 0.88 0.09

NumNBpurch 2.9 1.2 0.85 1.44 1.05 0.21

Numelectronics–TVpurch 0.11 1.15 0.09 1.35 0.05 0.21

Numelectronics–camerapurch 0.02 0.05 1.06 1.88 0.24 0.45

Numelectronics–printerpurch 1.38 1.06 1.15 1.19 1.09 0.29

Numelectronics–accessorypurch 1.2 5.5 0.08 1.08 0.29 1.87

Numelectronics–phonepurch 0.03 1.21 0.99 0.89 0.09 0.35

Numelectronics–sw–gamepurch 0.02 9.55 0.08 0.09 0.68 0.65

Numelectronics–sw–productivepurch

4.1 0.09 1.06 2.21 0.24 0.87

Numother–networkpurch 1.1 1.02 1.54 2.89 1.98 0.87

Numother–accessoriespurch 0.11 1.55 0.22 1.59 1.08 1.54

Numother–otherpurch 0.02 1.06 0.65 0.68 0.28 2.25

Numtotalpurch 14.49 25.64 8.88 18.17 7.95 9.65

RevDTpurch 3,150 4,730 999 2,592 352 81

RevNBpurch 2,320 720 680 1,152 630 168

Revelectronics–TVpurch 127 1,811 104 1,553 30 242

Revelectronics–camerapurch 7 15 371 658 60 158

Revelectronics–printerpurch 207 105 173 179 82 44

Revelectronics–accessorypurch 90 853 6 81 19 140

Revelectronics–phonepurch 7 454 223 200 14 79

Revelectronics–sw–gamepurch 1 716 5 6 37 42

Revelectronics–sw–productivepurch

308 2 80 166 18 65

Revother–networkpurch 61 97 85 159 109 48

Revother–accessoriespurch 4 271 8 56 38 54

Revother–otherpurch 0 12 10 10 4 34

Page 133: Marketing Analytics: A Practical Guide to Real Marketing Science

Revother–otherpurch 0 12 10 10 4 34

Revtotalpurch 6,281 9,786 2,742 6,811 1,393 1,154

Table8.7Additionaldetailsbysegment

Segment1

Segment2

Segment3

Segment4

Segment5

Segment6

NumberDMsent 13.5 9.1 19.5 5.6 6.8 9.5

NumberEMsent 15.9 17.8 9.1 12.9 15.5 12.8

NumberEMopen 1.4 3.2 0.4 4.5 1.7 2.6

NumberEMclick 0.1 0.4 0 2.3 0.3 0.2

Numberprodpurchcallcenter 3.6 2.6 8 0.9 2 3.9

Numberprodpurchonline 10.9 23.1 0.9 17.3 6 5.8

NumberDMdiscount 8.1 5.5 11.7 3.4 4.1 5.7

NumberEMdiscount 11.1 12.5 6.4 9 10.9 9

NumberDMcall 1.2 0.8 15.9 0.2 3.9 9.5

NumberEMcall 9.4 12.8 2.1 3.4 8.4 4.8

Numonlineconfig 5.5 21.5 0.7 16.5 12.6 0.4

Numbercallcenterpurch 3.6 2.6 8 0.9 2 3.9

Numbercallcentercomplaint 2.1 0.9 5.6 3.2 1.2 0.5

Age 28.9 25.5 41.9 30.1 21.2 38.9

Education(years) 19.1 12.9 11.8 17.9 13.8 13.8

Income 185,000 60,000 45,000 125,000 15,250 75,000

Sizehh 4.1 1.2 3.9 3.7 1.1 3.1

Occupation–bluecollar 20% 19% 60% 18% 13% 25%

Occupation–whitecollar 70% 38% 1% 65% 5% 35%

Occupation–agriculture 4% 5% 2% 1% 5% 18%

Occupation–government 3% 28% 25% 15% 15% 11%

Occupation–unemployed 1% 8% 10% 1% 60% 10%

Ethnicity–asian 15% 5% 2% 21% 7% 1%

Ethnicity–white 55% 65% 35% 41% 70% 80%

Page 134: Marketing Analytics: A Practical Guide to Real Marketing Science

Ethnicity–black 20% 15% 35% 8% 10% 11%

Q1purchase 30% 4% 6% 20% 5% 1%

Q2purchase 25% 10% 5% 31% 5% 3%

Q3purchase 20% 15% 5% 33% 75% 55%

Q4purchase 25% 70% 83% 14% 15% 41%

Avgtimebetweenpurch(months) 6.5 3.1 16.5 4.2 9.4 15.4

Avgtimebetweenwebvisits(weeks)

3.2 2.1 9.5 1.9 3.9 8.5

Namingthesegments

Oneofthemostenjoyableexerciseseveristhenamingofthesegments.Acommonwaytodoitisthroughrevenueandproducts.Thisisthedesktopsegmentandthisisthelow-techsegment,etc.Anotherpossibilityiswithmarcom.Thisisthedirectmailrespondersandthisisthee-mailpreferencesegment,etc.Bothoftheseareprobablytoosimplistic.

Eachsegmentnameshouldhaveonlytwoorthreewordstodescribeit:desktopdevotees,gamers,lifestarters,web-centrics,etc.Theideaistobedescriptiveaswellasmemorable.

K-meanscomparedtoLCAThecomparisonbelowcamefromScott’sdebatewithotheranalyticfolks.SomeofthemhadlearnedK-meansandbecauseLCAwasnewtothemdidnotreallyunderstandortrustit.ThereforeScottranLCAandtoldtheK-meansteamthenumberofsegmentshefoundandhetoldthemwhichvariablestouse.Notethatthesetwopiecesofinformation(howmanysegmentsandwhichvariablestousearesignificant)wouldnoteverbeinformationK-meanswouldhave.ThushegavetheK-meansteamtwoHUGEadvantages.EachteamranthealgorithmandproducedtheKPIsinTable8.8.

Table8.8KPIs

LCAoutput Segment1

Segment2

Segment3

Segment4

Segment5

Segment6

hi/low

%ofmarket 30% 24% 19% 15% 9% 3% 12

%ofrevenue 32% 39% 9% 17% 2% 0% 81.44

Numtotalpurch 14.49 25.64 8.88 18.17 7.95 9.65 3.23

RevDTpurch 3,150 4,730 999 2,592 352 81 58.4

RevNBpurch 2,320 720 680 1,152 630 168 13.81

Page 135: Marketing Analytics: A Practical Guide to Real Marketing Science

Revtotalpurch 6,281 9,786 2,742 6,811 1,393 1,154 8.48

NumberDMsent 13.5 9.1 19.5 5.6 6.8 9.5 3.48

NumberEMsent 15.9 17.8 9.1 12.9 15.5 12.8 1.96

NumberEMopen 1.4 3.2 0.4 4.5 1.7 2.6 12.4

NumberEMclick 0.1 0.4 0 2.3 0.3 0.2 124.04

Numberprodpurchcallcentre

3.6 2.6 8 0.9 2 3.9 8.8

Numberprodpurchonline

10.9 23.1 0.9 17.3 6 5.8 25.99

Education(years) 19.1 12.9 11.8 17.9 13.8 13.8 1.62

Income 185,000 60,000 45,000 125,000 15,250 75,000 12.13

Q4purchase 25% 70% 83% 14% 15% 41% 5.93

Timebetweenpurch(months)

6.5 3.1 16.5 4.2 9.4 15.4 5.32

Timebetweenvisits(weeks)

3.2 2.1 9.5 1.9 3.9 8.5 5

K-meansoutput Segment1

Segment2

Segment3

Segment4

Segment5

Segment6

hi/low

%ofmarket 24% 19% 17% 16% 15% 9% 2.67

%ofrevenue 19% 15% 17% 19% 18% 13% 1.45

Numtotalpurch 14.1 17.7 16.2 14.8 16.9 17.2 1.26

RevDTpurch 1,901 2,490 3,498 4,021 2,011 2,666 2.12

RevNBpurch 1,344 1,108 1,655 1,100 1,100 911 1.82

Revtotalpurch 4,992 5,006 6,271 7,509 7,489 9,200 1.84

NumberDMsent 10.1 11 11.2 12.8 12.9 15.1 1.5

NumberEMsent 11.9 15.2 16.4 15.2 14.9 15 1.38

NumberEMopen 1.8 2.2 2.3 2.2 2.1 2.8 1.56

NumberEMclick 0.61 0.66 0.54 0.52 0.51 0.26 2.54

Numberprodpurchcallcentre

3.1 3.6 3.7 3.9 3.4 4.9 1.58

Numberprodpurch 9.1 10.2 12.4 17.1 13.5 13.6 1.88

Page 136: Marketing Analytics: A Practical Guide to Real Marketing Science

online

Numtotalpurch 12.2 13.8 16.1 21.0 16.9 18.5 1.73

Education(years) 16.3 16.4 15.1 13.1 15.3 15.5 1.25

Income 109,655 109,166 98,066 98,054 97,112 88,055 1.25

Q4purchase 39% 34% 61% 44% 44% 55% 1.79

Timebetweenpurch(months)

6.6 7.5 7.7 9.1 8.1 7.9 1.38

Timebetweenvisits(weeks)

3.8 4.1 4.5 4.6 3.5 4.9 1.4

NoticeinthetopLCAtablethevariable‘Numtotalpurch’.Thistableshowstheaveragesbysegment.Segment2onaveragepurchasesthemostitems,with25.64andsegment5purchasestheleastitemsonaveragewith7.95.Lookatthelastcolumnandseethehigh/lowand25.64/7.95=3.23.Thatisameasureofrange,ordispersion.

SeethelowerpartofthetablewhichusesK-means.Itisthesamedata,samenumberofsegmentsandsamevariablesusedassignificant.Thehigh/lowofNumtotalpurcharemuchlessdifferentthanthatfromLCA.Ahighof17.7andalowof14.1givearangeofonly1.26.Thisisatypicaldifference.K-meansoutputwouldwork;LCAissimplybetter,moredistinctandultimatelyproducesaclearerstrategy.

AnotherfairlycommonfindingcomparingK-meanstoLCAisintermsofsegmentsize.LCAproducessegmentsrangingfrom30%to3%,butK-meansrangesonlyfrom24%to9%.BecauseK-meansproducesroughlysphericalclustersandtheytendtobeofsimilarsize.Thereisnomarketingtheorythatwouldhypothesizethesegmentsshouldbeofaboutthesamesize.

ScottconvincedtheteamthattheLCAoutputwastheobviouswaytogo.

Elasticitymodelling

Oneverynaturalandhelpfulexerciseaftersegmentationistodoelasticitymodelling.(RememberChapter3ondemandwentthroughthemodellingdetail.)Thisshowsdifferentpricesensitivitiesbysegment.Thatis,onesegmentwilllikelybesensitivetopriceandanothersegmentwilllikelyNOTbesensitivetoprice,etc.Thisallowsforverylucrativestrategies.Reviewearlierchaptersforhowelasticitymodellingistypicallydone.

WhatScottfoundwasthatsegment1isnotsensitivetoprice.Thissegmentdoesnotrequireadiscountinordertopurchase.Hefoundconverselythatsegments3and5areverysensitivetoprice.Thesearethesegmentsthatwillonlybuywithsomekindofpromotion.

Testandlearnplan

Page 137: Marketing Analytics: A Practical Guide to Real Marketing Science

Thelaststeptendstobeputtingtogethersomekindoftestingplan.Wewillcoverstatisticaldetailslaterinthebook,buttheconceptisstraightforward.

Theideaistocorroboratethesensitivitiesthesegmentationfound.Thatis,ifasegmentissensitivetoprice,testthat.Ifasegmentprefersaparticularchannel,testthat,etc.

Usuallyselectionistestedfirst,thenpromotionandthenchannelorproductcategory,etc.Theseareusuallyinatestversuscontrolsituation.

HIGHLIGHT

WHYGOBEYONDRFM?(ThisarticlewaspublishedinadifferentformatinMarketingInsights,April2014)

AbstractWhileRFM(recency,frequencyandmonetary)isusedbymanyfirms,itinfacthaslimitedmarketingusage.Itisreallyonlyaboutengagement.Itisvaluableforashort-term,financialorientationbutasorganizationsgrowandbecomemorecomplexamoresophisticatedanalytictechniqueisneeded.RFMrequiresnomarketingstrategyandasfirmsincreaseincomplexitythereneedstobeanincreaseinstrategicplanning.Segmentationistherighttoolforboth.

RFMhasbeenapillarofdatabasemarketingfor75years.Itcaneasilyidentifyyour‘best’customers.Itworks.SowhygobeyondRFM?Toanswerthat,let’smakesureweallknowwhatwe’retalkingabout.

WhatisRFM?Onedefinitioncouldbe,‘Anessentialtoolforidentifyinganorganization’sbestcustomersistherecency/frequency/monetaryformula.’RFMcameaboutmorethan75yearsagoforusebydirectmarketers.Itwasespeciallypopularwhendatabasemarketingpioneers(suchasStanRapp,TomCollins,DavidShepherdandArthurHughes)startedwritingtheirbooksandadvocatingdatabasemarketing(asthenextgenerationofdirectmarketing)nearly50yearsago.Itbecameapopularwaytomakeadatabasebuild(anexpensiveproject)returnaprofit.Thus,themostpressingneedwastosatisfyfinance.

JacksonandWangwrote,‘Inordertoidentifyyourbestcustomers,youneedtobeabletolookatcustomerdatausingrecency,frequencyandmonetaryanalysis(RFM)…’(JacksonandWang,1997).Againthefocusisonidentifyingyourbestcustomers.But,itisnotmarketing’sjobtojustidentifyyour‘best’customers.‘Best’isacontinuumandshouldbebasedonfarmorethanmerelypastfinancialmetrics.

Page 138: Marketing Analytics: A Practical Guide to Real Marketing Science

TheusualwayRFMisputintoplace,althoughthereareaninfinitenumberofpermutations,endsupincorporatingthreescores.First,sortthedatabaseintermsofmostrecenttransactionsandscorethetop20%,say,witha5andondowntothebottom20%witha1.Thenre-sortthedatabasebasedonfrequency,maybewiththenumberoftransactionsinayear.Again,thetop20%geta5andthebottom20%geta1.Thelaststepistore-sortthedatabaseon,say,salesdollarvolume.Thetop20%geta5andthebottom20%geta1.Now,sumthethreecolumns(R+F+M)andeachcustomerwillhaveatotalrangingfrom15to3.Thehighestscoresarethe‘best’customers.

Table8.9Customertotals

CustomerID R F M Total

999 3 2 1 6

1001 5 3 3 11

1003 4 4 2 10

1005 1 5 2 8

1007 1 4 1 6

1009 2 4 3 9

1010 3 4 4 11

1012 2 3 5 10

1014 3 1 5 9

1016 4 1 4 9

1017 5 2 3 10

1018 4 3 4 11

1020 4 4 3 11

1022 3 5 3 11

1024 2 4 2 8

1026 1 3 5 9

Notethatthis‘best’isentirelyfromthefirm’spointofview.Thefocusisnotaboutcustomerbehaviour,notaboutwhatthecustomerneeds,whythosewithahighscorearesoinvolvedorwhythosewithalowscorearenotsoengaged.Thepointistomakea(financial)returnonthedatabase,nottounderstandcustomerbehaviour.Thatis,themotivationisfinancialandnotmarketing.

Page 139: Marketing Analytics: A Practical Guide to Real Marketing Science

RFMworksasamethodoffindingthosemostengaged.Itworkstoacertainextent,andthatextentisselectionandtargeting.RFMissimpleandeasytouse,easytounderstand,easytoexplainandeasytoimplement.Itrequiresnoanalyticexpertise.Itdoesn’treallyevenrequiremarketers,onlyadatabaseandaprogrammer.

Sayyoure-scorethedatabaseeverymonth,inanticipationofsendingoutthenewcatalogue.ThatmeansthateverymontheachcustomerpotentiallychangesRFMvaluetiers.Aftereverytimeperiodanewscoreisrunandanewmigrationemerges.Notethatyoucannotlearnwhyacustomerchangedtheirpurchasingpatterns,whytheydecreasedtheirbuying,whytheymadefewerpurchasesorwhythetimebetweenpurchaseschanged.Muchlikethetipofaniceberg,onlytheblatantresultsareseenandRFMgivesnothinginthewayofunderstandingtheunderlyingmotivationsthatcausedtheresultantactions.Therecanbenorationaleastocustomerbehaviourbecausethepurposeofthealgorithmusedwasnotforunderstandingcustomerbehaviour.RFMusesthethreefinancialmetricsanddoesnotuseanalgorithmthatdifferentiatescustomerbehaviour.

BecauseRFMcannotincreaseengagement(itonlybenefitsfromwhateverlevelofinvolvement,brandloyalty,satisfaction,etc.youinheritedatthetime–withnoideaWHY)ittendstomakemarketerspassive.Thereisnorelationshipbuildingbecausethereisnocustomerunderstanding.Thatis,becauseRFMcannotprovidearationaleastowhatmakesonevaluetierbehavethewaytheydo,marketingstrategistscannotactivelyincentivizedeeperengagement.

RFMisagoodfirststep,buttomakeagreatsteprequiressomethingbeyondRFM.Marketersrequirebehaviouralsegmentationinordertopractisemarketing.

Whatisbehaviouralsegmentation?Behaviouralsegmentation(BS)quicklyfollowedRFM,duetothefrustrationsthatRFMproducedgood,butnotgreat,results.Aswithmostthings,complexanalysisrequirescomplexanalytictoolsandexpertise.BSwasputintoplacetoapplymarketingconceptswhenusingadatabaseformarketingpurposes.

Inordertoinstituteamarketingstrategy,thereneedstobeaprocess.KotlerrecommendedthefourPsofstrategicmarketing:Partition,Probe,PrioritizeandPosition.Partitioningistheprocessofsegmentation.

Whileit’smathematicallytruethatpartitioningonlyrequiresabusinessrule(RFMisabusinessrule)todividethemarketintosub-markets,behaviouralsegmentationisaspecificanalyticstrategy.Itusescustomerbehaviourtodefinethesegmentsanditusesastatisticaltechniquethatmaximallydifferentiatesthesegments.JamesH.Myersevensays,‘Manypeoplebelievethatmarketsegmentationisthekeystrategicconceptinmarketingtoday’.

BSisfromthecustomer’spointofview,usingcustomertransactionsandmarcom

Page 140: Marketing Analytics: A Practical Guide to Real Marketing Science

responsedatatospecificallyunderstandwhat’simportanttocustomers.Itisbasedonthemarketingconceptofcustomer-centricity.BSworksforallstrategicmarketingactivities:selectiontargeting,optimalpricediscounting,channelpreference/customerjourney,productpenetration/categorymanagement,etc.BSallowsamarketertodomorethanmeretargeting.

Animportantpointmightbemadehere.Behavioursarecausedbymotivations,bothprimaryandexperiential.Behavioursarepurchases,visits,productusageandpenetration,opens,clicksandmarcomresponses,etc.Thesebehaviourscausefinancialresults,revenue,growth,lifetimevalueandmargin.

Primarymotivationswouldbeunseenthingslikeattitudes,tastesandpreferences,lifestyle,valuesetonprice,channelpreferences,benefitsorneedarousal.Thereareexperiential,secondarycausesofbehaviour,typicallybasedonsomebrandexposure.Thesearenotbehaviours,butcausesubsequentbehaviours.Thesesecondarycauseswouldbethingslikeloyalty,engagement,satisfaction,courtesyorvelocity.NotethatRFMusesrecencyandfrequency,metricsofengagement,whichisasecondarycause.RFMalsousesmonetarymetrics,whichareresultantfinancialmeasures.ThusRFMdoesnotusebehaviouraldata,butengagementandfinancialdata.TheseareverydifferentthanbehaviouraldatausedinBS.Onesimplewaytodistinguishbehaviouraldatafromsecondarydataisthatbehavioursarenouns:purchases,responses,etc.Notethatsecondarycausesareadjectives:engagementmetrics,loyalcustomers,recenttransactions,frequentlypurchased,etc.

BStypicallyrequiresanalyticexpertisetoimplement.Behaviouralsegmentationisastatisticaloutput(seetheboxonpage164).

OnecriticaldifferencebetweenBSandRFMisthatinabehaviouralsegmentationmemberstypicallydonotchangegroups.Thatis,thebehaviourthatdefinesasegmentevolvesveryslowly.Forexample,ifonepersonissensitivetoprice,herdefiningbehaviourwillnotreallychange.Sheissensitivetopriceevenaftershehasababy,sheissensitivetopriceassheages,orifshegetsapuppy,orbuysanewhouse.Herproductspurchasedmightchange,herinterestsincertaincampaignsmightchange,butherdefiningbehaviourwillnotchange.ThisisoneoftheadvantagesofBSoverRFM.Thisiswhatdrivesyourlearningaboutthesegments.BSprovidessuchinsightsthateachsegmentgeneratesarationale,astory,astowhyit’suniqueenoughtoBEasegment.

WhileRFMusesonlythreedimensions,BSusesanyandallbehaviouraldimensionsthatbestdifferentiatethesegments.Ittypicallyrequiresfarmorethanthreevariablestooptimallydistinguishamarket.

Becausemarketingmixtestingcanbedoneoneachsegment(usingproduct,price,promotionandplace)theinsightsgeneratedmakefordifferentiatedmarketingstrategiesforeachsegment.TotestifRFMtiersdrivebehaviourisprobablyinappropriate,because

Page 141: Marketing Analytics: A Practical Guide to Real Marketing Science

tiermembershippotentiallychangeseverytimeperiod.Muchlikestudiesthatproclaim,‘womenwhosmokegivebirthtobabieswithlowbirthweight’,thereisspuriouscorrelationgoingon.Justasanotherdimension(socio-economic,culture,etc.)mightbethereal(unseen)causeofthelowbirthweightandNOTnecessarily(only)thesmoking,sothereareotherdimensionsof(unseen)behaviourusingRFMtoexplain,say,campaignresponses.Thatis,theresponseisnotcausedbytheRFMtier,butsomeothermotivation.

Inshort,BSgoesfarbeyondRFM.Theinsightsandresultantstrategiesaretypicallyworthit.

WhatdoesbehaviouralsegmentationprovidethatRFMdoesnot?Asmentioned,BSdeliversacohortofsegmentmembersthataremaximallydifferentiatedfromothersegmentmembers.Becausethesememberstypicallydonotchangesegments,variousmarketingstrategiescanbelevelledateachsegmenttomaximizecross-sell,up-sell,ROI,margin,loyalty,satisfaction,etc.

BSidentifiesvariablesthatoptimallydefineeachsegment’suniquesensitivities.Forexample,onesegmentmightbedefinedbychannelpreference,anotherbypricesensitivity,anotherbydifferingproductpenetrationsandanotherbyapreferredmarcomvehicle.Thisknowledge,inandofitself,generatesvastinsightsintosegmentmotivations.Theseinsightsallowforadifferentiatedpositioningofeachsegmentbasedoneachsegment’skeydifferentiators.Yougetawayfromtryingtoincentivizecustomersoutofthe‘bad’tiersandintothe‘good’tiers.InBS,therearenogoodorbadtiers.Yourjobisnowtounderstandhowtomaximizeeachsegmentbasedonwhatdrivesthatsegment’sbehaviour,ratherthanfocusononlymigration.Thus,BSgivesyouatest-and-learnplan.

Becauseoftheinsightsprovided,knowledgeisgainedofeachsegment’sprimepainpoints,whichmeansthateachsegmentcanbetreatedwiththerightmessage,attherighttime,withtherightofferandattherightprice.Thiskindofpositioningcreatesa‘segmentofone’inthecustomer’smind.Thisuniquenessdifferentiatesthefirm,perhapseventotheextentofmovingitawayfromheavycompetitionandtowardmonopolisticcompetition.Thismeansyouapproachadegreeofmarketpowerthatisbecomingapricemaker.

BecauseBSprovidessuchinsightsittendstomakemarketersveryactiveinunderstandingmotivations.Thistendstogenerateverylucrativestrategiesforeachsegment.

ConclusionWhataretheadvantagesofRFM?It’sfast,simpleandeasytouse,explainandimplement.Whatarethedisadvantagesofbehaviouralsegmentation?Itrequiresanalyticexpertiseto

Page 142: Marketing Analytics: A Practical Guide to Real Marketing Science

generate,ismorecostlyandtakeslongertodo.

BStakesbehaviouralvariablesandusesthemforthepurposeofunderstandingcustomerbehaviour,anditusesastatisticalalgorithmtomaximallydifferentiateeachsegmentbasedonbehaviour(seeboxoverleaf).Asmentioned,thevastmajorityofmarketersthatevolvefromRFMtoBSsayit’sworthit,andtheirmarginsagree.

Segmentationtechniques

TherearethreecharacteristicsthatdistinguishbehaviouralsegmentationfromRFM:BSuses(typically)morebehaviouraldata,BSusesthedataforthespecificpurposeofunderstandingcustomerbehaviourandBSusesstatisticaltechniquestomaximallyseparatethesegments.

Therearetwogeneralphilosophiesinanalysis:supervisedandunsupervisedtechniques.Unsupervisedtechniquesalmosteliminatetheanalystfromtheanalysis.Theseareneuralnetworks,machinelearning,chaostheory,etc.Philosophically,itseemsonthewrongtracktorunatechniquerequiringlittleanalyticstrategy.It’salsowellknownthatneuralnetworktechniquessufferfromover-fittinganddifficultyinexplainingwhatthemodelmeans(usuallybecauseofthehundredsofadditional/transformationalvariablesneuralnetworkingtendstocreate).Therefore,unsupervisedtechniquesarenotrecommended.

Ofthosetechniquesthatrequiresomekindofanalyticinput,ashortcomparisonfromRFMtoCHAIDtoK-meanstoLatentClassisinstructive.RFMismultivariable(typicallyusingthreevariables)butitisnotmultivariate–simultaneouslyusingthethreedimensions.RFMismathematicalandcouldnotbeastatisticallyvalidoption.

CHAID(chi-squaredautomaticinteractiondetection)issometimesofferedasasegmentationsolution.Itisatree-likestructurethatsplitsthenodesbasedonthechi-squaretest.WhileCHAIDisfastandsimple(andprobablybetterthanRFM)itcannotbeoptimal.CHAIDisnotastatisticalmodelbutaheuristic,aguideline.Itbringswithitnodiagnosticsandlittleintelligence.

K-means(alsocalledpartition,iterative,orclustering)isanotherfastandsimpletechnique.Thetypicalalgorithmrequiresyoutodecideonthenumberofclusters(asifyouknow)anddecidewhichvariablestousetodesigntheclusters(asifyouknow).K-meansgivesnodiagnosticstoaidintheseimportantcriteria,leavingittoyourarbitraryintuition.

So,afterthenumberofclustersisdecided,alongwithwhichvariablestouseforclustering,thealgorithmgoestothefirstobservation(egcustomeronthedataset)thathasallthevariablespopulated,calculatesthecentroid(averageofallthevariablesindimensionalspace)andlabelsthiscluster1.Itgoestothenext

Page 143: Marketing Analytics: A Practical Guide to Real Marketing Science

observationthatispopulated,calculatesthecentroidandascertainshowfaraway(basedonthesquarerootEuclideandistance)thesecondobservationisfromthefirst.Ifit’s‘farenough’away(basedoncriteriatheanalystgivesoradefault)tobedefinedasitsowncluster,itis.Itcontinuesthroughthedatasetuntilthenumberofclusterssuppliediscreatedandalloftheobservationsareclassifiedintoone(mutuallyexclusive)cluster.

Note:1)Itisnotstatistical,butmathematical.ItusesthesquarerootEuclidiandistancetoassignclustermembership.2)Clustercentroids(andhenceclusters)arehighlydependentontheorderofthedataset.Ifthedatasetisre-sortedtherewilllikelybeverydifferentsegments.3)Itofferslittleinthewayofdiagnostics.4)Becausetheclustersarenaturallyspherical(owingtoassignmentsbasedondistancefromacentroid)theclusterstendtobeofsimilarsize,whichseemsanunlikelyassumptioninarealmarket.WhileK-meansisastepaboveRFMandCHAID,itclearlysuffersfrommanyshortcomings.

Latentclassanalysis(LCA)hasbeenaroundfor50years,butinthelast20hasreallycaughton.LCAisaBayesian(maximumlikelihood)techniquewhichisstatisticalinnature.Becausecustomerbehaviourisprobabilistic(evenirrational)astatisticaltechniquebettermatchesbehaviourthanamathematicaltechnique.Ithasdiagnosticstofindtheoptimalnumberofsegments.Ithasdiagnosticstofindwhichvariablesaresignificantforthesegmentation.

LCAappliesaprobabilityscoretoeveryobservation(customeronthedataset)tobelongtoeachsegment.Forexample,it’sonethingifcustomerAis95%likelytobelongtosegment1andonly5%likelytobelongtosegment2.Thereisanobviousconclusion.Butwhatif,owingtothecustomeraseitherneweronfileorhavingdisplayedsomeunusualpatterns,itisscoredat55%likelytobelongtosegment1and45%likelytobelongtosegment2?Thisisnotsoclear.LCAgivesyoutheabilitytoremovefromthesegmentassignmentsanyofthosethatdonotfigurestrongsegmentbehaviour.Thisshouldtypicallybeaverysmallpercentageofthefilebuttheabilityto‘know’whereeachcustomermostlikelybelongsisveryimportantstrategically.

Ithasbeenprovedoften,butbynonebetterthanJayMagidsonandJeroenK.Vermunt,thatLCAisvastlysuperiortoK-Meansintermsofsegmentidentificationandseparation(MagidsonandVermunt,2002).GiventheadvantagesofLCAasseenabove,itshouldbeseenasthefirstandbestchoice.

Checklist

You’llbethesmartestpersonintheroomifyou:

Page 144: Marketing Analytics: A Practical Guide to Real Marketing Science

RememberSASgivesametricofanoptimalsegmentationsolutionasthe‘logofthedeterminantofthecovariantmatrix’.

Recallavarietyofsegmentationtechniques:businessrules,CHAID,hierarchicalclustering,K-means,latentclassanalysis(LCA),etc.

PointoutthatLCAprovidestheoptimalnumberofsegments,diagnosisofwhichvariablesaresignificantandcalculatesaprobabilityscoreforeverymemberbelongingtoeverysegment–nothingisarbitrary!

Usethebehaviouralsegmentationprocess:strategize,collectbehaviouraldata,create/useadditionaldata,runthechosenalgorithmandprofilesegmentoutput.

ProveRFMisfromthefirm’spointofviewandnottheconsumer’s.

PreachRFMincitesnostrategyexceptmigration.

Page 145: Marketing Analytics: A Practical Guide to Real Marketing Science

Partfour

Other

Page 146: Marketing Analytics: A Practical Guide to Real Marketing Science

09

MarketingresearchIntroduction

Howissurveydatadifferentthandatabasedata?

Missingvalueimputation

Combatingrespondentfatigue

Afartoobriefaccountofconjointanalysis

Structuralequationmodelling(SEM)

Checklist:You’llbethesmartestpersonintheroomifyou…

IntroductionWhystickinachapteronmarketingresearch?Mostoftheanalytictechniques(discussedsofar)applytobothmarketingresearchanddatabasemarketing.It’sbecause,whilethereisoverlap,thefunctionandgoalofmarketingresearchisdifferentthanthatofdatabasemarketing.

Databasemarketingexistsinordertodrivepurchasesfromcustomers.Marketingresearchexistsinordertounderstandconsumerbehaviour.

Databasemarketingispopulatedwithprogrammers,econometriciansandmarketers.Marketingresearchispopulatedwithpsychologists,statisticiansandmarketers.Databasemarketingisappliedanalytics.Marketingresearchisexploratoryanalytics.Databasemarketingistacticalandfast.Marketingresearchisstrategicandthorough.

MerlinStone’sbookConsumerInsight(Stone,2004)detailswelldatabasemarketingandmarketingresearch.ThisoverviewincludesCRM,marketingsystems/operations,loyalty,etc.

Howissurveydatadifferentthandatabasedata?Thisisagoodquestion,andmoreinvolvedthanitmayseematfirstglance.Ofcourse,surveydatacomesfromasurveyanddatabasedatacomesfromadatabase.Butthekeythingisthatsurveydatahasasourcethatis(typically)theconsumeranditisself-reportedandmayevenincludeopinions,etc.Databasedatahasasourcethat(typically)isasystem(transactionalorotherwise)anditisrealdata,realbehaviour,realresponses;thatis,NOTself-reported.

Page 147: Marketing Analytics: A Practical Guide to Real Marketing Science

Marketingresearchasadisciplinetendstofocusonsurveydata,whereasdirectmarketing,ofcourse,tendstofocusondatabasedata.You’veseenhowmanymarketingsciencetechniquesareapplicabletoboth.Thischapterscrapesoffthosetechniquesthataremostlyusedinmarketingresearch.Youcannotreallydo,forexample,aconjointondatabasedata;itisnotdesignedthatway.

Thisisoneareaofcontentionalludedtoearlier,especiallyintermsofpricing.Marketingresearchwouldsuggestasurveyandaskcustomers/potentialcustomersaboutpricingpolicies.Theseresponsesaresubjective/self-reportedandtendtohavethesameconclusion:‘Yourpricesaretoohigh!’Conjointisdesignedtogetaroundthatinsomemannerbutitisstillartificialintermsofarealbuying/choicedecision.That’swhyIrecommendusingdatabasedatawhichisrealreactionsfromrealtransactionsfacingrealchoicesintermsofrealprices.Realcool,right?Butthereisaplaceforsurveysandconjoint,etc.Justseebelow.

MissingvalueimputationAcommonissueinsurveydata(aswellasdatabasedata,butlessso)iswhattodoaboutmissingvalues.Itisatypicalpractice–but,asisthecasewithmosttypicalpractices,notagoodidea–tojustreplacethemissingvaluewiththemeanvalue.Thatis,saywehavesurveydataarounddemographics,includingage.Saythatinthiscaseageisimportanttowhatwe’restudying.Ifaverysmallpercentofageismissing,maybereplacingthemissingvalueswiththeoverallmeanisnotsobad.Butit’sstillstupid.

Abetterpossibilityistodosegmentation(evenK-meansisadecentchoice)andbasedon,say,incomeorsizeofhousehold,replacethemissingagevalueswiththemeanofeachsegment.Thisindicatesthatageiscorrelatedwithincomeorsizeofhousehold,andthat’sprobablynotabadassumption.

Thebestideawouldbetomodel,usingordinaryregression,thepredictedagebasedontheabovedemographicsbyeachsegment.Thiswouldaddvariation,ratherthanonlythe(segment)meanvalue.

Thisisallbasedonasubjectiveideathatdependsonthepercentofwhatevervalueismissing.If,say,<5%ismissing,replacingwiththeoverallmeanvaluemightbeacceptable.If,say,between5%and25%ismissing,replacingwiththemeanvaluebysegmentisbetter.Ifbetween25%and50%ismissing,modellingthemissingvaluewithregressionbysegmentisthebest.If>50%ismissingnoimputationshouldbeattempted.

CombatingrespondentfatigueMarketingsurveysshouldbeshort(Idon’tknowwhatImeanbyshort,buttheyshouldrequirelittleeffort,thinkingortime).Iftheyaretoolong(whatevertoolongmeans)fatiguewillsetin(orworse,irritation)andresponseswillbegintobe

Page 148: Marketing Analytics: A Practical Guide to Real Marketing Science

erroneous/nonsensical.

Thefirstsuggestiontocombatthisproblemistodesignsurveysthatareshort.It’sbettertohavetwoorthreesurveysinsteadofonelongsurvey.Otherwisetheanswersaremeaningless.

Ananalyticsuggestionistorotateandmodelquestions.Thisrequiressomethinkinganddesignbuttheresultsareusuallyverygood.

Thegeneralideaistousesomequestionstomodeltheanswerstootherquestions.Obviouslythesemodelledquestionswouldnotbeasked.Thatis,saythesurveyisinthree(welldesignedformodelling)sections,A,BandC.Onlyonefourthoftherespondents(randomlychosen)wouldgettheentiresurvey.OnefourthwouldgetonehalfofAandonehalfofB,anotherfourthwouldgetonehalfofAandonehalfofC,andthelastfourthwouldgetonehalfofBandonehalfofC.Thesurveyishalfaslongfortheselastthreefourthsoftherespondents.

Nowtheideaistomodeltheotherhalfofthosesectionsthatwerenotgiven.Thatis,useanswersfromAandBtomodelmissingC,BandCtomodelAandAandCtomodelB.See?Frommyexperiencetheerrorsfromfatiguearefarlessintherotate-and-modelscenariothantheerrorsfromthemodel.Thatmeansthatthemodelsareat95%confidenceandthoseanswersarebetterthangivingtheentirelongsurveyto100%oftherespondentsthatwillintroducefatigue-inducederrorsintothem.

AfartoobriefaccountofconjointanalysisTobefair,ifyou’rereadingthisbookinordertoknowallaboutconjointanalysis,youarereadingthewrongbook.Therearedozensof(entire)booksdetailingallthecooltypesandtechniquesofconjoint.IwillbarelymentionthisherebecauseconjointisavastsubjectandIamnotmuchofaconjointguy.

Toelaboratethelastpoint,Ithinkconjointservesanimportantpurpose,especiallyinmarketingresearch,especiallyinproductdesign(beforetheproductisintroduced).Mymainproblem(asmentionedabove)withsurveysoverallisthattheyareself-reportedandartificial.Conjointsetsupacontrivedsituationforeachrespondent(customer)andasksthemtomakechoices.Thecustomermakeschoicesandthesechoicesaretypicallyintermsofpurchasingaproduct.YouknowI’maneconguyandthesecustomersarenotreallypurchasing.Theyarenotweighingrealchoices.Theyarenotusingtheirownmoney.Theyarenotbuyingproductsinarealeconomicarena.TheartificialnessiswhyIdonotadvocateconjointformuchelseotherthannewproductdesign.Thatis,ifyouhaverealdatauseit.Ifyouneed(potential)customers’inputindesigninganewproductuseconjointforthat.Also,pleaserecognizethatconjointanalysisisnotactuallyan‘analysis’(likeregression,etc.)butaframeworkforparsingoutsimultaneouschoices.Conjointmeans‘consideredjointly’.

Page 149: Marketing Analytics: A Practical Guide to Real Marketing Science

Thegeneralprocessofconjointistodesignchoices,dependingonwhatisbeingstudied.Marketingresearchersaretryingtounderstandwhatattributes(independentvariables)aremore/lessimportantintermsof(typically)customerspurchasingaproduct.Soacollectionofexperimentsisdesignedtoaskcustomershowthey’drateaproduct(howlikelytheywouldbetopurchase)givenvaryingproductattributes.

Intermsof,say,PCmanufacturing,choice1mightbe:an800costofPC,17inchmonitor,1Gigharddrive,1GigRAM,etc.Choice2mightbe:an850costofPC,19inchmonitor,1Gigharddrive,1GigRAM,etc.Thereareenoughchoicesdesignedtoshoweachcustomerinordertocalculate‘part-worths’thatshowhowmuchtheyvaluedifferentproductattributes.Thisissupposedtogivemarketersandproductdesignersanindicationofmarketsizeandoptimaldesignforthenewproduct.

Notethatitisimportanttodesignthetypesandnumberoflevelsofeachattributesothattheindependentvariablesareorthogonal(notcorrelated)toeachother.Thesechoicedesigncharacteristicsarecriticaltotheprocess.Attheendanordinaryregressionisusedtooptimallycalculatethevalueofpart-worths.Itisthisestimatedvaluethatmakesconjointstrategicallyuseful.

Nowlet’stakeaslightlydeeperdiveintotheanalyticsofconjoint.Notethattheideaistopresenttoresponderschoices(insuchawaythattheyarerandomandorthogonal)andtherespondersrankthesechoices.Thechoicerankingsarearesponder’sjudgmentaboutthe‘value’(economistscallitutility)oftheproductorserviceevaluated.Itisassumedthatthistotalvalueisbrokendownintotheattributesthatmakeupthechoices.Theseattributesaretheindependentvariablesandthesearethepart-worthsofthemodel.Thatis:

Ui=x11+x12+x21+x22+xmn

whereUi=totalworthforproduct/serviceand

X11=part-worthestimateforlevel1ofattribute1

X12=part-worthestimateforlevel1ofattribute2

X21=part-worthestimateforlevel2ofattribute1

X22=part-worthestimateforlevel2ofattribute2

Xmn=part-worthestimateforlevelmofattributen.

Asmentionedabove,myview(andmanywillviolentlydisagree)isthatconjointisappropriatefornewproduct/serviceevaluations,andthat’saboutall.Itisnotappropriateinthetypicalwayusuallyused,especiallyintermsofpricing,except,asmentioned,inanewproduct–aproductwherethereisnorealdata.(Ievenprefer,say,vanWestendorppricingschemesoverconjoint.Thesearewherethesurveyasksrespondentswhatpriceissohighyouwouldnotconsiderpurchaseandwhatpriceissolowyouwouldsuspectaqualityissue.Theintersectionofwhere‘tooexpensive’and‘toocheap’crossis

Page 150: Marketing Analytics: A Practical Guide to Real Marketing Science

hypothesizedasoptimalprice.)

Anyway,foranexistingproduct,itispossibletodesignaconjointanalysisandputpricelevelsinaschoicevariables.Ihavehadmarketingresearcherstellmethatthispricevariablederivesanelasticityfunction.YoushouldknowbynowhowIfeelaboutthat.Idisagreeforthefollowingreasons.1)thoseestimatesareNOTrealeconomicdata.Theyarecontrivedandartificial.2)Thesizeofthesampleitisderivedfromistoosmalltomakerealcorporatestrategicchoices.3)Thedataisself-reported.Thoserespondentsarenotrespondingwiththeirownmoneyinarealeconomicareapurchasingrealproducts.4)Usingrealdataisfarsuperiortousingconjointdata.HaveIsaidthisenoughyet?Ok,therantwillnowstop.

Structuralequationmodelling(SEM)Thiswillunfortunately(also)beafar-too-briefaccountofSEM.SEMisinthedomainofmarketingresearch,ratherthandirect/databasemarketing(wherewe’vespentmostofourtime)butitissopowerfulandsofunthataquicktourhastobedone.

TherearesomesimilaritiesbetweenSEMandsimultaneousequations(coveredearlier).Theyeachareaboutsystemsofequationsandthusseveralsimilaritiesfollow.Theyeachdealwithendogenousandexogenousvariables.Theyeachrequirethealgebraicsolutionoffixedvariablesandenoughobservationstocalculatevariance.Ofcoursetheyeachrequiretheanalysttothinkthroughcauseandeffect.Thisisbecausebothtechniquesareaboutcauseandeffectandcanbeconceptualizedasregressions.

Asmentioned,SEMisamarketingresearchtoolwhilesimultaneousequationsareaneconometrictool.Thisisthefirstdifference.Another(major)differenceisthatsimultaneousequationsare(only)aboutblatantvariableswhileSEMcancontainbothblatantaswellaslatentvariables.Thisisinfact,inmyview,themostimportant(andexciting)difference.Anotherdifferenceisthatsimultaneousequationsoperateoneach(raw)observation(say,eachrowisacustomer)butSEMoperatesonanobservationbeinganelementofacovariancematrix.Whew.So,withthat,let’sgoontoafewdefinitionsofSEMasadifferentkindofanimal.

Figure9.1Unitsandpricecauserevenue

Page 151: Marketing Analytics: A Practical Guide to Real Marketing Science

Inthecontrivedexampleabove,notethatbothunitsandpriceCAUSErevenue.Revenueisadependentvariable.That’sequation1.NotealsothatbothpriceandmarcomCAUSEunits.Unitsareadependentvariableinequation2.Obviouslyunitsarebothanindependentandadependentvariable.Therearetwoequations.Alloftheseareblatant(manifest)variables.Theycanbemeasuredforwhattheyare.

Revenue=f(units,price)

Units=f(price,marcom)

Itistrueinthiscasethatwhilepriceandmarcomstatisticallyimpactunits(withstochasticerror),revenueisNOTstatisticallydrivenbyunitsandpricewitharandomerror.Revenueisalgebraicallycausedbyunits*price.Thiswouldbeastraightlinewithnoerror.It’sjustanexample.ItalsoshowsthatSEMisoftendiagrammedusingpaths.Wewilldothesame.Exampleswillrevolvearoundpathanalysis.InSASitwillbewithproccalis.

Let’sgooversometerminology,asSEMhasitsownlanguage,jargon,etc.Asnoted,therearetwokindsofvariables:manifestandlatent.Manifestvariablesareblatant,directlymeasured,directlyobserved.Thesearethingslikeresponses,sales,units,priceordaysbetweenpurchases.Thesecondkindofvariableislatent.Theseare(indirectly)estimatedthroughobservabledata.Thesearethingslikesatisfaction,loyaltyandintelligence.Thatis,whilethereisnoquantitativeobservablemetricof,say,satisfaction,itcanbeinferredbyobservablebehaviour.

Nowlet’smentionagainexogenousandendogenousvariables.Exogenousvariablesareoutsidethesystem;theyareindependentvariables(notcaused)butcanbeeitherlatentormanifest.Endogenousvariablesaretypically(atleast)dependentvariablesandarecausedbysomethingelse.Theyalsocanbeeitherlatentormanifest.Okay?Nowwe’rereadytodoSEM.

ComparingregressiontoSEMForasimpleexamplelet’suseprocregrevenue=f(units,price)andthenproccalisrevenue=f(units,price).

ThisisfartoosimpleauseofSEMbutitwillillustratesomeimportantthings.Note

Page 152: Marketing Analytics: A Practical Guide to Real Marketing Science

thatallvariablesaremanifestandwehaveonlyoneequation.Let’ssaywerunprocregandgetthefollowing:

Table9.1Procreg

Variable Parmestimate Standarderror Tvalue

Intercept –8862

Units 73.24 7.4 9.98

Price 111.25 19.03 5.84

Nowifwerunproccalis:

proccalisdata=xx.xxmeanstr;

path

rev<–unitsn_price;

run;

Table9.2Proccalis

Pathrevenue Variable Parmestimate Standarderror Tvalue

Intercept –8863

Units 73.24 1.48 49.39

Price 111.25 2.07 53.81

Proccalisgivesalotmore(butnotshownhere)results.TheonlypointhereisthatSEMandOLSshowthesame(singleequation,manifest)output,intermsofparameterestimates.Thedifferenceint-valuecalculationisthatregressionusesadifferentdenominatorforstandarderrorthanSEM.

CalculatingimpactsNowlet’sseewhathappenswhenweincludemorecomplexityandmorerealism.Mostmarketerswanttoknowtheimpactoftheirmarcom(andprice)onrevenue.Saywedidaregressionmodelrevenue=f(units,price,e-mail,directmail).(Wewillignorethealgebraicissueofhavingbothpriceandunitsasindependentvariables.)Theinteresthereismarcomimpacts.

Table9.3Regressionmodelrevenue

Variable Parmestimate Standarderror Tvalue

Page 153: Marketing Analytics: A Practical Guide to Real Marketing Science

Intercept –9368

Units 77.08 7.569 9.79

Price 115.24 20.112 5.73

Email 9.089 2.969 3.06

Directmail 3.99 1.88 2.12

Thisindicatesthateverye-mailsentdrives9.089inrevenueandforeverydirectmailsentweget3.99inrevenue.Lookslikemarcomisreallyrockin’!Thismeansthatsending100eachdrives909and399or1,308intotalrevenue.Thismodelimplicitlyassumestheimpactofmarcomisdirectlyonrevenueandnotonunits.TheR2hereis57%.

Nowlet’sgoastepfurther,andtheresultswillbemoreinteresting.Wewillusetheabovepathoftwoequations:

Revenue=f(units,price)

Units=f(price,email,directmail)

wheremarcomwillbenumberofe-mailsanddirectmailssent.Thehypothesishereisthatunitsandpricedirectly(algebraicallyinthiscase)impactrevenue.Theotherhypothesisisthatpriceandmarcom(EMandDM)directlyimpactunitswhichthenindirectlyimpactrevenue.Thatis,unitsarebothadependentandanindependentvariable.ThatmeansthatrevenuecomesfrombothpriceandunitsandthatunitscomefrompriceandEMandDM.

Thismeansthetotalimpactonrevenueis:

Table9.4Totalimpactonrevenue

Pathrevenue Variable Parmestimate Standarderror Tvalue

Intercept –8863

Units 73.24 1.48 49.39

Price 111.25 2.07 53.81

Pathunits Intercept 259

Price –2.53 0.082 –30.88

Email 1.266 0.299 4.23

DirectMail 1.141 0.089 12.82

Mostimportantlynotetheimpactofmarcomisthroughunits,andnottorevenue.Theimpactofonee-mailisnow1.266ofrevenueandeverydirectmailisnow1.414.Nowsending100eachonlytotals241inrevenue.Thisisfarmorerealisticthantheabove

Page 154: Marketing Analytics: A Practical Guide to Real Marketing Science

model.TheR2hereis78%.Whilethisisacontrived,overlysimplisticmodelithascomplexitythatmorecloselymatchesreality.

UseoflatentvariablesNowlet’stalkaboutwheretherealpowerofSEMcomesin:theuseoflatentvariables.Inthiscaselet’sputtogetheraframeworkforloyalty.Notethatthereisactuallynosuchthingasablatantentitycalled/quantifiedas‘loyalty’.Itisalatentvariable.Theideaisthatitislikeintelligence,whichisalsounquantifiableasitself;itcanonlybeindirectlymeasuredassomethinglikeascoreonanIQtest,whichinturnmeasuresdimensionsofintelligence:spatialability,logic,mathematics,verbalskills,etc.Sameistrueforloyalty.Itcanbeseenandsurmisedbyotheractions.

Let’ssaywehaveabehaviouralsegmentationinplacebasedoncustomertransactionsandresponsestomarcom.Weareinterestedinhowloyaleachsegmentis,whichisnotnecessarilythesamethingashowmuchtheyspendorhowmanytransactionstheyhave.Sowedoprimarymarketingresearchandaskquestionsaboutopinions/attitudesaroundprice,value,qualityandsatisfaction.Thesemetricswillshowarangeofloyalty.Wealsoaskaboutshareofvoice,competitivedensityandtheconvenienceofourstorescomparedtoourcompetitors.

Themodelabovetriestoputaframeworktogetherthatsaysconsumerbehaviour(transactions,responses,etc.)iscausedbyaspectrumofloyalty(fromnonetotransactionaltoemotional)whichisinturncausedbyattitudesaroundprice,value,satisfactionandqualityaswellasopinions/metricsofoperationallogisticslikeconvenience,shareofvoiceandcompetitivedensity.

Figure9.2Marcomresponsestransactions

Page 155: Marketing Analytics: A Practical Guide to Real Marketing Science

Sothegeneralanalyticideaisthattherearenosuchmetrics/quantitiesasemotionalortransactionalloyalty.Thesearelatentvariables.Butaddingthesevariableshelpsexplainthebehaviourofcustomerspurchasingandcustomersresponding.Thislatentvariableisdiscoveredbyafactoranalysis-typetechniqueusedinSEM.Thatis,themanifestvariablesindirectlyshowtheinfluenceofthelatentvariableandthatlatentvariableis‘teasedout’andlabelled.

Aquicknoteaboutthedifferencebetweentransactionalandemotionalloyaltyshouldclarifythisimportantpoint.Itispossibleforacustomertoappearveryloyalintermsofbuyingalotofproducts,havingashorttimebetweenpurchases,respondingtomarcom,etc.,butnotbeinfactactuallyloyal.Theseareheavypurchasersbecausetheremightnotbeanycompetitorsaround,orourstoresareveryconvenientorourshareofvoiceiscomparativelylarge.Thusit’simportanttoknowhow‘loyal’customersare.Thatis,atransactionalloyalcustomermayjumpshipifcompetitorsmoveinneartheirlocation,orchangetheirshareofvoice.

Theresultsbelowarefromapplyingtheloyaltymodeltotwodifferentsegments,sayXandY.Thesegmentsweredefinedby(transactionsandmarcomresponses)behaviour.Thequestionishowloyal(whatkindofloyalty)theyareandwhatcanbedoneaboutit.Let’ssaythateachsegmenthasgenerallythesamemetricsontransactionsandresponses.SegmentXscoresasatransactionalloyaltycustomer.Notetheparameterestimatesofconvenienceandcompetitivedensityareveryhighandsignificantwhileshareofvoiceisstrongandnegative.Thesearetraditionalindicationsofthetransactionalloyaltysegment.Notealsohighandpositiveimpactsofattitudesaroundpriceandquality,andrecognize

Page 156: Marketing Analytics: A Practical Guide to Real Marketing Science

thatmostofthevariablesontheemotionalpathareinsignificant.

Now,asegmentthatscoresasastrongtransactionalloyalty-onlysegmentisabitofaredflag.ThisisespeciallytrueiftheyLOOKliketheyareloyalbasedontheirnumberandamountofpurchases.

Howcanweusetheabovemodeltomovethesegmentfrommeretransactionalloyaltytoemotionalloyalty?Theanswerisintheemotionalloyaltypath.Thesinglelargestimpactisshareofvoiceandthatisametricwecan(somewhat)control.Thereisabusinesscasearoundwhatisthecosttospendandincreaseourrelativeshareofvoiceappliedagainsttheaddedsecurity(andperhapsincreasedpurchasing)ofasegmentthatevolvesintoemotionalloyalty.Seethatshareofvoiceisnegativeinthetransactionalpath?AsSOVincreasesacustomerislesstransactionalandmoreemotional.

Table9.5SegmentX,transactionalloyalty

Path Variable Parmest Sterror Tvalue

Transactional

Price 5.65 3.23 1.75

Quality 6.21 1.65 3.75

Value 3.03 2.07 1.47

Satisfaction 1.35 0.66 2.05

Convenience 5.22 0.75 6.96

Competition 2.66 0.99 2.68

Shareofvoice –1.55 1.03 –1.51

Path Variable Parmest Sterror Tvalue

Emotional

Price 0.03 2.66 0.01

Quality 0.56 1.07 0.53

Value 1.04 2.36 0.44

Satisfaction 1.66 1.03 1.62

Convenience 1.99 1.66 1.2

Competition 0.66 2.04 0.32

Shareofvoice 2.55 1.69 1.51

Page 157: Marketing Analytics: A Practical Guide to Real Marketing Science

Nowlet’slookattheoppositekindofloyalty,thebrand/emotionalkind.Thesearecustomersthatloveourbrand,nomatterwhat.ViewtheoutputbelowforsegmentY,whichscoresmostlyasanemotionallyloyalgroup.Noteontheemotionalpathconvenienceandcompetitivedensityarenegative.Thissegmentissoconnectedtothebrandthatevenifitisinconvenienttogotoourstoretheygoanywayandevenifmorecompetitionmovesinthesecustomerscometoourstoreanyway.Thisisemotionalloyalty.Youseealsothatontheemotionalpath,whilepriceispositiveit’sinsignificantandqualityisverysmall.Itshouldbenosurprisethatbothvalueandsatisfactionarehigh.Onthetransactionalpathnoneofthosemetricsaresignificant.

Table9.6SegmentY,emotionalloyalty

Path Variable Parmest Sterror Tvalue

Transactional

Price –1.27 5.65 –0.22

Quality 2.07 6.24 0.33

Value 2.07 1.65 1.25

Satisfaction 0.03 5.07 0.01

Convenience 0.23 0.2 1.17

Competition 0.04 0.02 1.8

Shareofvoice –2.65 1.54 –1.72

Path Variable Parmest Sterror Tvalue

Emotional

Price 3.25 3.04 1.07

Quality 0.24 0.12 2.06

Value 1.26 0.76 1.67

Satisfaction 3.23 1.23 2.63

Convenience –3.65 1.26 –2.91

Competition –2.07 0.56 –3.66

Shareofvoice 1.27 0.87 1.45

ThisisthepowerofSEM,hypothesizingandtestingalatentvariable.Thislatentvariableaccountsformovementinthecustomertransactionsandcustomerresponses.Ifonlyablatant/manifestmodelwasusedthefitwouldnothavebeensogoodandtheinsights

Page 158: Marketing Analytics: A Practical Guide to Real Marketing Science

(differentiatingbetweenthetwokindsofloyalty)wouldnotberealized.Soisthatcool,orwhat?

Checklist

You’llbethesmartestpersonintheroomifyou:

Pointoutthatmarketingresearchanddatabasemarketingusemanysimilarmarketingscience/analytictechniques.

Rememberthatsurveydataanddatabasedataaredifferentinmanyways:•surveydataistypicallyafewhundredorthousandresponses,whereasperhapsmillionsofconsumershavetransactionsonadatabase;•surveydataisself-reported/opinionswhereasdatabasedataisrealevents;•surveydataisasampleofsomekindwhereasdatabasedatacanbethewholerelevantpopulation(egallofafirm’scustomers).

Takegreatcareinimputingmissingvalues.Undersomecircumstancesreplacingamissingvaluewiththemeanisappropriate,othertimesmaybeamodeliscalledfor.

Recallthatconjointanalysisisbestsuitedfornewproducts,becauseoftheartificialnatureofthesimulatedpurchase.

Differentiatebetweenstructuralequationsmodels(SEM)andsimultaneousequations.SEMandsimultaneousequationsarebothsystemsofequations,butSEMdoesnotrequireonlyblatantvariables.

ArguethatthepowerofSEMisinuncoveringlatentvariables.

Page 159: Marketing Analytics: A Practical Guide to Real Marketing Science

10

StatisticaltestingHowdoIknowwhatworks?Everyonewantstotest

Samplesizeequation:usetheliftmeasure

A/Btestingandfullfactorialdifferences

Businesscase

Checklist:You’llbethesmartestpersonintheroomifyou…

EveryonewantstotestStatisticaltesting(designofexperiments,DOE)seemstodecreasetheriskofmakingamistake.

Designofexperiments:aninductivewayofcreatingastatisticaltestusingastimulustakingintoaccountvariance,confidence,etc.,byrandomizationandcomparisontoacontrolgroup.

I’lltellyourightnow,Imyselfamnotreallyatestingguy.Iseeitsworth,butthetimesthatthetestisactually‘clean’,canbemeasuredandismeasuringwhatitwasdesignedtomeasure,areveryfew.Thisisbecauseofacoupleofthings.First,companiesdonotwanttodesignfortestvs.control–whywouldtheywanttotakepotentialbuyersoutofthetreatment(iethecontrolgroupdoesnotgetthestimulus–thetest)?Themarketingscienceansweristhat‘youmustinvestinthetest!’Sofirmsusuallyfighttomakethecontrolgroupsosmall,actuallytoosmall,sothatastatistical(t-test,z-test,etc.)cannot(reliably)beperformed.

Anotherreasonisthatmostofthetimethetestis‘dirty’.Weneverseemtogetcustomersthatweretogetonlyacertainkind(ornokind)oftreatment(stimulus).SayacustomerissupposedtogettreatmentXsotheycanbemeasuredagainsttreatmentY(thatisthetest).However,accidentally,thatcustomeralsogetsstimulifromotherpartsofthecompanyandthenumberoneruleoftestingis:onlyonethingcanbedifferentinmeasuringtestvs.control.IfacustomerwassupposedtogetonlytreatmentXandthey(orsomeofthem)alsogotstimulusAandtreatmentB,promotionC,etc.,thetestcannotbedone;youcannotmeasure(inaDOEframework)multipledifferences(withoutdesigningforthat).Thatiswhythedesigniscritical.

Page 160: Marketing Analytics: A Practical Guide to Real Marketing Science

Veryfewcompaniesaredisciplinedenoughtoactuallycarryoutatest.Mostofthetime,attheendofthetest,everyoneshrugstheirshouldersandalsoacknowledgesseasonalityorcompetitionorchangingtastesandpreferencesorhypothesizesthatsomethingsystematic,affectedthetestresults.Sotheywanttotestagain.Andagain:neverreallylearninginordertoact,justtesting.Moreaboutthatlater.

Samplesizeequation:usetheliftmeasureTestingquestionsalwaysbeginwithsamplesize.Theideaistohaveasamplelargeenough–andwithenoughvariation–inordertobeconfidentaboutgeneralizingtothepopulation.Rememberstatisticsusesinductivereasoning.Thatisthepointoftesting:takeasmallsample(soasnotto(publicly)ruinanything)andsimulatethepopulation.That’simportant.Whatyou’retryingtodoisdesignalaboratorythatlooks(andacts)justlikethepopulation.Youexperimentonthe(sampled)laboratoryandfindwhatseemstoworkandthenyouhavetothrusttheseontothepopulation,whichyouhopewillactasthesampledid.That’sinductivereasoning.

Sowehavetorevisitthenormaldistribution,z-scoresandtheconfidenceinterval.Thatwasalongtimeago,sogobackifyouneedto.Idid.

Rememberthatthenormaldistribution(althoughkindoftheoretic)isthemodelthatweuse(mostly)fortesting.Weassumeanormaldistribution.Thenormaldistributionischaracterizedbytwothings:1)themeanandmedianandmodeareallthesamenumberand2)theirdistributionissymmetricalaboutthatnumber.Now,bydefinition,withinthefirststandarddeviationofanormaldistributionarecontained68%ofalltheobservations;withthesecondstandarddeviationadd14%toeachside,aggregating28%moreforatotalnumberofobservationsbetweentwodeviationsof96%.SeeFigure10.1.Nowlet’sthinkaboutz-scores.Remembertheformulais

(observation–mean)/standarddeviation.

Figure10.1Z-scores

Page 161: Marketing Analytics: A Practical Guide to Real Marketing Science

IntermsofIQ,wherethemeanis100andthestandarddeviationis15,68%ofallobservationsarebetween85and115.Saidanotherway,anIQof+1standarddeviationsisaz-scoreof1.00,whichisgreaterthan(34+34+14+1.9)nearly84%ofthepopulation.Az-scoreof+2.0isgreaterthannearly98%ofthepopulation.See?Thisisactuallythekeytosamplesizeneededandoveralltesting.

BysampleImeanasubsetofthepopulation.Evenifyoudonotreallyhavethewhole,entirepopulation,we’llpretend.Whatelsecanwedo?Sowegenerallytakeasimplerandomsample(SRS)ofthepopulation.Buthowlargeasampledoweneedinordertosimulatethepopulation?

Samplesizeneedstotakeintoaccount(intermsofDOE)variationwhichaffectsconfidence.Wearetryingtobeprettyconfidentthatthesizeofoursamplewillmirrorthepopulationwhenthetestingisdoneandthengeneralizedtothepopulation.Thatis,ifyoutookthemeanofthepopulationandfoundittobe50.0andthentookanSRSandfoundthatmeantobe40.0,wouldyoubeconfidentthatyoursamplemirroredthepopulation?Theansweris,‘Maybe,dependingonthevariation’.Sayyouknewthepopulationhadameanof50.0butastandarddeviationof25.50.It’spossibleyourSRSisrepresentativeofthepopulation.Thez-scoreis–0.392,whichmightnotbeTHATunusual.

So,theformulaI’dadvocateforsamplesizeneedstotakeintoaccountthestandarddeviationofthepopulation,howconfidentyouwanttobeofgeneralizingyourresultstothepopulationafterthetest,whatsensitivityyouwanttomeasure(ieliftdetection)andexpectedresponse.Thatis:

wherenissamplesize,Zisconfidencelevel,risresponserateandl=liftdetection.Asanexample,saywehaveanexpectedresponserateof28%,aconfidencewantedof90%(z-score=1.64)andaminimalliftdetectionof5%,thesamplesizeneededineachcellis5,566.Thatis,tobe90%confidentyourresultswillgeneralizetothepopulation(9outof10timesitwill,theoretically),andhavingusuallya28%responserateandyouwantedtonotdetectadifferenceunlessitisbyatleast5%(thatis,26.6%–29.4%)response,youneedatotalsampleof11,131.Thatis,forA/Btestingyouneed5,566ineach(testandcontrol)cell.See?

Ihavetomentionasillythingthatisstillgoingon,Ihearitallthetime.Theanswertothequestion‘HowlargeasamplesizedoIneed?’isoften‘380’.(Ifnotexactly380itisverycloseto380.)Letmeshowyouwherethiscomesfromandwhyitiswrong.Evenstupid.

Theformulathisusesis:

Page 162: Marketing Analytics: A Practical Guide to Real Marketing Science

Oftenmarketerstestat95%confidence(az-scoreof1.96)anda1%responserateisassumedandtheyonlywanttoaccepta1%error,whichtranslatesthisformulaintoasamplesize380.Nowthinkaboutthis.A1%assumedresponseratemeansthatofthe380cellonly3.8willrespond.Iguaranteethat3.8(okay,rounditupto4people)isNOTenoughtobeconfidentabout.Atall.Oriftheysay380areresponses,thenthatcellactuallyhad38,000init,right?Seethefolly?

Isn’tthisthesameproblemwiththeformulaIrecommendabove?No,itisnot.Ofthe5,566cellsizeandaresponserateof28%thatmeanstherewillbe1,558respondersandIcanbeconfidentwiththat.Orevenata1%responserate(still90%confidenceand5%lift)thecellsizeisover200,000.And2,000responsesareenoughtotestandbeconfidentabout.So,donotletthemtellyou380isanadequatesamplesize.Isitanywondercorporationsareinanosedive?

A/BtestingandfullfactorialdifferencesAcoupleofquicknotesonverycommontestingwillfollow.DidImentionIamnotreallyatestingguy?

WealwaystalkaboutA/Btesting(sometimescalled‘champion/challenger’)andthissimplymeanscomparing(evenastestvs.control)twocellsagainsteachother.Theideaisthatwerandomlychosetheparticipantsineachcelland(thisisimportant)theonlydifference(getthat?Theonlydifference)betweenthemisthatthetestcellhasthetesttreatmentandthecontrolcelldoesnot.

ThenwemeasuretheaverageresponsesofcellAvs.cellBandiftheyaredifferentenoughwesaytheyarestatistically/significantlydifferent.Thatmeanswehaveconfidence(typically95%)thatwhenwegeneralizethistothepopulationthesameresultshappen,onalargerscale.TheformulaIusuallyuseforresponsetestingisthez-score:

where .At95%confidenceifthisformulais>1.96thentheAresponserateisstatistically,significantly(andpositively–yesthisisveryimportant!)differentthantheBresponserate.

Asanexample,let’ssayfortheAtestwehaveresponsesof1,200andwesent10,000.ForBwehaveresponsesof950andwesent5,000.rAmeansresponsesfromA,nAmeanspopulationofA.(rA=1,200,nA=10,000,rB=950andnB=5,000.)Thiscalculatestoaz-scoreof–11.53whichisstatisticallyandsignificantlydifferent:withBoutperformingAat95%confidence.

Page 163: Marketing Analytics: A Practical Guide to Real Marketing Science

Letmemakeanotherpointthatmarketers(especiallyretailers)haveahardtimewith.Inordertoeffectivelycalculateandmonitorincrementalmarcom,thereneedstobeauniversalcontrolgroup(UCG).Thismeansagroupofcustomersthatnever(ever)getpromotedto.Thiscanbeasmallgroup,butstillstatisticallysignificantinordertotest.IfyoudonothaveaUCGyoucanonlytestonetreatmentcomparedtoanother,andneverknowifit’sincremental(ordetrimentalforthatmatter).IrealizeI’maskingyoutosetasideagroupofcustomersthatwillnevergetapromotion,nevergetabrandmessage,etc.Thisiscalledinvestinginthetest.Ifknowledge(orproof)thatyourmarcomisdrivingincrementalrevenuetoyourbusinessisimportant(andnoonewoulddisagreethatitis)thenyouneedtoinvestinthetest.Everycampaignneedstobedesignedatleastasatestvs.controlandthecontrolistheUCG.Ifyoudoabusinesscaseonthepotentialrevenueyou’lllosefromtheUCGandcomparethattotheinsightyou’llhaveaboutwhichcampaignsareactuallyincreasingthebottomline,investinginaUCGwinseverytime.RememberthepointofanalyticsistodecreasethechanceofmakingamistakeandUCGisallaboutthat.

BUSINESSCASEScottwalkedintothelittleconferenceroom,knowinghewouldagainhavetoexplainandstrugglewithBecky,thedirectorofconsumermarketing.Everymonthshehadmanyideasabouttest-and-learnplansandwhatshewantedtolearnfromaseriesofmailings.EverymonthScotthadtoexplaintohertheconceptsoftesting,especiallytheideaofonlychangingonedimensionatatimeinordertotest.Hehadthoughtifmaybeherecordedlastmonth’sconversationhewouldjustsendtherecordingandhaveherpressplaytore-hearit.

Hearrivedfirst.Healwaysarrivedfirst.Heestimatedinayearhewasted53hourswaitingforameeting/phonecalltostartwhileeverybodyelseeventuallywanderedin.Beckyandherteamjoinedhimaboutsixminutespastthehour.

‘SoScott,we’dliketotestourmessagesagain.Reallygetsomelearning.’

‘Great,allforit’,Scottsaid.Healwayssaidthis.

‘I’vethoughtaboutwhatyou’vebeensayingandhaveputatabletogether.We’dliketotestdiscountsagainstdifferentaudiences.’Sheshowedhimthetable.Notethatdiscountlevelisappliedonlyonce.(SeeTable10.1.)

Table10.1Testingdiscountsagainstdifferentaudiences

CellA 5%discount Desktoppurchase

CellB 10%discount Onlineexclusive

Page 164: Marketing Analytics: A Practical Guide to Real Marketing Science

CellC 15%discount Purchased>$2,500

CellD 20%discount Addingaprinter

Scottsighed.‘Becky,thisisthesameideawe’vehadbefore.Comparetwocustomers;oneincellAandanotherincellB.IfcellBhasahigherresponse/morerevenue,isitbecauseofthe10%discountsorbecauseoftheonlineexclusive?’

‘Iwouldsayboth’,shesmiled.

‘Butthepointofatestistoisolatejustonetreatment,inordertoquantifythatstimulus.’Helookedatthem.Theyallsmiled,allnodded.‘Whatisneededtotestthisisnota4cellbuta16cellmatrix.Likethis.’(HedrewTable10.2.)

Table10.2Testingdiscountsagainstdifferentaudiencesina16cellmatrix

5%discount 10%discount 15%discount 20%discount

Desktoppurchase CellA CellE CellI CellM

Onlineexclusive CellB CellF CellJ CellN

Purchased>$2,500 CellC CellG CellK CellO

Addingaprinter CellD CellH CellL CellP

‘Wow’,Beckysaid.‘Thatmakessense.Wewillneedafargreatersamplesizethough,right?’

‘That’sright.Thisiscalledfullfactorialandwilldetectallinteractions.Thebenefitisintheconfidenceofthelearningsandthecostisinthesamplesize,whichmeansbothtimeandmoney.It’satrade-off,asalways.’

‘Okay,we’llredesign.Let’salsotalkabouttheresultsoflastmonth’stest.’

‘Great.’

‘Well,inthiscasethecontrolcellout-performedthetestcell.Sothetestdidnotwork.’

‘Whatwerewetesting?’

‘Thiswastopastdesktoppurchasers.Thecontrolwasa10%discountandthetestwasa20%discount.Inthepastthe10%discountisprettystandardsowewantedtoseehowmanymoresaleshappenwitha20%discount.’

‘Makessense’,Scottsaid.‘Itseemssoweirdthatthe10%wouldout-performthe20%.Byhowmuch?’

‘Byalmost50%moreresponse,thatis,numberofpurchases.’

‘Thesewererandomlychosen?’

Page 165: Marketing Analytics: A Practical Guide to Real Marketing Science

‘Yep’,Beckysaid.‘Iguessitmeansourtargetaudiencedoesnotneedadeeperdiscount,whichisagoodthing.Theyareveryloyalandwillactwithoutadeeperstimulus.ButsomehowIdoubtit.’

‘SodoI.Itdoesnotmakeeconomicsense.Weshouldinvestigatethelist,makesurebothsidesgotthesingletreatment,trytoseeifsomethingwasamiss.Eachcellwasaboutthesamesize?’

‘Yeah,veryclose.’

‘But’,Kristinasaid,‘howdidwemakesurebothcellsonlygotthistreatment?’

‘Whatdoyoumean?’Scottasked.

‘NothinghappenedthatIknowoftopullthesecustomersoutandonlygetthismonth’sdeal.’

‘Andlastmonththe“GetaFreePrinter”wentout.’

‘Andthedesktopbundlewentout.’

‘Andsincefarmoreofourcustomersgetthe10%discountthananythingelse,thosethatgotthe10%discountinthistestcellmayalsohavereceivedoneorbothoftheotherstimuli.Right?’

‘Yeah,Ithinkso.’

‘Well,iftrue,thatcouldexplainit’,Scottsaid.‘Our10%testcellmayhavegotatleastthreestimuli,notone.’

Beckysighed.‘Sothetesthastobedoneagain?’

‘Probably.Ifitwasimportanttoknowwhatthattreatmentdrovethentheanswerisyes.’

‘Well,yeahitwas.Andwe’vehadsuchdifficultywithtestinganyway–Imeanthedesignofit–togobackandre-testwillbeahardsell.’

Scottlookedather.‘Idon’tknowhowhelpfulitmightbe,butwepossiblycoulddoamultivariateexercisetotrytoisolatethistest.’

‘Whatdoyoumean?’

‘I’mnotsure.Wemightbeabletodoamodelthataccountsforallthetreatmentsandstill,ceterisparibus,measuresjustthiscampaign.’

Kristinalookedup.‘YoumeananANOVAofsomekind?’(Analysisofvarianceisageneralstatisticaltechniquetoanalysethedifferenceswithinandbetweengroupmeans.)

‘Yeah,althoughI’maneconguysoI’mmorecomfortablewithregression.Butsometechniquethataccountsformultiplesimultaneoussourcesofstimulionrevenue.’

Page 166: Marketing Analytics: A Practical Guide to Real Marketing Science

ScottwenttothewhiteboardanddrewTable10.3.

Table10.3Multiplesourcesmodel

CustID

60dayreview

Printerpromo

DTbundlepromo

20%discpromo

#opens

#clicks

#webvisits

#calls

Pastrev

X 0 1 0 1 7 3 9 0 1800

Y 900 0 1 1 8 1 5 2 490

Z 0 0 0 0 11 4 4 1 800

‘Now’,Scottsaid,‘wecanincludeanyandallpromotions,etc.,thatwecantrackandputinthismodel.Theideaistomeasurethedollarvalueofallstimuli.’

‘Whatifwedon’torcan’tgetalltheinformation?’

‘Wewillalwaysmisssomething.It’simportanttoincludeallweknow,allwecanknow,frombothatheoreticalaswellasactualcausalityassumption.Thereisafinelinebetweenincludingtoomuchandmissingsomethingimportant.’

‘Canyouexplainabitaboutthat?I’mnotsurewhatyoumean’,Kristinaasked.Shehadalwayshadaninterestinthemodellingprocess,especiallyonthemoretechnicalsideofthings.

‘Fromaneconometricpointofview,toexcludearelevantvariablewillbiasthoseparameterestimates,soweneedtoensurewehaveallimportanttheoreticallysoundindependentvariables.Toincludeanirrelevantvariableincreasesthestandarderroroftheparametersestimates,meaningthatwhiletheyareunbiasedthevariationislargerthanitshouldbesothet-ratios(beta/standarderrorofbeta)willappearsmallerthantheyshouldbe.Thus,itbehoovesmodellerstodesignatheoreticallysoundmodelandcollectrelevantdata.’

Theyalllookedathim.‘Soundsgood’,Beckysaid.‘Let’stalkwithITandcollectthedatayouneedandyoucanputthistogetherforus?’

SoScottgotthedatatogetherandranthemodelandtheyfoundthevariouscampaigns’contributiontorevenuethataccountedformostotherimportantfactors.ThistypeofanalysisallowedScott’steamtooffercampaignvaluationoutsideofastrictlytestingenvironment.Whileeachpointofviewhasplusesandminuses,Scott’svaluationmethodcouldspecificallytakeintoaccountother(dirty)dataissues.Also,hisresultsdirectlytiedtosales,somethingA/Btestingdidnotdo.Asmentioned,abackgroundineconomicsisvaluableforamarketingsciencefunction.

Page 167: Marketing Analytics: A Practical Guide to Real Marketing Science

Checklist

You’llbethesmartestpersonintheroomifyou:

Remindeveryonethattheymust‘Investinthetest!’Thistypicallymeansusingalargeenoughsampleforacontrolgroupthatwillallowameaningfultest.

Pointoutthatit’sdifficulttoactuallycontrolforeverything.Simplerandomselectionisonlyabluntinstrument.

Rememberthatexperimentdesign,A/Btesting(championvs.challenger)willnotgivetheimpactofindividualdimensions(whatimpactpricehas,ormessage,orcompetitionchanges,etc.).

Demandthatthesamplesizeequationincorporateslift.

Makefunofthesillyanswer(‘N=380’)tothequestion‘Howlargeasampledoweneed?’

Shoutloudthatinalltestingeachcellcanonlydifferbyonething(onedimension).

Recommendusingordinaryregressiontoaccountfor‘dirty’testing.

Page 168: Marketing Analytics: A Practical Guide to Real Marketing Science

Partfive

Capstone

Page 169: Marketing Analytics: A Practical Guide to Real Marketing Science

11

Capstone:focusingondigitalanalyticsIntroduction

Modellingengagement

Businesscase

Modelconception

HowdoImodelmultiplechannels?

Conclusion

IntroductionThischapterisacapstoneofmostofwhatwe’vedonebefore.It’smeanttobeapracticalapplicationoftraditionaltechniquesappliedtodifferentkindsof(non-traditional)data.

Sincethemid-1990swhentheWorldWideWebbecameavailable,manymarketingscientistsandotherspanickedbecauseofthenewkindofdata.Clickstreams/weblogswerebecomingavailableandmanypeoplethoughtthatthenewdatawouldneednewtechniques.Theyforgotitisstillmarketing.Theyforgotitisstillconsumerbehaviour.

You’veprobablysurmised,asImentionedelsewhere,Iamnotinfavourofunsupervisedtechniquesanditwasthesethatmanydataanalystsbegantorunto.Unsupervisedtechniquesincludethingslikeneuralnetworks,variousmachinelearnings,chaos/catastrophetheory,etc.(IfyouHAVEtolearnthesethingsyouwilleasilyfindabucketloadofnew-fangledalgorithmsonline.)Butwhywouldnewdatarequirenewtechniques?Whendirectmailbecameavailabledidweinventnewtechniques?Whene-mailbecameavailabledidweinventnewtechniques?Regressionisstillworthwhileregardlessofthekindsofdataused.

TheaboveisnottosaythatdigitaldataISNOTverydifferentthantraditionaldata.ILOVEclickstreamdata(suchasOmniture’spageviews)thatshowsjustwhatpageaconsumerviews,forhowlongandinwhatorder.Thatisanamazingtrackingofconsumerbehaviour.Andthenewsocialmediaisbringingaboutaparadigmshiftfromoutboundmarketingtoinboundmarketing.It’sdifferentkindsofdatabutwhywoulditrequirenewstatisticaltechniques?Consumersarestillbehaving,shopping,buying.Right?

Newdata(BIGDATA!)isbringingaboutpanicbecauseitisMOREdata(bothintermsofsize(includingincreasedvariety)andadditionalbehaviouraldimensions).Newdata

Page 170: Marketing Analytics: A Practical Guide to Real Marketing Science

stilltracksaconsumer’sawareness,familiarity,consideration,shoppingandpurchase.SoI’dsuggestNOTusingneuralnetworksandTaguchimethodsasareactiontonewdata.Theremightbeaplaceforthesethings,butitisNOTjustbecausethedataisnew.

I’mnotagainstnewalgorithmswhenneeded.Itypicallydonotthinktheyareneeded.Iamalsophilosophicallyopposedtomanyoftheconceptionsthatseemtobebehindthesenewtechniques,inthattheytrytoremovetheanalystfromtheanalysis.Manyofthemarevirtuallymarketedasavoodoo/blackboxandadvocatenotreallyneedingananalyticexpertiserunningtheoperations.Thatseemstomeaformulaformassivefailure.Nottomentionthatwhenthesethingshavebeenputintothefield,Ihaveneverseenthemdobetterthantraditionaleconometrictechniques.Never.Ihavehadmanydebatesandbetsonthisveryissueovertheyears.(Youknowwhoyouare!)

ModellingengagementWhenitcomesdowntoit,afirmcanonlyreallybesuccessfulifitcanengageconsumers.ThisiswhyRFM(recency,frequency,monetary)works,toacertainextent:it(simplistically)findsthosecustomersthattendtobemostengaged.Therealissueisquantifyingengagement:whatbehaviourismostvaluable?

Whyquantifyengagement?Becauseengagementisbydefinitionpsychological(itsimpactisseeninovertbehaviour)themetric‘engagement’hastobederivedindirectly.Thatis,engagementisamotivator,astimulusthatshowsitselfincertainovertbehaviours.Becauseengagementisanindicatorofinterest,dependingontheproblemsolvingfortheproductneeded,interest(intheshoppingphase)iskeytomovingtheconsumertothepurchasingphase.Quantifyingengagementcanleadtospecificmarketingactions.

Whatarethehypothesizedfactorstodrivepurchases?Thereareseveralthingsthatcausepurchases.Someofthesearepricing,seasonality,competition,consumerconfidence,campaignsandengagement.Thesearebothblatantaswellaslatent.Thesearebothinternalandexternal.Thesearebothmarketingleversandconsumers’needarousal.But,engagement(interestintheproduct)iscertainlyaprecursorbeforeanypurchasingcanbemade,regardlessofthelevelofdecisionmaking.

Whataretheissuesarounddesigninganengagementmodel?Figure11.1showsan‘issuetree’,atechniquesometimesusedindesigningaproject.Theideaisthatthekeyissues/requirementsarestatedandsolutionsorotherissuesaredetailed.Thisway,focusisonthebigpicture,andall‘troublespots’aswellasnecessitiesareplannedfor.Yes,thiscomesfromMcKinsey.

Page 171: Marketing Analytics: A Practical Guide to Real Marketing Science

Figure11.1Issuetree

Whatshouldanengagementmodellooklike?Becauseengagementislatent,thereneedstobeatechniquethataccountsfortheinteractionsanddiscoveryofthishiddenmotivator.Butthemodelmustultimatelyquantifyengagement.Itshouldshowwhatexplanatorypowerengagementhas(givenseasonality,competition,pricing,marcom,etc.)andhowmuchengagementisworthtothefirm.Thatis,themodelmustbothgiveastructuralanalysisinsharedvarianceaswellasimpacttorevenue.RememberPeterDrucker’sadmonition:ifyourprojectisnot

Page 172: Marketing Analytics: A Practical Guide to Real Marketing Science

increasingsatisfaction,decreasingexpenseorincreasingrevenue,youshouldconsiderNOTdoingit.

Sinceengagementisaboutbothhiddenmotivationsandoutrightbehaviours,whatdoesthismeananalytically?Itmeansfactoranalysiswillbeusedtofindthelatentmotivations.Factoranalysisisaninter-relationshiptechniquestolenfrompsychologists.Theideaisthatitextractsvariancefromvariablesthat‘load’(correlatetogether)andthenmakesanewfactor.Thatis,variablesloadhighorlow,dependingontheunderlying(hidden)factor.

Recallthatweusedfactoranalysistocombineindependentvariablesintoother(factors)thatwerebydefinitionnon-correlated.Thatis,theresultantfactorsareuncorrelatedwitheachotherbutthecollectionoffactorsmaintainsthe(distinct,non-overlapping)varianceoftheindependentvariables.Thisiswhyittendstoworkasacorrectionforcollinearity.

Another(andmoretypical)useoffactoranalysisistodivineunderlyingmotivations.Conceptuallythismeansthatifblatantvariablesloadhighontoafactor,itisbecausetheyareeachmotivatedbyalatentdimension.Thenanotherlatentdimensioncomesintoplaytomotivatetheothervariables.Forexample,ifwehavevariableslikeGPA,income,education,jobtitle,etc.thatloadhighontoonefactorwemightcallthatfactor‘intelligence’.Thereisnovariablecalled‘intelligence’;welabelthefactorassuchbasedonwhichvariablescorrelatetogether.Thusthesameanalyticstrategycanbeleviedforengagement.Thisisthetechniquethatstructuralequationmodels(SEM)uses.

BUSINESSCASEScottwas‘loaned’totheonlinesoftwaresalesteamattheendoftheyear.Thisteamwasnewandprimarilymarketedsoftwareforsmallbusinesses.Thesoftwarewouldkeeptrackofthefirm’snetwork,ensuringsecurityandconnectivitywasupdated.Italsorecommendedcertainhardwareproductstoupgradeperformance,etc.

ScottreportedtotheGMofthesoftwaregroup.

‘HiScott,goodtoseeyou’,hesaidandstoodupandshookScott’shand.‘I’veheardgoodthingsaboutyouandfranklyweneedyourhelp.’

‘AnywayIcan’,Scottsaid.

‘Good.Weneedtounderstandwhatonlineactionsindicateinterest.Whenourpotentialcustomerscometoourwebsitetheycanbrowseforthesoftware,clickonproductdemos,downloadatrialversion,downloadawebinar,chatwithasalesengineer,etc.Wearetryingtoquantifythoseactionsthataremostindicativeofpurchase,andthenexploitthoseactions.’

Page 173: Marketing Analytics: A Practical Guide to Real Marketing Science

Scottnodded.

‘So’,theGMcontinued,‘whenapotentialcustomeroptsintoreceivee-mails,ortojoinacommunity,weknowthatbehaviourisobviouslyoneofengagement.Wewanttoknowwhatthatengagementisworth.Doesonlyopt-inbehaviourprovidethepathtopurchase,orarethereotherthings?’

‘Soyouwanttoquantifythoseclicks–thosebehaviours–thatleadtopurchase.’

‘That’sright.Notallbehavioursareequallyimportantinindicatingengagement.Wewanttoknowwhereinthepurchasingchainarenumberofopens,numberofpageviews,andtimeonsite,etc.’

‘Sure,Isee.Whichbehavioursarebiggerdriversofpurchasingthanothers?Whichareshoppingandlatent,whichareprecursorstopurchasingandareblatant?Soundsfun.’

ScottalreadyhadanideaashelefttheGM’soffice.Hecalledhisteamtogetherandtheyorganizedaccesstodata.Themaindimensionswouldbeclickstream/pageviews,primarilywhitepaperdownloads,webinars,trialsoftwaredownloads,numberofopens,numberofclicks,numberofpageviews,timeonsiteandwidthanddepthofproductpages.Opensandclicksrefertoe-mailengagement,widthofproductpagesindicatesthevarioussoftwareoptionsavailableanddepthofproductpagesindicatesaninvestigationofallofthespecificsforaparticularsoftwareproduct.Widthanddepthareimportantanddifferentviewsofcustomerbehaviour.Thinkofwidthasifshoppingforjeansandtopsandshoesandcoats.Thinkofdepthasifshoppingforjeans,whitewashedjeans,differentsizedjeans,returnpolicy,storelocation,productreviewofjeans,etc.

Mostoftheinternalclientsbelievedthatonlygated/registereditems(whitepaperdownload,trialsoftwaredownload,webinars,etc.)hadanyrealengagementtoquantify.Thisisanobviouslydeeperbehaviourthan,say,numberofopensandnumberofclicks.Scottwonderediftherewereanyotherbehaviours(particularlynon-gated)thatwouldquantifyasengagedastheopt-inrequiredbehaviours.

Sohecollectedthedataandranfactoranalysis.Twofactorsaccountedfor86%ofallthevariationoftheindependentvariables.Giventhebelowloadings(Table11.1),Scottcalledfactorone‘WindowShopping’andfactortwohecalled‘TryitOn’.Thatis,thebehavioursofopens,clicksandnumberofpageviews,forexample,arehypothesizedtobemotivatedby‘WindowShopping’.Likewisethebehavioursofdepthofproductpages,whitepaperdownloadandwebinarsaremotivatedbyadesireto‘TryitOn’.Whilethisseemsultimatelyintuitive,thewaytheanalysisputsthesetwolatentfactorstogethertoexplaintheblatantbehavioursiscompelling.

Table11.1Factoranalysis

Variable Factor1 Factor2

Page 174: Marketing Analytics: A Practical Guide to Real Marketing Science

WindowShopping TryitOn

Opens 0.76 0.26

Clicks 0.84 0.12

Webinar 0.10 0.88

Whitepaperdownload 0.12 0.82

Softwaredownload 0.29 0.86

Pageviews 0.90 0.11

Timeonsite 0.77 0.14

Widthproductpages 0.03 0.09

Depthproductpages 0.16 0.77

It’simportanttonote(forbusinessinsights)thatthefactor‘TryitOn’isnotonlygateditems,butincludesdepthofproductpagesat0.77.Thismeansthereishighengagementindepthofproductpages,almostashighastheopt-inbehaviours.

ModelconceptionThisgaveScottanobviousfunctionalformofthemodel:

Purchase=windowshoppingandtryiton.

Thatis,hewouldregresspurchasespendonthetwofactors(whichinturnaccountsforthevariationofalltheotherindependentvariablesandarethemselvesorthogonal,thatis,uncorrelatedwitheachother).Whenhedidthat,usingthefactorsasthetwoindependentvariables,heachievedanadjustedR2ofover37%andbothfactorsweresignificantatthe95%level.Thismeansthatindrivingrevenue,engagementitselfaccountsformorethanonethirdoftheimpact.The‘TryitOn’coefficientwas17,573andthe‘WindowShopping’coefficientwas5,448.Thismeansthat‘TryitOn’hasthreetimestheimpactonrevenuethandoes‘WindowShopping’.Theinterceptwas9,801.

Examplesappliedtocustomers

Table11.2showsthreeexamplesofhowitworks.Notethatcontact1050hasalargeamountofwebinars,didmanywhitepaperdownloads,downloadedthetrialsoftwareandsearchedthewebsiteproductpagestoasignificantdepth.Theyobviouslyoptedinandfallintothe‘tryiton’motivationandhavehighpredictedrevenue.

Table11.2Examplesappliedtocustomers

Contact Engagedrevenue

Windowshopping

Tryit

Opens Clicks Webinar Whitepaper

Trialsw

Pageviews

Timeon

W_prodpages

Page 175: Marketing Analytics: A Practical Guide to Real Marketing Science

on dl dl site

1050 90,451 –0.005 4.591 34 22 5 7 1 222 666 8

1061 51,523 4.453 0.988 77 71 1 6 1 620 1860 4

1269 37,145 3.445 0.488 55 8 0 0 0 559 111 5

Let’scalculatecontact1050’sengagedrevenueusingthemodel.Engagedrevenue=

intercept+

(TryitOncoeff*tryitonindepenvar)+

(Windowshoppingcoeff*windowshoppingindepenvar).

90,451=9,801+(5,448*–0.005)+(17,573*4.591).

Second,notecontact1061hasadifferentbehaviour.Theyhadmanyopensandclicks(indeedtheyclickedonnearlyeveryopen),smallernumberofdownloadactions,butahighnumberofpageviewsandtimeonsite.Theyexhibitthewindowshoppingbehaviourandthushavesmallerpredictedrevenue.

Last,notecontact1269.Theyhavethesmallestnumberofclicks,smallestnumberofdownloads,leasttimeonsightandnodepthofproductpages.Thereforetheirpredictedrevenueislowest.

Scottgothisteamtogether,aswellasthestakeholders,fortheoutputpresentation.Hewantedtotalkaboutmarketingactions.Theycameupwiththefollowinglist:

Sales/hotleads:giventhescore,thesecontactscouldbeturnedovertothesalesteam,thatis,engagementcanbeusedasa‘qualifier’ofahotlead.

Operations/strategy:giventhevastlymorevaluable‘TryitOn’behaviour,everythingpossibleshouldbedonetoremovebarriersto‘TryitOn’.

Marcom/campaigns:messagethat‘TryitOn’isavailable,leteverypotentialcontactknowthattheycandownloadtrialsoftware,readawhitepaper,etc.,togetcomfortablewiththebuyingdecision.

Atthequarterlyanalyticoperationsmeeting,ScottandhisteamwerecalledoutbytheVPfortheirworkonengagementmodelling.Thiswasagroupofallthemarketinganalystsinthecompany.

Therehadbeenatestputinplacebasedonthatanalysisandtheresultswereoverwhelming:whencampaignsmentionedtheavailabilityof‘TryinOn’beforepurchase,purchasewasultimately3.5timesmorethanwiththosethatdidnotgetthemessage.Thistranslatestohugeincreasesinsoftwarerevenue.Theaudiencesmiledandnoddedtheirheads.

‘I’malittlesurprised’,theVPsaid.‘Thisisextremelymeaningfultous;we’vefounda

Page 176: Marketing Analytics: A Practical Guide to Real Marketing Science

simplewaytoextractmillionsinextrarevenue,basedonananalyticproject.’

Thecrowdlookedathim.

TheVPhuffed.‘Whenwehaveafunctionalbreakfastoranafter-workget-together,youguysarelaughingandclappingandmakingallkindsofnoise.Atsportseventsyouscreamandcheer.Butwhenhearingofananalyticresultthatisverypositive,youjustnodyourhead.’

Nowtheaudiencesquirmedabit.

‘Ijustmean’,theVPcontinued,‘Iwouldthink–givenyouallworkinanalytics,andhavespentyearseducatingyourselfaboutanalytics–thatwhenyouseeanexcitingresultprovinganalytics,therewouldbealotmorehoopla.It’sokaytobegladthatyourchosencareerfieldreallydoesaddvalue.’

LetmereiteratewhatthisVPissaying.Analyticfolks,overall,tendtobeabitquiet–sure,let’ssayit’sthelogic/rational-dominatedsideoftheirbrain.

Howdoyouknowifyou’reananalyticperson?Youlovethesimplejoythatcomeswhenseeingavariablethatshouldbesignificant,provedinthedata.Thesatisfiedlookofwonderpervadesyourfacewhentheworldmakessense.Thatreplacestheconstant,cynicalcaveat-ladenwearinessweusuallyhavetocarryaround.That’swhatgotusintoanalyticsinthefirstplace,right?Peopleareconfusing,fullofirrationalgreyareas,butdataisdata,truthistruth.Whenwell-understoodrelationshipsmakesenseit’scomforting;wheninsightsarefound,it’sexciting.Murdersolved!Puzzlecompleted!Andit’sconsumerbehaviourwearetryingtopredict–thishelpsusbelievethatmaybepeopleareNOTsoconfusing.Okay,infomercialover,backtotheVP’smeeting.

‘It’sokay’,’theVPsaid,‘toacknowledgethatanalyticsworks.’

Scottstoodupandclapped.‘Yeah,analyticsrocks!’

Mostoftheaudiencelookedattheirwatches,afewclappedorcheeredalittle,somecoughed,oneortworolledtheireyes.TheVPshruggedhisshouldersandtheyallwentbacktowork.Scottsatbackdownandsighed.

HowdoImodelmultiplechannels?

Simultaneousequationsaretheanswertothatquestion.Thisincludesblogs,positiveratings,directmail,e-mail,etc.

SocialmediahasbecomeTHETHINGlately,ofcourse.Whileeveryoneseemstojumpontherevolutionarybandwagon,andrightfullyso,therehavebeenotherrevolutionarybandwagons.Inthemid-1990stheinternet/WWWbecameavailableandwidespread.Inthemid-1970sitwaspersonalcomputersandinthe1960smainframecomputers–eachofthesehadhugedataimplications.SowhilesocialmediaISadifferentkindofdata,analyticallyitmerelyallowsmoreunderstandingofconsumerbehaviour.Of

Page 177: Marketing Analytics: A Practical Guide to Real Marketing Science

coursethemostexcitingaspectofsocialmedia(intermsofmarketingscience)isthatforthefirsttimeINBOUNDmarketingispossible.

Assuch,theabilitytomodelsocialmediaiscritical.Thisdoesnotmeanitwillrequirenewtechniques;itisjustadifferentsourceofdata.Itdoesshedlightonshoppingchannels,thatis,whatdoessocialmediahavetodowithonlinepurchasesasopposedtoofflinepurchases?Sinceeveryoneisdemandingtoknowhowmuchadvertisingbudgettoassigntosocialmedia,theimpactofsocialmediaonpurchasingbychanneliscritical.

That’swhatScottknewwasgoingtohappenwhenhewascalledintotheofficeofthenewlycreatedVPofdigitalmedia.

TheVPputdownherphoneandshookScott’shand.Scottsmiled.

‘IbetIknowwhatyou’regoingtosay’,Scottsaid.‘You’dliketoknowwhatimpactsocialmediahasonsales.’

‘Sure,butonecomplication:wehavetwosaleschannels,onlineandoffline.We’dliketoknowtowhatextentsocialmediaimpactsonsalesinboththeonlineandofflinechannel.’

Scottgulped.‘Well,that’salittlemorecomplicated.’

Shesmiled.‘ButnottoohardforsomeonethatwontheExecutiveAwardlastyear,right?’

‘We’lldowhatwecan’,Scottsaid.‘I’llgetconnectedwithyourdatapeopleandwe’llseewhatwecanfindout.’

‘Theissueisimportant’,shepointedout.‘Allofusarebeingaskedtocutouradvertisingbudgets.Wehaveaportfolioapproach.Dowespendindirectmail,e-mail,onlineorsocial?Youranalysiscanhelpusoptimizeourbudgets.’

‘Isee.Nopressure.’

‘Andwe’llneeditintwoweeks,tomeetourmarcomplans.’Shesmiledandpickedupherphone,themeetingover.Scottwalkedtohisofficeandknewthatthenexttwoweekswouldbedifficult.

Histeamcollectedweeklysalesdata,bothonlineandoffline.Scottwoulddoatimeseriesmodel.Hewouldusesimultaneousequationstomodeltheimpactofthemarketingmix(product,price,promotionsandplace)onsales.He’ddoaseparatemodelfordesktops,notebooksandworkstations.

Forexample,inthedesktopmodel,hewantedtoknowwhatpricedoestoexplainthesalesofdesktopsbyeach(onlineandoffline)channel.Whataboutpromotions,likee-mailanddirectmail?Andwhataboutsocialmedia:blogs,positivementions,shareofvoice,etc?Itwouldbeinterestingtofindoutthedifferencestheseindependentvariableshadonmovingunitsdifferentlybychannel.E-mailanddirectmailcouldbethoughtofas

Page 178: Marketing Analytics: A Practical Guide to Real Marketing Science

outboundmarketing,whereassocialmediacouldbethoughtofasinboundmarketing.Fromastrategicpointofview,theobjectivewastooptimizethebudget,andScottthoughtthatifthismodelworkedthatwouldbeaveryrealuse.

BecauseScotthadalreadydecidedonatimeseriesmodel,ieeachrowisaweeklyaggregation,hedidnothavetodealwithsparsedataonaconsumerlevel.Thatis,ifhetookthe‘eachrowisaconsumer’approach,therewouldbesofewmatches(especiallyintermsofsocialmedia)thathewouldnothavealargeenoughsample.Likewisehewasgoingtomodelunitssoldasthedependentvariableagainstthewholemarketingmix,NOTjustusesocialmediaasindependentvariables.Thatwouldplacefartoomuchattentiononjustsocialmediaandwouldflyinthefaceofalltheotherthingsknowntomoveconsumerbehaviour,suchasprice,season,marcomvehicles,etc.

Sothetheoreticconceptionofthemodelwouldbe:

ONLINEUNITS=f(#directmails,#emails,onlineprice,offlineprice,socialmedia,etc.)

OFFLINEUNITS=f(#directmails,#emails,onlineprice,offlineprice,socialmedia,consumerconfidence,etc.)

Hewouldhavetoconsidertheidentityproblemandalltheothermodellingissues,buttheabovelookedlikewhatheneeded.

AnaddedthingScotthadtoaddress:thelagstructure.It’swellknownthatmanythings(especiallymarketingcommunicationvehicles)havealageffecton,say,demand.(Bylagismeantaweeklyvariableismoveddownoneweek,sothatinsteadofitsactualoccurrenceonJan7forexample,itislaggedtohappenonJan14.)Theactualshape,amplitudeandlengthofthatlagstructureisthesubjectofhundredsofacademicpapers.Sotheproblemis,torestate:whatimpactdomarketinglevers(price,websitevisits,marcomvehicles(includingthelagstructure),socialmedia)andothereffects(seasonality,consumerconfidence)haveonmovingunitsinboththeonlineandofflinechannels?ThisshouldbeseenasaBIGproblem,andveryimportanttoquantify.

SoScottcollectedthedataandbeganworkingonthemodel.HesettledonSAS’s3SLSprocedure.Forsocialmediatheir‘listeninggroup’cameupwithseveralvariables:numberofblogsaboutthecompanyaswellascompetitors,shareofvoice(percentmentionsaboutthecompanydividedbytotalmentionsofallcompetitors),forums,positivementions,etc.ForthelagstructureScottusedSAS’smacro(%pdl)thatallowsmodellingtoincludethenumberoflagsandamplitudeoflags.

Table11.3showstheoutputofthesimultaneous(desktop)models.Thereareseveralnotesabouteach.Firsttheofflinemodelhasanadjustedfitof80%;thatis,thelistedindependentvariables(significantatthe95%level)accountfor80%ofthemovementintheofflinechannel.

Page 179: Marketing Analytics: A Practical Guide to Real Marketing Science

Table11.3Impactonofflineunits

OFFLINE

Variable Parameter R-Square 86%

Estimate AdjR-Sq 80%

Intercept 52,289

Blogs 0.055 +55units

Directmails 0.046 +46units

Directmails_lag1 0.039 +39units

Directmails_lag2 0.012 +12units

Directmails_lag3 0.009 +9units

Directmails_lag4 0.004 +4units

E-mails 0.025 +25units

E-mails_lag2 –0.04 –40units

E-mails_lag3 –0.065 –65units

E-mails_lag4 –0.012 –12units

Visits 0.048 +48units

Offlineprice –3.417

Onlineprice 1.801

Consumerconfidence 21.158

Q4 192,668

Themarcom(directmailande-mail)showsalageffect.Directmaillags0–4periodsinitsimpactande-mailalsolags0–4periodsinitsimpact.

Priceisinteresting.Theofflineprice(intheofflinemodel)is,asexpected,negative.Thisagainisthe‘lawofdemand’;pricegoesupandunitsgodown.Theonlinepriceispositive.Thismeanstheonlinepriceisasubstitute;thatis,iftheonlinepriceincreasedby,say,10%,theOFFlinedemandwouldincreaseby18%.

Nowaninterpretationisneeded,especiallyofsocialmediaandmarcomintermsofunits.Thegreyhighlightsshowhowmanyunitsareexpected,onaverage,fromeach,initems.Thatis,multiplyingthecoefficientby1,000,forexample,meansthatifthereare

Page 180: Marketing Analytics: A Practical Guide to Real Marketing Science

1,000blogs,onaveragetheofflinechannelbenefitsbyabout55units.Whendirectmailisdropped,foreach1,000piecesthereare46unitsincreasedtotheofflinechannel.Notethee-maillagsarebothpositiveandnegative,meaningtheamplitudehasadifferentshape.E-mailonlyhasapositiveimpactwhenitisfirstdropped,butovertimeitisnegative(thismightreflecte-mailfatigue).Theaboveseemstoindicatethatdirectmailismoreimpactfulthane-mailintheofflinechannel.Notealsohowimpactfulq4isintheofflinechannel.Thisispartoftheinsightthatonlyaneconometricmodelgives.

Nowtakealookattheonlinemodel.TheadjustedR2isalittlebetter.Nowobserveprices.Theonlinepriceisagainnegativeasexpectedbutnotethatwhiletheofflinepriceispositive(indicatingsubstitutability)itisfarlessimpactfulthanintheofflinemodel.Thatis,intheonlinemodea10%increaseintheofflinepricebringsaboutonlya1.2%changeintheonlineunits(comparedtoan18%impactintheofflinemodel).

Itshouldbenosurprisethatwebvisitsarefarmoreimpactfultoonlineunitsbutlookhowmuchmorepowerfule-mailis.Whilethisalsoisprobablynosurprisepleasenotethatthismarcomchannelcanbequantified.Observelikewisethatintheonlinemodelnowdirectmailisnegative.

Nowlet’sinterpretthesocialmedia.Itismuchmoresignificantintheonlinemodel.Shareofvoice,forums,howmanyfollowersthefirmhasandpositivementionsallcontributetotheonlineunits.Thiswouldprobablyindicatethefirmshoulddowhattheycantoinvestinachievingpositivementions,followers,increasingshareofvoice,etc.

Thelasttaskistolookattheseasonality.Becauseq4isdropped(rememberthedummytrap?)alltheotherquartersarereferencingthat.Noteallthreearenegative(comparedtoq4)withq2beingthemostnegative.Thishelpsplanningpurposes.

Thisoverallmessagewouldseemtobe:directmailandconsumerconfidencearepowerfulinimpactingofflineunits,bute-mailandsocialmediaarenot.Intheonlinechannele-mail,socialmediaandwebsitevisitsaremuchmoreimpactful.Whileagainthisisintuitivelycompelling,ithadnotbeenquantifiedbefore.

So,giventheabovemodel,whatarethestrategicimplicationsScottcangive?Intermsofprice:sincetheonlinechannelismuchmoreofasubstituteforofflinepurchasers,raisetheofflinepricetodrivemorebuyersonlineandthinkaboutaddingonlineexclusives.

Intermsofe-mail:decreasetheamountofe-mailssenttothosethatonly/mostlypurchaseoffline.Increasetheamountofe-mailssenttothosethatonly/mostlypurchaseonline.

Intermsofdirectmail:decreasetheamountofdirectmailsenttothosethatonly/mostlypurchaseonline.Increasetheamountofdirectmailsenttothosethatonly/mostlypurchaseoffline.

Intermsofsocialmedia:engageininboundmarketing(findXadvocates/championsof

Page 181: Marketing Analytics: A Practical Guide to Real Marketing Science

thefirm,instituteablogstrategyofcommunity,etc.).Offerpromotionsinsocialspacetopurchasethefirm’sonlineproducts.

Noteallthestrategicimplicationsfromthismodel.Itaddressesmostofthemarketingmix(product,price,promotionandplace)andoffersstrategiesbasedonquantifyingcausality.

Table11.4Impactononlineunits

ONLINE

Variable Parameter R-Square 88%

Estimate AdjR-Sq 83%

Intercept 11,805

SOV 46.92

Forums 0.0037 +3units

Followers 0.0592 +59units

Positivementions 0.016 +16units

Directmails 0.08 +80units

Directmails_lag3 –0.073 –73units

Directmails_lag4 –0.043 –43units

E-mails 0.113 +113units

E-mails_lag1 0.013 +13units

E-mails_lag4 0.009 +9units

Visits 0.165 +165units

Offlineprice 0.121

Onlineprice –5.704

Q1 –1,947

Q2 –2,323

Q3 –170

ConclusionSimultaneousequationsprovideapowerful(andsophisticated)wayofquantifying

Page 182: Marketing Analytics: A Practical Guide to Real Marketing Science

important(andwell-known)interactions.Oversimplificationisthebaneofgoodanalytics.

Page 183: Marketing Analytics: A Practical Guide to Real Marketing Science

Partsix

Conclusion

Page 184: Marketing Analytics: A Practical Guide to Real Marketing Science

12

TheFinaleWhatshouldyoutakeawayfromthis?Anyotherstories/soapboxrants?WhatthingshaveIlearnedthatI’dliketopassontoyou?

Whatotherthingsshouldyoutakeawayfromallthis?

WhatthingshaveIlearnedthatI’dliketopassontoyou?Wow,we’rehereattheend.Ihopeitwasworthwhileandmaybealittlefun.Ifso,tellyourfriends.

OnethingI’dliketherestofthecorporateworldtoknowiswhatamarketinganalystdoes.Thatis,notthetechnicaldetailsbutwhatistheirfunction,whatistheirpurpose,whyaretheyimportant?

Now,Iknowthatifwetakearandomsampleofpeopleallacrossanumberofcorporationsandaskthem,‘Whatarethefirsttwowordsthatcometomind,whenyouthinkofmarketinganalysts?’

Mostofthemwillanswer,‘Smoulderingsexuality’.

Iknowit’strue,wedealwithrealdata,weseecampaigneffectiveness,wecanforecast,itisnodoubtthesexiestthinginthebuilding.ButthatisnotwhatIwouldwantthemtothinkaboutus,topofmind.Iwouldhopethatthisbook–andmanylikeit–andy’all,willhelpthemtothinkofusas‘QUANTIFYINGCAUSALITY’.

Weareabletothinkintermsof‘thiscausesthat’,thisvariable(price)changesthatvariable(sales)andthen–mostimportantly–quantifyitsomarketingstrategycanactonit.Wequantifycausality.

Idon’twanttohear,‘Correlationisnotcausality’becausewhocares;wearenottalkingaboutcorrelation,andwehardlyevertalkaboutcorrelation.Grangercausality(inventedbyeconomistCliveGranger)assertsthatifanXvariablecomesbeforetheYvariable,andiftheYvariabledoesnotcomebeforetheXvariable,andif,inremovingtheXvariable,theaccuracyofthepredictiondeteriorates,thenthereforeXcausesY.Andwecanstateitascausality.

So,acoupleofthingsI’velearnedthatI’dliketopassontoyou.TheseareanecdotesthathelpedmefocusonimportantthingsandIhopethesestorieswillhelpyou.

Page 185: Marketing Analytics: A Practical Guide to Real Marketing Science

Anecdote#1Myfirstjobwasasasalesmaninashoestore.Iwas16andthatatleastmeantIthoughteveryoneover30wasoutoftouchandun-cool(itwasthemid-1970s).

OnedaythebosswasoutandleftBenandIinchargeofthestore.Benwasapart-timesalesguy,hadknownthebossandhisfamilyforyearsandwassemi-retired,over60,andJewish.

Awomancameindraggingtwotoddlerswithher.Benwasatthecounterandthewomansetdownapairofshoesandsaidthestrapbroke.Bensaidhe’dhelphergetareplacement.IsawrightawaythosewereNOTourshoes.Thatwomanwasabouttogetafreepairofshoesbecauseofabefuddled,half-addled,maybesenileandconfusedsalesman.Iwasnotabletogethisattentiontoexplaintohimtheerrorofhisways.Hegotheranotherpairofshoesandshealsoboughtapairforoneofhertoddlers.IwatchedthemasshepaidandcheckedoutandBenwavedatherandsmiled.

Iwentuptohim.‘Ben,whatareyoudoing!?Thosewerenotourshoes!’

‘Oh,youmeanforMrs.Rasmun?’

‘Yes,yougaveherapairofshoes,forfree!’

‘Yes,Iknowher.She’sareturningcustomer,hasaboutfivekids,comesinhereallthetime.’

‘But,youGAVEherapairofshoes.’

Helookedatme.‘Yes.IfItoldherthosewerenotourshoesshewouldhavedisagreedandwalkedout,unhappy,maybenottoevercomeback.Maybenotbuyherkidstheirshoeshere.Ididgiveherapairofshoes.Ialsosoldheranotherpairofshoes,andensuredshewassatisfiedandwouldcontinuetocomeback.’

Igulped.‘Oh…’.Somuchformycoolness.

WhatItookawayfromthat,otherthanmynarrow-mindedprofiling,wasthatsmartnessisalwaysaboutfocusingonthecustomer.It’snotwhatis‘right’financially,butwhatdrivesabusinessiscustomer-centricity.That’sprobablywhyIendedupinmarketing,adisciplinethat(issupposedto)putcustomersfirst.

Now,doesthismeanthecustomerisalwaysright?Ofcoursenot,seeabove.ThecustomerCANbecrazy.RememberGaryBecker’sirrationaldemandcurve(Becker,1962).But,accordingtoPeterDrucker,thepurposeofabusinessistocreateandkeepacustomer–getit?KEEPacustomer.Thismeansunderstandingacustomer,andthismeansusinganalytics.

Whattogetoutofthis:beingcustomer-centricisalwaysright.

Page 186: Marketing Analytics: A Practical Guide to Real Marketing Science

Anecdote#2IworkedearlyonasananalystataPCmanufacturingfirm.IwasalsofinishingmyPhD;infact,writingmydissertation.Itinvolvedafairlynovelkindofmathematics,calledtensoranalysis(moreusedinphysics/engineeringthanmarketing/economics)andwasaboutmodellingmulti-dimensionaldemand.Myboss(whilenotveryanalytic,wasverystrategic–includingpromotinghisgroupandhimselftoallofhisbosses)wasimpressedwiththeidea.

Somehowhegotanappointmentwiththeheadguy,threelevelsabovehimself,toshowmydissertation.Thiswasnotaboutthedifferentialgeometryofmanifoldtensors,butwhatcouldbedoneforthePCmanufacturingcompanyintermsofbetterestimatesofdemand.Sothebigmeetingwasset,aboutfiveweeksinadvance.Thiswastogiveustimetoprepare,because–mygod!–thiswasanaudiencewiththeCEO,theBIGBOSS.Sowe(myboss,callhimBob,andI)workedhardonthePowerPointpresentation,spendingdaysonthewordsandgraphics,tryingtofocusontheusecasesofdemandforPCs.HRandthebigboss’ssecretaryevenmadeusrehearse,thatis,practiseourdeliveryinfrontofthem,tomakesuretherewerenooffendingphrasesorcomments(thiswasprobablydirectedatme–Iwasseenassomewhataloosecannon)andtheyhadtoapproveit.FinallyitwasalldoneandwehadourtimewiththeBIGBOSS.

Wewentinandtheofficewaslikeamuseum,glassandbrassandmarble–itwasacorporatetemple.

‘So’,myboss,Bob,began,‘thankssomuchforsomeofyourtime.MikeherehasaveryinterestingPCmodeltoshowyou.Mike?’

Iclearedmythroatandpointedtotheoverheadprojection.‘Demandisusuallymodelledasunitsbeingafunctionofseveralthings,includingprice.Itisalwaysaboutholdingeverythingelseconstant.’

‘SoBob’,theBIGBOSSsaid,‘howarewegoingtobeatthecompetitionontheseserverwars?’

Ilookedathim.What?

’Oh’,Bobstammered,‘wehavesomeideasinmind.’

Thenext45minuteswasaboutBobandtheBIGBOSStalkingabouttheserverwarsandourcompetition.Attheendweshookhandsandleft.TheBIGBOSShadalimp,damphandshake.

Whattogetoutofthis:successcomesfromfocusingonwhat’simportant,especiallyonwhat’simportanttopeopleseverallevelsaboveyou.

Anecdotes#3and#4

Page 187: Marketing Analytics: A Practical Guide to Real Marketing Science

Thisanecdoteisimportant,becauseanyonedoingmarketingsciencehasfacedit.Andthosenotinmarketingsciencewonderaboutit.I’mtalkingaboutalteringthedata,editingtheoutputfile,changingtheresultstobe(more)intuitive.

Thisistheunderbellyofmarketingscience.Iknowthoseinotherfunctionswonderifwechangethedata.Dowemakestuffup?

Iwastalkingwithaclientrecentlyandtheytoldmeaboutaconsultantwhowaspredictingthelifttheywouldgetonaparticularcampaign.Theconsultantestimateda16%increase,whichwasWAYMOREthananythingeverachievedbefore.Theconsultantwassketchyonwhatwerethekeydriversofthisphenomenalsuccess.Theclientfranklydidnotbelieveitandsaidso.Theconsultantaskedwhatitshouldbeandtheclientrepliedthataboutone-tenthofhisestimatewouldbebelievable.Thenextweektheconsultantcamebackwitharevisedestimateof,waitforit,2%.HonesttoGod!One-tenthofwhattheiranalyticshadpredictedearlier.NowI’mheretotellyouthatthereisnowayamodelwouldpredict16%andthenreviseittorealisticallybe2%,assumingrealanalyticsweredone.

ThatisoneoftheonlyinstancesIknowofwheretheysimplychangedtheoutputfile.Bytheway,theclientdidnotbelieveiteither(didnottrusttheiranalytics)andfiredthem.Rightfullyso.

So,dowechangetheoutputfile?Theanswerisno.Wecan’t.It’snotjustaboutintellectualintegrity,it’saboutCOA(coveringourasses!).Alteringthedatacannotbehidden;changingtheresultscannotbeburieddeepenoughtoneverbefound.Thatis,youwillbefoundout,youwillbecaughtandtheywillknowthatyoualteredtheresults.Youwillneverhavecredibilityagain.Ever.Itcannotbehidden.Trustme,itwill(eventually)bediscovered.Thisisbecausealldataisinterrelated,onemetricdrivesanother,andonepieceaffectsanotherbecauseonevariablefitstogetherwithanothertotellthewholestory.ChangingonepartofitwillaffectallotherpartsanditwillNOTaddup.Thatdoesnotmeanyouhavetobroadcastittoeveryonethough.Youcanemphasizethisordirecttheconversationtofocusonthat.

ThebiggestmistakeI’veevermade(thatIknowof)wasridiculouslysimplebutverycostly.Iwasadatabasemarketinganalystandmyjobwastodoamodelandproducealistforcustomersmostlikelytopurchase.Wesentoutoveramillioncataloguesamonth(atacostofabout0.40each).

IdevelopedalogisticregressionmodeltoscorethedatabasewithprobabilitytobuyandusedSASprocrank.Iwassupposedtogivethemthetopthreedeciles.Now,SASprocrankhasdecileoutputlabelledfrom0to9,with0thehighest(thebest).Iaccidentallysentdeciles7,8and9–thelowest,theworst.Althoughthesewerethehighest(numbered)deciles,getit?Easymistaketomake,right?Well,thecampaignthatmonthdidnotdowell.SoIsentamessagetoeveryonethatIwasworkingonanew

Page 188: Marketing Analytics: A Practical Guide to Real Marketing Science

modelthatIthoughtmightbebetterfornextmonth.MymessagewasdesignedasapreemptivestrikethatIwasengagedandworkingontheproblem.That’swhattheysaw,Iwasmakingitbetter.WhenthetimearrivedthefollowingmonthIusedthesamemodelbutthistimepickeddeciles0,1and2(thebest).Thatcampaignworkedwell.Iwascongratulatedonimprovingthemodel.Ofcoursemyteamknewitwasthesamemodelbuttherightdecileswerechosen.Keytakeaway:becarefulandbeupfrontandhonest(asneedbe).

Anotheranecdotefromearlyinmycareerwasaboutdemandestimation.Myjobwastoforecastcallvolumeandbasedonthatvolumedifferentload-balancing(amongotherthings)sitesweredesigned.Well,thecompanyhaddecidedtobuildanothersite(inFlorida)tohandleallthecalls.Theyhadboughtthelandandgotabuildingandwerehiringpeopletostaffit.Eventuallysomeonethoughtmaybetheyshouldpredicthowmanycallswouldgothere,thatis,estimatedemand.Itsohappenedthatmybosswasawell-respectedandlong-timeeconometricianandourjobwastoputupthedemandnumbers.Everyoneknewthedemandwashuge;thequestionwasjusthowhuge.SoIcollecteddata,macroandmicrovariables,competition,newproducts,timeseriestrends,etc.TheforecastIgotwaslow–waylowerthanexpected.Igulpedandlookedatitagain.Themodelwasforecastinglessthanhalfwhatwasneededforanewsite.Imetwithmybossandwewentovereverythingbutcouldonlyassume,inthebestscenario,60%ofwhatwasneeded.Wegavetherealestateteamourestimatesandtheysaidthanksandthencarriedonwiththebuildingandthehiringforthenewsite.Ayearlaterthatsitewasclosed–therewasnotenoughcallvolumetosupportit.

Nowitwouldhavebeeneasyandacceptableforustojustdoubletheoutput,right?Itwouldhavebeeneasytomakeheroicassumptionsthatmadenosenseinordertogetthedemandforecastwayhigher,right?Inthiscasewejustshowedtheoutputandshruggedourshouldersandcalleditaconservative,worstcasescenario.

TohavealtereditwouldhavebeenakintowhatEinsteincalledTheBiggestBlunderofHisLife(notthatI’mcomparingmyselftohim!)Einstein’srelativityequationsshowedthatbecauseofgravitytheuniverseshouldbeexpanding(orcontracting).Sincenoonebelievedthat,includingEinsteinhimself,headdeda‘cosmologicalconstant’tohisequations,ineffectamathematicalwaytocancelouttheexpansion.AfewyearslaterHubblediscoveredthattheuniversewasindeedexpanding.Einsteineditedtheoutputfile!Thekeytakeaway?IfitdidnotworkforEinsteinitwillnotworkforyou.Donotchangetheresults.

Whatotherthingsshouldyoutakeawayfromallthis?

Haveanimplementationplan!Thebestanalyticsintheworldisofnouseifitisnotimplemented.OftenIhavebeen

Page 189: Marketing Analytics: A Practical Guide to Real Marketing Science

accused(oftenrightlyso)ofdoinganalyticsthatistooadvanced,andnooneunderstandswhatitmeans,nooneunderstandshowtouseit.ThisisafterIhavedoneit,showntheresultsandputtogetheraPowerPointpresentationexplainingwhatitisandhowithelps.Itwastypicallythenatureofmyjobtodoaprojectandthen,basically,goaway.TheodoreLevitt(who,itcouldbeargued,basicallyinventedmarketingasadisciplinewithhisMarketingMyopiaarticle)saidthatpeopledonotwantaone-inchdrill;theywanttomakeahole,oneinchwide.Iwasoftenguiltyofexpoundingonthecoolnessofthedrill,thewonderfuldetailsandspecificationsofthedrill,howthedrillwouldhelpmakeahole,whythisdrillisbetterthanthatdrill,etc.Ineededtofocusonwhatwastheneed,notthetool.ThereforeI’dsuggestsomeofthefollowingafteranalyticshasbeendone.

Setuptacticalusecases.Puttogetherscenariosofbeforeandafter,withandwithouttheanalytics.

Trainthestaff,maybeevenwithrealdata.Designsimulationsorusepastdataandshowhowtheanalyticswillbeimplemented.Thismaymeandesigningatrackingreportandfocusingonthenewmetrics.Itoughttomeanactuallyshowingdata,thescoreonthedatabaseandthestrategicimplicationsofthenewinsights.Takeawaytheabstractblackbox:analyticsisnotvoodoo.

Getstakeholderstogetherandtalkabouttheirgoals(especiallythosetheirbonusesaredependenton).Showhowthenewanalyticsdirectlyimpactsthesemetrics,andthendecideuponstretchgoals.Ihavetypicallyfoundthebarisratherlow.Mostfirms,evenFortune100firms,havelittleideawhat’sgoingon,havefewinsightsanddonotknowtheircustomersorcompetition.Theytypicallymarketwithashotgunapproachandthrowmoneyaroundhopingforthebest.Afewwell-designedanalyticprojectscandrasticallymakeadifference.That’showyoubecomeasuperstar.

Youshouldsetupcheck-insat30daysafter,90daysafter,and180daysafter,etc.,togetbacktogetherandseehowit’sgoing,whathasbeenhappening.Youareaconsultantandaretheretohelpanswerquestions,ensurethemodesareworkingandarebeingusedcorrectly.

It’scommontosetuptestvs.controlgroups,somakesureyouarepartofthis.Remember,everyonewantstotest,butalmostnooneknowshowtodesignastatisticaltest.

Findawaytomakeanalyticscentraltoasmanydivisionsandseniorpeopleaspossible.Getinfrontofasmanydecisionmakersasfeasible.Nevertalkaboutthetechnicalaspectsoftheanalytics,alwaystalkaboutthedownstreamresultant(typicallyfinancial)metrics.Insteadofsayingthet-ratioissignificantandpositive,tellthemthatnetprofitcanincreaseby2.5%nextquarter.Thatwillmakethemputtheirphonesdownandlisten.

Takeaclassorreadabook(ortwo)onabnormalpsychology

Page 190: Marketing Analytics: A Practical Guide to Real Marketing Science

Successinthecorporateworlddependsmoreonyourabilitytoworkwithpeopleandgetthemtodowhatneedstobedonethanonyourtechnicalskills.Thisbookhasbeenaboutaddingtoolsbutreallyyouneedtounderstandpeople.Everyoneisdifferent,thesamethingsdonotworkonallpeople,andpeopleevolveandchangeovertime.Justlikekids.

Allbusinessemotionscomefromeitherfearorgreed.Discovertheprimarymotivatorofthepeopleaboveyouandthepeoplebelowyou.Generallyspeaking,lower-levelfolksaretactic-oriented;theyneedalistoftaskstocomplete.Astheyriseinthecorporaterankstheytendtobecomelesstacticalandmorestrategic.Thismeans,generally,lower-levelfolksaremotivatedbyfear(didtheygetthejobdone,wasitdonecorrectly,cantheybeblamed?)andhigher-levelpeoplearemotivatedbygreed(theyruntheorganizationandgetabonus,theygetperks,newspaperclippingsmentiontheirname).Astheyreachaveryhighleveltheyaremotivatedagainbyfearbecausetheycanbeblamedforeverything.

Soyouneedtoknowpeopleenough(especiallythoseunderyou)sothatyouunderstandiftheyaregoingthroughadivorce,havingtroublewiththeirkids,drugproblems,orjustplaincrazy.Somepeoplewouldpreferrecognitiontoaraise,aflexiblescheduletoanincreaseintitle,one-on-onetimewithyouinsteadoftheforcedfrivolitiesofdepartmentoff-sites.(BTW,noteveryonelovesbowlingorpaintball!)So,investanddiscover.

ConsumerbehaviourispredictableenoughWhatmarketingsciencedealswithisquantifyingcausality.Thatis,measuringhowonevariableimpactsanothervariable.Thismeanspredictingconsumerbehaviour.

Iliketopointoutthattheweatherman,everyday,predictstheweather.Everydayit’swrong.(Maybeit’srightenough,butyoudecidehowoftenyouhavemadefunofthebadpredictions.)Meteorologistshavedecadesofdataandusemainframecomputerstodevelopmodels.Thedatatheydealwitharedewpoints,temperature,wind,pressure,precipitation,etc.Thatis,theydealwithinanimateobjects.Allofthis,andtheystillcan’tgetitright!

Wemarketingsciencefolkstypicallyhaveonlyahandfulofyearsofdatatoworkwith.WedothisonaPCorso,maybeaserver.Andwedealwithirrationalanimateconsumers.Wehavenochancetobe‘right’.

Butthetechniquesyou’veseenherehelpandtheyhelptogetitrightoftenenough.It’softenenoughtomovetheneedleonacorporation’sfinancialperformance.Andbytheway,howgooddoesthemodelhavetobe?I’vehadabossnotuseamodelbecauseitwasnot100%accurate.(Yes,hewasanidiot.)

Iliketousetheanalogyoftheevolutionofthehumaneye.Millionsofyearsagoourancestorswereblindandathighriskamongpredators.Eventuallysomemutationsformed

Page 191: Marketing Analytics: A Practical Guide to Real Marketing Science

andwedevelopedan‘eyebud’thatallowednotperfectvisionbutcoulddetectlightfromdark,couldsenseshadowymovementsahead,etc.Iproposethatwhilethiseyebudwasnowherenearperfect(not100%)theinsight(getit,sight?)wasenoughtoallowthemtomakesmarterdecisions.Itsvisualacuitywouldgrowanddevelopovertimebutatleastitcouldnowslightly‘see’largecreaturescomingtowardit,itcouldtelldayfromnight,maybefindfoodeasier,etc.Iproposethiswasenoughtosurvive.

So,aimhigh.Wecameoutofthemud.

Thebarislow.Wecanonlygoupfromhere.Goget‘em!

Page 192: Marketing Analytics: A Practical Guide to Real Marketing Science

GlossaryAverage:themostrepresentativemeasureofcentraltendency,NOTnecessarilythemean.

Censoredobservation:thatobservationwhereinwedonotknowitsstatus.Typicallytheeventhasnotoccurredyetorwaslostinsomeway.

Collinearity:ameasureofhowvariablesarecorrelatedwitheachother.

Correlation:ameasureofbothstrengthanddirection,calculatedasthecovarianceofXandYdividedbythestandarddeviationofX*thestandarddeviationofY.

Covariance:thedispersionorspreadoftwovariables.

Designofexperiments:aninductivewayofcreatingastatisticaltestusingastimulustakingintoaccountvariance,confidence,etc.,byrandomizationandcomparisontoacontrolgroup.

Elasticdemand:aplaceonthedemandcurvewhereachangeinaninputvariableproducesmorethanthatchangeinanoutputvariable.

Elasticity:ametricwithnoscaleordimension,calculatedasthepercentchangeinanoutputvariablegivenapercentchangeinaninputvariable.

Inelasticdemand:aplaceonthedemandcurvewhereachangeinaninputvariableproduceslessthanthatchangeinanoutputvariable.

Lift/gainschart:avisualdevicetoaidininterpretinghowamodelperforms.Itcomparesbydecilesthemodel’spredictivepowertorandom.

Maximumlikelihood:anestimationtechnique(asopposedtoordinaryleastsquares)thatfindsestimatorsthatmaximizethelikelihoodfunctionobservingthesamplegiven.

Mean:adescriptivestatistic,ameasureofcentraltendency,themeanisacalculationsummingupthevalueofalltheobservationsanddividingbythenumberofobservations.

Median:themiddleobservationinanoddnumberofobservations,orthemeanofthemiddletwoobservations.

Mode:thenumberthatappearsmostoften.

Ordinaryregression:astatisticaltechniquewherebyadependentvariabledependsonthemovementofoneormoreindependentvariables(plusanerrorterm).

Oversampling:asamplingtechniqueforcingaparticularmetrictobeoverrepresented(larger)inthesamplethaninsimplerandomsampling.Thisisdonebecauseasimplerandomsamplewouldproducetoofewofthatparticularmetric.

Range:ameasureofdispersionorspread,calculatedasthemaximumvaluelessthe

Page 193: Marketing Analytics: A Practical Guide to Real Marketing Science

minimumvalue.

Reducedformequations:ineconometrics,modelssolvedintermsofendogenousvariables.

Segmentation:amarketingstrategyaimedatdividingthemarketintosub-markets,whereineachmemberineachsegmentisverysimilarbysomemeasuretoeachotherandverydissimilartomembersinallothersegments.

Simultaneousequations:asystemofmorethanonedependentvariable-typeequation,oftensharingseveralindependentvariables.

Standarddeviation:thesquarerootofvariance.

Standarderror:anestimateofstandarddeviation,calculatedasthestandarddeviationdividedbythesquarerootofthenumberofobservations.

Stratifying:asamplingtechniquechoosingobservationsbasedonthedistributionofanothermetric.Thisisdonetoensurethesamplecontainsadequateobservationsofthatparticularmetric.

Variance:ameasureofspread,calculatedasthesummedsquareofeachobservationlessthemean,dividedbythecountofobservationslessone.

Z-score:ametricdescribinghowmanystandarddeviationsanobservationisfromitsmean.

Page 194: Marketing Analytics: A Practical Guide to Real Marketing Science

BibliographyandfurtherreadingAriely,Dan(2008)PredictablyIrrational:Thehiddenforcesthatshapeourdecisions,HarperCollins

Bagozzi,Richard(ed)(2002)AdvancedMethodsofMarketingResearch,Blackwell

Baier,Martin,Ruf,KurtisandChakraborty,Goutam(2002)ContemporaryDatabaseMarketing:Conceptsandapplications,RacomCommunications

Becker,Gary(1962)Irrationalbehaviourandeconomictheory,JournalofPoliticalEconomy,70(1),pp1–13

Belsley,David,Kuh,EdwinandWelsch,Roy(1980)RegressionDiagnostics:Identifyinginfluentialdataandsourcesofcollinearity,JohnWileyandSons

Binger,BrianandHoffman,Elizabeth(1998)MicroeconomicswithCalculus,AddisonWesley

Birn,RobinJ(2009)TheEffectiveUseofMarketResearch:Howtodriveandfocusbetterbusinessdecisions,KoganPage

Brown,WilliamS.(1991)IntroducingEconometrics,WestPublishingCompany

Chiang,Alpha(1984)FundamentalMethodsofMathematicalEconomics,McGrawHill

Cox,David(1972)Regressionmodelsandlifetables,JournalofRoyalStatisticalSociety,34(2),pp187–220

Deaton,AngusandMuellbauer,John(1980)EconomicsandConsumerBehavior,CambridgeUniversityPress

Engel,James,Blackwell,RogerandMiniard,Paul(1995)ConsumerBehavior,DrydenPress

Greene,WilliamH(1993)EconometricAnalysis,PrenticeHall

Grigsby,Mike(2002)Modelingelasticity,CanadianJournalofMarketingResearch,20(2),p72

Grigsby,Mike(2014)RethinkingRFM,MarketingInsights

Hair,Joseph,Anderson,Rolph,Tatham,RonaldandBlack,William(1998)MultivariateDataAnalysis,PrenticeHall

Hamburg,Morris(1987)StatisticalAnalysisforDecisionMaking,HarcourtBraceJovanovich

Hazlitt,Henry(1979)EconomicsinOneLesson:Theshortestandsurestwaytounderstandbasiceconomics,CrownPublishers

Page 195: Marketing Analytics: A Practical Guide to Real Marketing Science

Hughes,ArthurM.(1996)TheCompleteDatabaseMarketer,McGrawHill

Intriligator,Michael,Bodkin,RonaldandHsiao,Cheng(1996)EconometricsModels,TechniquesandApplications,PrenticeHall

Jackson,RobandWang,Paul(1997)StrategicDatabaseMarketing,NTCBusinessBooks

Kachigan,Sam(1991)MultivariateStatisticalAnalysis:Aconceptualintroduction,RadiusPress

Kennedy,Peter(1998)AGuidetoEconometrics,MITPress

Kmenta,Jan(1986)ElementsofEconometrics,Macmillan

Kotler,Philip(1967)MarketingManagement:Analysis,planningandcontrol,PrenticeHall

Kotler,Philip(1989)Frommassmarketingtomasscustomization,PlanningReview,17(5),pp10–47

Lancaster,Kelvin(1971)ConsumerDemand,ColumbiaUniversityPress

Leeflang,Peter,S.H.,Wittink,Dick,Wedel,MichelandNaert,Philippe(2000)BuildingModelsforMarketingDecisions,KluwerAcademicPublishers

Levitt,Theodore(1960)Marketingmyopia,HarvardBusinessReview,38,pp24–47

Lilien,Gary,Kotler,PhilipandMoorthy,K.Sridhar(2002)MarketingModels,Prentice-HallInternationaleditions

Lindsay,CottonMather(1982)AppliedPriceTheory,DrydenPress

MacQueen,JB(1967)Somemethodsforclassificationandanalysisofmultivariateobservations,inProceedingsof5thBerkeleySymposiumonMathematicalStatisticsandProbability,UniversityofCaliforniaPress

Magidson,JayandVermunt,Jeroen(2002)Anontechnicalintroductiontolatentclassmodels,StatisticalInnovationwhitepaper[online]http://statisticalinnovations.com/technicalsupport/lcmodels2.pdf

Magidson,JayandVermunt,Jeroen(2002)Latentclassmodelsforclustering:acomparisonwithK-means,CanadianJournalofMarketingResearch,20,pp37–44

Myers,James(1996)SegmentationandPositioningforStrategicMarketingDecisions,AmericanMarketingAssociation

Porter,Michael(1979)Howcompetitiveforcesshapestrategy,HarvardBusinessReview,March/April,pp137–45

Porter,Michael(1980)CompetitiveStrategy,TheFreePress

Samuelson,Paul(1947)FoundationsofEconomicAnalysis,HarvardUniversityPress

Page 196: Marketing Analytics: A Practical Guide to Real Marketing Science

Schnaars,StevenP(1997)MarketingStrategy:Customers&competition,TheFreePress

Silberberg,Eugene(1990)TheStructureofEconomics:Amathematicalanalysis,McGrawHill

Sorger,Stephan(2013)MarketingAnalytics,AdmiralPress

Stone,Merlin,Bond,AlisonandFoss,Bryan(2004)ConsumerInsight:Howtousedataandmarketresearchtogetclosertoyourcustomer,KoganPage

Sudman,SeymourandBlair,Edward(1998)MarketingResearch:Aproblemsolvingapproach,McGrawHill

Takayama,Akira(1993)AnalyticalMethodsinEconomics,UniversityofMichiganPress

Treacy,MichaelandWiersema,Fred(1997)TheDisciplineofMarketLeaders:Chooseyourcustomers,narrowyourfocus,dominateyourmarket,AddisonWesley

Urban,GlenandStar,Steven(1991)AdvancedMarketingStrategy:Phenomena,analysisanddecisions,PrenticeHall

Varian,Hal(1992)MicroeconomicAnalysis,W.W.Norton&Company

Wedel,MichelandKamakura,Wagner(1998)MarketSegmentation:Conceptualandmethodologicalfoundations,KluwerAcademicPublishers

Weinstein,Art(1994)MarketSegmentation:Usingdemographics,psychographicsandothernichemarketingtechniquestopredictandmodelcustomerbehavior,IrwinProfessionalPublishing

Page 197: Marketing Analytics: A Practical Guide to Real Marketing Science

IndexNote:italicsindicateafigureortableinthetext.

A/Btesting(i),(ii)

abnormalpsychology(i)

advertising(i)

affinityanalysis(i)

AID(automaticinteractiondetection)(i)

AlmostIdealDemandSystem(AIDS)(i)

average(i),(ii),(iii)

definition(i),(ii)

BayesInformationCriterion(BIC)(i),(ii),(iii)

Becker,Gary(i)

behaviouralsegmentation(BS)(i)

differencetoRFM(i),(ii)

techniques(i)

seealsosegmentation

branding(i)

causality(i),(ii)

seealsoGrangercausality

censoredobservation(i),(ii)

centrallimittheorem(i)

CHAID(chi-squaredautomaticinteractiondetection)(i),(ii)

advantages(i)

disadvantages(i)

output(i)

uses(i)

‘champion/challenger’

seeA/Btesting

Page 198: Marketing Analytics: A Practical Guide to Real Marketing Science

Chrysler(i)

Cochran-Orcutttest(i)

collinearity(i)

definition(i)

conditionindex(i),(ii)

confidenceintervals(i),(ii)

‘confusionmatrix’(i),(ii)

conjointanalysis(i),(ii)

consumerseecustomerbehaviour

correlation(i)

definition(i)

negative(i)

positive(i)

serial(i),(ii)

covariance(i),(ii)

definition(i)

Cox,SirDavid(i),(ii)

customerbehaviour(i),(ii),(iii),(iv),(v)

background(i)

choices(i),(ii)

constraints(i)

data(i),(ii)

decision-process(i),(ii)

engagement(i)

example(i)

experientialmotivations(i),(ii)

informationprocessing(i)

loyalty(i),(ii),(iii)

marketingstrategyand(i),(ii)

needrecognition(i)

Page 199: Marketing Analytics: A Practical Guide to Real Marketing Science

predicting(i)

preferences(i)

primarymotivations(i),(ii)

post-purchaseevaluation(i)

pre-purchasealternativeevaluation(i)

purchasing(i)

shareofvoice(i)

underlyingmotivations(i)

customerloyalty(i),(ii)

emotional(i),(ii)

transactional(i),(ii)

data(i)

behavioural(i),(ii)

big(i)

clickstream(i)

database(i)

digital(i)

survey(i)

usesof(i)

Deaton,Angus(i)

deductivethinking(i)

demand(i)

drivers(i)

elastic(i),(ii),(iii)

estimation(i)

inelastic(i),(ii),(iii)

descriptiveanalysis(i)

designofexperiments(DOE)(i),(ii),(iii)

digitalanalytics(i)

Drucker,Peter(i),(ii),(iii)

Page 200: Marketing Analytics: A Practical Guide to Real Marketing Science

‘dummytrap’(i),(ii)

dummyvariables(i),(ii)

seealsovariables

Durbin-Watsontest(i),(ii)

econometrics(i),(ii),(iii)

elasticdemand(i)

elasticity(i)

elasticitymodelling(i),(ii),(iii)

outputbysegment(i)

overview(i)

ownpricevscompetitors(i)

pointelasticity(i)

segmentation(i)

seealsodemand

engagement(i)

issuetree(i)

model(i)

purpose(i)

seealsocustomerbehaviour

equations

deterministic(i),(ii)

probalistic(i),(ii)

reducedform(i)

simultaneous(i),(ii)

estimators(i),(ii)

consistency(i)

efficiency(i)

unbiasedness(i)

gametheory(i),(ii),(iii),(iv)

generalsurvivalcurve(i)

Page 201: Marketing Analytics: A Practical Guide to Real Marketing Science

glossary(i)

Grangercausality(i)

Hamburg,Morris(i)

hierarchicalclustering(i)

dendogram(i)

Iacocca,Lee(i)

illconditioning(i)

inductivethinking(i)

inelasticdemand(i)

Kennedy,Peter(i)

K-meansclustering(i),(ii),(iii),(iv)

advantages(i)

disadvantages(i)

Kotler,Philip(i),(ii),(iii)

latentclassanalysis(LCA)(i),(ii),(iii)

advantages(i)

disadvantages(i)

LatentGold(i)

Levitt,Theodore(i),(ii)

lifetimevalue(LTV)(i)

descriptiveanalysis(i)

examplecalculations(i)

predictiveanalysis(i)

lift/gainschart(i),(ii)

logisticregression(i),(ii),(iii),(iv),(v),(vi),(vii),(viii)

marketbasketanalysisand(i)

logit(i),(ii)

MacQueen,James(i)

Magdison,Jay(i),(ii)

marcomseemarketingcommunications

Page 202: Marketing Analytics: A Practical Guide to Real Marketing Science

marketbasketanalysis(i)

estimating/predicting(i),(ii)

marketing(i),(ii),(iii),(iv)

consumer-centric(i)

customerbehaviourand(i)

database(i),(ii),(iii)

demand(i)

partition(i),(ii)

position(i),(ii)

prioritize(i),(ii)

probe(i),(ii)

productcentric(i)

strategic(i),(ii)

tactical(i)

marketingcommunications(marcom)(i),(ii),(iii),(iv)

businesscase(i)

impactonrevenue(i)

responsestransactions(i)

marketingeconomics(i)

marketingresearch(i),(ii)

marketingstrategy(i),(ii),(iii),(iv),(v)

competitivethreats(i)

consumerbehaviourand(i)

defensivereactions(i)

lifetimevale(LTV)and(i)

offensivereactions(i)

types(i)

maximumlikelihood(i),(ii)

mean(i),(ii),(iii),(iv)

definition(i)

Page 203: Marketing Analytics: A Practical Guide to Real Marketing Science

measuresofcentraltendency(i),(ii),(iii)

measuresofdispersion(i),(ii)

median(i),(ii),(iii),(iv)

definition(i)

mode(i),(ii),(iii),(iv)

definition(i)

modelling

dependentvariabletechniques(i),(ii)

engagement(i)

inter-relationshiptechniques(i)

segmentationand(i)

structuralequation(i)

Muelbauer,John(i)

multipleregression(i)

Myers,JamesH(i),(ii)

Nash,John(i)

netpresentvalue(NPV)(i)

normaldistribution(i),(ii),(iii)

Omniture(i)

ordinaryregression(i),(ii),(iii),(iv),(v),(vi),(vii)

definition(i)

oversampling(i),(ii)

partiallikelihood(i)

pointelasticity(i)

Porter,Michael(i)

predictiveanalysis(i)

pricing(i),(ii),(iii),(iv)

probability(i),(ii),(iii)

example(i)

proportionalhazardsmodelling(i)

Page 204: Marketing Analytics: A Practical Guide to Real Marketing Science

seealsosurvivalanalysis

range(i),(ii)

definition(i)

reducedfromequations(i)

regression(i),(ii),(iii),(iv),(v)

revenuegrowthmargin(i)

RFM(recency,frequency,monetary)(i),(ii),(iii),(iv),(v),(vi)

definition(i)

ridgeregression(i)

samplesizeequation(i),(ii)

sampling(i)

distribution(i),(ii)

Schnaars,StevenP(i)

segmentation(i),(ii),(iii)

accessibility(i)

actionable(i)

algorithm(i)

behavioural(i)

behaviouraldata(i)

benefits(i)

businessrules(i)

definition(i),(ii)

example(i)

identifiability(i)

marketingstrategy(i),(ii)

metrics(i)

namingsegments(i),(ii)

pricingand(i)

referencebooks(i)

responsiveness(i)

Page 205: Marketing Analytics: A Practical Guide to Real Marketing Science

scoringdatabase(i)

stability(i)

strategicusesof(i),(ii)

substantiality(i)

testandlearnplan(i),(ii)

toolsandtechniques(i)

significance(i),(ii)

simpleregression(i)

simultaneousequations(i),(ii),(iii),(iv),(v),(vi)

definition(i)

‘slopeshifters’seebinaryvariables

standarddeviation(i),(ii),(iii),(iv)

definition(i)

standarderror(i),(ii)

StatisticalInnovations(i)

statisticaltechniques

assumptions(i),(ii)

dependentequationtypes(i),(ii),(iii)

inter-relationshiptypes(i),(ii),(iii)

segmentation(i)

statisticaltesting(i)

A/Btesting(i)

samplesizeequation(i)

Stone,Merlin(i)

strategicmarketingseemarketing

stratifying(i),(ii)

structuralequationmodelling(SEM)(i),(ii)

latentvariables(i)

supply(i)

surveys

Page 206: Marketing Analytics: A Practical Guide to Real Marketing Science

data(i),(ii)

design(i)

respondentfatigue(i)

survivalanalysis(i)

businesscase(i)

targeting(i)

‘timeuntilanevent’(i)

t-ratio(i),(ii)

universalcontrolgroup(UCG)(i)

variables(i),(ii),(iii),(iv)

binary(i),(ii)

endogenous(i),(ii)

exogenous(i),(ii)

inter-relationshiptechniques(i)

latent(i),(ii)

predetermined(i)

seealsocorrelation,covariance,modelling

variance(i),(ii),(iii),(iv)

varianceinflationfactor(VIF)(i)

Vermunt,JeroenK(i)

Yule-Walkerestimate(i)

z-score(i),(ii),(iii),(iv),(v),(vi),(vii)

formula(i)

Page 207: Marketing Analytics: A Practical Guide to Real Marketing Science

Publisher’snote

Everypossibleefforthasbeenmadetoensurethattheinformationcontainedinthisbookisaccurateatthetimeofgoingtopress,andthepublisherandauthorcannotacceptresponsibilityforanyerrorsoromissions,howevercaused.Noresponsibilityforlossordamageoccasionedtoanypersonacting,orrefrainingfromaction,asaresultofthematerialinthispublicationcanbeacceptedbytheeditor,thepublisherortheauthor.

FirstpublishedinGreatBritainandtheUnitedStatesin2015byKoganPageLimited

Apartfromanyfairdealingforthepurposesofresearchorprivatestudy,orcriticismorreview,aspermittedundertheCopyright,DesignsandPatentsAct1988,thispublicationmayonlybereproduced,storedortransmitted,inanyformorbyanymeans,withthepriorpermissioninwritingofthepublishers,orinthecaseofreprographicreproductioninaccordancewiththetermsandlicencesissuedbytheCLA.Enquiriesconcerningreproductionoutsidethesetermsshouldbesenttothepublishersattheundermentionedaddresses:

2ndFloor,45GeeStreetLondonEC1V3RSUnitedKingdomwww.koganpage.com

1518WalnutStreet,Suite1100PhiladelphiaPA19102USA

4737/23AnsariRoadDaryaganjNewDelhi110002India

©MikeGrigsby,2015

TherightofMikeGrigsbytobeidentifiedastheauthorofthisworkhasbeenassertedbyhiminaccordancewiththeCopyright,DesignsandPatentsAct1988.

ISBN9780749474171

E-ISBN9780749474188

BritishLibraryCataloguing-in-PublicationData

ACIPrecordforthisbookisavailablefromtheBritishLibrary.

LibraryofCongressCataloging-in-PublicationData

Grigsby,Mike.

Marketinganalytics:apracticalguidetorealmarketingscience/MikeGrigsby.

pagescm

ISBN978-0-7494-7417-1(paperback)–ISBN978-0-7494-7418-8(ebk)1.Marketingresearch.2.Marketing.I.Title.

HF5415.2.G7542015

658.8’3–dc23

2015016002

TypesetandeBookbyGraphicraftLimited,HongKong

PrintproductionmanagedbyJellyfish

PrintedandboundbyCPIGroup(UK)Ltd,Croydon,CR04YY