Big Data at Vectren - static1.squarespace.com€¦ · Big Data at Vectren Tom Vargo, ... Analytics...
Transcript of Big Data at Vectren - static1.squarespace.com€¦ · Big Data at Vectren Tom Vargo, ... Analytics...
Big Data at Vectren
Tom Vargo, Vectren
Thursday, June 16, 2016
2016SoutheastU/li/esDay:ReimaginetheDigitalU/lity
2
Session Agenda
• Introduc+ontoVectren• QuickHistoryonBigData• FindingValuewithCorporateAnaly+cs• Organiza+onalDesign&NextSteps
Thankyouforhavingmetoday
3 Introduction - Vectren at a glance
• HQEvansville,IN
• +1MMgascustomers
• 157kelectriccustomers
• Opera/ngCenters:48
• BargainingUnitContracts:5
• ServiceTechnicians:250
• Distribu/onFieldTechnicians:500
• Engineers&BackOfficeSupport:150
Non-Regulated businesses: § Miller Pipeline § Minnesota Limited § Energy Systems Group (ESG)
750MobileFieldworkers
4
2012-2013–Emergingtrend
2014–TopicalinNews,U/li/es(esp.smartgrid)–Vectrendoesaproofofconcept
2015–“BigData”iscool,showsupinDilbert,HouseofCards(season4,ep.5-8)
2016–EstablishCorporateAnaly/csFunc/on
Quick History: Big Data Timeline
2016–Setitup.2017–2018–Makeitpayforitself!
5
Buzzwordtodescribetheabilitytoanalyzemassiveamountsofdatanoteasilyanalyzedusingtradi/onalsoawaretools.WhatarethehallmarksofBigData?
• Processesstructured(e.g.Oracle)&unstructureddata(socialmedia)
• Transcendsdata“silos”usingmassivestorage–availablereal+me
• Predic@veanaly@cs:correla+ons->rela+onships->paUerns->causa+on
• Complexdataunderstandableusingadvancedvisualiza+ontools
• Enablesnewlearning,newtheories–datareservoir,availabletoData
Scien+sts
"Afull90percentofallthedataintheworldhasbeengeneratedoverthelasttwoyears.“–ScienceDaily
Big Data Defined
6
Thisismathema@cs,i.e.whataDataScien@ststudies:
IfIcaniden/fytheX’s(causa/on):𝐶𝑢𝑠𝑡𝑜𝑚𝑒𝑟 𝑆𝑎𝑡𝑖𝑠𝑓𝑎𝑐𝑡𝑖𝑜𝑛="Y" = f(𝒙𝟏, 𝒙𝟐, 𝒙𝟑,…,𝑥𝑛)
ThenImaybeabletoinfluenceoutcomes(predic/on)
Y= 𝜷𝟏𝑥1+𝜷𝟐𝑥2+𝜷𝟑𝑥3+…+𝜷𝒏𝑥𝑛+𝛆
So:Canweusepredic@veanaly@cstobeIerunderstandourcustomersandincreasesa@sfac@on?
Dataleadstoconversa/ons,whichleadstosharedunderstanding,whichdrivesac/on.
Correlation > Causation > Prediction > Prescription > Value
7“Tech Stack” – Current State
Viryanet
Banner
Information Bus (Data Grand Central)
Vectren.com
Oracle
Esri GIS
Power On
Maximo
Disaster Recovery (Data TiVo )
Weinvestedinamessagehub,or“middleware”,in2004andhavegrownittoover500connec+onpointstoourcri+calsystems.
Business Process Management (end to end overlay)
Dataisthe“milkbetweenthecornflakes”
8Big Data “Tech Stack” Cost Drivers
Cloudera Hadoop Cluster
Viryanet
Banner
Information Bus (Data Grand Central)
Vectren.com
Business Process Management (end to end overlay)
Oracle
Esri GIS
Power On
Maximo
Transaction Journaling (Data TiVo )
Real time “Predictive Analytics” triggering Value generating “Prescriptive Analytics”
“Social Media”
feed
Analytics Laboratory Machine Learning
BigDataisallabout3things:Value,ValueandValue
9
§ 20% of the timeline was installation, skills development
§ 70% of the time was data loading
§ Data Prep and cleanup is a large part of the job, and, we found data
“gaps”
§ Producing the actual deliverables , i.e. using the tools, was done in 3
weeks
§ “Answers” can be found, but it requires discipline and time
POC Outcomes: Key Learning
ThiswaslikeanIndyracecar,wedid3lapsat40mph(andwehadfunandreallywantedtoopenitup…)
10
Big Data POC Results
Results• Abletoconnectsurveydatatootherdatatables• Successfullystoredmul+plesourcesinHadoop• MasterDataMgmt(MDM)tooltocreate“goldcopy”ofcustomer
Implica/ons• Plabormfordeepanaly+csoncustomersurveystoiden+fydriversofsa+sfac+on• Couldexpandsurveytoallcontacttypes,regardlessofcontactmethod(currentlywedoGasLeaks,Outages&NewCustomer)
• “Cleanse”data,beUertotalviewofthecustomer
PlannedDeliverable:CombinedCustomerSurveydatawith…
Volume
Veracity
Visibilityintothedatawehadnothadbefore
11
Results• BuiltworkingprototypeCSRscreenwithnewinforma+on• Insight1–FoundVectrenonTwiUer/Facebook,abletoiden+fycustomer“behind”theirscreenname
• Insight2–“Inmemory”drama+callyreduced+meforcustomerlookups–saveshandle+meImplica/ons• Plabormforpersonaliza+on,i.e.likeOpower’s“momentsthatmaUer”• Abilitytoenhancecustomerexperienceacrossallchannels• Abilitytoleveragesocialmedia
• Canopera+onalizenewlearning• Contactchannelcontainment(serveatlowestcostusingmethodpreferred)
PlannedDeliverable:“Omni-view”oftheCustomer,“Fastdata”andSocialMedia
ValueBig Data POC Results
Accesstodatawehadnothadbefore
12CSR Screen – working prototype
Predic/onWindow
60daywindowofac/vity
Op@onsforValuewehadnothadbefore
13
Results• DeeperanalysisofCustomer• Scratchedthesurfaceofiden+fyingdriversofsa+sfac+on
Implica/ons• Allows“directedcuriosity”,abilitytotesthypotheses
PlannedDeliverable:Predic/veAnaly/cs
Nowlet’sgettoPredic/on,butbeforewestart…..
Big Data POC Results
So,visibility,accessandop@ons,whataboutpredic@on?
14
Aren’tcustomerswhorespondtoourContactSurveyinfluencedbythesizeoftheiru+litybills?
NewInsights:FirstSteps
15ContactSurvey:NotSo!Sizeofbillunrelatedtooverallsa/sfac/onscore
Whenconcentric,nosta/s/caldifference
“Hmmm,that’sinteres@ng”
16
CustomerswhorespondtoourPercep+onsurveysareinfluenced
bythesizeoftheirbills
NewInsights:FirstSteps
17
Percep/onSurvey:True!Sizeofbillsignificantlyrelatedtosa/sfac/onscores
Whennotconcentric,thereisasta/s/caldifference
“Hmmm,that’sinteres@ng”
18
Warning:Thefollowinggraphsarerated“M”,forMaybe
Nospecificconclusionsorrecommendedac+onsatthispoint.
19Canwedigdeeper…?
§ We want to develop a PREDICTIVE MODEL, that can estimate a customer’s response on overall satisfaction given a set of values for predictor variables.
§ But what are the right predictors?
§ How do we know?
§ Do we have the right data? In sufficient volume?
§ There are a host of techniques for developing predictive models, selecting the right one and properly applying it is a competency unto itself.
Before we try to build a predictive model, let’s continue searching for predictor candidates. ..
20“Chi-Square”technique:Significantrela/onshipbetweenoverallsa/sfac/onandagebracket:p=.02<alpha=.05
“Hmmm,that’sinteres@ng”
21Chi-Squaredforoverallsa/sfac/onversusincomebracket:notsignificantatp=.91>alpha=.05,sofornow,notaleadtofollow…
“Hmmm,that’sinteres@ng”
22DirectedCuriosity:Howdoesoverallsa/sfac/onrespondtosmartphoneownership?Moresothanincomeorage!(p=.004<alpha=.05)
“Hmmm,that’sinteres@ng”
23
Justforfun,quickdetour
24Whatifwenormalizeagetoourdemographic?
0.6% 0.5% 2.0%
Satgoesdownby:
“Hmmm,that’sinteres@ng”
25Ifwenormalizeandprojectsmartphoneownershipwhathappens?
Whatcouldwedoaboutthatnow?• UseHadoop+FastDatatosurveymorecustomersacrossallchannels
• DigdeeperonSmartPhoneusers,holdfocusgroupstoget“inside”theirheads
• Con+nuetoimprovethecustomerexperience
• Onepossibility:developpersonalizedoutboundmessagesonenergyuse,ata@methatishelpfultothecustomer
N–0.2% O–0.3% S–0.9%For2015:
N–1.1% O–1.3%For2020: S–3.8%
“Hmmm,that’sinteres@ng”
26Sowhat…… 55%?
1millionads 300clicks 3applica+ons 1approval0.0001%ofads
resultinapprovedcustomer
Isn’tthatasgoodasacointoss?
27
Isitthatinteres/ng?
StrategicRabbitHoles
28BigDataPoten/al–Isitthatinteres/ng?Poten:alareastoapplyBigData
Benefits:
• Elimina/ngspecificexcep/onprocessing
• Improvementsinexcep/onhandlingshouldbenefitcustomersa/sfac/on
• Improvedcustomerservicetechniques,outboundmessaging,richerinfoonCSRscreens,etc.
• ShouldreceivefavorablereviewbyregulatorsValue-$500kofsustained
O&MsavingsReadytoexecute
Benefits:
Iden/fied3usestoreducecosts:
• Earlydetec/onofcurbing,reducedtruckrolls
• Detec/ngdiversion,lostrevenueandinves/ga/onexpense
• Improvedmetertracking
ValueTBD–es/matedrangeof$300k-$500k
O&Msavings
Benefits:
Iden/fiedmul/pleuses:
• Assetriskanalysis(leaks,facili/esdamage)
• AnalysisofTOUdatafromAMI
• Engineering&Constr.costmanagement
• Workmgmtforecas/ng
• Routeop/miza/on
ValueTBD–es/matedrangeof:$200k-$400kO&Mand
$500k-$2Mcapital
Benefits:
Iden/fiedmul/pleuses:
• Healthoutcomeanalysis
• Auditcon/nuousmonitoring
ValueTBD–est.rangeof:$100k-$300kO&M
BigDataisallabout3things:Value,ValueandValue
29
Monthly frequency
Monthly hrs impact Annualized High Level business analysis Advanced Business Rules and automation opportunity
% Reduction of manual
handling Monthly time
savings Annual time
savings 10 25 300 Manual effort is expended researching why
a bill appears to be abnormal. Because of the nature of VIPs, they are special handled to research root causes and factors that affected the situation. Anecdotally, 90% of the time a plausible / reasonable explanation is identified, hwoever many factors need to be analyzed.
With Spotfire, Complex Business Rules , Fast Data and Hadoop:Analyze low bill for estimate/actual history.Compare estimate to multiple years same month and to peer meters.Analyze weather during billing period against norm.If all tests pass, no issue with the meter so auto-insert a bill message that explains the exception: Bill was (high or low). Read was actual. Weather was cooler than normal. Energy usage was xx. This is first actual , previous were estimates. Explain customers with similar homes also experienced reduced bill due to weather. IF - energy reduction wasn't same as peer meters, add a conservation message.
90% 22.5 270
20 480 5760 A process will improve estimation to prevent over stated consumption which causes the negative and cancel-rebill to be performed. This process will be on an account by account basis outside of Banner.
With Spotfire, Complex Business Rules , Fast Data and Hadoop:Analyze consumption history and factor in weather data to produce a more accurate proration to be used in lieu of the current Banner methodology to avoid processing to avoid the cancel rebill for customer experience.
25% 120 1440
21 630 7560 Considerable amount of time is spent researching to determine whether a meter has become non-registering, due to lack of valid appliance information or good record of premise occupancy. Customer contact must be made to verify if usage is occurring
Analyze historical consumption patterns for multiple yearsAnalyze weather during billing period against normIdentify if meter age correlates to other confirmed non-registering meter types for accurate identification
50% 315 3780
21 336 4032 Current resources and systems do not allow an effective and efficient method of identifying slowing meters. This equates to lost revenue due to accurate usage not registering and eventually the meter no longer captures usage.
Analyze historical consumption patterns for multiple yearsAnalyze weather during billing period against normIdentify if meter age correlates to other confirmed non-registering meter types for accurate identification
50% 168 2016
21 336 4032 Current resources do not allow for trending and analysis of off-cycle reads that may be obtained once AMR is fully implemented
Analyze off-cycle read that is obtained to past month on-cycle usage and historical usage during same time framAnalyze weather during billing period against norm to look for meter issuesAnalyze meter based on age and model to look for trends with meter issues
50% 168 2016
Total FTE Hours 21684 Net FTE Hours Savings 9522
approximate FTEs 10.425 $476,100
Ini/alRevenueMgmtUseCases:$500kO&MHardSavingsiden/fied
30OtherPoten/alUses-$1.7MO&Miden/fied,page1of2: Use Case Opportunity High Level business analysis Impact Description Benefit Description Benefit Value
Customer / BillingCurbing Detection Combine Meter Reader data with Billing usage data
and identify irregularities.
Evaluate current usage to historical usage, weather and peer meters. Identify any out of range conditions. Compare meter reader history to identify patterns of abnormality.
Causes high/low investigations.
Time spent with vendor to prove out the issue
Saved time and effort from high/low investigations.
Reduced re-read truck rolls.
Improved customer sat
$100
Meter Tracking Link Banner to Oracle Financials and Meter Inventory System (MITS). Real time tracking and reconciliation of meters being, purchased, installed, and setup in Banner.
Talk to Jon on this : look at all business and reconcile to all meters to ensure every one is billing
Minimize revenue loss from meter not in Banner
$100
Diversion Identification link the LUG data and Gas Operations with billing and ESRI then we could potentially identify diversion quickly and pinpoint within a certain geographic radius
Minimize lost revenue $100
Personalized Messages like OPower Specific messages on:Actions to reducing energy consumption.Products/services to reduce energy.Early warning on high bills.Peak days warning.Outages
Increase customer satisfactionReduced subscription costs for Opower Messaging
$100
Analyze Mass Accepted Low Bills Currently low usage bills are mass accepted (confirm with Rob) Analyze low bills for patterns and explanations and operationalize personalized messages to customers as bill inserts of text messages explaining why for ones that follow standard pattern (weather, estimates/actual reconciliation, etc).
Causes extensive manual review and response.
Save FTE manual labor review time.
Reduce bill complaint calls to contact center.
Reduce Regulatory complaints and negative social media events.
Increase customer satisfaction.
$476
Optimize Marketing Campaign Effectiveness
Use predictive analytics to optimize adoption rates for customer targeted programs such conservation, budget bill, etc.
TBD
Energy DeliveryEngineering
Analyze estimates across engineers and to actuals to determine if it is standard approach and what is estimated matches final as built
Improved design$
$200
Facilities Damage Analyze historical damage cases to identify any patterns that are preventable.
Reduced damage claims$
$200
31
Use Case Opportunity High Level business analysis Impact Description Benefit Description Benefit ValueEnergy Delivery Leaks Analyze leak history and look for patterns that may
provide early warning or shift priorities for replacement Safety TBD
Truck rolls Analyze various truck rolls and look for patterns to validate cause and avoid expense of the roll
$truck roll = $60
$100
Work assignment / Routes Analyze assigned work routes, break in work and re-routing for efficiency improvements
mom budget increased over 6 years 14m - 16.5m
$2.5M labor if we can get route efficiency back to 2009 baseline
$100
Work Forecasting & Planning Load years of work history and labor type/costs and create simulation models that allow you to manage timing, labor pool etc to reduce costs
$ reduced labor costs by optimizing internal/contractor crew mix, shifting work to take advantage of time of year, optimizing use of large asset (e.g. backhoe)
$200
Safety accident prevention:Personal injuriesMVAs
Combine Maximo, PowerOn, Viraynet, Personnel records, and safety records to identify patterns of injuries. Is it years of service? Time of year?
property and insurance costs, health care, days away with restricted duty or lost time
$50
Electric reliability study where outages have been , asset perfromance (lightning strikes, arrestor failure, etc etc )
TBD
DIMP Risk and safety mgmt systemADMS
Human ResourcesHealth, Wellness and Retirement programs
Refer to Sep 2015 HR Magazine article on Big Data. Suggests using health and wellness data and campaigns to study outcomes and adjutsments to programs. Some areas they were researching: Where can employees get top-level medical care at thebest prices? Which workers are at risk for becomingseriously ill? They also studied retirement planning and employee investments.
$50
TOTAL POTENTIAL SAVINGS $1,776
OtherPoten/alUses-$1.7MO&Miden/fied,page2of2:
32
BigData–Organiza/on–2016
CIO,VPofIT
Director,StrategicTechnology
Sr.DataScien/stCorporateAnaly/cs
Laboratory
LeadSystemsAnalystCorpAnaly/cs-Tech
SystemsAnalystCorpAnaly/cs-Tech
Organiza/on
• CentralizedinITbutavailabletoallbusinessunits
• Laboratoryisresearchcenter• TechnologygroupworkswithLaboratory
toenableresearch,andtoimplement“opera+onalized”changes
Manager,CorporateAnaly/cs
Technology
33
BigData–MatureOrganiza/on-2018
CIO,VPofIT
Director,StrategicTechnology
Manager,CorporateAnaly+csTechnology
Manager,CorporateAnaly+cs
DataScience
DataScien/stCustomerOpera/ons
DataScien/stEnergyOpera/ons
DataAnalystBigDataLab
DataScien/stFinancial,Audit,Risk
DataAnalystBigDataLab
LeadSystemsAnalystCorpAnaly+cs-Tech
SystemsAnalystCorpAnaly+cs-Tech
SystemsAnalystCorpAnaly/cs-Tech
SystemsAnalystCorpAnaly/cs-Tech
34
APPENDIX
35
BigDataApplied–Non-U/lity–EarlyAdopters
• CMEGroup:Aggrega+onandanalysisofreal-+memarketdatatoautomateprocesses,modifyalgorithms,andtokeepupwithbusiness(trader)demand.
• Proctor&Gamble:Leverageslargevolumesofdatatoanalyzeperformancetothecountry,territory,productline,andstorelevel.ThisprovidesinsightintotheleversP&Gcanpull,suchaspricing,adver+sing,andproductmixtoop+mizesales.
• Macy’s:Customerbigdatatounderstandbehaviorandawtudes,productplacement,cross-sellingandupsellforOMNIChannelmarke/ng.
• GM:Largevolumesofmaterialscosts,salesvolumeandgrowthtrendstodeterminetobrandprofitabilitybycountry
• AT&T–increasedfreecashflowby1/3byreconcilingvarioussystemsaccoun+ngdataandmethods.
• PlanetFitness–equipmentusageandrota+onschedulebasedonfloorloca+on
• Healthcare–clinicalpathways,pa+entcasemanagement
• CDC–contagionimpactanalysis
36
BigDataApplied–U/lityPeers
• Centerpoint–totalviewofthecustomer,masterdatamanagement
• St.LouisSewage–infrastructuremoderniza+onandregulatoryratecase
• BCHydro:Analyzeddatafromover300Energysystemstoimproveoutagemanagement,restora+onandcustomerno+fica+on.
• ERCOT:Studiedwholesalemarketdatatoimproveopera+ngefficienciesbyusingmorerapidanddetailedpricingandscheduling(i.e.,beUerpricesignalsforloca+nggenera+onandtransmission).
• PJM:Analyzesbigdatafromweathermodelsandenergydistribu+onacrossthe13NortheasternStatestopredictandcontrolnecessaryenergydistribu+onandrou+ng.
37
ThanksforHavingMe!