Big Data Analytics Summit - April, 2014

22
Big Data Analytics Industry Perspective Shankar Radhakrishnan

description

Presentation on Big Data Analytics at Vellore Institute of Technology, Chennai

Transcript of Big Data Analytics Summit - April, 2014

Page 1: Big Data Analytics Summit - April, 2014

Big Data Analytics !

Industry Perspective

Shankar Radhakrishnan

Page 2: Big Data Analytics Summit - April, 2014

Topics• Market Research • Market Trends • Big Data Analytics in

• Banking and Financial Services • Insurance • Travel and Hospitality • Retail • Life Sciences • Manufacturing • Telecommunications

• Challenges vs. Opportunities • Q & A

Page 3: Big Data Analytics Summit - April, 2014

3

Page 4: Big Data Analytics Summit - April, 2014

Market Research

4

Page 5: Big Data Analytics Summit - April, 2014

Key Trends driving Big Data AnalyticsIndustry

Financial services▪ Customer Insights – Integrating Transactional data (CRM/Payments) and unstructured Social feeds ▪ Regulatory Compliance – Risk exposures across asset classes, LOBs and firms ▪ Fraud Detection in Credit Cards & Financial Crimes (AML) in Banks

Travel, Hospitality & Retail

▪ Customer centricity – Customer behavior analysis from Omni channel retailing & Social feeds ▪ Markdown Optimization – Improve markdown based on actual customer buying patters ▪ Market basket analysis – Narrow down market basket analysis by demographics

Life Science▪ Improve targeting & predictions – Automatic Detection of Adverse Drug Effects (ADEs) ▪ Patient data analysis – Longitudinal Patient Data (LPD) analysis ▪ Predictive Sciences – Analyze Preclinical Side Effect Profiles of Marketed Drugs

Healthcare (Payers & Providers)

▪ Cost of Care – Drug effectiveness & Cost of Care Analysis based on electronic Health Records (EMR) ▪ Self Service Healthcare – Increase in mHealth & eHealth to allow consumer access to health information ▪ Claims Analytics – Analyze insurance claims data for fraud detection & preferred treatment plans

Communication, Media & Entertainment

▪ Discover churn patterns based on Call data records (CDRs) and activity in subscribers’ networks ▪ Digital Asset Management (DAM) – Analyze & capitalize digital data assets

Manufacturing▪ Proactive Maintenance & Recommendation – Sensor Monitoring for automobile, buildings & machinery ▪ Energy Efficiency – Leveraging Smart meters for utility energy consumption ▪ Location or Proximity Tracking – Location based analytics using GPS Data

Hi-Tech ▪ Extend and complement conventional information supply chain with big data path ▪ Predictive analysis and real time decision support

Trends

5

Page 6: Big Data Analytics Summit - April, 2014

John calls a customer care executive at the bank. !He is irritated with the services offered to him and is expressing signs of making a switch

Executive validates the customer’s identify and pulls up an application powered by Big Data that presents all relevant information to make a decision. !Big Data Application converts his speech to text in real time and identifies his propensity to churn.

Based on John’s tonal sentiment the application immediately pulls up top 5 offers or decisions to take based on the Customer Pe r sona i n fo rmat i on wh i ch contains l ikes/disl ikes, past experiences, which channels he prefer, CLV(Customer Life-time Value) etc.

Well Informed Customer Service Executive

6

Page 7: Big Data Analytics Summit - April, 2014

Social media

Depositions

Complaints

Voice Data

Unstructured Data Speech to Text Conversion

Decision Engine

Analytical System

Customer Persona

•Customer Persona •Demographics, •Top interactions •Channel Preferences, •Dis-satisfiers •Customer Lifetime Value •Recent Contact History •Customer Sentiment •Trend during the call

Customer’s state of mind

Sentimental Analysis

Other Channel information

(ATM, Branch)

Big Data Warehouse

Traditional Warehouse

•Customer Executive Dashboard presents all intelligence required to make a decision •The decision engine also presents important decisions to be taken for the particular customer issue

Well Informed Customer Service Executive

7

Page 8: Big Data Analytics Summit - April, 2014

Fraud Pattern Analysis & Detection

Envisaged Benefits ▪New fraud patterns can be identified by building ‘analytical models’ to run against X yrs. of History data ▪‘Web crawling’, ‘Contextual text analysis’, ‘Natural Language Processing’ allows fraud behavior identification from social media. It may increase Fraud detection success rate ▪‘Real time’ models to capture behavioral patters and do pattern analysis against History data to evaluate Fraud case validity. The model learns by self and updates ‘Fraud pattern master sets. This brings ‘artificial intelligent’ fraud pattern detection and analysis ▪‘Real time’ (in the order of .5-1 minute refresh rate) alerts to Fraud analysts about ‘self learned’ fraud patterns based on new customer behavior patterns

Process ▪Formation of key value groups to the order of XcY (where X no. of attributes that are relevant to Fraud and Y is no. of attributes that should be combined to identify patterns) ▪High speed history data loading from source systems ▪Efficient Real time fraud detection by identifying patterns through customer behavioral events and processing them over X yrs. of history data

Scenario ▪Formation of Fraud patterns using •Real time data coming from different departments like IVR, WEB, Customer profile, Transactions etc •Real time Mining and analysis of history data to form prior patterns

Fraud Pattern Analysis & Detection

8

Page 9: Big Data Analytics Summit - April, 2014

Legacy Fraud Data

Customer Profile Data

Social Media Data

Card Transaction

Data

Decision Engine

Approval/Denial

Decision

History Data Processing to

find Fraud Patterns over years

Real-time Customer Behavior Analysis for

Fraud Detection

Real time Analysis of behavior patterns

Real time update to Decision Engine

Self Learning Fraud Detection

9

Page 10: Big Data Analytics Summit - April, 2014

Cross Channel Analytics

John exhibits a specific pattern when he avails services. !He always visits the bank when he wants to deposit a check. !He prefers most other operations to be online. !He has recently started paying his utility bill payments through mobile.

• Analytical Solution integrates Customer transactions through different channels and reveals insights on customer’s channel preferences and activities.

• It also integrates data from call centers, surveys and complaints and measures Customer Experience.

• It reveals customer activities across channels which is normally not available for a customer touch-point to deliver superior service

• I t r e v e a l s o p p o r t u n i t i e s t o consolidate channels and optimize cost of operations by incentivizing customers to choose one medium over other

Analytical Solution produces !

• Dominant Path Analysis specifying which channel is used by John for which events

• Service Behavior Segmentation • Customer Journey analysis • Root Cause & Repeat Issue analysis • Longitudinal analysis on customer

preference changes !Helps bank deliver superior service and also optimize cost on specific channels

10

Page 11: Big Data Analytics Summit - April, 2014

Analytics

Cross Channel Analytics

Big Data Warehouse

Dominant Path Analysis- channel usage info

Service Behavior Segmentation

Repeat Issue analysis

Query Drill-down

Ad hoc Reports

Predictive ModelingStatistical Analysis & Text Mining

Optimization

Root cause analysis

Call Reasons analysis

Customer Journey analysis

Structured data

Web & Mobile

ATM / Branch

IVR, Call Records, Notes

CRM Data

ACH / Wire Transfer / Other channels

Unstructured Content / Logs

DW

Transactions

Mailings, Offers, Lists

Other Channels

Survey

Complaints

11

Page 12: Big Data Analytics Summit - April, 2014

Analytics Data Mart

Member profiling based on profile, demographic, social

media and history data. Identification of key

predictor variable for customer churn

Member profiling and variable identification

Termination prediction modeling

Termination prediction modeling engine to

determine “probability of termination” at each

member level

Member Prioritization Matrix

List of members with high likelihood of

termination

Retention Target list generation

Alternate product recommendation

engine

List of suitable products for each

customer

Analyze profitability of each of the

recommended product

Create most optimal and effective Retention

campaigns

Personalized Retention Plan

Churn and Retention Analytics

1212

Page 13: Big Data Analytics Summit - April, 2014

Analyze customer’s search pattern by doing the weblog analysis using big data. !e.g. Rate or amenity which customer prefers !Step 1: Customer Starts the search on website

Drill down into specific search patterns and analyze customer’s rate preference or amenity expectation on a particular rate e.g. !Step 2: Customer selects some destination. !Search displays all the hotels and then refine the search by selecting a price range or sort the search based on price and then he leaves and doesn’t book. !It concludes that customer didn’t find the hotels at his\her expected rate. !Step 3: Customer selects some destination. !Search displays all the hotels and then refine the search by selecting preferred amenity e.g. swimming pool,wifi etc. and then he leaves and doesn’t book. !It concludes that customer didn’t find the hotels with expected amenities.

Popup right offers to the customer when they search which in return increase customer attraction and sales as well. !Revenue management team to use this data and come up with ideal rate. !The search pattern can be used for individual property amenity improvisation. !Step 4: This data can be forwarded to revenue management team to setup the right\competitive rates in right geography !Step 5: This data can be forwarded to propert ies as wel l for amenity improvisation

Look to Book Ratio Analytics

13

Page 14: Big Data Analytics Summit - April, 2014

Planogram – created by planners and buyers

Actual view of the shelf arranged by store associates

Compliance dashboard as well as compliance score by Dept./Category/Subclass

▪ Planogram compliance is the process of verifying if the products arrangement and the manner in which they are displayed on the shelf in each store match the planogram that is strategically created and collaboratively developed between planners and trading partners

▪ Usually this verification and compliance check is a time consuming process and done on a sample basis. When the execution of planogram is compromised or if there are assortment void, it is a lost opportunity

▪ To accelerate this compliance check – take picture of the actual shelf by product facing, position and systematically compare for compliance

The Need

▪ Storing the planogram’s created at corporate location for each store/dept./category combination

▪ Storing the actual photo of the shelf ▪ Comparing this unstructured data for matching ▪ Integrate this matching score with planogram

planning data in Data warehouse to produce various dashboards and metrics that will influence sell-thru, profitability and customer satisfaction

Big Data Analytics

Planogram’s Compliance

14

Page 15: Big Data Analytics Summit - April, 2014

▪ eCommerce retailer needs to analyze graphic images depicting items for sale over the Internet

▪ When a consumer wants to buy a red dress, their search may not match the tags used to identify each item’s search terms.

▪ Manufacturers do not always label their goods clearly for the distributors or identify keywords with which users are likely to search.

The Need

▪ Analyze thousands of dress images, detecting the red prominence of the primary object in the graphic (JPGs, GIFs and PNGs)

▪ This requires enormously complex logic for the computer to “see” the dress and its primary colors as humans do.

▪ Millions of images are tagged with additional information to assist consumers with their search

▪ Increases the chances that they find the item they were looking for and make a purchase

Big Data Analytics

Intelligent Item Search

15

Page 16: Big Data Analytics Summit - April, 2014

ProcessInput Benefits

Predictive Biology External and Internal Literature sources

Text mining used for linking molecule with metabolic processes such as glucose uptake, fatty acid synthesis, metabolic stress etc. !Manual curation can be done to extract assertions and relationships with respect to effect, drug treatment, experiment type etc.

Vital evidence collected on the effect and relationships on species, tissues and linkages to canonical pathways and RNA expression data

Business Goal

Rapid extraction of key information from literature sources to collect evidence on biological processes !Assessing the incidence of Nausea in development compounds by analyzing the preclinical side effect profiles of marketed drugs

Statistical models created to find out the relation between various preclinical observations and occurrence of nausea !Model shows clustering of compounds associated with nausea having higher gastrointestinal preclinical observations

Model helps in identifying the risk of nausea early on during development !Running this model during compound selection can minimize the risk of seeing nausea in follow up compounds

Predictive Pre-Clinical Safety Gastrointestinal preclinical findings of marketed drugs

16

Predictive Sciences - Predictive Biology & Predictive Pre-Clinical Safety

Page 17: Big Data Analytics Summit - April, 2014

Data Processing StepsInput Benefits

EMR data !Prescription data !Promotion data(eMail, Sales Calls etc.) !

Identify key themes in the EMR data for a particular disease type Use the prescription data to validate the patterns / themes !Merge the findings with the promotion data to uncover any relationships between promotion and treatment

Refining targeting / promotional strategies !Cost Reductions !Uncover potential reasons for choosing a particular therapy

Business Goal Linkage of EMR/Prescriber/Promotion data in order to understand the relationship between prescriber promotions and treatment patterns

Improving Diagnosis by EHR/EMR Data Analysis

17

Page 18: Big Data Analytics Summit - April, 2014

Challenge Analytics Benefits

Wastage of energy and resources !Under utilized room’ temperature and lighting settings !Huge Energy bills

Historic Sensor Data !Blue-prints of the building and room layouts !Realtime Sensor Data !Temperate settings in the room and building !Usage patterns of the room

Green Energy, Smart Energy Management

Optimize consumption of energy in business environments !Networked sensors and a new generation analytics tools play a huge role in gaining insights and to implement the most efficient and sustained energy strategies.

18

Page 19: Big Data Analytics Summit - April, 2014

Case

• Typically contact center channel data is analyzed typically from SLA perspective: TAT, Average wait time. • However, the actual transcript of the conversation can yield powerful insights regarding telecom infrastructure usage

Customer call-centerText Mining

Collocation Analysis fromCell Phone Towers

• Collocation analysis by an investigation team finds out if there were multiple phones with the same person. • Examination involves Terabytes of CDR/Tower records from the switch, one can triangulate on a few

collocation events

Multi-device Event Stream Analysis

co-relating Firewall & IDS & Switch activity

• Most telecom infrastructures IDS (Intrusion detection systems) sit at the periphery, with network monitoring , Firewalls and application logs being captured in silo

• Deploy Central Log File repository with events streaming from multiple devices that are ingested and collated centrally • Channels into intelligence, network infrastructure and security of the telecom assets • Optimizes significantly to detect everything from malware and spear phishing attempts to breach security

Optimizing cost of Telecom Tower Maintenance

• Big Data platform manages fuel consumption data in the telecom tower business • Each of the telecom towers has a generator and one of the biggest components of cost is diesel cost • Sensors/energy meters which constantly emit large data streams of operational data • Machine learning algorithms crawls through operational data stored over years to predict and optimize cost and revenue

User Behavior Analysis

• Operational systems at each telecom service provider generates huge data volumes in the form ofCall Data Records(CDR) for each call/SMS handled

• Signaling data between various switches, nodes, and terminals within the network • Mining of this data leads to insights for improving marketing operations, network and service optimization

Planning Sales Approach

• Large-scale data analysis boosts the ability to pinpoint exactly where ongoing sales approach could make further gains • Study the behavior of customers to see what factors motivated them to choose one brand or product over another. • This involves analyzing online search data and real-time information, shared by consumers across social networks and

other Web-based channel - about the company’s products and services • Brand affinity and customer sentiments are measured using Sentiment Analysis algorithms

Big Data Analytics In Telecom

Big Data Analytics

19

Page 20: Big Data Analytics Summit - April, 2014

✓ Hype, Buzz & Myth

✓ “How?” vs. “Why?”

✓ Big Data Analytics for Business, than just for IT

✓ Business Case Justification

✓ Right Partner’s For Your Big Data Analytics Journey

✓ Evangelization and Alignment

✓ Business Onboarding

✓ Execution Plan And Course Corrections

✓ Talent and Knowledge Management

✓ Right math for ROI

20

Big Data Analytics : Challenges vs. Opportunities

Page 21: Big Data Analytics Summit - April, 2014

Source: The Evolving Role of the Enterprise Data Warehouse in the Era of Big Data Analytics , By Ralph Kimball

  Vector, matrix, or complex structure Free text Image or

Binary data Data “bags”Iterative logic

or complex branching

Advanced analytic routines

Rapidly repeated

measurements

Extreme low

latency

Access to all data

required

Search Ranking X X X X X X

Ad Tracking X X X X X X X X  

Location or Proximity Tracking X   X X     X X  

Social CRM X X X X X X      X

Document Similarity Testing X X X X X X   X X

Genomic Analysis X X X X X

Customer Cohort groups X X   X X X     X

Fraud Detection X X X X X X X X X

Smart Utility Metering X X X X X X

Churn Analysis X X X X X X   X  

Satellite Image Analysis X X X X

Game Gesture Analysis X X X X X X X X

Data Bag Exploration X X X X X X

Ad Tracking / Click stream analytics Location or Proximity Tracking Social Media Analytics / Social CRM Document Similarity Testing / Match Making

Customer Cohort Groups Sensor Monitoring (Flights / Building Smart Utility Metering Call Center Voice Analytics Log Analytics

Satellite / CAT Image Comparisons Fraud Detection Game Online Gesture Analysis Big Science (Astronomy, weather, atom smashers, Genome decoding)

Search Ranking Risk Management Churn Analysis Data “Bag” Exploration / Causal Factor Analysis

Design Challenge

21

Page 22: Big Data Analytics Summit - April, 2014

Thanks Much !