Big Data: Overhyped or Underexploited?

9
www.fico.com Make every decision count TM Big Data! Big D Big Data Big Data Big Data Big Data! Big Data? Big Data? INSIGHTS WHITE PAPER Big Data: Overhyped or Underexploited? Six analytic practices that extract more value and amplify performance Number 81 The honeymoon of business and Big Data is over. Lately, Big Data has been the target of a bit of backlash, including from the New York Times, Harvard Business Review, Wired and the Financial Times. That’s probably because we’ve reached a moment of truth—a point where, early vision aside, companies must figure out conclusively how to use Big Data analytics in profitable ways. That can be easier said than done. Given the enormous volume of data, “signal” (useful information for your purposes) must be separated from a whole lot of “noise” (useless information). Across the wide variety of data, extracting more value can require adopting a diverse set of analytic techniques. And the velocity with which Big Data accumulates and ages means analytic insights impact business performance only to the extent they can be quickly brought into operations to drive actions. So how are the leaders using Big Data analytics to achieve better, faster customer decisions? Based on FICO’s work with organizations of different sizes, in different industries around the world, this paper examines six fundamental practices for driving more value. Find out how a leader boosted customer acquisition by 300% and cross-sell penetration by 15%—without raising risk

Transcript of Big Data: Overhyped or Underexploited?

Page 1: Big Data: Overhyped or Underexploited?

www.fico.com Make every decision countTM

Big Data!Big Data?Big Data

Big Data

Big Data

Big Data!Big Data?

Big Data?

INSIGHTS WHITE PAPER

Big Data: Overhyped or Underexploited?Six analytic practices that extract more value and amplify performance

Number 81

The honeymoon of business and Big Data is over. Lately, Big Data has been the target of a bit

of backlash, including from the New York Times, Harvard Business Review, Wired and the

Financial Times. That’s probably because we’ve reached a moment of truth—a point where, early

vision aside, companies must figure out conclusively how to use Big Data analytics in profitable ways.

That can be easier said than done. Given the enormous volume of data, “signal” (useful information

for your purposes) must be separated from a whole lot of “noise” (useless information). Across the

wide variety of data, extracting more value can require adopting a diverse set of analytic techniques.

And the velocity with which Big Data accumulates and ages means analytic insights impact business

performance only to the extent they can be quickly brought into operations to drive actions.

So how are the leaders using Big Data analytics to achieve

better, faster customer decisions? Based on FICO’s work

with organizations of different sizes, in different industries

around the world, this paper examines six fundamental

practices for driving more value.

Find out how a leader boosted customer acquisition by 300% and cross-sell penetration by 15%—without raising risk

Page 2: Big Data: Overhyped or Underexploited?

Big Data: Overhyped or Underexploited?

INSIGHTS WHITE PAPER

October 2014 www.fico.com page 2

While the effort to extract business value from Big Data is not without challenges, it’s clearly

underway. Last year, two surveys of large companies found that a significant percentage of

respondents (NewVantage Partners (NVP): 90%; IDG Enterprise: 49%) already had one or more Big

Data applications in place or in process. Meanwhile, cloud-based solutions are leveling the playing

field by expanding access to Big Data analytics, infrastructure and services for organizations of all sizes.

The value of most of these investments

will be measured in terms of the ability to

make better operational decisions. In the

NVP survey, 70% of companies investing in

Big Data projects said accelerating time-to-

answer—the speed with which they can

gain insights for answering critical business

questions to enable better fact-based

decisions—was their top goal. Decision

areas ranking high as investment drivers

included sales, marketing, risk, fraud and

customer management. In the IDG survey,

59% of respondents said improving the

quality of decision making was the top goal,

and 53% cited making quicker decisions as

of primary importance. According to 20% of

respondents, Big Data projects had already

improved the quality and speed of their

decision making.

This paper covers six best practices for

obtaining more business value from the

ever-growing volume, variety and velocity

of available data. We also share stories of

companies in various industries—including

financial services, retail and telecom—at the

forefront of implementing these practices

to drive performance gains.

1. Start with a business problem in mind

Exploring huge amounts of data with Hadoop and other advanced analytic tools can be lots of fun

for data scientists. But it can be a huge waste of time and resources if the results do not translate into

something that solves real-world business problems.

To identify projects that are both promising and practical, work with business experts to understand

their challenges and opportunities. Also, understand the types of problems the various types of Big

Data and analytic techniques can solve.

For instance, while much of the Big Data buzz has been around analyzing unstructured data such

as text and speech (discussed in best practice #3), the most important source of Big Data for most

businesses is consumer transactions. Payment card, DDA/current account and loyalty program

transactions produce abundant, timely streams of data. These are replete with granular details on the

what, when, how much and how often of individual spending.

Getting Down to Business with Big Data

Behind the big data backlash is the classic hype cycle, in which a technology’s early proponents make overly grandiose claims, people sling arrows when those promises fall flat, but the technology eventually transforms the world... The Economist, April 2014

The digital economy is all about capturing, analyzing, and using information to serve customers. Harvard Business Review, December 2013

Gartner released last week its latest Hype Cycle for Emerging Technologies. Last year, big data reigned supreme, at what Gartner calls the “peak of inflated expectations.” But now big data has moved down the “trough of disillusionment”… Forbes, August 2014

Page 3: Big Data: Overhyped or Underexploited?

www.fico.com page 3

Big Data: Overhyped or Underexploited?

INSIGHTS WHITE PAPER

Until recently, most analysis of such data was done by banks and other creditors for fraud

detection. Today, however, with standards-based technologies greatly reducing the cost

of processing huge amounts of streaming data, transactional analysis is being adopted for

a much wider range of purposes. Throughout this paper, we showcase examples of FICO

clients tapping into transactional data for greater insight into both the risk and reward side of

customer relationships.

Automatically generated customer archetypes

#1 #2 #3 #4 #5 ... #200

35.2% 2.4% 50.5% 0.1% 11.8%

Best offer timingis six weeks from now

PRO

PEN

SITY

TIME

Offers

A C DB

Today

Allocation values available as variables for time-to-event predictive models

Simple probability model generates constant estimate

over an extended time horizon.

Nina’s current transaction allocated in real time across archetypes

Customer shopping transactions

PROBLEM: Target and interact with customers earlier in the purchase cycle

SOLUTION: Increase depth and speed of analytic insights

A leading North American supermarket chain has already seen

the predictive power of time-to-event (TTE) models that pinpoint

when a customer is likely to purchase particular products. Results

include targeted offers that produced a 150% lift in average visits for

redeemers over nonredeemers.

As a next step, FICO is exploring whether a combination of

descriptive and predictive analytics will lift performance further.

The descriptive technique used, FICO’s streaming Collaborative

Profiles, analyzed stock keeping unit (SKU) data and their

hierarchical groupings to automatically discover 200 similarity-

based customer groupings (called “archetypes”). As shown

in Figure 1, this technique allocates individual customers (by

percentage of similarity) across any number of archetypes—and

updates allocations in real time with every purchase.

The resulting insights tell the retailer, for example, that a customer

Nina purchases Thai foods at a certain frequency, prefers organic

products and is likely to buy nail polish while shopping for groceries.

The company can reach out to Nina at the right time in her purchase

cycle to offer recipes she’s likely to enjoy, a personalized shopping

list and discounts on ingredients. Knowing when to tuck a nail polish

e-coupon into Nina’s package is especially valuable to the retailer,

since this is a low-volume but high-margin product category.

Preliminary results indicate manifold improvements in model

performance. The biggest improvements are being seen in low-

volume SKUs, where the limited amount of data on observed

customer behavior makes it difficult to predict time-based

propensities with the plain-vanilla TTE model alone. For these

items, including archetypes in TTE models significantly improves

targeting capability.

FIGURE 1: ANALYZING TRANSACTIONAL DATA IN MORE WAYS TO DEEPEN UNDERSTANDING OF CUSTOMERS

CASE STUDY

Page 4: Big Data: Overhyped or Underexploited?

www.fico.com page 4

Big Data: Overhyped or Underexploited?

INSIGHTS WHITE PAPER

2. Look ahead to how you’ll deploy insights in operations

To achieve real business value, you have to be able to operationalize the results of your analysis.

Although this seems obvious, far too many projects are left gathering dust or encounter delays

because it is too hard to leverage findings where they could provide value. The opportunity cost to

the company—from all the suboptimal decisions made in the interim—can be immense.

Wise selection of data is critical. What looks wonderful in the lab may not be available or may be

too expensive to obtain at the time needed for use in day-to-day business operations. Industry

regulations may affect where and how data can be used. Moreover, most analytics require extensive

calculations to be made from the raw data to turn it into useful variables and engineered predictive

features. All that has to happen in efficient, automated ways that make insights available fast enough

to drive operational decisions and actions.

Analytic development teams must carefully consider how

their models will be published and used by operations teams.

Models that rely on manually intensive data processing steps,

for instance, can cause problems at implementation. The

quality of scripts and how well they’ve been documented

may determine whether recoding is required for deployment.

And such issues can have far-reaching effects, especially in

regulated areas like lending and insurance underwriting,

where they make it difficult to explain and defend data-driven

decisions to auditors and customers.

Technology advances are helping organizations avoid these problems and speed up analytic lifecycle

processes. In the past, for instance, it wasn’t just models that might have to be recoded for target

applications. Each customer characteristic usually also had to be implemented separately, sometimes

requiring several hours for coding and testing. The current state-of-the-art is to use the same

characteristic coding in both development and production—enabling deployment of thousands of

characteristics at once in about the same amount of time it used to take for just one.

A key reason for this improvement is widespread use of business rules management, enabling

applications to execute or access analytic models as part of making decisions. As a result,

implementations increasingly revolve around deployable analytic libraries—sets of models,

characteristics and business rules codifying additional logic needed for production. In some cases,

everything needed for a scorecard model (good/bad schema, log odds slope and intercept,

performance window, scoring exclusions and characteristics associated bins with multiple ranges,

unexpected flags and reason codes, etc.) can be imported as a single package, ready for consumption

by the application. These streamlined methods not only reduce time to operational value, but also

make analytic work easier to share and reuse for multiple purposes.

In addition to being able to deploy analytics quickly, having a mechanism in place to enforce best

practices in model lifecycle management helps avoid development and implementation delays. It

also reduces the time and cost of regulatory compliance.

Our survey suggests most companies aren’t yet able to quickly introduce change to their operational systems, but that they are working to do better. The Era of Intimate Customer Decision is at Hand, Forrester Consulting, June 2013

Page 5: Big Data: Overhyped or Underexploited?

www.fico.com page 5

Big Data: Overhyped or Underexploited?

INSIGHTS WHITE PAPER

Centralized model management, depicted in

Figure 2, automates manual tasks and accelerates

lifecycle processes. It captures granular detail

about the tasks performed and decisions made

at each step—data quality analysis, new variable/

predictor generation, segmentation analysis,

model engineering, pre-deployment testing,

periodic validation and updates. State-of-the-art

solutions make it possible to enforce consistent,

approved methods across the lifecycle of all

models, while giving business units and analytic

teams the flexibility to create workflows that fit

their needs.

Companies can also keep a detailed inventory of

every model in their operational environment.

With centralized management, they can even

monitor which predictive characteristics are in

use by which models, evaluate performance and

stability over time and manage the downstream

effects of characteristics changes. A top-five

US bank we work with considers the ability to

comprehensively manage not only models, but

also characteristics as essential for maximizing

analytic value.

FIGURE 2: SHARED INFRASTRUCTURE FOR ENTERPRISE-WIDE MODEL LIFECYCLE GOVERNANCE

ModelData Mart

Tracking

Monitoring

OngoingValidation

ManagementReporting

Alerts

DecisionSimulation

DecisionExecution

ScoringServices

DecisionOptimization

Development&

Calibration

Deployment&

Verification

ModelData Mart

ADVANCED

PR

OFESSIONAL

DEC

IS

IONING

DEV

ELOPMENT

FO

UN

DA

TION

Source: FICO® Model Central™

With consumer credit behavior changing and competition for good

customers intensifying, a super-regional bank needed a breakthrough

in cross-selling, while also reaching out to additional population

segments for new customers. Both depended on eliminating the

inefficiencies that occurred when individuals pre-approved for credit

did not qualify in underwriting for those offers.

Today the bank targets customers for acquisition and cross-selling

likely to be approved by underwriting. A shared analytic learning hub,

shown in Figure 3 (next page), enables marketing and originations

to base decisions on the same criteria. It also continuously captures

operational data and outcomes. Analyzing current customer

responses to credit decisions, business experts edit business rules

weekly to improve strategies. Marketing and originations learn from

each other’s results and work toward common profit and loss (P&L)

objectives for driving portfolio growth.

In the first year after deployment, acquisition volume rose 300%,

cross-sell penetration 15% and average balance per account 8%. And

by getting the right offers to the right individuals, the bank has been

able to maintain a steady risk profile.

PROBLEM: Acquire and cultivate more profitable customers

SOLUTION: Eliminate silos via a shared analytic learning hub

CASE STUDY

Page 6: Big Data: Overhyped or Underexploited?

www.fico.com page 6

Big Data: Overhyped or Underexploited?

INSIGHTS WHITE PAPER

3. Leverage analytic innovationInnovations in Big Data processing and analytics are transforming how businesses get value from

their customer data. We’re seeing a shift from approaches that supply periodic snapshots in the

form of descriptive reports and dashboards (what happened) to systems that continuously analyze

incoming data to produce predictions (what is likely to happen) and prescriptions (what to do about

it) that are actionable in real-time.

Many types of analytics will increasingly operate inside production streams. Relying less on persistent

historical data, they’ll respond more to changes in the current environment. Analytic outputs will be

combined with complex event processing to enable very rapid responses to customer behavior.

Big Data tools and infrastructure are also making it easier to apply machine learning techniques to

explore huge datasets that include a wide variety of structured and unstructured data. The right

balance of these techniques with human analytic and domain expertise not only lifts business

performance but also improves the ability of companies to learn at a fast pace from data-driven

experiments.

Here are a few highlights of advanced techniques delivering tangible business value:

Fraud detection is at the vanguard of Big Data analytics for business. Fraud management

systems have analyzed huge amounts of streaming transaction data for decades, and have continued

to incorporate leading-edge innovations. FICO® Falcon® Fraud Manager models, for instance, rely

on transaction profiles that summarize data in the stream as it passes by in order to compute the

pertinent fraud feature variables without relying on the persistence of data in production. Initially

BookedAccounts

Credit OffersTaken toMarket

Direct MailResponders

Prescreenof One

Responders

Test and Learn

Responders

Originations Performance

Tracking

Acquisition Performance

Tracking

Existing Customers

Bank FootprintProspects

MarketingProspects

Originations Management

System

Analytic Learning Hub

1. Risk Score2. Economic Impact

Model3. Segmentation Model4. Action-Effect Model

1. Tracking2. Simulation3. Learning

1. Direct Mail2. Prescreen of One3. Test and Learn4. Accept/Reject5. Initial Credit Line6. Test and Learn

Analytics Analytic Data Mart

Decision Strategies for Acquisitions

and Originations

FIGURE 3: ANALYTIC LEARNING HUB ENABLES MARKETING AND ORIGINATIONS TO SHARE DATA, STRATEGIES AND RESULTS

Page 7: Big Data: Overhyped or Underexploited?

www.fico.com page 7

Big Data: Overhyped or Underexploited?

INSIGHTS WHITE PAPER

applied to customer accounts, the technique is now extensible to other entities, such as merchants,

ATMs and point-of-sale terminals, providing a more complete picture of payment card transactions. A

“bolt-on” adaptive model layer automatically adjusts its model feature weights based on production

data, improving sensitivity to emerging fraud patterns. Self-calibrating technologies for both profiles

and models increase detection accuracy where service/channel usage and other customer behaviors

are changing.

Unstructured data analytics can increase model

predictiveness. Up to 80% of the Big Data available to

businesses is text, speech, video and other unstructured data.

A growing number of automated techniques for transforming

these inputs into numerical representations can be used

with statistical analysis to discover predictive features. Other

techniques find patterns without such transformation, including

from a messy, mixed bag of different types of data.

Either way, features and patterns from unstructured data

can be combined with those from traditional structured

data into predictive models. In one project (see Figure

4), FICO demonstrated that a risk scorecard imbued with

text-extracted insights lifted predictiveness by 8% over a

traditional scorecard. In another project, analyzing notes

from sales inquiries, the addition of text insights enabled

a scorecard to identify 3% more leads resulting in sales.

Named entity extraction, a complementary text analytic

technique, identified individuals likely to have authority to

make a purchase decision—a strong predictor for improving

intelligent automated lead generation.

Machine learning can speed improvement cycles. Champion-challenger contests (pitting

the current best-performing strategy against proposed alternatives) are a widely used method of

improving data-driven decisions. But to accelerate learning and provide even more momentum for

performance improvement, they should incorporate some amount of deliberate experimentation.

That’s the only way to introduce enough diversity into the resulting outcome data to analyze causal

relationships (this change in action A causes outcome Y to change in this specific way). Machine

learning algorithms can help by automatically generating challenger strategies that maximize

learning speed within company-specified constraints on testing cost and risk.

4. Embrace analytic diversity

R, Python, Hive, Groovy, Scala, MATLAB, SQL, SAS. One of the side effects of the exploding world of

analytic innovation is that taking advantage of the latest techniques often requires learning a new

set of tools. Analytic teams will inevitably need to use multiple development methods to deliver the

insights the business needs.

It’s also clear that combining different types of analytic techniques often delivers superior results.

In the retail case study discussed on page 3, for instance, we described the benefits of using

Collaborative Profiles with time-to-event (TTE) predictive models. In this implementation, individual

customer allocation values, updated in real time with every transaction, become a pool of potential

FIGURE 4: INSIGHTS FROM UNSTRUCTURED DATA ANALYTICS CAN RAISE PREDICTIVE ACCURACYFigure 4: Analyzing hidden signals in text lifts predictive performances

0%

20%

40%

60%

80%

100%

0% 20% 40% 60% 80% 100%

PERC

ENTA

GE

“BA

DS”

PERCENTAGE “GOODS”

TraditionalScorecard

SemanticScorecard

At any percentage of “goods” (current, fully paid accounts), the scorecard with text insights predicts a higher percentage of “bads” (charged-off and defaulted accounts) than the traditional scorecard.

Page 8: Big Data: Overhyped or Underexploited?

www.fico.com page 8

Big Data: Overhyped or Underexploited?

INSIGHTS WHITE PAPER

variables for the tens of thousands of TTE models. During model generation and refresh, these

variables are automatically selected and combined with other variables based on how strongly they

predict the target outcome of that model—say, the purchase of noodles within the next 10 days. For

some product categories, this combination of techniques improved predictiveness over the model

alone by as much as a factor of four.

In addition, Collaborative Profiles substantially improve the ability to predict new behaviors that

are probable but never before observed for a particular customer. They also pick up early signs

of behavioral change. A significant shift in archetype allocation could indicate, for example, that

the customer is transitioning to a vegan diet or a new baby has joined the household. Both are

opportunities for the retailer to adjust its offers for greater relevance and value to the customer.

To get multiple types of analytic models to work together like that in an efficient development

environment and robust production environment, you need a flexible infrastructure that embraces

diversity. Fundamental requirements include the ability to operationalize models authored by a wide

range of tools by supporting extensible libraries, web services and standards such as the Predictive

Modeling Markup Language (PMML). Centralized lifecycle management should extend across

models, business rules and analytic assets from any source.

5. Leverage cloud services and productivity platforms

Creating Big Data analytics no longer requires making a huge investment in expensive infrastructure

and specialized skills. By leveraging cloud services, companies can let a dedicated third party securely

handle the underlying systems and services, paying just for the capacity and services they need.

In the bank case study on page 5, the shared analytic learning hub for marketing and originations

was rapidly deployed using FICO hosting (similar infrastructure-as-a-service is also available today

through the FICO® Analytic Cloud). In addition, the open, hub-based architecture is a quicker,

less costly way to improve cross-functional visibility and coordination than traditional one-to-one

systems integration.

A global telecom company is leading the way for an industry

transitioning from years of go-go growth to a new era of deliberate,

precise management of risk and reward. Intense competitive pressure

in saturated markets had led to marketing campaigns netting too

many customers who end up in collections and/or cost the company

more than their accounts were worth.

To make more profitable originations decisions, the company

is moving beyond traditional credit classes to more granular

analytic segmentation that separates populations by credit risk

and customer lifetime value. To do it, it’s tapping an increasingly

wide range of data, including customer transactions and service

interactions. And it’s using analytics not only to predict customer

behavior, but also to balance all the key elements of risk and

reward in an originations decision to prescribe the best action for

maximizing discounted cash flow over time.

These deeper analytic insights are also helping the company reduce

attrition of valuable customers. Now the company knows when

allowing some leniency—such as to an otherwise current account

that falls behind during the holidays—is likely to encourage long-term

customer loyalty and profitability.

PROBLEM: Improve customer lifetime value and cash flow

SOLUTION: Balance risk and reward by optimizing customer acquisition decisions

CASE STUDY

Page 9: Big Data: Overhyped or Underexploited?

Big Data: Overhyped or Underexploited?

INSIGHTS WHITE PAPER

For more information North America Latin America & Caribbean Europe, Middle East & Africa Asia Pacificwww.fico.com +1 888 342 6336 +55 11 5189 8222 +44 (0) 207 940 8718 +65 6422 7700 [email protected] [email protected] [email protected] [email protected]

FICO, Falcon, Model Central and “Make every decision count” are trademarks or registered trademarks of Fair Isaac Corporation in the United States and in other countries. Other product and company names herein may be trademarks of their respective owners. © 2014 Fair Isaac Corporation. All rights reserved.

4057WP 10/14 PDF

The Insights white paper series provides briefings on research findings and product development directions from FICO. To subscribe, go to www.fico.com/insights.

Unless analytics are to interact only with applications, you also need tools for packaging analytic

services for business users. Today’s application development productivity platforms (available for site-

install or via cloud services) provide everything needed to create complete applications, including

user forms and workflows powered by the analytic models.

6. Give control to the business experts

Everything we’ve discussed so far produces substantial value only when companies nail this final

best practice. The whole point of Big Data analytics is to give business experts new insights they can

quickly turn into decision strategies that ultimately improve results with customers.

For instance, visual tools for building strategies (decision trees) enable business experts to quickly

segment customer populations using any mix of policies and data-driven insights. A direct

marketing company that deployed FICO® Analytic Modeler Decision Tree Professional was able to

re-segment its customer database, extending credit to pockets of customers previously misidentified

as high-risk—generating nearly $12 million in incremental sales in just four months.

A European bank currently automating its originations process provides another example. Its leap

from manual methods to industry best practices—including use of visual strategy development

tools—is a major transformation. But this bank will avoid the “silos” of information that still challenge

many institutions that automated earlier. It will support its ambitious growth plans with systems

that make it easy for product marketing and risk management to collaboratively develop, test and

improve operational decision strategies.

The value of Big Data to business is easy to understand. But it’s not as easy to extract customer

insights from immense stores and incoming streams of data in an actionable form—and in time to

make a difference. Fortunately a reliable set of best practices for Big Data analytics is proving itself

in industries and markets around the world. There’s no need to “reinvent the wheel”—just take

advantage of its momentum.

To learn more about best practices for Big Data analytics, visit the FICO Blog and read these other

Insights white papers:

• Harnessing the Speech Analytics Advantage (No. 76)

• Cloud Democratizes Access to Big Data Analytics (No. 74)

• Extracting Value from Unstructured Data (No. 71)

• When Is Big Data the Way to Customer Centricity? (No. 67)

Conclusion