How is Big Data Adoption

15
Survey Results

description

Adopting Big Data through Open Source

Transcript of How is Big Data Adoption

Page 1: How is Big Data Adoption

Survey Results

Page 2: How is Big Data Adoption

Talend How Big Is Big Data Adoption? – Survey Results

Page 2

Table of Contents  

Survey Results ............................................................................................................................. 4

Big Data Company Strategy ........................................................................................................ 6

Big Data Business Drivers and Benefits Received ...................................................................... 8

Big Data Integration ................................................................................................................... 10

Big Data Implementation Challenges ......................................................................................... 12

Big Data Implementation Technologies ..................................................................................... 14

About Talend .............................................................................................................................. 15

Page 3: How is Big Data Adoption

Talend How Big Is Big Data Adoption? – Survey Results

Page 3

Big data represents a significant paradigm shift in enterprise technology and stands to transform

much of what the modern enterprise is today. Digital data is everywhere and global data is growing at

40% per year. Companies capture trillions of bytes of information about their customers, suppliers,

and operations, and millions of networked sensors are being embedded in the physical world in

devices such as mobile phones, energy meters and automobiles, sensing, creating, and communicating

data.1 By collecting and analyzing all this information companies gain insight into new business

opportunities and threats.

But what exactly is big data? Big data encompasses a complex and large set of diverse structured

and unstructured datasets that are difficult to process using traditional data management practices

and tools. There is an increasing desire to collect call detail records, web logs, data from sensor

networks, financial transactions, social media and Internet text, and then analyze with existing data

sources. Conventional data management tools fail when trying to integrate, search and analyze big

datasets, which (for now) range from terabytes to multiple petabytes of information. As an example,

Walmart handles more than 1 million customer transactions every hour, which is imported into

databases estimated to contain more than 2.5 petabytes of data - the equivalent of 167 times the

information contained in all the books in the US Library of Congress.2 New technologies based on the

Apache Hadoop Big Data Platform have emerged as a way to analyze large data sets through a

technique called massively parallel-processing (MPP) of information.

As with any new successful paradigm, there is a technology adoption curve from innovators and early

adopters, to the early majority, to the late majority, to laggards. Early adopters are driven by

competitive advantage and innovation, take the biggest risks for success, typically use primitive tools,

and build it themselves. Conversely, the late majority and laggards strive for the productivity gains

others have received and take less risk by adopting proven technologies backed by robust products

and services. In the information arms race, companies that can collect and analyze more information

should be able to make faster, better-informed decisions compared to their competitors, e.g. by

maximizing customer wallet share, by knowing when and why customers may leave, by efficiently

creating and targeting new markets, or by deterring fraud. To date most of the big data discussion

has been about big data technology. The goal of this survey and whitepaper is to highlight big data

adoption challenges, business objectives and benefits, as well as big data technologies being used.

1 “Big data: The next frontier for innovation, competition, and productivity” McKinsey & Company, May 2011. 2 http://en.wikipedia.org/wiki/Big_data

Page 4: How is Big Data Adoption

Talend How Big Is Big Data Adoption? – Survey Results

Page 4

North America 49%

EMEA 51%

Survey Results

In the summer of 2012, Talend conducted a big data adoption survey of 231 professionals involved in

the delivery of data solutions for their company. Survey respondents were closely split between North

America (49%) and EMEA (51%), with 60% of respondents in IT and 36% having business titles. 95

respondents who did have a big data strategy, were then asked a series of questions about their

experience.

Figure 1: Survey Demographics

Key findings from the survey are:

• 41% of companies have a strategy for dealing with big data, indicating the growing adoption of

big data.

• 48% of big data initiatives are driven by the business, 39% by IT, and 13% cross-functionally.

• For those without a big data strategy, the main reason (76%) is that they do not distinguish big

data from existing corporate data.

• Increasing the depth and accuracy of predictive analytics was the number one driver for big

data, reported by 68% of those who have a big data strategy. Using today’s definition of big

data (> 10 terabytes), 71% of respondents have big data to manage.

• 62% indicated that they have achieved big data business benefits with the primary benefit

being business process optimization (28%) and improvements in marketing and sales (24%).

Business 36%

IT 60%

Other 4%

Page 5: How is Big Data Adoption

Talend How Big Is Big Data Adoption? – Survey Results

Page 5

• However, 24 companies reported not receiving a business benefit which may indicate the need

for improved big data skillsets, governance and management.

• The types (inputs) of big data that are being used today include web and social media (57% of

respondents) followed by sales data (54% of respondents).

• 61% replied that their primary big data challenge was allocating sufficient time, budget and

resources, with just over half (52%) reporting a lack of big data in-house expertise.

• Open source Apache Hadoop and Hadoop-based distributions represented over 60% of big data

implementation technologies in use or considered for use.

 

 

   

Page 6: How is Big Data Adoption

Talend How Big Is Big Data Adoption? – Survey Results

Page 6

Big Data Company Strategy

It was just over 10 years ago when Doug Laney from the Meta Group (now Gartner) published a report3

on the growing volume, velocity and variety of data and that organizations need to look beyond

traditional approaches. The business model of big data early adopters such as Google and Facebook

required that they create a strategy to collect and analyze large volumes of data to scale their

business. Some companies have big data collection and analysis in their DNA and have built a separate

big data strategy. Others recognize that big data is part of a larger total data management function

and incorporate big data tools and practices throughout the business to manage big data as well as

enterprise data and discrete data. In 2011 there were 9 companies who offered products based on big

data (Apache Hadoop) technologies and now there are over 120 vendors – clearly showing signs of

momentum. The survey results (Figure 2) showed that 41% of organizations do have a big data

strategy and 59% do not. Furthermore, 76% of those without a specific strategy replied that they do

not distinguish big data from existing corporate data management practices.

For companies that do have a big data strategy (n=95), it is being driven by several company functions

(Figure 4), which indicates that big data as a core strategy has moved past an early adopter stage.

39% indicated that big data initiatives are being driven by IT, or a bottom up approach to be more

efficient in collecting and analyzing large data sets. However, 48% of big data strategy is being driven

by lines of business or executives, which indicates there are compelling business reasons for big data

adoption, such as increased revenue, improved customer satisfaction, or faster time to market.

3 http://blogs.gartner.com/doug-laney/files/2012/01/ad949-3D-Data-Management-Controlling-Data-Volume-Velocity-and-Variety.pdf

41%

59%

Figure 2: Does your organization have a strategy for dealing with big data? (n=231)

No Yes

24%

76%

Figure 3: If no, why? (n=136)

We don't distinguish big data from existing corporate data Other reasons

Page 7: How is Big Data Adoption

Talend How Big Is Big Data Adoption? – Survey Results

Page 7

Figure 4: The big data initiative is primarily being driven by:

Early results show that big data is part of larger corporate data initiatives, and is being driven by business more than IT.

IT 39%

Business and Consumers of

Data 26%

Executive Management

20%

Cross-functional Team

13%

Board of Directors

2%

Page 8: How is Big Data Adoption

Talend How Big Is Big Data Adoption? – Survey Results

Page 8

Big Data Business Drivers and Benefits Received

Data requirements and benefits vary by industry. For example, communications providers,

government, healthcare and retail firms all have larger amounts of unstructured data such as text,

audio and video files that can benefit from big data collection and analysis. The number one business

driver for big data (68%) is increasing the accuracy and depth of predictive analytics – or the ability to

analyze current and historical data to make future predictions. Revenue optimization (51%) and new

revenue generation (48%) were the second and third highest responses as companies seek to do more

in-depth analysis to maximize market and wallet share, e.g. improve cross-selling capabilities.

Figure 5: What are the business drivers for big data in your organization? (multiple responses, n=95)

48%  51%  

68%  

31%  

20%   19%  16%  

1%  0%  

10%  

20%  

30%  

40%  

50%  

60%  

70%  

80%  

Page 9: How is Big Data Adoption

Talend How Big Is Big Data Adoption? – Survey Results

Page 9

For those that have implemented big data projects, 62% indicated that they have achieved business

benefits (Figure 6) with the primary benefit being business process optimization (28%) and

improvements in marketing and sales (24%). It is a concern that 38% responded “No” or “Unknown”,

and may be due to the lack of project governance, data quality, big data skillsets and/or available

tooling, which is typical for new paradigms.

Figure 6: To date, have you realized any business benefit to big data? (n=95)

Big data business benefits include business process optimization and marketing/sales improvement; however for some projects it is too early to tell or have failed to deliver a benefit.

Yes,  particularly  in  Marketing  and  Sales  

24%  

Yes,  particularly  in  Crime  Prevention  and  

Fraud  5%  

Yes,  particularly  in  Customer  Retention  

5%  Yes,  particularly  in  Business  Process  Optimization  

28%  

No  25%  

Unknown  13%  

Page 10: How is Big Data Adoption

Talend How Big Is Big Data Adoption? – Survey Results

Page 10

Big Data Integration

Common big data use cases include marketing campaign analysis, recommendation engines,

predictive analytics, sentiment analysis, risk management and fraud detection. IT is integrating

existing data warehouses and business intelligences systems with diverse sets of structured and

unstructured data for more in-depth analysis. The survey revealed that the most common applications

being integrated were financial transactions (48.4%) and social media, clickstream and Internet text

(48.4%), followed by web logs (35.8%) and call detail records (28.4%). By looking at social media and

internet text, firms can understand who the “super users” are in any social network or community,

i.e. ones that have the most influence over others inside social networks. Also, by correlating

financial transactions and call detail records with click streams, one can generate a more complete

view of customer buying patterns and behavior.

Figure 7: Which applications are driving big data needs at your organization? (multiple responses)

Furthermore the type of big data that is being used today or considered in the future reinforced the

previous response. Web and social media are being used by 57% of respondents today with 23%

considering it for the future. Sales data was the second highest for being used today (54%) and

28%  

48%  

25%   25%  

48%  

16%  

36%  

8%  0%  

10%  

20%  

30%  

40%  

50%  

60%  

Call  detail  records  

Financial  transactions  

Science,  research  data  or  

medical  data  

Sensor  data   Social  media,          

click-­‐stream  or  internet  search  analytics  

Video,  imaging  data  

Web  logs   Other  

Page 11: How is Big Data Adoption

Talend How Big Is Big Data Adoption? – Survey Results

Page 11

considered for the future (32%) for analyzing buying patterns and sales incentives. Biometric data was

the lowest on the list.

Figure 8: What type of big data are you involved in or considering making part of your Business Intelligence?

Web and social media are big data’s primary inputs with sales data a close second.

54%  42%   51%  

18%  34%  

22%  30%  

30%  

8%  

36%  

19%   23%   14%  

69%  

25%  

Web  and  social  media  (web  logs,  twitter  feeds,  JSON)  

Machine  generated  data  (RFID,  GPS,  

phone  apps  and  other  machine  generated  

data)  

Sales  data  (buying  patterns  and  sales  

incentives)  

Biometric  (Qingerprint,  voice/

face  recognition,  DNA)  

Human  interactions  (e-­‐mails,  voice  mails,  

call  centers)  

Actively  Involved   Considering  for  the  Future   Not  Considering  

Page 12: How is Big Data Adoption

Talend How Big Is Big Data Adoption? – Survey Results

Page 12

Big Data Implementation Challenges

The technical challenges in processing big data involve integrating, searching and analyzing large data

sets. However, like any new paradigm, companies must also find the right skillsets, get budget

approval, navigate company politics, and manage the unknowns. The United States alone faces a

shortage of 140,000 to 190,000 people with deep analytical skills as well as 1.5 million managers and

analysts to analyze big data and make decisions based on their findings.4 Many early big data projects

are free of explicit project management structure and over time companies will incorporate

standards and procedures just as they have with data management projects. In the survey, 61%

replied that their primary big data challenge was allocating sufficient time, budget and resources,

with just over half (52%) reporting a lack of big data in-house expertise. Also 48% reported data

quality challenges, while only 11% reported a challenge getting C-level buy-in for big data projects.

Figure 9: What challenges to implementing big data are most hindering your success? (multiple responses)

4 “Big data: The next frontier for innovation, competition, and productivity” McKinsey & Company, May 2011.

52%  

18%  

36%  

61%  

37%  

48%  

11%   5%   4%  0%  

10%  

20%  

30%  

40%  

50%  

60%  

70%  

Page 13: How is Big Data Adoption

Talend How Big Is Big Data Adoption? – Survey Results

Page 13

Big data processing varies by organization, industry and the types of tools available to process the

data. It may be collecting and analyzing terabytes of information or many petabytes of data, and over

time it is assumed that the definition of “big data” will grow. Using today’s definition of big data (>

10 terabytes) 71% of respondents (Figure 10) have big data to manage. 46% of respondents have over

100 terabytes of data to manage and 12% had greater than 2 petabytes to manage.

Figure 10: What is the total amount of data that exists within your organization?

A large majority of companies have over 10 terabytes of data to manage, but the biggest barriers to big data adoption are a shortage of time, budget, expertise and resources.

   

29%

25% 23%

11%

6% 6%

< 10 Terabytes

10 to 99 Terabytes

100 to 499 Terabytes

500 to 1 Petabytes

2 to 5 Petabytes

> 5 Petabytes

Page 14: How is Big Data Adoption

Talend How Big Is Big Data Adoption? – Survey Results

Page 14

Big Data Implementation Technologies

Many technologies have been developed to integrate, manipulate, manage and analyze big data.

Survey respondents were asked which big data technologies they are using or they are considering to

use. Apache Hadoop (28%), an open source framework with basic big data constructs (file system,

language, and distributed system for managing large datasets) that incorporates MapReduce, was the

most popular response. Showing strong support for open source technology, Hadoop and Hadoop-

based solutions represented 62% of the responses, The large amount of Other (38%) responses suggest

an early adopter, fragmented market, and included selections for big data appliances (e.g. Teradata,

Netezza) and NoSQL databases (e.g. Couchbase, MongoDB).

Figure 11: Which implementation of big data technology are you considering or are you using already? (n=95)

Open source Apache Hadoop and Hadoop-based distributions represented over 60% of big data implementations in use or considered for use.

13%  

28%  

12%  

2%  3%  4%  

38%  

Amazon  Web  Services  

Apache  Hadoop  (own  installation)  

Cloudera  (CDH)  

Greenplum  

Hortonworks  Data  Platform  

MAPR  

Other  

Page 15: How is Big Data Adoption

Talend How Big Is Big Data Adoption? – Survey Results

Page 15

About Talend

Talend is the recognized leader in open source integration solutions. The company’s holistic

integration platform helps organizations minimize costs and maximize the value of data integration,

ETL, data quality, master data management, application integration and business process

management - while supporting their shift toward big data. More than 3,500 paying customers

worldwide, including eBay, ING, The Weather Channel, Deutsche Post and Allianz, subscribe to

Talend’s solutions and services. With over 20 million downloads, Talend’s products are the most

trusted integration solutions in the world. The company has major offices in North America, Europe

and Asia, and a global network of technical and services partners.

Talend’s open source approach and flexible integration platform for big data enables users to easily

connect and analyze data from disparate systems to help drive and improve business performance.

Talend’s big data capabilities integrate with today’s big data market leaders such as Cloudera,

Hortonworks, Google, Greenplum, Mapr, Teradata and Vertica, positioning Talend as a leader in the

management of big data. Talend’s goal is to democratize the big data market just as it has with data

integration, data quality, master data management, enterprise service bus and business process

management. Visit www.talend.com to learn more and download your free copy of Talend Open

Studio for Big Data.