HP Software - Big Data Challenges - Minoc Media...

22
© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. HP Software - Big Data Challenges February 2015

Transcript of HP Software - Big Data Challenges - Minoc Media...

© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

HP Software - Big Data ChallengesFebruary 2015

© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.2

The world has changed…

YouTube

Viber

Qzone

Amazon Web Services

GoGrid

Rackspace

LimeLight

Jive Software

salesforce.com

Xactly

Paint.NET

Business

Education

Entertainment

Games

Lifestyle

Music

Navigation

News

Photo & Video

Productivity

Reference

Social Networking

Sport

Travel

Utilities

Workbrain

SuccessFactors

Taleo

Workday

Finance

box.net

Facebook

LinkedIn

TripIt

Pinterest

Zynga

Zynga

Baidu

Twitter

Twitter

Yammer

Atlassian

Atlassian

MobilieIronSmugMug

SmugMug

Atlassian

Amazon

AmazoniHandy

PingMe

PingMe

Associatedcontent

Flickr

Snapfish

Answers.com

Tumblr.

Urban

Scribd.Pandora

MobileFrame.com

Mixi

CYworld

Renren

Xing

Yandex

Yandex

Heroku

RightScale

New Relic

AppFog

Bromium

Splunk

CloudSigma

cloudability

kaggle

nebula

Parse

ScaleXtreme

SolidFire

Zillabyte

dotCloud

BeyondCore

Mozy

Fring Toggl

MailChimp

Hootsuite

Foursquare

buzzd

Dragon Diction

SuperCam

UPS Mobile

Fed Ex Mobile

Scanner Pro

DocuSign

HP ePrint

iSchedule

Khan Academy

BrainPOP

myHomework

Cookie Doodle

Ah! Fasion Girl

PaperHost

SLI Systems

NetSuite

OpSource

Joyent

Hosting.com

Tata Communications

Datapipe

PPM

Alterian

Hyland

NetDocuments

NetReach

OpenText

Xerox

Google

Microsoft

IntraLinks

Qvidian

Sage

SugarCRM

Volusion

Zoho

Adobe

Avid

Corel

Microsoft

Serif

Yahoo

CyberShift

Saba

Softscape

Sonar6

Ariba

Yahoo!

Quadrem

Elemica

Kinaxis

CCC

DCC

SCMADP VirtualEdge

Cornerstone onDemand

CyberShift

KenexaSaba

Softscape

Sonar6

Workscape

Exact Online

FinancialForce.com

IntacctNetSuite

Plex Systems

Quickbooks

eBay

MRM

Claim Processing

Payroll

Sales tracking & Marketing

CommissionsDatabase

ERP

CRM

SCM

HCM

HCM

PLM

HP

EMC

Cost Management

Order Entry

Product Configurator

Bills of MaterialEngineering

Inventory

Manufacturing Projects

Quality Control

SAP

Cash Management

Accounts ReceivableFixed AssetsCosting

Billing

Time and Expense

Activity ManagementTraining

Time & Attendance

Rostering

Service

Data Warehousing

The InternetGigabytes

Client/serverMegabytes

Every 60 seconds…

IBM

Unisys

Burroughs

Hitachi

NECBull

Fijitsu

Mainframe Kilobytes

Big Data, Cloud, Mobility

Zettabytes

Brontobytes + Geopbytes

2,000 check-ins on Four Square

$275,000 spent online shopping

204 million+ emails sent

48 hours new video on YouTube

38,000 new Tumblr blog posts

100,000+ tweets

2 million+ Google searches

35,000 brand “Likes” on Facebook

© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.3

We have gone beyond the decimal system

Big Data from the “Internet of Things”

Today, data scientists use

Yottabytes to

describe how much government data the NSA or FBI have on people altogether.

In the near future, a Geopbyte will be the measurement to describethe type of data generated from the IOT.

1030This will take us beyond our decimal system

Geopbyte

This will be our digital universe tomorrow…

Brontobyte 1027

1024

This is our digital universe today Yottabyte

1021

1.3 ZB of network traffic by 2016

Zettabyte10

18

1 EB of data is created on the internet each dayExabyte

1012

Terabyte500TB of new data per day are ingested in Facebook databases

1015PetabyteThe CERN Large Hadron Collider generates 1PB per second

109

Gigabyte10

6

Megabyte

© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.4

Enterprise data growthCosts of managing data

1,820 TB of data created

Every 60 seconds…

YouTube

Viber

Qzone

Amazon Web Services

GoGrid

Rackspace

LimeLight

Jive Software

salesforce.com

Xactly

Paint.NET

Business

Education

Entertainment

Games

Lifestyle

Music

Navigation

News

Photo & Video

Productivity

Reference

Social networking

Sport

Travel

Utilities

Workbrain

SuccessFactors

Taleo

Workday

Finance

box.net

Facebook

LinkedIn

TripIt

Pinterest

Zynga

Zynga

Baidu

Twitter

Twitter

Yammer

Atlassian

Atlassian

MobilieIronSmugMug

SmugMug

Atlassian

Amazon

AmazoniHandy

PingMe

PingMe

Associatedcontent

Flickr

Snapfish

Answers.com

Tumblr.

Urban

Scribd.Pandora

MobileFrame.com

Mixi

CYworld

Renren

Xing

Yandex

Yandex

Heroku

RightScale

New Relic

AppFog

BromiumSplunk

CloudSigma

cloudability

kaggle

nebula

Parse

ScaleXtreme

SolidFire

Zillabyte

dotCloud

BeyondCore

Mozy

Fring Toggl

MailChimp

Hootsuite

Foursquare

buzzd

Dragon Diction

SuperCam

UPS Mobile

Fed Ex Mobile

Scanner Pro

DocuSign

HP ePrint

iSchedule

Khan Academy

BrainPOP

myHomework

Cookie Doodle

Ah! Fasion Girl

PaperHost

SLI Systems

NetSuite

OpSource

Joyent

Hosting.com

Tata Communications

Datapipe

PPM

Alterian

Hyland

NetDocuments

NetReach

OpenText

Xerox

Google

Microsoft

IntraLinks

Qvidian

Sage

SugarCRM

Volusion

Zoho

Adobe

Avid

Corel

Microsoft

Serif

Yahoo

CyberShift

Saba

Softscape

Sonar6

Ariba

Yahoo!

Quadrem

Elemica

Kinaxis

CCC

DCC

SCMADP VirtualEdge

Cornerstone onDemand

CyberShift

KenexaSaba

Softscape

Sonar6

Workscape

Exact Online

FinancialForce.com

IntacctNetSuite

Plex Systems

Quickbooks

eBay

MRM

Claim processing

Payroll

Sales tracking & marketing

CommissionsDatabase

ERP

CRM

SCM

HCM

HCM

PLM

HP

EMC

Cost management

Order entry

Product configurator

Bills of materialEngineering

Inventory

Manufacturing projects

Quality control

SAP

Cash management

Accounts receivableFixed assetsCosting

Billing

Time and Expense

Activity managementTraining

Time & attendance

Rostering

Service

Data warehousing

The InternetGigabytes

Client/serverMegabytes

IBM

Unisys

Burroughs

Hitachi

NECBull

Fijitsu

Mainframe Kilobytes

Mobile, social, Big Data & the cloud

Zettabytes TCO for unstructured data varies between $4/GB to $100/GB annually, but $25GB is a good rule of thumb*

*Source: ESG White Paper – The Cost of Managing Unstructured Data, May 2014

The volume, velocity and breadth of channels often overwhelms Information Management strategies leading to dark data

Storage costs are visible, soft costs such as opportunity & risk costs are less so, but no less real…

© Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

What is legacy data and dark data?Redundant, obsolete, trivial, and the unknown…

Legacy data resides in:Legacy applications and repositoriesUnmanaged SharePoint sites, file shares and mail systems

Legacy data can contain or be: RedundantDuplicates and unauthorized copies Obsolete

No longer in use or out of dateDetermined through creation,

last modified or accessed date and retention policy

TrivialFile type with no content value

© Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

What is dark data?What lies hidden in your enterprise data… the unknown

Beyond legacy data…

Dark data tends to be:• Human readable

• Unstructured

• Unindexed

• Unmanaged

• Inactive

• Orphaned

Dark data resides in:• File servers

• SharePoint

• Email servers

© Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

The risk of ignoring legacy and dark data

Legacy & dark data sitting outside the information governance strategy exposes the organization to risk:

•Spiralling costs– Expanding information footprint and storage costs

– Litigation and eDiscovery costs (“smoking gun” or inability to deliver)

•Security breaches and reputational damage– Sensitive information unprotected (personally identifiable information, privacy regulations)

– Data leakage and misuse

•Poor business execution and performance– Incorrect context

– Decisions based on outdated information

– Duplicate effort spent re-creating information

© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.8

Today’s reality

86% of corporations cannot deliver the right information at right time*

3%23%

% of data that would be

potentially useful if effectively engaged

actually being tagged for Big Data value

% of the digital universe that is

actually being tagged, analyzed and leveraged

0.5%

*Source: IDC Predictions 2012: Competing for 2020 & ¹Source: IDC The Digital Universe in 2020, December 2012

© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.9

Insight from 100% of the data

Data is exploding but traditional data technologies impose limits - We need connected intelligence

Structureddata

Humaninformation

Machine data

Connected Intelligence

© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

…leaving a trail of digital footprints.© Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.11

Engage 100% of data to gain competitive advantage

Data volumes

Acc

ura

cy a

nd

insi

gh

t

CRM ERP Data warehouse Web Social Log files Machine data Semi-structured

Dark data

Big DataTraditional

enterprise data

Unstructured

© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.12

It takes a Big Data platform to “cash in” on all your data assets

• Only .5% of data in the average organization is tagged and analyzed

• Information silos - everywhere

• Tools for finding and understanding information, tied to original application and format

• Queries take too long and are too rigid,difficult to uncover opportunities, emerging patterns & unexpected threats

Siloed data challenge• Ad hoc discovery - find what’s in the

data without pre-structuring it

• Ubiquitous but secure data access

• Real time data collection and analysis, any format, any data source

• An extensible platform to harness100% of data, on-premise, in the cloud

Big Data platform needs

© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.13

HP HavenBig Data platform

Gain insight from 100% of your data

• Analyze machine, business, human data

• Connect to any existing data source system

• Scale 50-1000x faster than legacy systems

• Develop modern data-driven applications & web services

HP applications

Customer applications

Developer applications

Haven

Defined programming interfaces

Analytics, context and categorization

Data connectors

Social media

IT/OT ImagesAudioVideo Mobile Search engine

Email Texts DocumentsTransactional data

Records Compliancearchives

Scalable data stores

On-premise

In the Cloud

© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.14

Use case # 1: Smart / Safe City

• Deployment Environment -

• Ingest data from 2,000+ CCTV cameras in Auckland

• View network of road and environmental sensors

• Social media trending, broadcast monitoring, and real time web news

• Phase 1 – scene analysis and license plate recognition

• Future Phase - Integrate HP Vertica to uncover breaking trends and facilitate incident responses

• HP IDOL eduction sends interesting data to Vertica for statistical analysis and slice/dice

• Combine HP Vertica’s pattern-matching and graph-analysis at scale with HP IDOL’s ability to model concepts and enrich data

This is a rolling (up to 3 year) roadmap and is subject to change without notice

Improving public safety by detecting high-risk activities and investigating threats

© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.15

Use case # 2: Catch Insider Traders

• Multiple data sources:

• HP Digital Safe data

• Transactional trading data

• Financial news feeds

• Social media

• Email, voicemail recordings, instant messaging

• Phase 1 – complex policies such as highlighting suspect trades where no communication can be found between related Bank A and Bank B contacts

• Future Phase - Integrate HP Vertica for trend and anomaly detection

• HP IDOL eduction sends interesting data to HP Vertica for statistical analysis and slice/dice

• Combine HP Vertica’s pattern-matching and graph-analysis at scale with HP IDOL’s ability to model concepts and enrich data

This is a rolling (up to 3 year) roadmap and is subject to change without notice

Financial Services - Information Surveillance & Digital Forensics Solution

© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.16

Use case # 3: Smart Retail / Voice of Customer

• Multiple data sources:

• Enterprise – documents, email, ticketing systems, CRM cases, videos

• Customer – social media, blogs, forums, User Generated Content , surveys

• Public – Websites, News

• Phase 1:

• Sentiment detection, clustering

• Eduction – people, places, credit card #s

• Link expansion, Gender detection

• Curation, tagging, alerts

• Future Phase - Integrate HP Vertica for demographic profiling

• HP IDOL eduction sends interesting data to HP Vertica for statistical analysis and slice/dice

• Combine HP Vertica’s pattern-matching and graph-analysis at scale with HP IDOL’s ability to model concepts and enrich data

This is a rolling (up to 3 year) roadmap and is subject to change without notice

Prevent churn, analyze NPS surveys, react to product/warranty issues

© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.17

Real world: claims integrity

Leading health insurance company

Business need• Identify duplicate or inaccurate health insurance claims and

transactions (i.e. overpayment)

• Multiple legacy systems containing claims data, with little integration

Solution• Connect legacy systems and create a common index of claims data

regardless of location, type or source Identify unusual patterns in transactions to identify fraud or error

Business benefits• Massive ROI through reduction in

duplicate claims paid

• Improved operational efficiency

© Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.17

© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.18

Real world: expertise networks

Aircraft manufacturing

Business need• Employees waste 30 min/day finding info, duplicate work of others

• Identify expertise across global community of 35,000 engineers

• Avoid manual approaches such as describing areas of interest & expertise in contacts directory using predefined keywords

Solution• Generate user profiles automatically and in real time based on the pages

visited and documents read

• Alert employees when documents, other employees, match the work they are doing

Business benefits• Reduced time spent retrieving information by over 90%

• Identified teams working on similar projects across the globe

• ROI within 7 months© Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.18

© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. HP Confidential.

Summary

© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.20

HP HavenBig Data platform

Gain insight from 100% of your data

• Connect to all of your machine, business, & human data sources

• Analyze at volume and velocity of data

• Develop modern data-driven applications

HP applications

Customer applications

Developer applications

Haven

Defined programming interfaces

Analytics, context and categorization

Data connectors

Social media

IT/OT ImagesAudioVideo Mobile Search engine

Email Texts DocumentsTransactional data

Records Compliancearchives

Scalable data stores

On-premise

In the Cloud

© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.21

Check out the websites…. www.autonomy.com www.vertica.com www.hp.com

© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. HP Confidential.

Thank you