Making EDW More Flexible with Hadoop - SNIA SNIA Analytics and Big Data Summit. © Pentaho...

14
Making EDW More Flexible with Hadoop Rob Rosen Big Data GTM Lead Pentaho Corporation

Transcript of Making EDW More Flexible with Hadoop - SNIA SNIA Analytics and Big Data Summit. © Pentaho...

Page 1: Making EDW More Flexible with Hadoop - SNIA SNIA Analytics and Big Data Summit. © Pentaho Corporation. All Rights Reserved. Making EDW More Flexible with Hadoop . Rob Rosen . Big

2013 SNIA Analytics and Big Data Summit. © Pentaho Corporation. All Rights Reserved.

Making EDW More Flexible

with Hadoop

Rob Rosen Big Data GTM Lead Pentaho Corporation

Page 2: Making EDW More Flexible with Hadoop - SNIA SNIA Analytics and Big Data Summit. © Pentaho Corporation. All Rights Reserved. Making EDW More Flexible with Hadoop . Rob Rosen . Big

2013 SNIA Analytics and Big Data Summit. © Pentaho Corporation. All Rights Reserved.

The State of Data Warehouses

2

Gartner Research Publication Date: 1 December 2010 ID Number: G00208101 Predicts 2011: Data Management Disciplines Elevate Business Criticality

Page 3: Making EDW More Flexible with Hadoop - SNIA SNIA Analytics and Big Data Summit. © Pentaho Corporation. All Rights Reserved. Making EDW More Flexible with Hadoop . Rob Rosen . Big

2013 SNIA Analytics and Big Data Summit. © Pentaho Corporation. All Rights Reserved.

TDWI Hadoop Survey: Business Intelligence and Data Warehouse

3

Page 4: Making EDW More Flexible with Hadoop - SNIA SNIA Analytics and Big Data Summit. © Pentaho Corporation. All Rights Reserved. Making EDW More Flexible with Hadoop . Rob Rosen . Big

2013 SNIA Analytics and Big Data Summit. © Pentaho Corporation. All Rights Reserved.

Competitive Advantage vs. Operational Efficiency

4

Operational Efficiency Competitive Advantage

Page 5: Making EDW More Flexible with Hadoop - SNIA SNIA Analytics and Big Data Summit. © Pentaho Corporation. All Rights Reserved. Making EDW More Flexible with Hadoop . Rob Rosen . Big

2013 SNIA Analytics and Big Data Summit. © Pentaho Corporation. All Rights Reserved.

5

Hadoop = Infrastructure Software

Costs

Time

Flexibility

Page 6: Making EDW More Flexible with Hadoop - SNIA SNIA Analytics and Big Data Summit. © Pentaho Corporation. All Rights Reserved. Making EDW More Flexible with Hadoop . Rob Rosen . Big

2013 SNIA Analytics and Big Data Summit. © Pentaho Corporation. All Rights Reserved.

6

Barriers to Implementing Hadoop Technologies

“It’s complex & difficult, plus our executives don’t understand it. Where should I start?”

Page 7: Making EDW More Flexible with Hadoop - SNIA SNIA Analytics and Big Data Summit. © Pentaho Corporation. All Rights Reserved. Making EDW More Flexible with Hadoop . Rob Rosen . Big

2013 SNIA Analytics and Big Data Summit. © Pentaho Corporation. All Rights Reserved.

7

Use Case Scenario – Call Volume Analysis VOIP service provider with a B2B customer base

wants to sub-lease excess capacity on the weekends

COO: what are the top 10 states for outbound calls on Fridays, Saturdays and Sundays?

Detailed information available, but not in the EDW: Call records: date/timestamp & source phone # Reference data: area code by country, state & time

zone (North American Numbering Plan)

?

Page 8: Making EDW More Flexible with Hadoop - SNIA SNIA Analytics and Big Data Summit. © Pentaho Corporation. All Rights Reserved. Making EDW More Flexible with Hadoop . Rob Rosen . Big

2013 SNIA Analytics and Big Data Summit. © Pentaho Corporation. All Rights Reserved.

8 © 2012, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555

Extract Transform

Load

Structured Data

Dashboard

Report

Analysis

Data Mart(s) / Warehouse

Metadata

Data Integration

Data acquisition & ingestion

Parsing

Cleansing

Enrichment

Data Integration Dimension management

Bulk loading DB management

“SQL or ETL Tool”

Traditional EDW Architecture

Page 9: Making EDW More Flexible with Hadoop - SNIA SNIA Analytics and Big Data Summit. © Pentaho Corporation. All Rights Reserved. Making EDW More Flexible with Hadoop . Rob Rosen . Big

2013 SNIA Analytics and Big Data Summit. © Pentaho Corporation. All Rights Reserved.

Challenges with the Traditional EDW

EDW can’t handle increasing data and workloads, so companies must:

• Reduce the volume of data

• Restrict end-user access (# of users or access windows) to accommodate longer batch processing windows

• Purchase additional capacity (hardware / licenses), which can be as much as $100K / TB

Then, companies are faced with the following challenges:

• The trade-off: more data versus user-experience

• The incremental outlay of capital required to expand the EDW or purchase more proprietary ETL tool capacity

• The inability of the incumbent ETL vendor to work with Hadoop

9

Page 10: Making EDW More Flexible with Hadoop - SNIA SNIA Analytics and Big Data Summit. © Pentaho Corporation. All Rights Reserved. Making EDW More Flexible with Hadoop . Rob Rosen . Big

2013 SNIA Analytics and Big Data Summit. © Pentaho Corporation. All Rights Reserved.

EDW Architecture – Hadoop Front-End

10

Data Integration

Data acquisition & ingestion

Parsing

Cleansing

Enrichment

ETL ETL

Data Integration Dimension management

Bulk loading DB management

Structured Data

Unstructured Data

Dashboard

Report

Analysis

Data Mart(s) / Warehouse

Metadata

Page 11: Making EDW More Flexible with Hadoop - SNIA SNIA Analytics and Big Data Summit. © Pentaho Corporation. All Rights Reserved. Making EDW More Flexible with Hadoop . Rob Rosen . Big

2013 SNIA Analytics and Big Data Summit. © Pentaho Corporation. All Rights Reserved.

Data Pipeline

11

2012/03/06 00:00:00.000,12054290060 2012/02/21 00:00:00.000,18774230140 2012/03/08 00:00:00.000,12152900580 2012/02/18 00:00:00.000,17732350700 2012/03/08 00:00:00.000,17242490750

3,2012,6,201,NJ,UNITED STATES,E,Friday,1 3,2012,6,513,OH,UNITED STATES,E,Friday,1 3,2012,6,850,FL,UNITED STATES,EC,Friday,1 3,2012,7,631,NY,UNITED STATES,E,Saturday,1 3,2012,6,650,CA,UNITED STATES,P,Friday,1

Page 12: Making EDW More Flexible with Hadoop - SNIA SNIA Analytics and Big Data Summit. © Pentaho Corporation. All Rights Reserved. Making EDW More Flexible with Hadoop . Rob Rosen . Big

2013 SNIA Analytics and Big Data Summit. © Pentaho Corporation. All Rights Reserved.

12

Analysis & Visualization

3,2012,6,201,NJ,UNITED STATES,E,Friday,1 3,2012,6,513,OH,UNITED STATES,E,Friday,1 3,2012,6,850,FL,UNITED STATES,EC,Friday,1 3,2012,7,631,NY,UNITED STATES,E,Saturday,1 3,2012,6,650,CA,UNITED STATES,P,Friday,1 . . .

SQL over Hadoop tool

Page 13: Making EDW More Flexible with Hadoop - SNIA SNIA Analytics and Big Data Summit. © Pentaho Corporation. All Rights Reserved. Making EDW More Flexible with Hadoop . Rob Rosen . Big

2013 SNIA Analytics and Big Data Summit. © Pentaho Corporation. All Rights Reserved.

13

EDW Optimization is a logical first use case for

Hadoop: Tremendous cost savings Revenue enhancement potential Deliver value and gain experience with Hadoop

Benefits: Increased revenue…AND lowered costs Archive onto lower-cost storage

platform…recover EDW operational headroom…lower storage costs

Understand transactional context to gain deeper insight into customer behavior

Leverage the Ecosystem for Assistance

Summary

Costs

Time

Flexibility

Page 14: Making EDW More Flexible with Hadoop - SNIA SNIA Analytics and Big Data Summit. © Pentaho Corporation. All Rights Reserved. Making EDW More Flexible with Hadoop . Rob Rosen . Big

2013 SNIA Analytics and Big Data Summit. © Pentaho Corporation. All Rights Reserved.

Questions and Answers

Thank You! [email protected] @robrosen3 415-525-5555 ofc 925-998-4422 mob

14