1 1 Apache Hadoop and the Emergence of the Enterprise Data Hub Eli Collins, Chief Technologist...

16
1 1 Apache Hadoop and the Emergence of the Enterprise Data Hub Eli Collins, Chief Technologist ©2014 Cloudera, Inc. All rights reserved.

Transcript of 1 1 Apache Hadoop and the Emergence of the Enterprise Data Hub Eli Collins, Chief Technologist...

Page 1: 1 1 Apache Hadoop and the Emergence of the Enterprise Data Hub Eli Collins, Chief Technologist ©2014 Cloudera, Inc. All rights reserved.

1 ©2014 Cloudera, Inc. All rights reserved.1

Apache Hadoop and the Emergence of the Enterprise Data HubEli Collins, Chief Technologist

Page 2: 1 1 Apache Hadoop and the Emergence of the Enterprise Data Hub Eli Collins, Chief Technologist ©2014 Cloudera, Inc. All rights reserved.

2 ©2014 Cloudera, Inc. All rights reserved.2

The Enterprise Data Warehouse

Flat Files

Operational Store

Data Sources

Staging

Reporting

Analysis

MiningOperational

Store Metadata

Summary

Facts & Dimensions

EDW

Archive

Data marts

Page 3: 1 1 Apache Hadoop and the Emergence of the Enterprise Data Hub Eli Collins, Chief Technologist ©2014 Cloudera, Inc. All rights reserved.

3 ©2014 Cloudera, Inc. All rights reserved.3

The Enterprise Data Hub

imageslogs

binaryDB dumps

1. Inexpensive storage2. Flexible storage3. Co-located compute4. Multiple compute engines

MR, Pig/Hive, SQL, Spark, SAS, R, Search, Graph..

Page 4: 1 1 Apache Hadoop and the Emergence of the Enterprise Data Hub Eli Collins, Chief Technologist ©2014 Cloudera, Inc. All rights reserved.

4 ©2014 Cloudera, Inc. All rights reserved.4

So it’s Like a Data Warehouse?

..but can store more data, more kinds of data, and do more flexible analysis. It’s open source and runs on industry standard hardware so it’s more economical.

Page 5: 1 1 Apache Hadoop and the Emergence of the Enterprise Data Hub Eli Collins, Chief Technologist ©2014 Cloudera, Inc. All rights reserved.

5 ©2014 Cloudera, Inc. All rights reserved.5

An Analogy

Page 6: 1 1 Apache Hadoop and the Emergence of the Enterprise Data Hub Eli Collins, Chief Technologist ©2014 Cloudera, Inc. All rights reserved.

6 ©2014 Cloudera, Inc. All rights reserved.6

What changed?

• The need?• Convenience? Cost?

Page 7: 1 1 Apache Hadoop and the Emergence of the Enterprise Data Hub Eli Collins, Chief Technologist ©2014 Cloudera, Inc. All rights reserved.

7

Take and share good photos

Page 8: 1 1 Apache Hadoop and the Emergence of the Enterprise Data Hub Eli Collins, Chief Technologist ©2014 Cloudera, Inc. All rights reserved.

8

Data Warehouse vs. Data Hub

©2014 Cloudera, Inc. All Rights Reserved.

Enterprise Data Warehouse Enterprise Data Hub

Page 9: 1 1 Apache Hadoop and the Emergence of the Enterprise Data Hub Eli Collins, Chief Technologist ©2014 Cloudera, Inc. All rights reserved.

9 ©2014 Cloudera, Inc. All Rights Reserved.

An Operating System

APP

SCHEDULER

FILE SYSTEM

MG

TSERVICES

APP

LIB

APP 3rd PARTY APP

Page 10: 1 1 Apache Hadoop and the Emergence of the Enterprise Data Hub Eli Collins, Chief Technologist ©2014 Cloudera, Inc. All rights reserved.

10 ©2014 Cloudera, Inc. All Rights Reserved.

An Enterprise Data Hub

BATCHPROCESSING

ANALYTICSQL

SEARCHENGINE

MACHINELEARNING

STREAMPROCESSING

3RD PARTYAPPS

WORKLOAD MANAGEMENT

STORAGE FOR ANY TYPE OF DATAUNIFIED, ELASTIC, RESILIENT, SECURE

DATAM

ANAG

EMEN

TSYSTEM

MAN

AGEM

ENT

Filesystem Online NoSQL

Page 11: 1 1 Apache Hadoop and the Emergence of the Enterprise Data Hub Eli Collins, Chief Technologist ©2014 Cloudera, Inc. All rights reserved.

11 ©2014 Cloudera, Inc. All rights reserved.11

Data Warehousing with an EDH

Flat Files

Operational Store

Data SourcesEDH

Reporting AnalysisMining

Operational Store

EDW

1. Stage, transform, archive

3. Exploratory, Discovery,Search, ML..

2. Reporting,Mining,Analysis

Page 12: 1 1 Apache Hadoop and the Emergence of the Enterprise Data Hub Eli Collins, Chief Technologist ©2014 Cloudera, Inc. All rights reserved.

12 ©2014 Cloudera, Inc. All rights reserved.12

What about data warehousing on the enterprise data hub?

Page 13: 1 1 Apache Hadoop and the Emergence of the Enterprise Data Hub Eli Collins, Chief Technologist ©2014 Cloudera, Inc. All rights reserved.

13 ©2014 Cloudera, Inc. All rights reserved.13

Page 14: 1 1 Apache Hadoop and the Emergence of the Enterprise Data Hub Eli Collins, Chief Technologist ©2014 Cloudera, Inc. All rights reserved.

14 ©2014 Cloudera, Inc. All Rights Reserved.

Data Warehousing in Cloudera’s EDH

Page 15: 1 1 Apache Hadoop and the Emergence of the Enterprise Data Hub Eli Collins, Chief Technologist ©2014 Cloudera, Inc. All rights reserved.

15 ©2014 Cloudera, Inc. All rights reserved.15

Page 16: 1 1 Apache Hadoop and the Emergence of the Enterprise Data Hub Eli Collins, Chief Technologist ©2014 Cloudera, Inc. All rights reserved.

16 ©2014 Cloudera, Inc. All rights reserved.