Building a Data Platform - Sas Institute · A new approach for building a Data Platform. What we...
Transcript of Building a Data Platform - Sas Institute · A new approach for building a Data Platform. What we...
Copyright © 2015, SAS Inst i tute Inc. Al l r ights reserved.#SASanz
Building a Data PlatformKunal Taneja, Systems Engineer
Building A Data Platformwith Hadoop and an Enterprise Data Hub
Kunal Taneja – Lead Solution Engineer ANZ
3© Cloudera, Inc. All rights reserved.
Data Changes How We Work
Everything that can be measured will be measured.
Employees and customers expect more personal interactions, but not at the cost of their privacy.
The most innovative companiesembrace experimentationand agility.
Instrumentation Consumerization Experimentation
4© Cloudera, Inc. All rights reserved.
Traditional Enterprise Data Platform
Source Systems
Enterprise Data Warehouse
BI Abstraction & Reporting Layer
Data Acquisition Layer•Extraction&Staging
•Cleansing
ATOMIC Layer•Normalisation & Storage
Performance & Access•Transformation & Calculation
• Performance & Access
Dashboard & Reports Ad-hoc Analysis Mobile
E
T
L
Data
Model l ing
5© Cloudera, Inc. All rights reserved.
DataSources
DataSystems
DataAccess
BusinessAnalytics
Custom Applications
Existing Data
Databases
Operational Applications
New Data
Limited DataNot efficient to keep existing data, let alone handle new data sources.Time consuming to transform data for analysis in existing systems.
Limited InsightsPower users struggle with data.Many users have no data.
Compliance and PrivacyMore data, more users, and more tools create complexity.Need to balance business agility with security and governance.
Traditional Architectures Under Pressure
6© Cloudera, Inc. All rights reserved.
Time consuming transforms ..
Source Systems
Enterprise Data Warehouse
BI Abstraction & Reporting Layer
Data Acquisition Layer•Extraction&Staging
•Cleansing
ATOMIC Layer•Normalisation & Storage
Performance & Access•Transformation & Calculation
• Performance & Access
Dashboard & Reports Ad-hoc Analysis Mobile
E
T
L
Data
Model l ing
Basel III Value at Risk (VaR): How far behind?
4 Hours
8 Hours
8 Hours
3NF
STAR
LDM
CDC
4 Hours
7© Cloudera, Inc. All rights reserved.
Limited Data
Source Systems
Enterprise Data Warehouse
BI Abstraction & Reporting Layer
Data Acquisition Layer•Extraction&Staging
•Cleansing
ATOMIC Layer•Normalisation & Storage
Performance & Access•Transformation & Calculation
• Performance & Access
Dashboard & Reports Ad-hoc Analysis Mobile
E
T
L
Data
Model l ing
Business: “Do we need to consider BlackBerry users part of our digital strategy?”
<xml?><device= iPhone>
…</xml>
3NF
STAR
LDM
CDC4 Weeks
8 Weeks
2 Weeks
4 Weeks
8© Cloudera, Inc. All rights reserved.
Limited Data
9© Cloudera, Inc. All rights reserved.
Limited Insights (Not Only SQL)
10© Cloudera, Inc. All rights reserved.
A new approach for building a Data Platform
What we doCopy Data to Applications
What we should doBring Applications to Data
DataInformation-centric
businesses use all Data:
Multi-structured, Internal & external data
of all types
App
App
App
Process-centric businesses use:
• Structured data mainly• Internal data only• “Important” data only• Multiple copies of data
App
App
App
Data
Data
Data
Data
©2014 Cloudera, Inc. All rights reserved
11© Cloudera, Inc. All rights reserved.
Cloudera Enterprise Data HubPowered by Apache Hadoop
A new kind of data platform.• One place for unlimited data• Unified, multi-framework data access
Only with Cloudera:• Leading performance• Enterprise system and data management• Fundamentally secure• Open source, open standards
Security and Administration
Unlimited Storage
Process Discover Model Serve
DeploymentFlexibility
On-PremisesAppliancesEngineered Systems
Public CloudPrivate CloudHybrid Cloud
12© Cloudera, Inc. All rights reserved.
Hadoop and The Enterprise Data Hub
Open Source,Scalable,Flexible, andCost-Effective
✔Unified and Managed ✖Open Architecture ✖Secure and Governed ✖
✔
✔
✔
3RD PARTYAPPS(Many)
STORAGE FOR ANY TYPE OF DATAUNIFIED, ELASTIC, RESILIENT, SECURE (Sentry, Gazzang, Rhino)
CLOUDERA’S ENTERPRISE DATA HUB
BATCHPROCESSING(MR, Hive, Pig)
INTERACTIVESQL
(Impala)
SEARCHENGINE
(SOLR)
MACHINELEARNING
(SPARK)
STREAMPROCESSING
(SPARK)
WORKLOAD MANAGEMENT (YARN)
FILESYSTEM(HDFS)
ONLINE NOSQL(HBASE)
DATAM
ANAG
EMEN
T(N
avigator)
SYSTEMM
ANAG
EMEN
T(Cloudera M
anager)
DATA COLLECTION (Flume, Kafka, Sqoop, NFS)
©2014 Cloudera, Inc. All rights reserved
14© Cloudera, Inc. All rights reserved.
The Most Complete Ecosystem
DataSystems
Enterprise Data Hub
Security and Administration
Unlimited Storage
Process Discover Model Serve
Applications
System Integration
Infrastructure
More than 1,400 partnersensure compatibility with existinginvestments, lower skill barriers, and help maximize value from your data.Operational
Tools
15© Cloudera, Inc. All rights reserved.
Basel II CVA across limited asset classes in batch?
Basel III CVA across 90% instruments, end-of-day, intra-day, pre-deal?
16© Cloudera, Inc. All rights reserved.
What trade delays did I experience yesterday?
What trade delays did I experience in the last 60 seconds, what can I expect in the future?
17© Cloudera, Inc. All rights reserved.
A policy holder has reported a car accident, provide claim form?
Vet with black box whether the claimant meets all conditions?
18© Cloudera, Inc. All rights reserved.
What has the customer bought from us?
Are we about to lose this customer, what can we do to prevent this?
19© Cloudera, Inc. All rights reserved.
A customer has phoned in an internet failure, how to troubleshoot?
A network fault has been detected for an at risk customer, please call?
20© Cloudera, Inc. All rights reserved.
Why Cloudera?
Enterprise-Grade HadoopDifferentiated performance, security, management, and governance.
ExpertiseNo one knows Hadoop better than Cloudera.
EnablementSupport, Training, and Professional Services enable and deliver success.
EcosystemCloudera ensures that Hadoop works with the platforms, tools, and integrators you rely on.
Sustainable InnovationOur hybrid open source model delivers the benefits of open source and what the enterprise requires, while enabling us to invest in the future for our customers.