BUSINESS DATA LAKE - Dell EMC · BUSINESS DATA LAKE EMC BIG DATA STORAGE ... EMC ISILON SCALE-OUT...
Transcript of BUSINESS DATA LAKE - Dell EMC · BUSINESS DATA LAKE EMC BIG DATA STORAGE ... EMC ISILON SCALE-OUT...
1© Copyright 2016 EMC Corporation. All rights reserved.
BUSINESS DATA LAKEFADI FAKHOURI, SR. SYSTEMS ENGINEER, ISILON SPECIALIST
2© Copyright 2016 EMC Corporation. All rights reserved.
UNSTRUCTURED DATA GROWTH
Source: IDC
2
2015
71 EB
Total Capacity Shipped, Worldwide % of Unstructured Data
75%
78%
80%
2016
106 EB
2017
133 EB
3© Copyright 2016 EMC Corporation. All rights reserved.
SCALE-OUT DATA LAKES MAKE ANALYTICS EFFICIENTA DATA LAKE LETS YOU BRING YOUR ANALYTICS TOOLS TO YOUR DATA. DATA IS SHARED BETWEEN PROJECTS WITH CENTRALIZED CONTROL & STANDARDIZED ANALYTICS TOOLING.
Ingest Store Analyze Surface Act
Capture data from a wide range
of sources, traditional and
new.
Store everything in one
environment for cross data-set
analysis.
Use advanced algorithms to discover new,
predictive patterns.
Share insight with business domain
experts.
Build data-driven applications that meet business
needs.
4© Copyright 2016 EMC Corporation. All rights reserved.
THE BUSINESS DATA LAKE JOURNEYSTART
DISPARATE DATA SILOS
STEP 1:
CONSOLIDATE DATA
STEP 2:
ADDHADOOP
STEP 3:
IMPLEMENT ANALYTICS
STEP 4:
INTEGRATEAPP DEV
STEP 5:
BUSINESS DATA LAKE
EMC BIG DATA STORAGE
Install Hadoop(OR multiple distros)
Implement Ananlytics
(OR ALT. TOOLS)
PIVOTAL CLOUD FOUNDRY
BIG DATA VISION
WORKSHOP
PROOF OF VALUE
CONSULTINGFAST TRACK
TECHNOLOGY
APPS
ANALYTIC
S
EMC ENGINEERED SOFTWARE
5© Copyright 2016 EMC Corporation. All rights reserved.
SERVIC
E P
RO
VID
ER
EN
TERPRIS
E D
ATA C
EN
TER
A UNIQUE FEDERATION OF COMPANIESDELIVERING THE SOFTWARE-DEFINED ENTERPRISE. SOLUTIONS & CHOICE
BIG DATA SOLUTIONSPLATFORM AS A SERVICE
AGILE APPLICATION DEVELOPMENT
ENTERPRISE MOBILITYSOFTWARE-DEFINED DATA CENTER
INFORMATION INFRASTRUCTURECONVERGED INFRASTRUCTURE
PLATFORMAS A SERVICE
VIRTUALWORKSPACE
BUSINESSDATA LAKE
SECURITYANALYTICS
SOFTWARE DEFINED
DATA CENTER
Partners
vCloudHybrid Service
AD
VA
NC
ED
SEC
UR
ITY
6© Copyright 2016 EMC Corporation. All rights reserved.
Unstructured Data/Content
Data Lake Use CasesArchive/Compliance
VMware / Info Archive
application retirement
File Shares and Home directories
Cloud/Object
Video/Surveillance
Hadoop - Bigdata
Mobile Apps
Call Centre CVR
Splunk/M2M Log Files
SQL/DB Dumps
Broadcast/Content Streaming
Backup
VDI
BLOBS
Social Media Feeds
EDW – ETL Offload
HPC / Genomic Sequencing
7© Copyright 2016 EMC Corporation. All rights reserved.
NEXT-GEN ACCESS METHODS
FILE
FILE
7
HPC
Backup/Archive
Analytics
Mobile
File Shares
Cloud Apps
8© Copyright 2016 EMC Corporation. All rights reserved.
ISILON DATA LAKE – ENTERPRISE GRADE FEATURESEMC ISILON SCALE-OUT NAS
DATA PROTECTION
DATA SECURITYPERFORMANCE MANAGEMENT
DATA MANAGEMENTIsilonData Lake
S-Series X-Series
NL-Series HD-Series
8© Copyright 2015 EMC Corporation. All rights reserved.
9© Copyright 2016 EMC Corporation. All rights reserved.
S - Series X - Series
NL-Series
IsilonCloudPools
3RD PLATFORM CLOUD INNOVATION
HD-Series
9© Copyright 2015 EMC Corporation. All rights reserved.
HD-SeriesDeep archive
X-SeriesThroughput
NL-SeriesArchive
CloudCold archive
Cap
acit
y
$/TBHigh Low
S-SeriesPerformance
10© Copyright 2016 EMC Corporation. All rights reserved.
EXPANDED DATA LAKEFROM EDGE TO CORE TO CLOUD
EDGE CORE CLOUD
10© Copyright 2015 EMC Corporation. All rights reserved.
EXPAND DATA LAKE TO THE EDGE… AND TO THE CLOUD
11© Copyright 2016 EMC Corporation. All rights reserved.
Customers>7000
>20% YoY GrowthData Lake#1
#1 Scale-Out NASMarket Leader
>2000Big Data Analytics Customers
Hadoop Shared Storage#1
EMC ISILON BUSINESS MOMENTUM
12© Copyright 2016 EMC Corporation. All rights reserved.
ISILON ONEFS OPERATING SYSTEM
Single Volume/
File System
UnmatchedEfficiency
Simplicity &
Ease of Use
LinearScalability
EasyGrowth
HighPerformance
12
13© Copyright 2016 EMC Corporation. All rights reserved.
In-place analytics
• Native integration speeds time to insight
Enterprise data protection
• Fast snapshots, backup, and data recovery
• Simple, efficient data replication for disaster recovery
Lower costs
• Eliminates the need for dedicated Hadoop infrastructure
• Much more efficient than DAS-based approach
Increase flexibility
• Simultaneous support for any Apache-compliant Hadoop distribution
• Ambari integration for management, monitoring, and provisioning
THE ISILON ADVANTAGE FOR HADOOPSCALE-OUT STORAGE WITH NATIVE HADOOP INTEGRATION
14© Copyright 2016 EMC Corporation. All rights reserved.
Ethernet
HADOOP ARCHITECTURE – DAS VS ISILON
NameNode
Data Node + Compute Node
Data Node + Compute Node
Data Node + Compute Node
Data Node + Compute Node
Data Node + Compute Node
Data Node + Compute Node
Ethernet
Compute Node Compute Node Compute Node
Compute NodeCompute Node Compute Node
name node
name node
name node
data
node
15© Copyright 2016 EMC Corporation. All rights reserved.
Traditional “Share-Nothing” Hadoop
Existing Virtualized Data Center SHARE-NOTHING Hadoop Infrastructure
Unstructured Data
1
Existing Primary Storage
2 3 4 2 3 4 2 3 4 2 3 4
• Hadoop on a Stick (R=3) means 5 data copies ($$$$)
• Data has to copy to the Hadoop cluster before analysis can begin (Time to Results)
How will you maintain data consistency when a file changes on your primary storage?
16© Copyright 2016 EMC Corporation. All rights reserved.
Existing Virtualized Data Center
Existing Primary Storage
Isilon “Share-Everything” Hadoop
1
Start using Hadoop NOW with unused processing and RAM available in your VMware environment
No replication required (Use your existing data)
Access to same data via NAS and HDFS protocols
Time to results extremely fast using already existing data with NO COPIES or wasted $$$$
Analysis Can Begin with the
1st VM
New Hadoop Compute Nodes
Unstructured Data
Use Native HDFS Protocol
17© Copyright 2016 EMC Corporation. All rights reserved.
HADOOP JOB CYCLE TIMES: TIME TO RESULTS
Traditional Hadoop + DAS Workflow
Isilon Enabled Hadoop Workflow
Original Data
Stage Data
MapData
Reduce Data
Write Results
Copy Results
View Results
IngestData into
HDFS3x
Mirror
Deletedata from
HDFS
Acquire Data handling notrequired on Isilon
Original Data
Acquire
MapData
Reduce Data
Write Results
View Results
VERSUS
Reusable and extensible to:
17
18© Copyright 2016 EMC Corporation. All rights reserved.
NFS
HDFS
SMB, NFS, HTTP, FTP,
HDFS
NodereplyNodereplyNodereplyNodereply
name node
name node
name node
name node d
ata
node
NFS
SMB
SMB
NFSMAP Reduce
MAP Reduce
MAP Reduce
MAP Reduce
MAP Reduce
MAP Reduce
SUPPORT FOR MULTIPLE ANALYTICS APPLICATIONS
19© Copyright 2016 EMC Corporation. All rights reserved.
HADOOP WITH ISILON SCALE-OUT NAS STORAGE
1Multi Protocol Scale-Out Storage Platform
– NFS, CIFS, FTP, HTTP, HDFS
2Highly resilient, Predictable Scalability
– Distributed NameNode & DataNode
3Enterprise Data Protection & Governance
– SnapshotIQ, SyncIQ, SmartLock, ACLs..
4Industry-Leading Storage Efficiency
– >80% Storage Utilization
5Independent Scalability with Optimized QoS
– Optimally Scale Storage & Compute
6Consolidate Data Silos
– Industry Standard Protocols
– Bring Applications to Shared Data
20© Copyright 2016 EMC Corporation. All rights reserved.
Simple to manage Single file system, single volume, global namespace
Massively scalable From 16 TB to over 50 PB in a single cluster, or to Cloud-scale
Unmatched efficiencyOver 80% storage utilization, automated tiering and SmartDedupe
Enterprise data protectionEfficient backup and disaster recovery, and N+1 thru N+4 redundancy
Robust security and compliance optionsRBAC, WORM, SEDs, auditing, STIG, FIPS, CAC/PIV
Operational flexibilityMulti-protocol support as well as Object and OpenStack Swift
Deployment flexibilityEdge to Core to Cloud
ISILON SCALE-OUT NAS