20150704 benchmark and user experience in sahara weiting
-
Upload
wei-ting-chen -
Category
Documents
-
view
47 -
download
1
Transcript of 20150704 benchmark and user experience in sahara weiting
![Page 1: 20150704 benchmark and user experience in sahara weiting](https://reader035.fdocuments.us/reader035/viewer/2022062515/55cd82eabb61eb04768b47a9/html5/thumbnails/1.jpg)
Benchmarking and User Experiencein Sahara
Weiting Chen
July 04 2015
![Page 2: 20150704 benchmark and user experience in sahara weiting](https://reader035.fdocuments.us/reader035/viewer/2022062515/55cd82eabb61eb04768b47a9/html5/thumbnails/2.jpg)
No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document.Intel disclaims all express and implied warranties, including without limitation, the implied warranties of merchantability, fitness for a particular purpose, and non-infringement, as well as any warranty arising from course of performance, course of dealing, or usage in trade.This document contains information on products, services and/or processes in development. All information provided here is subject to change without notice. Contact your Intel representative to obtain the latest forecast, schedule, specifications and roadmaps.The products and services described may contain defects or errors known as errata which may cause deviations from published specifications. Current characterized errata are available on request.
© 2015 Intel Corporation.
LEGAL DISCLAIMERS
![Page 3: 20150704 benchmark and user experience in sahara weiting](https://reader035.fdocuments.us/reader035/viewer/2022062515/55cd82eabb61eb04768b47a9/html5/thumbnails/3.jpg)
oOur Background
oWhy Sahara
oDeployment Consideration
oCustomer Experience
oThe Future of Sahara
AGENDA
![Page 4: 20150704 benchmark and user experience in sahara weiting](https://reader035.fdocuments.us/reader035/viewer/2022062515/55cd82eabb61eb04768b47a9/html5/thumbnails/4.jpg)
BACKGROUND
![Page 5: 20150704 benchmark and user experience in sahara weiting](https://reader035.fdocuments.us/reader035/viewer/2022062515/55cd82eabb61eb04768b47a9/html5/thumbnails/5.jpg)
WHO WE ARE…
![Page 6: 20150704 benchmark and user experience in sahara weiting](https://reader035.fdocuments.us/reader035/viewer/2022062515/55cd82eabb61eb04768b47a9/html5/thumbnails/6.jpg)
Exploring new opportunities in Big Data-as-a-Service(BDaaS)o Researching the possibility BDaaS solutiono Let BDaaS become better in IT infrastructureo Moving forward the future of BDaaS
Focusing on Sahara in OpenStacko Bring CDH into Saharao Create more features in Saharao Rank #1 in LOC, #3 in Commits for Sahara contribution
ABOUT OUR TEAM
![Page 7: 20150704 benchmark and user experience in sahara weiting](https://reader035.fdocuments.us/reader035/viewer/2022062515/55cd82eabb61eb04768b47a9/html5/thumbnails/7.jpg)
WHY SAHARA?
![Page 8: 20150704 benchmark and user experience in sahara weiting](https://reader035.fdocuments.us/reader035/viewer/2022062515/55cd82eabb61eb04768b47a9/html5/thumbnails/8.jpg)
oYou or someone at the company is using public Big Data application services like AWS EMR.You need Sahara to migrate Big Data application to your private cloud
oYou have multiple Hadoop clusters in your environment and you would like to integrate them for better infrastructure utilization.You need Sahara to virtualized Hadoop into cloud infrastructure.
oYou are using OpenStack as a IT cloud infrastructure for many years and there is a Hadoop cluster also running in your IT environment.You must use Sahara to bring them together as a unified IT environment for better maintenance.
FROM THE CUSTOMER NEEDS
source from OpenStack Vancouver Design Summit: Benchmarking Sahara-based as a Service solution by RedHat & Intel
![Page 9: 20150704 benchmark and user experience in sahara weiting](https://reader035.fdocuments.us/reader035/viewer/2022062515/55cd82eabb61eb04768b47a9/html5/thumbnails/9.jpg)
Data Scientists/Analystso Provide an elastic way to run big data application
Developerso Bring a custom big data infrastructure by different needs
Administrator/Operatorso A better way to maintain not only hardware platform but also software package
Companyo Cost, cost, cost
BETTER USER EXPERIENCE MEANS…
![Page 10: 20150704 benchmark and user experience in sahara weiting](https://reader035.fdocuments.us/reader035/viewer/2022062515/55cd82eabb61eb04768b47a9/html5/thumbnails/10.jpg)
A COMPLEX BIG DATA SOLUTION
Structured, Unstructured Data Big Data SolutionDifferent type data sources Complexity in organizing Data(ETL)
BI ReportDiverse BI Report
Pig
ZooKeeper
![Page 11: 20150704 benchmark and user experience in sahara weiting](https://reader035.fdocuments.us/reader035/viewer/2022062515/55cd82eabb61eb04768b47a9/html5/thumbnails/11.jpg)
Deployment Consideration
![Page 12: 20150704 benchmark and user experience in sahara weiting](https://reader035.fdocuments.us/reader035/viewer/2022062515/55cd82eabb61eb04768b47a9/html5/thumbnails/12.jpg)
SAHARA ARCHITECTURE
![Page 13: 20150704 benchmark and user experience in sahara weiting](https://reader035.fdocuments.us/reader035/viewer/2022062515/55cd82eabb61eb04768b47a9/html5/thumbnails/13.jpg)
SAHARA DATA PROCESSING PATTERN
OpenStack
Instance
Data Node
Pattern 1: Internal HDFS
Collect Application
Collecting Data
OpenStack support to create HDFS on Cinder or Ephemeral Disk. This method can provide a better data processing performance via Ephemeral Disk or to persist the data via Cinder with lower performance.
Node Manager
Pros: Performance would be extreme fast.(depends on the storage backend)
Cons: Data persistence may be a problem if you would like to follow with the life of Virtual Cluster.
![Page 14: 20150704 benchmark and user experience in sahara weiting](https://reader035.fdocuments.us/reader035/viewer/2022062515/55cd82eabb61eb04768b47a9/html5/thumbnails/14.jpg)
SAHARA DATA PROCESSING PATTERN
OpenStack
Instance 1
Pattern 2: External HDFS
Collect Application
Collecting Data
You can also choose to deploy HDFS to two different instances. This way can bring you more elasticity to manage your instances when you would like to save more compute power via turn off your node manager instance.
Node Manager
Pros: Performance may be the same as Pattern 1, but it can bring more flexible to control your instances, save the power, and also persist your data in data node.Cons: A long run cluster may still need to consider another way for persisting data.
Instance 2
Data Node
![Page 15: 20150704 benchmark and user experience in sahara weiting](https://reader035.fdocuments.us/reader035/viewer/2022062515/55cd82eabb61eb04768b47a9/html5/thumbnails/15.jpg)
SAHARA DATA PROCESSING PATTERN
OpenStack
Instance
Pattern 3: Swift
Collect Application
Collecting Data
Use Swift can stream the data from storage to Hadoop directly. It provide a way to store your data externally and solve the data persistence problem. Currently Swift can also support data locality feature.
Node Manager
Pros: Streaming data directly and integrating with your Swift infrastructure. Cons: Performance could be an issue when comparing with other pattern by using HDFS.
Swift
Streaming Data
![Page 16: 20150704 benchmark and user experience in sahara weiting](https://reader035.fdocuments.us/reader035/viewer/2022062515/55cd82eabb61eb04768b47a9/html5/thumbnails/16.jpg)
Cluster Deploymento Service Deployment
Compute Engine Choiceo Baremetal, KVM, Docker, Hyper-V, vSphere,
Xen
Storage Architectureo Ephemeral Disk
o Persistent Volume
o Performance
o Cost
o Current IT Infrastructure
Deployment Consideration
Host
Instance Instance …Instance
Data
Bare Metal KVM Container
EphemeralBlock
Storage
Data Data
Node Manager
Node Manager
Node Manager
Object Storage
Compute Engine
Storage Infrastructure
Cluster Deployment
![Page 17: 20150704 benchmark and user experience in sahara weiting](https://reader035.fdocuments.us/reader035/viewer/2022062515/55cd82eabb61eb04768b47a9/html5/thumbnails/17.jpg)
Customer Experience
![Page 18: 20150704 benchmark and user experience in sahara weiting](https://reader035.fdocuments.us/reader035/viewer/2022062515/55cd82eabb61eb04768b47a9/html5/thumbnails/18.jpg)
Issue1 - Provision a Cluster Takes a Long Time
Problem Description:o 10000+ jobs per day including several different workloads(some jobs run in SECs and some jobs
run in HOURs)o Hard to sort out a job is small or large, it is not only about data size but also in logistic o Provisioning a cluster takes a longer time than running a small job in secs, for example: launch a
4-nodes cluster in 10+ mins
Customer’s Feedback:o Finish job on time, no need to worry about provisioning a cluster
Possible Solutions/Alternatives:o Run jobs in an existing cluster(depends on the cases)o Run jobs in a public cluster using Resource ACL(will support in Liberty)o To reduce the time for provisioning a cluster -> Plugin specifico Use Docker can save time to launch an instance, but still need time to launch services
![Page 19: 20150704 benchmark and user experience in sahara weiting](https://reader035.fdocuments.us/reader035/viewer/2022062515/55cd82eabb61eb04768b47a9/html5/thumbnails/19.jpg)
Docker brings better boot time
10X boot time difference between Docker and KVM
![Page 20: 20150704 benchmark and user experience in sahara weiting](https://reader035.fdocuments.us/reader035/viewer/2022062515/55cd82eabb61eb04768b47a9/html5/thumbnails/20.jpg)
Docker also get the advantage when instance is idle
0
10
20
30
40
50
60
70
80
1 9
17
25
33
41
49
57
65
73
81
89
97
10
5
11
3
12
1
12
9
13
7
14
5
15
3
16
1
16
9
17
7
18
5
19
3
20
1
20
9
21
7
22
5
23
3
24
1
24
9
25
7
26
5
27
3
28
1
28
9
29
7
30
5
31
3
32
1
CP
U U
sag
e I
n P
erc
en
t
Time
Docker: Compute Node CPU (full test duration)
usr
sys
Averages
– 0.54
– 0.17
0
10
20
30
40
50
60
70
80
1
10
19
28
37
46
55
64
73
82
91
10
0
10
9
11
8
12
7
13
6
14
5
15
4
16
3
17
2
18
1
19
0
19
9
20
8
21
7
22
6
23
5
24
4
25
3
26
2
27
1
28
0
28
9
29
8
30
7
31
6
32
5
33
4
34
3
CP
U U
sag
e I
n P
erc
en
t
Time
KVM: Compute Node CPU (full test duration)
usr
sys
Averages
– 7.64
– 1.4
Source from IBM: Boden Russell (Performance Characteristics of Traditional VMs vs Docker Containers)
![Page 21: 20150704 benchmark and user experience in sahara weiting](https://reader035.fdocuments.us/reader035/viewer/2022062515/55cd82eabb61eb04768b47a9/html5/thumbnails/21.jpg)
Issue2 - A complex data processing
Problem Description:o A job usually run multiple sub-jobs in a row, Ex: Job A -> Job B -> Job C, and also need to
support scheduling a job
Customer’s Feedback:o Running a complex job to fulfill their caseo To Schedule a job using Sahara EDPo Running a recurring job
oPossible Solutions/Alternatives:• Currently Sahara EDP only support to run a simple job• Schedule a job -> BP: https://review.openstack.org/#/c/175719/• A complex job running -> Under discussion• Running a recurring job -> Under discussion
![Page 22: 20150704 benchmark and user experience in sahara weiting](https://reader035.fdocuments.us/reader035/viewer/2022062515/55cd82eabb61eb04768b47a9/html5/thumbnails/22.jpg)
Issue3 - Storage ArchitectureProblem Description:o Currently our customers use individual Compute Cluster(Using Nova) and Storage
Cluster(Using Swift as an Object Storage for data store). But there is a performance issue if compute and data put in different node, to transfer data must pass through network.
Customer’s Expectation:o Find a better solution to fulfill their requirements and integrate to their current storage
architecture
Possible Solutions/Alternatives:o Use Internal HDFS -> Needs a way to copy data from Swift to Internal HDFSo Use Swift Data Locality Feature -> Must change their storage architecture
![Page 23: 20150704 benchmark and user experience in sahara weiting](https://reader035.fdocuments.us/reader035/viewer/2022062515/55cd82eabb61eb04768b47a9/html5/thumbnails/23.jpg)
Two-phases in Sort running period for disk writeo Shuffle Map-Reduce Data -> Use temp folder to storeo intermediate data(40%total throughput)• Write Output -> HDFS Write(60%total throughput)
Sort Workload Profile
Shuffling data using temp folder
Write output to HDFS/External Storage
Disk IO Peak
![Page 24: 20150704 benchmark and user experience in sahara weiting](https://reader035.fdocuments.us/reader035/viewer/2022062515/55cd82eabb61eb04768b47a9/html5/thumbnails/24.jpg)
1. Hadoop temp Folder Location
2. HDFS Location
3. Data Persistent
4. Integrate with current Storage Architecture, usually use shared storage in cloud
5. Optimize storage by your workload
Storage Consideration
![Page 25: 20150704 benchmark and user experience in sahara weiting](https://reader035.fdocuments.us/reader035/viewer/2022062515/55cd82eabb61eb04768b47a9/html5/thumbnails/25.jpg)
Redundant Issue when HDFS over Ceph/GlusterFS
Compute Cluster
Instance1
HDFS
Instance2
HDFS
…..
Instance3
HDFS
Ceph Cluster
Cinder
DATA DATA DATA
A DATA C DATAB DATA
A DATA B DATAC DATA
C DATAB DATA A DATA
3(in HDFS) x 3(in Ceph) = 9 Replicas in CephCluster
![Page 26: 20150704 benchmark and user experience in sahara weiting](https://reader035.fdocuments.us/reader035/viewer/2022062515/55cd82eabb61eb04768b47a9/html5/thumbnails/26.jpg)
Cinder Volume Instance Locality Support in Sahara
Compute1
Instance1
HDFS
Instance2
HDFS
…..
Instance3
HDFS
Cinder-volume
DATA DATA DATA
Volume1 Volume2 Volume3
Compute2
Instance4
HDFS
Instance5
HDFS
…..
Instance6
HDFS
Cinder-volume
DATA DATA DATA
Volume4 Volume5 Volume6
Nova Nova
![Page 27: 20150704 benchmark and user experience in sahara weiting](https://reader035.fdocuments.us/reader035/viewer/2022062515/55cd82eabb61eb04768b47a9/html5/thumbnails/27.jpg)
Performance Impact from o Swift overhead comes from “Rename” method in Hadoopo “List Endpoint” feature bring huge impacto Larger data size may deliver worse performance gap
27
Swift Performance Issue
Host
Swift
VMVM
HostNova Inst.
Store
VM
HDFS
VM
HDFS…..
…..vs.
1.25x overhead
1.67x overhead
1X
![Page 28: 20150704 benchmark and user experience in sahara weiting](https://reader035.fdocuments.us/reader035/viewer/2022062515/55cd82eabb61eb04768b47a9/html5/thumbnails/28.jpg)
The output of the reduce function is written to a temporary location in HDFS. After completing, the output will automatically renamed from its temporary location to its final location.
Rename in Reduce Task
ANALYSIS
• Object storage cannot support rename, swiftfs use “copy and delete” for rename function.
• HDFS Rename -> Change METADATA in Name Node
• Swift Rename -> Copy new object and Delete the older one in Swift
1.5x overhead
local to swift
swift to swift
local to hdfs
![Page 29: 20150704 benchmark and user experience in sahara weiting](https://reader035.fdocuments.us/reader035/viewer/2022062515/55cd82eabb61eb04768b47a9/html5/thumbnails/29.jpg)
Issue4 - Scaling a Cluster
Problem Description:o Current there are several issues they found when using scaling a cluster, they would like to
ask Community to improve their experience
Customer’s Expectation:o Rebalancing HDFS after scalingo Auto-scale a cluster by request(ex: job size, …etc)
Possible Solutions/Alternatives:o Rebalance HDFS -> BP: https://blueprints.launchpad.net/sahara/+spec/hdfs-rebalanceo Auto-scaling -> Needs be discussed
![Page 30: 20150704 benchmark and user experience in sahara weiting](https://reader035.fdocuments.us/reader035/viewer/2022062515/55cd82eabb61eb04768b47a9/html5/thumbnails/30.jpg)
Issue5 - OpenStack Version SupportProblem Description:o New features usually support in new release, customers would like to use new feature in old
environmento Some new features cannot be accepted to backport to an older one
Customer’s Expectation:o Customers would like to use new feature in Kilo or later version OpenStack
Possible Solutions/Alternatives:o Rolling Upgrade from Juno to Kiloo Only use Sahara and Horizon in Kilo and other OpenStack project in Juno -> We haven’t try
thiso In the future, plugin will support backward compatible, let plugin can separate with Sahara
![Page 31: 20150704 benchmark and user experience in sahara weiting](https://reader035.fdocuments.us/reader035/viewer/2022062515/55cd82eabb61eb04768b47a9/html5/thumbnails/31.jpg)
The Future of Sahara
![Page 32: 20150704 benchmark and user experience in sahara weiting](https://reader035.fdocuments.us/reader035/viewer/2022062515/55cd82eabb61eb04768b47a9/html5/thumbnails/32.jpg)
oVanilla support Hadoop v1.2.1 and Hadoop 2.6
oSpark Plugin
oCloudera CDH Plugin
oMapR Plugin
oStorm Plugin
oNew Horizon UI with a Guide Panel
oDefault Template Support
What’s New in Kilo
![Page 33: 20150704 benchmark and user experience in sahara weiting](https://reader035.fdocuments.us/reader035/viewer/2022062515/55cd82eabb61eb04768b47a9/html5/thumbnails/33.jpg)
oSahara EDP is the focus to process data flow
oSupport more data sources and storage architecture
oSupport more Big Data projects
oIntegrate with other OpenStack projects
oBaremetal -> Ironic
oDocker -> Magnum
oApplication Catalog -> Murano
The Future of Sahara
![Page 34: 20150704 benchmark and user experience in sahara weiting](https://reader035.fdocuments.us/reader035/viewer/2022062515/55cd82eabb61eb04768b47a9/html5/thumbnails/34.jpg)