EC2 Performance, Spot Instance ROI and EMR Scalability
-
Upload
jesse-anderson -
Category
Technology
-
view
1.764 -
download
1
description
Transcript of EC2 Performance, Spot Instance ROI and EMR Scalability
![Page 1: EC2 Performance, Spot Instance ROI and EMR Scalability](https://reader033.fdocuments.us/reader033/viewer/2022061206/5482b457b4af9f7d148b4583/html5/thumbnails/1.jpg)
EC2 PERFORMANCE, SPOT INSTANCE ROI AND EMR SCALABILITY
Jesse Anderson
![Page 2: EC2 Performance, Spot Instance ROI and EMR Scalability](https://reader033.fdocuments.us/reader033/viewer/2022061206/5482b457b4af9f7d148b4583/html5/thumbnails/2.jpg)
AMAZON WEB SERVICES (AWS)
Elastic Cloud Compute (EC2) Virtual Machine in Cloud
Simple Storage Service (S3) Network Share in Cloud
Elastic MapReduce (EMR) Cluster of EC2 instances for Hadoop cluster
![Page 3: EC2 Performance, Spot Instance ROI and EMR Scalability](https://reader033.fdocuments.us/reader033/viewer/2022061206/5482b457b4af9f7d148b4583/html5/thumbnails/3.jpg)
EC2 PRICE TYPES
Spot Instances System for bidding on unused instances Same Performance Go away (abruptly) if outbid
On Demand Ad Hoc starting
Reserved Not Covered
![Page 4: EC2 Performance, Spot Instance ROI and EMR Scalability](https://reader033.fdocuments.us/reader033/viewer/2022061206/5482b457b4af9f7d148b4583/html5/thumbnails/4.jpg)
SPOT INSTANCE SAVINGS
![Page 5: EC2 Performance, Spot Instance ROI and EMR Scalability](https://reader033.fdocuments.us/reader033/viewer/2022061206/5482b457b4af9f7d148b4583/html5/thumbnails/5.jpg)
MILLION MONKEYS PROJECT
Randomly recreated Shakespeare Open source Good metric for CPU and memory
![Page 6: EC2 Performance, Spot Instance ROI and EMR Scalability](https://reader033.fdocuments.us/reader033/viewer/2022061206/5482b457b4af9f7d148b4583/html5/thumbnails/6.jpg)
EC2 SPECIFICATIONS
Instance Name
Memory
EC2 Compute Units/Cores
Platform
I/O Performance
Small 1.7 GB 1 EC2 on 1 Core 32-bit Moderate
Large 7.5 GB 4 EC2 on 2 Cores 64-bit High
Extra Large 15 GB 8 EC2 on 8 Cores 64-bit High
High-CPU Medium
1.7 GB 5 EC2 on 2 Cores 32-bit Moderate
High-CPU Large 7 GB 20 EC2 on 8 Cores 64-bit High
Quad XL 23 GB 33.5 on 8 Cores 64-bit Very High
EC2 Compute Unit (ECU) – One EC2 Compute Unit (ECU) provides the equivalent CPU capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor.
![Page 7: EC2 Performance, Spot Instance ROI and EMR Scalability](https://reader033.fdocuments.us/reader033/viewer/2022061206/5482b457b4af9f7d148b4583/html5/thumbnails/7.jpg)
EC2 PERFORMANCE
My Core 2 Duo 2.66 GHZ did 50,000,000,000 character groups
![Page 8: EC2 Performance, Spot Instance ROI and EMR Scalability](https://reader033.fdocuments.us/reader033/viewer/2022061206/5482b457b4af9f7d148b4583/html5/thumbnails/8.jpg)
EC2 COST PER HOUR ON DEMAND/SPOT
![Page 9: EC2 Performance, Spot Instance ROI and EMR Scalability](https://reader033.fdocuments.us/reader033/viewer/2022061206/5482b457b4af9f7d148b4583/html5/thumbnails/9.jpg)
PRICE PER UNIT
![Page 10: EC2 Performance, Spot Instance ROI and EMR Scalability](https://reader033.fdocuments.us/reader033/viewer/2022061206/5482b457b4af9f7d148b4583/html5/thumbnails/10.jpg)
EMR (HADOOP) CLUSTERING
Tests of 1, 2, 3, 4, 5, 10, 20 node clusters
Price Scalability
![Page 11: EC2 Performance, Spot Instance ROI and EMR Scalability](https://reader033.fdocuments.us/reader033/viewer/2022061206/5482b457b4af9f7d148b4583/html5/thumbnails/11.jpg)
EMR COST
![Page 12: EC2 Performance, Spot Instance ROI and EMR Scalability](https://reader033.fdocuments.us/reader033/viewer/2022061206/5482b457b4af9f7d148b4583/html5/thumbnails/12.jpg)
PRICE PER UNIT IN A CLUSTER
![Page 13: EC2 Performance, Spot Instance ROI and EMR Scalability](https://reader033.fdocuments.us/reader033/viewer/2022061206/5482b457b4af9f7d148b4583/html5/thumbnails/13.jpg)
CLUSTERED CHARACTER GROUPS
![Page 14: EC2 Performance, Spot Instance ROI and EMR Scalability](https://reader033.fdocuments.us/reader033/viewer/2022061206/5482b457b4af9f7d148b4583/html5/thumbnails/14.jpg)
EMR/HADOOP SCALABILITY PERCENTAGE
![Page 15: EC2 Performance, Spot Instance ROI and EMR Scalability](https://reader033.fdocuments.us/reader033/viewer/2022061206/5482b457b4af9f7d148b4583/html5/thumbnails/15.jpg)
EMR/HADOOP SCALABILITY ABSOLUTE
![Page 16: EC2 Performance, Spot Instance ROI and EMR Scalability](https://reader033.fdocuments.us/reader033/viewer/2022061206/5482b457b4af9f7d148b4583/html5/thumbnails/16.jpg)
BREAKDOWNS
Original project would have run in 3 days 9 hours Took 1.5 months before
20 node cluster costs $45.44 per day 5 day run cost $317 11 day run cost $528
![Page 17: EC2 Performance, Spot Instance ROI and EMR Scalability](https://reader033.fdocuments.us/reader033/viewer/2022061206/5482b457b4af9f7d148b4583/html5/thumbnails/17.jpg)
ENGINEERING FOR THE CLOUD
Establish if a good fit Test the EC2 performance Figure out a unit or widget Find the most cost efficient EC2
performer with price per unit/widget Engineer with Spot Instances in mind
![Page 18: EC2 Performance, Spot Instance ROI and EMR Scalability](https://reader033.fdocuments.us/reader033/viewer/2022061206/5482b457b4af9f7d148b4583/html5/thumbnails/18.jpg)
CONCLUSIONS
Spot Instance Saves From $2.20 to $1.30 per hour Saved $1,000 in one run
Hadoop/EMR Scalability 95% efficiency at 2-5 nodes 87% efficiency at 10 nodes 84% efficiency at 20 nodes
![Page 19: EC2 Performance, Spot Instance ROI and EMR Scalability](https://reader033.fdocuments.us/reader033/viewer/2022061206/5482b457b4af9f7d148b4583/html5/thumbnails/19.jpg)
MORE INFORMATION
http://www.jesse-anderson.com/2012/02/ec2-performance-spot-instance-roi-and-emr-scalability/
@jessetanderson