Big Data Use Cases and Solutions in the AWS Cloud
-
Upload
amazon-web-services -
Category
Technology
-
view
838 -
download
3
description
Transcript of Big Data Use Cases and Solutions in the AWS Cloud
![Page 1: Big Data Use Cases and Solutions in the AWS Cloud](https://reader034.fdocuments.us/reader034/viewer/2022051608/540d66a38d7f728d7e8b48f5/html5/thumbnails/1.jpg)
© 2014 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc.
Big Data Use Cases and
Solutions in the AWS Cloud
Ben Butler, @bensbutler, Sr. Mgr., Big Data & HPC
July 10, 2014
![Page 2: Big Data Use Cases and Solutions in the AWS Cloud](https://reader034.fdocuments.us/reader034/viewer/2022051608/540d66a38d7f728d7e8b48f5/html5/thumbnails/2.jpg)
Generation
Collection & storage
Analytics & computation
Collaboration & sharing
![Page 3: Big Data Use Cases and Solutions in the AWS Cloud](https://reader034.fdocuments.us/reader034/viewer/2022051608/540d66a38d7f728d7e8b48f5/html5/thumbnails/3.jpg)
Generation
Collection & storage
Analytics & computation
Collaboration & sharing
![Page 4: Big Data Use Cases and Solutions in the AWS Cloud](https://reader034.fdocuments.us/reader034/viewer/2022051608/540d66a38d7f728d7e8b48f5/html5/thumbnails/4.jpg)
Big Data: Unconstrained data growth
95% of the 1.2 zettabytes
of data in the digital
universe is unstructured
70% of of this is user-
generated content
Unstructured data growth
explosive, with estimates
of compound annual
growth (CAGR) at 62%
Source: IDCGB TB
PB
ZB
EB
![Page 5: Big Data Use Cases and Solutions in the AWS Cloud](https://reader034.fdocuments.us/reader034/viewer/2022051608/540d66a38d7f728d7e8b48f5/html5/thumbnails/5.jpg)
The amount of information generated during the first day of
a baby’s life today is equivalent to 70 times the information
contained in the Library of Congress
![Page 6: Big Data Use Cases and Solutions in the AWS Cloud](https://reader034.fdocuments.us/reader034/viewer/2022051608/540d66a38d7f728d7e8b48f5/html5/thumbnails/6.jpg)
Lower cost,
higher throughput Generation
Collection & storage
Analytics & computation
Collaboration & sharing
![Page 7: Big Data Use Cases and Solutions in the AWS Cloud](https://reader034.fdocuments.us/reader034/viewer/2022051608/540d66a38d7f728d7e8b48f5/html5/thumbnails/7.jpg)
Highly
constrained
Lower cost,
higher throughput Generation
Collection & storage
Analytics & computation
Collaboration & sharing
![Page 8: Big Data Use Cases and Solutions in the AWS Cloud](https://reader034.fdocuments.us/reader034/viewer/2022051608/540d66a38d7f728d7e8b48f5/html5/thumbnails/8.jpg)
Gartner: User Survey Analysis: Key Trends Shaping the Future of Data Center Infrastructure Through 2011
IDC: Worldwide Business Analytics Software 2012–2016 Forecast and 2011 Vendor Shares
Available for analysis
Generated data
Data volume - Gap
1990 2000 2010 2020
![Page 9: Big Data Use Cases and Solutions in the AWS Cloud](https://reader034.fdocuments.us/reader034/viewer/2022051608/540d66a38d7f728d7e8b48f5/html5/thumbnails/9.jpg)
Elastic and highly scalable
No upfront capital expense
Only pay for what you use+
+
Available on-demand+
=
Remove constraints
![Page 10: Big Data Use Cases and Solutions in the AWS Cloud](https://reader034.fdocuments.us/reader034/viewer/2022051608/540d66a38d7f728d7e8b48f5/html5/thumbnails/10.jpg)
Accelerated
Generation
Collection & storage
Analytics & computation
Collaboration & sharing
![Page 11: Big Data Use Cases and Solutions in the AWS Cloud](https://reader034.fdocuments.us/reader034/viewer/2022051608/540d66a38d7f728d7e8b48f5/html5/thumbnails/11.jpg)
Technologies and techniques for working
productively with data, at any scale.
Big Data
![Page 12: Big Data Use Cases and Solutions in the AWS Cloud](https://reader034.fdocuments.us/reader034/viewer/2022051608/540d66a38d7f728d7e8b48f5/html5/thumbnails/12.jpg)
Big data and AWS Cloud computing
Big data Cloud computing
Variety, volume, and velocity
requiring new tools
Variety of compute, storage,
and networking options
![Page 13: Big Data Use Cases and Solutions in the AWS Cloud](https://reader034.fdocuments.us/reader034/viewer/2022051608/540d66a38d7f728d7e8b48f5/html5/thumbnails/13.jpg)
Big data and AWS Cloud computing
Big data Cloud computing
Potentially massive datasets Massive, virtually unlimited
capacity
![Page 14: Big Data Use Cases and Solutions in the AWS Cloud](https://reader034.fdocuments.us/reader034/viewer/2022051608/540d66a38d7f728d7e8b48f5/html5/thumbnails/14.jpg)
Big data and AWS Cloud computing
Big data Cloud computing
Iterative, experimental style of
data manipulation and analysis
Iterative, experimental style of
infrastructure deployment/usage
![Page 15: Big Data Use Cases and Solutions in the AWS Cloud](https://reader034.fdocuments.us/reader034/viewer/2022051608/540d66a38d7f728d7e8b48f5/html5/thumbnails/15.jpg)
Big data and AWS Cloud computing
Big data Cloud computing
Frequently not steady-state
workload; peaks and valleys
At its most efficient with highly
variable workloads
![Page 16: Big Data Use Cases and Solutions in the AWS Cloud](https://reader034.fdocuments.us/reader034/viewer/2022051608/540d66a38d7f728d7e8b48f5/html5/thumbnails/16.jpg)
Big data and AWS Cloud computing
Big data Cloud computing
Absolute performance not as
critical as “time to results”;
shared resources are a
bottleneck
Parallel compute projects allow
each workgroup to have more
autonomy, get faster results
![Page 17: Big Data Use Cases and Solutions in the AWS Cloud](https://reader034.fdocuments.us/reader034/viewer/2022051608/540d66a38d7f728d7e8b48f5/html5/thumbnails/17.jpg)
One tool to
rule them all
![Page 18: Big Data Use Cases and Solutions in the AWS Cloud](https://reader034.fdocuments.us/reader034/viewer/2022051608/540d66a38d7f728d7e8b48f5/html5/thumbnails/18.jpg)
Use the right tools
Amazon
S3
Amazon
Kinesis
Amazon
DynamoDB
Amazon
RedshiftAmazon
Elastic
MapReduce
![Page 19: Big Data Use Cases and Solutions in the AWS Cloud](https://reader034.fdocuments.us/reader034/viewer/2022051608/540d66a38d7f728d7e8b48f5/html5/thumbnails/19.jpg)
Store anything
Object storage
Scalable
99.999999999% durability
Amazon
S3
![Page 20: Big Data Use Cases and Solutions in the AWS Cloud](https://reader034.fdocuments.us/reader034/viewer/2022051608/540d66a38d7f728d7e8b48f5/html5/thumbnails/20.jpg)
Real-time processing
High throughput; elastic
Easy to use
EMR, S3, Redshift, DynamoDB
Integrations
Amazon
Kinesis
![Page 21: Big Data Use Cases and Solutions in the AWS Cloud](https://reader034.fdocuments.us/reader034/viewer/2022051608/540d66a38d7f728d7e8b48f5/html5/thumbnails/21.jpg)
NoSQL Database
Seamless scalability
Zero admin
Single digit millisecond latency
Amazon
DynamoDB
![Page 22: Big Data Use Cases and Solutions in the AWS Cloud](https://reader034.fdocuments.us/reader034/viewer/2022051608/540d66a38d7f728d7e8b48f5/html5/thumbnails/22.jpg)
Relational data warehouse
Massively parallel
Petabyte scale
Fully managed
$1,000/TB/Year
Amazon
Redshift
![Page 23: Big Data Use Cases and Solutions in the AWS Cloud](https://reader034.fdocuments.us/reader034/viewer/2022051608/540d66a38d7f728d7e8b48f5/html5/thumbnails/23.jpg)
Try Amazon Redshift with BI & ETL for Free!
aws.amazon.com/redshift/free-trial
2 months | 750 hours/month | dw2.large SSD instance
160GB of compressed storage per node
Try BI & ETL for free from nine partners at
aws.amazon.com/redshift/partners
![Page 24: Big Data Use Cases and Solutions in the AWS Cloud](https://reader034.fdocuments.us/reader034/viewer/2022051608/540d66a38d7f728d7e8b48f5/html5/thumbnails/24.jpg)
Hadoop/HDFS clusters
Hive, Pig, Impala, Hbase
Easy to use; fully managed
On-demand and spot pricing
Tight integration with S3,
DynamoDB, and Kinesis
Amazon
Elastic
MapReduce
![Page 25: Big Data Use Cases and Solutions in the AWS Cloud](https://reader034.fdocuments.us/reader034/viewer/2022051608/540d66a38d7f728d7e8b48f5/html5/thumbnails/25.jpg)
Amazon EMR now ships with ODBC and JDBC drivers for
Hive, Impala, and HBase
Easier to use popular BI tools like:
Microsoft Excel, Tableau, MicroStrategy, and QlikView
ODBC and JDBC drivers now for Amazon EMR
![Page 26: Big Data Use Cases and Solutions in the AWS Cloud](https://reader034.fdocuments.us/reader034/viewer/2022051608/540d66a38d7f728d7e8b48f5/html5/thumbnails/26.jpg)
The right tools.
At the right scale.
At the right time.
![Page 27: Big Data Use Cases and Solutions in the AWS Cloud](https://reader034.fdocuments.us/reader034/viewer/2022051608/540d66a38d7f728d7e8b48f5/html5/thumbnails/27.jpg)
HDFS
Amazon EMR
![Page 28: Big Data Use Cases and Solutions in the AWS Cloud](https://reader034.fdocuments.us/reader034/viewer/2022051608/540d66a38d7f728d7e8b48f5/html5/thumbnails/28.jpg)
HDFS
Amazon S3 Amazon
DynamoDB
Amazon EMR
AWS Data Pipeline
![Page 29: Big Data Use Cases and Solutions in the AWS Cloud](https://reader034.fdocuments.us/reader034/viewer/2022051608/540d66a38d7f728d7e8b48f5/html5/thumbnails/29.jpg)
HDFS
Amazon S3 Amazon
DynamoDB
Amazon EMR
Amazon
Kinesis
AWS Data Pipeline
Data
Sources
![Page 30: Big Data Use Cases and Solutions in the AWS Cloud](https://reader034.fdocuments.us/reader034/viewer/2022051608/540d66a38d7f728d7e8b48f5/html5/thumbnails/30.jpg)
HDFS
Amazon S3 Amazon
DynamoDB
Amazon EMR
Amazon
Kinesis
AWS Data Pipeline
Data
Sources
Data management Hadoop Ecosystem analytical tools
![Page 31: Big Data Use Cases and Solutions in the AWS Cloud](https://reader034.fdocuments.us/reader034/viewer/2022051608/540d66a38d7f728d7e8b48f5/html5/thumbnails/31.jpg)
HDFS
Amazon
RedShift
Amazon
RDS
Amazon S3 Amazon
DynamoDB
Amazon EMR
Amazon
Kinesis
AWS Data Pipeline
Data management Hadoop Ecosystem analytical tools
Data
Sources
![Page 32: Big Data Use Cases and Solutions in the AWS Cloud](https://reader034.fdocuments.us/reader034/viewer/2022051608/540d66a38d7f728d7e8b48f5/html5/thumbnails/32.jpg)
HDFS
Amazon
RedShift
Amazon
RDS
Amazon S3 Amazon
DynamoDB
Amazon EMR
Amazon
Kinesis
AWS Data Pipeline
Data management Hadoop Ecosystem analytical tools
Data
Sources
AWS Data
Pipeline
![Page 33: Big Data Use Cases and Solutions in the AWS Cloud](https://reader034.fdocuments.us/reader034/viewer/2022051608/540d66a38d7f728d7e8b48f5/html5/thumbnails/33.jpg)
Free steak campaign
Disaster recovery
Web site & media sharing
Facebook app
Ground campaign
SAP & SharePoint
Marketing web site
Business line of sight
Consumer social app
IT operations
Mars exploration ops
Interactive TV apps
Media streaming
Consumer social app
Facebook page
Securities Trading Data Archiving
Financial markets analytics
Web and mobile apps
Big data analytics
Digital media
Ticket pricing optimization
Streaming webcasts
Mobile analytics
Consumer social app
Core IT and media
![Page 34: Big Data Use Cases and Solutions in the AWS Cloud](https://reader034.fdocuments.us/reader034/viewer/2022051608/540d66a38d7f728d7e8b48f5/html5/thumbnails/34.jpg)
Customer Use Cases of Big Data
![Page 35: Big Data Use Cases and Solutions in the AWS Cloud](https://reader034.fdocuments.us/reader034/viewer/2022051608/540d66a38d7f728d7e8b48f5/html5/thumbnails/35.jpg)
![Page 36: Big Data Use Cases and Solutions in the AWS Cloud](https://reader034.fdocuments.us/reader034/viewer/2022051608/540d66a38d7f728d7e8b48f5/html5/thumbnails/36.jpg)
Dropcam is the biggest inbound video service
on the Web
More data uploaded per
minute than YouTube
Petabytes of data
processed every month
Billions of motion events
detected
![Page 37: Big Data Use Cases and Solutions in the AWS Cloud](https://reader034.fdocuments.us/reader034/viewer/2022051608/540d66a38d7f728d7e8b48f5/html5/thumbnails/37.jpg)
![Page 38: Big Data Use Cases and Solutions in the AWS Cloud](https://reader034.fdocuments.us/reader034/viewer/2022051608/540d66a38d7f728d7e8b48f5/html5/thumbnails/38.jpg)
4 months to production
300% speed gain
$500k - $1M in CAPEX saved
![Page 39: Big Data Use Cases and Solutions in the AWS Cloud](https://reader034.fdocuments.us/reader034/viewer/2022051608/540d66a38d7f728d7e8b48f5/html5/thumbnails/39.jpg)
![Page 40: Big Data Use Cases and Solutions in the AWS Cloud](https://reader034.fdocuments.us/reader034/viewer/2022051608/540d66a38d7f728d7e8b48f5/html5/thumbnails/40.jpg)
![Page 41: Big Data Use Cases and Solutions in the AWS Cloud](https://reader034.fdocuments.us/reader034/viewer/2022051608/540d66a38d7f728d7e8b48f5/html5/thumbnails/41.jpg)
![Page 42: Big Data Use Cases and Solutions in the AWS Cloud](https://reader034.fdocuments.us/reader034/viewer/2022051608/540d66a38d7f728d7e8b48f5/html5/thumbnails/42.jpg)
![Page 43: Big Data Use Cases and Solutions in the AWS Cloud](https://reader034.fdocuments.us/reader034/viewer/2022051608/540d66a38d7f728d7e8b48f5/html5/thumbnails/43.jpg)
![Page 44: Big Data Use Cases and Solutions in the AWS Cloud](https://reader034.fdocuments.us/reader034/viewer/2022051608/540d66a38d7f728d7e8b48f5/html5/thumbnails/44.jpg)
![Page 45: Big Data Use Cases and Solutions in the AWS Cloud](https://reader034.fdocuments.us/reader034/viewer/2022051608/540d66a38d7f728d7e8b48f5/html5/thumbnails/45.jpg)
500MM tweets/day = ~ 20.8MM tweets/hr
2k/tweet is ~12MB/sec, need 6 shards, ~1TB/day
$0.015/hour per shard, $0.028/million PUTS
Kinesis cost is $0.765/hour
Redshift cost is $0.850/hour (for a 2TB dw1.xlarge)
Total: $1.615/hour
Cost &
Scale
![Page 47: Big Data Use Cases and Solutions in the AWS Cloud](https://reader034.fdocuments.us/reader034/viewer/2022051608/540d66a38d7f728d7e8b48f5/html5/thumbnails/47.jpg)
“THANKS TO AMAZON WEB SERVICES, WE CAN DELIGHT OUR PLAYERS WORLDWIDE.”
Sami Yliharju | Services Lead
![Page 48: Big Data Use Cases and Solutions in the AWS Cloud](https://reader034.fdocuments.us/reader034/viewer/2022051608/540d66a38d7f728d7e8b48f5/html5/thumbnails/48.jpg)
![Page 49: Big Data Use Cases and Solutions in the AWS Cloud](https://reader034.fdocuments.us/reader034/viewer/2022051608/540d66a38d7f728d7e8b48f5/html5/thumbnails/49.jpg)
The Climate Corporation - Weather Insurance for Farms
Challenge:Volatile weather is deadly to crops like grapes
Solution:
Built a predictive model based on freely available
data:
• 60 years of crop data,
• 14 TBs of soil data, and
• 1M government Doppler radar points
• 50 EMR clusters process new data as it comes
into S3 each day, continuously updating the
model.
![Page 50: Big Data Use Cases and Solutions in the AWS Cloud](https://reader034.fdocuments.us/reader034/viewer/2022051608/540d66a38d7f728d7e8b48f5/html5/thumbnails/50.jpg)
150B Soil
Observations
3M Daily Weather
Measurements
850K Precision Rainfall
Grids Tracked
200 TB in Amazon S3
![Page 51: Big Data Use Cases and Solutions in the AWS Cloud](https://reader034.fdocuments.us/reader034/viewer/2022051608/540d66a38d7f728d7e8b48f5/html5/thumbnails/51.jpg)
Foursquare…
33 million users1.3 million businesses
…generates a lot of Data3.5 billion check-ins 15M+ venues, Terabytes of log data
![Page 52: Big Data Use Cases and Solutions in the AWS Cloud](https://reader034.fdocuments.us/reader034/viewer/2022051608/540d66a38d7f728d7e8b48f5/html5/thumbnails/52.jpg)
Uses EMR for
Evaluation of new features
Machine learning
Exploratory analysis
Daily customer usage reporting
Long-term trend analysis
![Page 53: Big Data Use Cases and Solutions in the AWS Cloud](https://reader034.fdocuments.us/reader034/viewer/2022051608/540d66a38d7f728d7e8b48f5/html5/thumbnails/53.jpg)
Benefits of Amazon EMR
Ease-of-Use“We have decreased the processing time for urgent data-analysis”
FlexibilityTo deal with changing requirements & dynamically expand reporting clusters
Costs“We have reduced our analytics costs by over 50%”
![Page 54: Big Data Use Cases and Solutions in the AWS Cloud](https://reader034.fdocuments.us/reader034/viewer/2022051608/540d66a38d7f728d7e8b48f5/html5/thumbnails/54.jpg)
Who is checking in?
0
0.1
0.2
0.3
0.4
0.5
0.6
Female Male
Gender
0 20 40 60 80
Age
![Page 55: Big Data Use Cases and Solutions in the AWS Cloud](https://reader034.fdocuments.us/reader034/viewer/2022051608/540d66a38d7f728d7e8b48f5/html5/thumbnails/55.jpg)
Gorilla Coffee
Gray's Papaya
Amorino
Thursday Friday Saturday Sunday
When do people go to a place?
![Page 56: Big Data Use Cases and Solutions in the AWS Cloud](https://reader034.fdocuments.us/reader034/viewer/2022051608/540d66a38d7f728d7e8b48f5/html5/thumbnails/56.jpg)
User Sign-ups
![Page 57: Big Data Use Cases and Solutions in the AWS Cloud](https://reader034.fdocuments.us/reader034/viewer/2022051608/540d66a38d7f728d7e8b48f5/html5/thumbnails/57.jpg)
Generation
Collection & storage
Analytics & computation
Collaboration & sharing
![Page 58: Big Data Use Cases and Solutions in the AWS Cloud](https://reader034.fdocuments.us/reader034/viewer/2022051608/540d66a38d7f728d7e8b48f5/html5/thumbnails/58.jpg)
a
AmazonDynamoDB
Amazon
RDS
Amazon
Redshift
AWS
Direct Connect
AWS
Storage Gateway
AWS
Import/ Export
Amazon
GlacierS3
Amazon
KinesisAmazon EMR
Generation
Collection & storage
Analytics & computation
Collaboration & sharing
![Page 59: Big Data Use Cases and Solutions in the AWS Cloud](https://reader034.fdocuments.us/reader034/viewer/2022051608/540d66a38d7f728d7e8b48f5/html5/thumbnails/59.jpg)
Amazon EC2 Amazon EMRAmazon
Kinesis
Generation
Collection & storage
Analytics & computation
Collaboration & sharing
![Page 60: Big Data Use Cases and Solutions in the AWS Cloud](https://reader034.fdocuments.us/reader034/viewer/2022051608/540d66a38d7f728d7e8b48f5/html5/thumbnails/60.jpg)
AmazonRedshift
AmazonDynamoDB
Amazon
RDS
S3 Amazon EC2 Amazon EMR
Amazon
CloudFront
AWS
CloudFormation
AWS
Data Pipeline
Generation
Collection & storage
Analytics & computation
Collaboration & sharing
![Page 61: Big Data Use Cases and Solutions in the AWS Cloud](https://reader034.fdocuments.us/reader034/viewer/2022051608/540d66a38d7f728d7e8b48f5/html5/thumbnails/61.jpg)
© 2014 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc.
DataXu in the Cloud
Yekesa Kosuru, V.P Technology
July 10th 2014
![Page 62: Big Data Use Cases and Solutions in the AWS Cloud](https://reader034.fdocuments.us/reader034/viewer/2022051608/540d66a38d7f728d7e8b48f5/html5/thumbnails/62.jpg)
What is DataXu?
• Digital Marketing Platform, Ad Tech Platform
• Real-time Multivariate Decision System
• 5th Fastest Growing Private Company in U.S (Inc 500)
• Optimize Digital Marketing Campaigns– ...put the right ad campaign in front of the right customer
– …find customer who left their site without converting
– …find more customers who are likely to convert
– …offer insight into who, why, when, where are respondents
• 950,000 times per second
![Page 63: Big Data Use Cases and Solutions in the AWS Cloud](https://reader034.fdocuments.us/reader034/viewer/2022051608/540d66a38d7f728d7e8b48f5/html5/thumbnails/63.jpg)
Big Data, Little Decisions
Decision
impact(also proportional
to risk)
Decision rate
1
2000’s – “How often can we run a permission-based email mktg. campaign?” Rules-based alerts
2010’s – Millions of decisions and actions taken, all in less than a blink of an eye
volume ~ value
The Evolution of Real-Time Decision Systems
1
2
2
3
3
1990’s – “Should we advertise on the Superbowl? Should we run direct mail this qtr.?” Batch mode
![Page 64: Big Data Use Cases and Solutions in the AWS Cloud](https://reader034.fdocuments.us/reader034/viewer/2022051608/540d66a38d7f728d7e8b48f5/html5/thumbnails/64.jpg)
Real Time Bidding
Site
Auctions
Ads, e.g
User
Opens
Browser
Goes to
Sports Site
DataXu
Bids(others bid too)
DataXu
Wins Bid
Ad Shown,
Page loads
![Page 65: Big Data Use Cases and Solutions in the AWS Cloud](https://reader034.fdocuments.us/reader034/viewer/2022051608/540d66a38d7f728d7e8b48f5/html5/thumbnails/65.jpg)
Quick Statistics
• 950K bid requests per second
• Billions of impressions per month, Petabyte of
data
• 100 ms round trip response time
• 100+TB of warehouse data
• 3000+ Servers powering the platform
![Page 66: Big Data Use Cases and Solutions in the AWS Cloud](https://reader034.fdocuments.us/reader034/viewer/2022051608/540d66a38d7f728d7e8b48f5/html5/thumbnails/66.jpg)
Why AWS
• Automation, API
• Costs, Pay As You Go
• Auto Scaling (elasticity – up and down)
• All Data in One Place (S3 foundational store)
• Improved Testability
• Security, Privacy
• Disaster Recovery and Business Continuity
![Page 67: Big Data Use Cases and Solutions in the AWS Cloud](https://reader034.fdocuments.us/reader034/viewer/2022051608/540d66a38d7f728d7e8b48f5/html5/thumbnails/67.jpg)
DataXu StackCampaign
Management
Business Intelligence
Data Mart
Interactive
Queries
Batch
Queries
Real Time Bidding System
Activity Logs
1st Party3rd Party
Distributed Log Ingestion
S3/HDFS Warehouse
CDN
User
ProfilesCampaign
Metadata
ETL Attribution Machine Learning
SpendDecision
System
Audience
CalculationUniques/S
egment
Big Velocity950K TPS
Big VolumePetabyte of Data
Big VarietyData Providers
![Page 68: Big Data Use Cases and Solutions in the AWS Cloud](https://reader034.fdocuments.us/reader034/viewer/2022051608/540d66a38d7f728d7e8b48f5/html5/thumbnails/68.jpg)
High Level Deployment
ON PREMISE
SSL
Meta
Amazon S3
RTB
System
Elastic Load
Balancing
Availability Zone
Route
53
EC2
Auto scaling Group
Volumes
AMI
Availability Zone
Log
Ingestion
System
Machine
Learning
SystemAuto scaling
Group
EMR
CloudWatch
![Page 69: Big Data Use Cases and Solutions in the AWS Cloud](https://reader034.fdocuments.us/reader034/viewer/2022051608/540d66a38d7f728d7e8b48f5/html5/thumbnails/69.jpg)
Traditional Hadoop vs EMR• Traditional Hadoop
– Anticipate and provision for peaks
– Cant de-couple storage and compute
– 75% cluster is idle
– Data Duplication/Multiple Clusters
• EMR to the rescue
• Monthly savings of 72%using EMR
![Page 70: Big Data Use Cases and Solutions in the AWS Cloud](https://reader034.fdocuments.us/reader034/viewer/2022051608/540d66a38d7f728d7e8b48f5/html5/thumbnails/70.jpg)
S3 Provides Linearly Scalable Bandwidth
• Big volume workloads involve several datasets together and terabytes of data
• Aggregate bandwidth matters
• S3 scales pretty linearly
S3 Streaming Performance
(m1.xlarge @ $0.34/hr)100 VMs; 9.6GB/s; $34/hr
350 VMs; 28.7GB/s; $119/hr
34 secs per terabyte
![Page 72: Big Data Use Cases and Solutions in the AWS Cloud](https://reader034.fdocuments.us/reader034/viewer/2022051608/540d66a38d7f728d7e8b48f5/html5/thumbnails/72.jpg)
© 2014 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc.
Getting Started with
Big Data on AWS
![Page 73: Big Data Use Cases and Solutions in the AWS Cloud](https://reader034.fdocuments.us/reader034/viewer/2022051608/540d66a38d7f728d7e8b48f5/html5/thumbnails/73.jpg)
AWS is here to help
Solution
Architects
Professional
ServicesPremium
Support
AWS Partner
Network (APN)
![Page 74: Big Data Use Cases and Solutions in the AWS Cloud](https://reader034.fdocuments.us/reader034/viewer/2022051608/540d66a38d7f728d7e8b48f5/html5/thumbnails/74.jpg)
aws.amazon.com/partners/competencies/big-data
Partner with an AWS Big Data expert
![Page 75: Big Data Use Cases and Solutions in the AWS Cloud](https://reader034.fdocuments.us/reader034/viewer/2022051608/540d66a38d7f728d7e8b48f5/html5/thumbnails/75.jpg)
https://aws.amazon.com/architecture/
Processing large amounts of parallel
data using a scalable cluster
AWS Architecture Diagrams
![Page 76: Big Data Use Cases and Solutions in the AWS Cloud](https://reader034.fdocuments.us/reader034/viewer/2022051608/540d66a38d7f728d7e8b48f5/html5/thumbnails/76.jpg)
http://aws.amazon.com/marketplace
Big Data Case Studies
Learn from other AWS customers
aws.amazon.com/solutions/case-studies/big-data
![Page 77: Big Data Use Cases and Solutions in the AWS Cloud](https://reader034.fdocuments.us/reader034/viewer/2022051608/540d66a38d7f728d7e8b48f5/html5/thumbnails/77.jpg)
AWS Marketplace
AWS Online Software Store
aws.amazon.com/marketplace
Shop the big data category
![Page 78: Big Data Use Cases and Solutions in the AWS Cloud](https://reader034.fdocuments.us/reader034/viewer/2022051608/540d66a38d7f728d7e8b48f5/html5/thumbnails/78.jpg)
http://aws.amazon.com/marketplace
AWS Public Data Sets
Free access to big data sets
aws.amazon.com/publicdatasets
![Page 79: Big Data Use Cases and Solutions in the AWS Cloud](https://reader034.fdocuments.us/reader034/viewer/2022051608/540d66a38d7f728d7e8b48f5/html5/thumbnails/79.jpg)
AWS Grants Program
AWS in Education
aws.amazon.com/grants
![Page 80: Big Data Use Cases and Solutions in the AWS Cloud](https://reader034.fdocuments.us/reader034/viewer/2022051608/540d66a38d7f728d7e8b48f5/html5/thumbnails/80.jpg)
AWS Big Data Test Drives
APN Partner-provided labs
aws.amazon.com/testdrive/bigdata
![Page 81: Big Data Use Cases and Solutions in the AWS Cloud](https://reader034.fdocuments.us/reader034/viewer/2022051608/540d66a38d7f728d7e8b48f5/html5/thumbnails/81.jpg)
https://aws.amazon.com/training
AWS Training & Events
Webinars, Bootcamps,
and Self-Paced Labs
aws.amazon.com/events
![Page 82: Big Data Use Cases and Solutions in the AWS Cloud](https://reader034.fdocuments.us/reader034/viewer/2022051608/540d66a38d7f728d7e8b48f5/html5/thumbnails/82.jpg)
Big Data on AWS
Course on Big Data
aws.amazon.com/training/course-descriptions/bigdata
![Page 83: Big Data Use Cases and Solutions in the AWS Cloud](https://reader034.fdocuments.us/reader034/viewer/2022051608/540d66a38d7f728d7e8b48f5/html5/thumbnails/83.jpg)
reinvent.awsevents.com
![Page 84: Big Data Use Cases and Solutions in the AWS Cloud](https://reader034.fdocuments.us/reader034/viewer/2022051608/540d66a38d7f728d7e8b48f5/html5/thumbnails/84.jpg)
aws.amazon.com/big-data
![Page 85: Big Data Use Cases and Solutions in the AWS Cloud](https://reader034.fdocuments.us/reader034/viewer/2022051608/540d66a38d7f728d7e8b48f5/html5/thumbnails/85.jpg)
© 2014 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc.
Thank you!
Ben Butler, @bensbutler, Sr. Mgr., Big Data
July 10, 2014 – http://aws.amazon.com/big-data