PowerPoint Presentation - Amazon S3€¦ · Ad Tech Live Voting Social Media Connected Devices...

82

Transcript of PowerPoint Presentation - Amazon S3€¦ · Ad Tech Live Voting Social Media Connected Devices...

AWS Rapid Pace of Innovation

2009

Amazon RDS

Amazon VPC

Auto Scaling

Elastic Load

Balancing

+48

2010

Amazon SNS

AWS Identity

& Access

Management

Amazon Route 53

+61

2011

Amazon

ElastiCache

Amazon SES

AWS

CloudFormation

AWS Direct

Connect

AWS Elastic

Beanstalk

GovCloud

+82

Amazon

CloudTrail

Amazon

CloudHSM

Amazon

WorkSpaces

Amazon Kinesis

Amazon Elastic

Transcoder

Amazon

AppStream

AWS OpsWorks

+280

2013

Amazon SWF

Amazon Redshift

Amazon Glacier

Amazon

Dynamo DB

Amazon

CloudSearch

AWS Storage

Gateway

AWS Data

Pipeline

+159

2012

Since inception AWS has:

• Released 1111 new services and features

• Introduced more than 40 major new services

• Announced 45 price reductions

2008

+24Amazon EBS

Amazon

CloudFront

+454

2014

Amazon Cognito

Amazon Zocalo

Amazon Mobile

Analytics

*as of Nov 13, 2014

AWS Directory

Service

Amazon RDS for Aurora

AWS CodeDeploy

AWS Lambda

AWS Config

AWS Key Management

Service

AWS Service Catalog

Amazon EC2

Container Service

AWS CodePipeline

AWS CodeCommit

Everyday, AWS adds enough new server capacity to support

Amazon.com when it was a $7 billion global enterprise.

James Hamilton: Innovation at Scale Presentation re:Invent 2014

1.

2. 3. 4.

http://mvdirona.com/jrh/work/

https://www.youtube.com/

watch?v=JIQETrFC_SQ

We are driven to remove any all causes of failure.

Our goal is to make our operational performance indistinguishable from perfect.

“Based on our experience, I believe that we can be even more secure in the

AWS cloud than in our own data centers.” – Tom Soderstrom, CTO, NASA JPL

AWS provides the same, familiar approaches to security that companies have

been using for decades with increased visibility, control, and auditability.

Visibility

View your entire infrastructure with a click

Deep insight with

AWS CloudTrail

Control

You have sole

authority on where

data is stored

Shared

responsibility model

Auditability

3rd party validation

SOC 1 / SOC 2 / SOC 3

SSAE 16 / ISAE 3402

PCI DSS Level 1

DIACAP & FISMA

ISO 27001 / 9001 / 13485

ISO/TS 16949

FedRAMP (SM)

FISMA

HIPAA

ITAR

MPAA

CSA

FIPS 140-2

High volume / low margin businesses are in our core DNA

Trade CapEX for

variable expense

Our economies of

scale provide us

with lower costs

45 price

reductions

since 2006

Pricing model

choice to support

variable and

stable workloads

On-demand

Reserved

Spot

Save more money

as you grow bigger

Tiered pricing

Volume discounts

Custom pricing

5,000+ SIs & Consultants

3,000+ ISVs

22 Global Premier Tier partners

6 Enterprise-focused competencies

2,000+ products available for 1-click

deployment across 23 distinct product

categories

Customers run over 70M hours of

software per month

SDL Reduces Time to Market for New Customer Web Environments to Less

than an Hour by Using AWS

• Needed flexibility in provisioning and managing

customer web environments to support rapid growth

• Created a fully automated resource provisioning and

management platform running on AWS

• SDL reports accelerated time to market to less than

an hour

• By using AWS, SDL delivers a reliable, scalable, and

secure web offering to its customers

4Synergy is an AWS Consulting Partner at the Advanced Tier that helps

define cloud strategy and designs and implement cloud applications.

AWS provides us with flexibility and

scalability, both technically and

financially. It allows us to provide a

superior service without hefty

upfront infrastructure investments

Dennis van der Veeke

CTO, SDL

SDL Global Customer Experience Management provides

solutions for managing global customer experiences.

CustomerGauge Uses AWS to Reduce Time to Market by 50%

CustomerGauge automatically collects actionable

insights and improves customer experience based on

Net Promoter® metrics.

By using AWS, we scaled our

services to three regions from

100,000 to 3 million transactions

per month.

Alessio Nobile

Head of Technology

“ • CustomerGauge needed to scale its survey engine

and API by 300% to support its growing customer

base in Australia, Europe and the United States

• By using AWS, the company replicated its

environment in the United States and Australia in less

than 6 hours, meeting its scalability target

• CustomerGauge now report that it experiences near

99.9% availability and can update its service without

disrupting server instances and customer traffic.

Zeeman Scales to Support Traffic Spikes of 10x by Using AWS

Zeeman is a leading Dutch retail organization

with 1,200 physical shops and a turnover of

€ 500 million in the retail market.

Using AWS helps us prepare for a

highly variable workloads that

comes with spikes in visitor counts

around our marketing efforts.

• Needed a flexible solution to meet traffic spikes

immediately following commercial campaigns

• Built an e-commerce infrastructure on AWS that

scales with traffic fluctuations

• Designed to scale up by 10x seamlessly and

automatically following advertising campaigns

• By using AWS deployment and management

services, Zeeman continuously integrates updates

and patches several times a day

Jacques van der Bom

Manager, E-commerce

Unitt is an AWS Advanced Consulting Partner that develops,

builds, and manages cloud and hosting services.

FINRA handles approximately 30 billion market

events every day to build a holistic picture of

trading in the U.S.

Determisconduct by

enforcing the rules

Detectand prevent wrongdoing

in the U.S. markets

Disciplinethose who

break the rules

That Kind of Volume Comes with Challenges

Market volumes are volatile and steadily increasing

Exchanges are dynamically evolving

Regulatory rules are created and enhanced

New securities products are introduced

Market manipulators innovate

AWS Offered The Right Services For Our

Platform

Cloud PlatformAPIs at the right layer

Automated infrastructure deployment

Open source commitment

Operations Security

A Platform That Adapts to Market Dynamics

Data Integration

Hbase

Hadoop

MapReduce

Flexible Interactive

Queries

Hadoop

EMR

SQL/Hive

Fast Predefined Queries

Hbase/NoSQL

Hadoop

Predefined Datamarts

Surveillance Analytics

EMR

Hive

Web Applications

Analysts

Regulators

Data Management Services

Data Movement

Data Registration

Notification

Version Management

Job Management

Cluster Management

S3

Firms

Delivering Agility, Speed, and Cost Savings To

FINRA

Cost SavingsSpeedAgility

Efficient scale

Pay for what we use

Projected to save $10-$20M annually

Reduce query times

from hours to seconds

Respond quickly to market challenges

What Tools Should I Use?

Glacier

S3 DynamoDB

RDS

EMR

Redshift

Data PipelineKinesis

Cassandra CloudSearch

Kinesis-

enabled

app

Ingest Store Process Visualize

GlacierS3

DynamoDB

RDS

Kinesis

Spark

Streaming

EMRData Pipeline

Storm

Kafka

Redshift

Cassandra

CloudSearch

Kinesis

Connector

Kinesis

enabled app

Ingest

Database

Cloud

Storage

Stream

Storage

Stream

Storage

Database

Cloud

Storage

Real-time processing of streaming data

High throughput

Elastic

Easy to use

Connectors for EMR, S3, Redshift, DynamoDB

Amazon

Kinesis

Amazon Web Services

AZ AZ AZ

Durable, highly consistent storage replicates dataacross three data centers (availability zones)

Aggregate andarchive to S3

Millions ofsources producing100s of terabytes

per hour

FrontEnd

AuthenticationAuthorization

Ordered streamof events supportsmultiple readers

Real-timedashboardsand alarms

Machine learningalgorithms or

sliding windowanalytics

Aggregate analysisin Hadoop or adata warehouse

Inexpensive: $0.028 per million puts

Amazon Kinesis Architecture

Cloud Database &

Storage

Store anything

Object storage

Scalable

Designed for 99.999999999%

durability

Amazon

S3

Big Data

Aggregate All Data in S3 Surrounded by a collection of the right tools

EMR Kinesis

Redshift DynamoDB RDS

Data Pipeline

Spark StreamingCassandra Storm

Amazon

S3

Amazon S3

App/Web Tier

Client Tier

Database & Storage Tier

App/Web Tier

Client Tier

Data TierDatabase & Storage Tier

Search

Hadoop/HDFS

Cache

Blob Store

SQL NoSQL

Database & Storage Tier

Amazon RDSAmazon

DynamoDB

Amazon ElastiCache

Amazon S3

Amazon

Glacier

Amazon CloudSearch

Amazon EMR

Fully-managed NoSQL database service

Built on solid-state drives (SSDs)

Consistent low latency performance

Any throughput rate

No storage limits

Amazon

DynamoDB

Consistent Performance at Scale

WRITES

continuously replicated andpersisted (SSD)

READS

strongly or eventually consistent (option)

better than

99.999%worldwide uptime

Provably Highly

Durable Available

File & PhotoSharing

OnlineGaming

Ad Tech Live Voting

SocialMedia

ConnectedDevices

State Management

Video Streaming

MobileMessaging

Backup & Restore

Publishing Mapping

(Some) DynamoDB Use Cases

(Some) DynamoDB Customers

design for scale = optimize for cost

50B+ database

operations/day

Relational Databases

Fully managed = low admin

Trillions of I/O requests/month

Aurora, MySQL, Oracle, SQL Server,

Postgres

Amazon

RDS

• Manageability Rapid deployment with pre-configured parameters Patch Management Monitoring and Metrics

• Availability and Data Durability Automated Backups and Point-In-Time-Recovery DB Snapshots Automatic Host Replacement (Single-AZ) Multi-AZ deployments

• Scalability Push-Button Scaling

• Storage, Memory and Compute Read Replicas

Key Features

Process

Columnar data warehouse

ANSI SQL compatible

Massively parallel

Petabyte scale

Fully-managed

Very cost-effective

Amazon

Redshift

Hadoop/HDFS clusters

Hive, Pig, Impala, HBase

Easy to use; fully managed

On-demand and spot pricing

Tight integration with S3,

DynamoDB, and Kinesis

Amazon

Elastic

MapReduce

Autocomplete Search RecommendationsAutomatic spelling corrections

A look at how it works

Months of user history Common misspellings

Data Analyzed Using EMR:

Weste

nWistin

Westa

nWhestin

Automatic spelling corrections

Months of user search data

Search terms

Misspellings

Final click throughs

Yelp web site log data goes into Amazon S3

Amazon S3

Amazon Elastic MapReduce spins up a 200 node Hadoop cluster

Hadoop Cluster

Amazon EMRAmazon S3

Hadoop Cluster

Amazon EMRAmazon S3

All 200 nodes of the cluster simultaneously look for common misspellings

Westen

Wistin

Westan

Hadoop Cluster

Amazon EMRAmazon S3

A map of common misspellings and suggested corrections are loaded back into Amazon S3.

Westen

Wistin

Westan

Then the cluster is shut down Yelp only pays for the time they used it

Hadoop Cluster

Amazon EMRAmazon S3

Each of Yelp’s 80 Engineers Can Do This Whenever They Have a Big Data Problem

spins up over

250 Hadoop

clusters per week in EMR.

Amazon EMRAmazon S3

Data Innovation Meets Action at Scale

at NASDAQ OMX

• NASDAQ’s technology powers more than 70 marketplaces in 50 countries

• NASDAQ’s global platform can handle more than 1 million messages/second at

a median speed of sub-55 microseconds

• NASDAQ own & operate 26 markets including 3 clearinghouse & 5 central securities repositories

• More than 5,500 structured products are tied to NASDAQ’s global indexes with the notional value of at least $1 trillion

• NASDAQ powers 1 in 10 of the world’s securities transactions

NASDAQ’s Big Data Challenge

• Archiving Market Data

– A classic “Big Data” problem

• Power Surveillance and Business Intelligence/Analytics

• Minimize Cost

– Not only infrastructure, but development/IT labor costs too

• Empower the business for self-service

NASDAQ’s Legacy Solution

• On-premises MPP DB

– Relatively expensive, finite storage

– Required periodic additional expenses to add more storage

– Ongoing IT (administrative) human costs

• Legacy BI tool

– Requires developer involvement for new data sources, reports,

dashboards, etc.

New Solution: Amazon Redshift

• Cost Effective

– Redshift is 43% of the cost of legacy

• Assuming equal storage capacities

– Doesn’t include IT ongoing costs!

• Performance

– Outperforms NASDAQ’s legacy BI/DB solution

– Insert 550K rows/second on a 2 node 8XL cluster

• Elastic

– NASDAQ can add additional capacity on demand, easy to grow their cluster

• Amazon Redshift partner

– http://aws.amazon.com/redshift/partn

ers/pentaho/

• Self Service

– Tools empower BI users to integrate

new data sources, create their own

analytics, dashboards, and reports

without requiring development

involvement

• Cost effective

New Solution: Pentaho BI/ETL

Net Result

• New solution is cheaper, faster, and offers capabilities that NASDAQ

didn’t have before

– Empowers NASDAQ’s business users to explore data like they never

could before

– Reduces IT and development as bottlenecks

– Margin improvement (expense reduction and supports business

decisions to grow revenue)

Putting All The AWS Data Tools Together & Common Design Patterns

One tool to

rule them all

Kinesis EMR

DynamoDB Redshift

S3

Data Ingestion / Creation

Transactions

Files

Streams

Kinesis EMR

DynamoDB Redshift

S3

Data Visualization / Reporting

Transactions

Files

Streams

Kinesis EMR

DynamoDB Redshift

S3

Optimal Path?

Transactions

Files

Streams

Kinesis EMR

DynamoDB Redshift

S3

The Cloud Enables Specialization

workload = ( access patterns )

fit the infrastructure

to the workload

Spark

Streaming,

Storm

Amazon

Redshift Spark,

Impala,

Presto

Hive

Amazon

Redshift

Hive

Spark,

Presto

Amazon

Kinesis/

Kafka

Amazon

DynamoDBAmazon S3Data

Hot ColdData TemperatureQ

ue

ry L

ate

nc

y

Low

HighAnswers

HDFS

Hive

Native

Client

Spark

Streaming

Hive

Amazon Kinesis / KafkaData

Answers

Apache Storm Native Client

Amazon

DynamoDB

Native

Client

Amazon

Redshift

Hive

Spark,

Presto

Amazon

Kinesis/

Kafka

Amazon S3Data

Answers

Spark,

Impala,

PrestoRedshift

Spark,

Presto

Kinesis/

KafkaDynamoDB S3Data

Answers

HDFS

Solution

Architects

Professional

ServicesPremium

Support

AWS Partner

Network (APN)

aws.amazon.com/big-data

http://aws.amazon.com/marketplace

Big Data Case Studies

Learn from other AWS customers

aws.amazon.com/solutions/case-studies/big-data

AWS Big Data Test Drives

APN Partner-provided labs

aws.amazon.com/testdrive/bigdata

Thank You!