Amazon Aws Presentation Drupal

download Amazon Aws Presentation Drupal

If you can't read please download the document

Transcript of Amazon Aws Presentation Drupal

Amazon Web Services

An overview of Amazon's Web Services, and how one can put them to use

June 28, 2007

Alex Harford

[email protected]

Overview of Amazon Web Services

MTurk: Mechanical Turk

SQS: Simple Queue Service

S3: Simple Storage Service

EC2: Elastic Compute Cloud

Overview of Amazon Web Services

Why do I like it?

Cross platform support: Libraries in Python, Java, etc

Or roll your own through REST and SOAP

Enthusiastic developer community: Message boards are very active

Pay as you go, for as much or as little as you need

Mechanical Turk

There is a story of the Mechanical Turk, a chess playing automaton, that toured Europe in the 18th century. It beat many famous people, including Napoleon Bonaparte and Benjamin Franklin.

Mechanical Turk

It was a hoax, there was a person hidden inside that controlled the Automaton!

Mechanical Turk

Amazon's offering is similar, it uses 'Artificial Artificial Intelligence'.

Submitters enter Human Intelligence Tasks into the queue and Workers solve them.

Workers can be tested for capabilities before they are trusted.

Submitters can refuse results if they find them unsatisfactory.

There is a feedback system in place so Workers know if the Submitters will pay.

Mechanical Turk

Some example uses are:

podcast transcription services

image recognition for mapping service

Amazon has used it themselves for internal projects

SQS: Simple Queue Service

From their website: Amazon Simple Queue Service (Amazon SQS) offers a reliable, highly scalable hosted queue for storing messages as they travel between computers. By using Amazon SQS, developers can simply move data between distributed application components performing different tasks, without losing messages or requiring each component to be always available.

SQS: Simple Queue Service

What can you do?

CreateQueue

ListQueue

DeleteQueue

SendMessage

ReceiveMessage

DeleteMessage

PeekMessage

SetVisibilityTimeout

AddGrant

SQS: Simple Queue Service

Made for communicating to your various web services.

Accessible through SOAP and REST APIs

How much does it cost?

Pay only for what you use. There is no minimum fee, and no start-up cost.

$0.10 per 1,000 messages sent ($0.0001 per message sent)

$0.20 per GB of data transferred

SQS: Simple Queue Service

CreateQueue: Create queues for your own use, or to share with others.

ListQueues: List your existing queues.

DeleteQueue: Delete one of your queues.

SendMessage: Add any data entry to a specified queue.

ReceiveMessage: Return one or more messages from a specified queue, which are returned in roughly the same order it was added to the queue.

SQS: Simple Queue Service

DeleteMessage: Remove a message from a specified queue.

PeekMessage: Return a specific entry from the queue without locking it.

SetVisibilityTimeout: Control the amount of time after a message has been read that is locked from being read again.

AddGrant: Allow other users to send messages to or receive messages from your queue.

S3: Simple Storage Service

Simple Storage Service lets you store data in 'buckets'

Cheap!

$0.15 per GB-Month of storage used.

$0.10 per GB of data uploaded.

$0.18 per GB of data downloaded (price breaks for volume over 10TB)

$0.01 per 1000 PUT or LIST requests

$0.01 per 10,000 GET requests

S3: Simple Storage Service

S3 is NOT a filesystem

You store your data in buckets

Bucket names are globally unique!

Individual chunks of data have a 'key' associated with them

Keys can be 4KB long

Chunks of data are a max of 5GB

S3: Simple Storage Service

Permissions can be set on buckets and objects to be public, private, or for individual users.

Files can be accessed over HTTP or BitTorrent!

Data is replicated across multiple datacenters.

The more it is accessed, the more it is replicated across different datacenters.

S3: Simple Storage Service

Accessing S3:

HTTP

BitTorrent

Python API

Java API

Linux FUSE (ok, so it can look like a filesystem)

s3sync, an rsync like program

JungleDisk

S3 Organizer for Firefox

MySQL storage backend (very preliminary!)

S3: Simple Storage Service

How do you run a website from S3?

#1 caveat: No index.html support

So it is only useful to offload traffic and storage from your main site.

S3: Simple Storage Service

Set up a bucket 'vanlug'

Store a key in it 'tux.png'

Make it publicly accessible.

Set up a CNAME 'static.vanlug.bc.ca' and point it to 'vanlug.s3.amazonaws.com'

Load up http://static.vanlug.bc.ca/tux.png

That's it!

S3: Simple Storage Service

BitTorrent:

remember our Tux image?

http://static.vanlug.bc.ca/tux.png

Turn it into a torrent:

http://static.vanlug.bc.ca/tux.png?torrent

Done. Amazon creates a tracker for you!

To save bandwidth, after you have served the file a few times, hide the original file. Amazon will keep hosting it for you, and the people sharing the file will seed it for you.

S3: Simple Storage Service

JungleDisk

This application lets you mount S3 like it is a local drive.

Cross platform, supports Linux, Mac, Windows

Encryption can be pre or post upload

S3: Simple Storage Service

Who is using it?

Amazon (duh)

SmugMug

Linden Labs (Second Life)

S3: Simple Storage Service

Drawbacks:

S3 cannot have data POSTed from clients, so if users are to upload data, they need to send it to another webserver first. This means that the data is transmitted twice to get into S3.

Difficult to track per key bandwidth.

Bandwidth cannot be throttled.

EC2: Elastic Compute Cloud

This is what I'm most excited about!

What is it?

An on-demand Linux cluster

Uses Xen technology

'Instances' are stored on S3, you request an instance, and an IP address / hostname is returned

Basically 1.7Ghz x86 processor, 1.75GB of RAM, 160GB of local disk, and 250Mb/s of network bandwidth.

Traffic between different EC2 instances and S3 is free! Even if they are not in the same datacenter!

EC2: Elastic Compute Cloud

How much does it cost?

$0.10 per instance-hour consumed (or part of an hour consumed).

Same prices for traffic and storage as S3

EC2: Elastic Compute Cloud

What can you do with it?

You are root, do anything!

Install software:

Apache

VPN

UT2004 / CounterStrike / Call of Duty

Jabber

Asterisk

Load kernel modules

EC2: Elastic Compute Cloud

Creating a custom Instance

Boot a current instance

Customize it

Bundle it

Save it to S3

Register it

Boot it

EC2: Elastic Compute Cloud

One Step Further: Creating custom instances without repackaging

Pass an environment variable in that indicates:

filename of a tarball stored on S3

what SQS queue to grab startup data from

what Subversion repository to grab data from

what startup file to use

EC2: Elastic Compute Cloud

The Ephemeral Nature

Your data is not stored anywhere permanent

Instances can be rebooted, preserving data

When they are terminated, they are gone forever

Hardware failures are rare but they can occur (just like on any server)

What to do?

S3 FUSE

MySQL replication

other failover systems

EC2: Elastic Compute Cloud

How To Access EC2

command line tools

EC2UI for Firefox

http://www.awszone.com

Questions?

More Info?

Presentation and links are available on my blog: http://alexharford.com

Contact me: [email protected]