Cloud Computing at Amazon’s EC2 Joe Steele [email protected].

41
Cloud Computing at Amazon’s EC2 Joe Steele [email protected]
  • date post

    19-Dec-2015
  • Category

    Documents

  • view

    214
  • download

    1

Transcript of Cloud Computing at Amazon’s EC2 Joe Steele [email protected].

Page 1: Cloud Computing at Amazon’s EC2 Joe Steele jrsteele@unomaha.edu.

Cloud Computingat Amazon’s EC2

Joe [email protected]

Page 2: Cloud Computing at Amazon’s EC2 Joe Steele jrsteele@unomaha.edu.

Grid Computing

Shared resources – many computer clusters transferring data and running jobs.

Geographically distributed.Cross-grid collaboration.Idea is analogous to electric power network

(grid), where power generators are distributed, but users access electric power without bothering about the source of energy and its location.

Page 3: Cloud Computing at Amazon’s EC2 Joe Steele jrsteele@unomaha.edu.

LHC Computing Grid (LCG)

Page 4: Cloud Computing at Amazon’s EC2 Joe Steele jrsteele@unomaha.edu.

Cloud Computing

What if I don’t have my own cluster?Cloud computing refers to a cluster that invites

users to send jobs. (SaaS –Software as a Service)Computation, software, data access, and storage

services that do not require user knowledge of the location or configuration of the system.

Term comes from the cloud drawing used in the past to represent the telephone network, later represents the internet.

Page 5: Cloud Computing at Amazon’s EC2 Joe Steele jrsteele@unomaha.edu.

Cloud Computing

Page 6: Cloud Computing at Amazon’s EC2 Joe Steele jrsteele@unomaha.edu.

Cloud Computing

Private companies large data centers.When considering operational costs, 50k servers

are cheaper per cpu then 1k servers (5 to 7 times cheaper).

Amazon: • $0.085/cpu-hour• No minimum, maximum• No contract

Page 7: Cloud Computing at Amazon’s EC2 Joe Steele jrsteele@unomaha.edu.

Amazon E2

aws.amazon.comComputing cluster – create an account and

provide a credit card.Let Amazon take care of the hardware.

Page 8: Cloud Computing at Amazon’s EC2 Joe Steele jrsteele@unomaha.edu.

Cloud BioLinux

JCVI (J. Craig Venter Institute) created cloud version of NERC BioLinux VM.

An Ubuntu machine with over 100 NEBC software packages. Image stored at EC2, is available to be copied at no charge, by EC2 users.

Page 9: Cloud Computing at Amazon’s EC2 Joe Steele jrsteele@unomaha.edu.

http://aws.amazon.com

Page 10: Cloud Computing at Amazon’s EC2 Joe Steele jrsteele@unomaha.edu.

Create a new account

Page 11: Cloud Computing at Amazon’s EC2 Joe Steele jrsteele@unomaha.edu.

Enter your information

Page 12: Cloud Computing at Amazon’s EC2 Joe Steele jrsteele@unomaha.edu.

Sign up for an EC2 account

Page 13: Cloud Computing at Amazon’s EC2 Joe Steele jrsteele@unomaha.edu.

Click on “Sign up for Amazon EC2”

Page 14: Cloud Computing at Amazon’s EC2 Joe Steele jrsteele@unomaha.edu.

EC2 Account

• Signing up for EC2 automatically signs you up for Amazon Simple Storage Service, and Amazon Virtual Private Cloud.

• Requires credit card information.• No charges until you start using the services.• Amazon will email with Access Identifiers, and

instructions for your first log in.

Page 15: Cloud Computing at Amazon’s EC2 Joe Steele jrsteele@unomaha.edu.

Click on “AWS Management Console”

Page 16: Cloud Computing at Amazon’s EC2 Joe Steele jrsteele@unomaha.edu.

Click the EC2 Tab

Page 17: Cloud Computing at Amazon’s EC2 Joe Steele jrsteele@unomaha.edu.

Launch an Instance

Page 18: Cloud Computing at Amazon’s EC2 Joe Steele jrsteele@unomaha.edu.

I recommend biolinux

Page 19: Cloud Computing at Amazon’s EC2 Joe Steele jrsteele@unomaha.edu.

Click “Select”

Page 20: Cloud Computing at Amazon’s EC2 Joe Steele jrsteele@unomaha.edu.

Pricing

• Amazon has a variety of VM sizes available – pricing is at: http://aws.amazon.com/ec2/pricing/

• You are charged for CPU usage, for data storage, and for data transferred to or from Amazon. Charges continue until a VM is “Terminated”.

• You can set up a small test VM for free – select “Micro” for the size.

Page 21: Cloud Computing at Amazon’s EC2 Joe Steele jrsteele@unomaha.edu.

Kernel defaults are fine

Page 22: Cloud Computing at Amazon’s EC2 Joe Steele jrsteele@unomaha.edu.

Create a Key Pair

Page 23: Cloud Computing at Amazon’s EC2 Joe Steele jrsteele@unomaha.edu.

Create security group

Page 24: Cloud Computing at Amazon’s EC2 Joe Steele jrsteele@unomaha.edu.

Launch

Page 25: Cloud Computing at Amazon’s EC2 Joe Steele jrsteele@unomaha.edu.

Machine info

Page 26: Cloud Computing at Amazon’s EC2 Joe Steele jrsteele@unomaha.edu.

“Terminate” to end charges

Page 27: Cloud Computing at Amazon’s EC2 Joe Steele jrsteele@unomaha.edu.

ssh to the machine

A window opens, telling you how to connect to your new VM, eg,:

“ssh -i key_pair_name.pem [email protected]

However, for biolinux, do:ssh –i key_pair_name.pem ubuntu@ec2-76-202-

01-919.compute-1.amazonaws.com

Page 28: Cloud Computing at Amazon’s EC2 Joe Steele jrsteele@unomaha.edu.

NX

Use NX for the graphical display (built in to biolinux already). Open source, can be found at http://www.nomachine.com/

Must ssh into VM FIRST, using the key pair.>adduser <username>>groups >usermod -G <grp1>,<grp2>,ssh <username>

Page 29: Cloud Computing at Amazon’s EC2 Joe Steele jrsteele@unomaha.edu.

Start NX

Page 30: Cloud Computing at Amazon’s EC2 Joe Steele jrsteele@unomaha.edu.

“Configure”

Page 31: Cloud Computing at Amazon’s EC2 Joe Steele jrsteele@unomaha.edu.

BioLinux over NX

Page 32: Cloud Computing at Amazon’s EC2 Joe Steele jrsteele@unomaha.edu.

Data Stored at Amazon

There are large datasets stored at Amazon, available for use – free of charge (mostly). You are charged for any data you copy.

http://aws.amazon.com/datasetsto search through them.

Page 33: Cloud Computing at Amazon’s EC2 Joe Steele jrsteele@unomaha.edu.

http://aws.amazon.com/datasets

Page 34: Cloud Computing at Amazon’s EC2 Joe Steele jrsteele@unomaha.edu.

DatasetsHuman DNA sequences: • 1000 Genomes Project (7,300 GB) • Ensembl Annotated Human Genome - FASTA (115 GB)• Ensembl Annotated Human Genome - MySQL (200 GB) • GenBank (200 GB) • Human Liver Cohort (Sage Bionetworks) (0.6 GB) • Illumina - Jay Flatley's Human Genome Data Set. (350 GB) • YRI Trio Data - complete genome sequence for three individuals (700 GB)

Other (might include some human data): • Ensembl - FASTA DB (100 GB) • Influenza Virus (including Swine Flu) - from NCBI (1 GB) • UniGene - from NCBI (10 GB) •

PubChem Library - from NCBI (230 GB)

Page 35: Cloud Computing at Amazon’s EC2 Joe Steele jrsteele@unomaha.edu.

Public Snapshots

Page 36: Cloud Computing at Amazon’s EC2 Joe Steele jrsteele@unomaha.edu.
Page 37: Cloud Computing at Amazon’s EC2 Joe Steele jrsteele@unomaha.edu.

Select “Volumes”

Page 38: Cloud Computing at Amazon’s EC2 Joe Steele jrsteele@unomaha.edu.

Create a Volume

Page 39: Cloud Computing at Amazon’s EC2 Joe Steele jrsteele@unomaha.edu.

Instance Information

Page 40: Cloud Computing at Amazon’s EC2 Joe Steele jrsteele@unomaha.edu.

Attach it to your Instance

Page 41: Cloud Computing at Amazon’s EC2 Joe Steele jrsteele@unomaha.edu.

Mount the Volume

From your VM:>sudo mkfs –t ext3 /dev/sdf>sudo mkdir /mnt/datasets>sudo mount –t ext3 /dev/sdf /mnt/datasets

200GB of genbank data are now in /mnt/datasets