Practical Cloud Computing - INNOQ...YouTube Data API Developer's Guide Client Libraries and Sample...

Post on 25-Jun-2020

3 views 0 download

Transcript of Practical Cloud Computing - INNOQ...YouTube Data API Developer's Guide Client Libraries and Sample...

PracticalCloud Computing

Stefan Tilkov | @stilkov | JBoss One Day Talk 2011

http://www.innoq.com

© 2011 innoQ Deutschland GmbH

Web & Internet

Utility Computing

Grid ComputingService Orientation

Virtualization

Cloud Computing

© 2011 innoQ Deutschland GmbH

“Cloud Computing” is an approach to IT architecture where resources (such

as virtualized hardware, storage capacity, CPU time or higher-level

services) can be dynamicallyreserved, used, and released over a network—usually the Internet—as

needed.

“Cloudonomics”

Joe Weinman http://gigaom.com/2008/09/07/the-10-laws-of-cloudonomics/

1. Utility services cost less2. On-demand trumps forecasting3. Peak of the sum <= sum of the peaks4. Reduced average unit costs5. Superiority in numbers

Taxonomy

Application

Service

Platform

Virtualization Layer

Hardware

You: Somebody else:

Whatever you call what you do today

Application

Service

Platform

Virtualization Layer

Hardware

You: Somebody else:

Infrastructure as a Service

Application

Service

Platform

Virtualization Layer

Hardware

IaaS

You: Somebody else:

Platform as a Service

Virtualization Layer

Hardware

Application

Service

PlatformPaaS

You: Somebody else:

Software as a Service

Virtualization Layer

Hardware

Application

Service

PlatformSaaS

Cloud Types

Public Cloud

Private Cloud

Hybrid Cloud

Community Cloud

Virtual Private Cloud

Let’s build our own!

How hard can it be?

© 2011 innoQ Deutschland GmbH

Step 1: Manual

Physical System

User

IT Ops

Users ask for resources

Manual installation by ops team

“Real” hardware

© 2011 innoQ Deutschland GmbH

Step 2: Virtualized

Users ask for resources

Manual installation by ops team

“Virtual” hardware

Pre-built images

Physical System(s)

User

IT Ops

VM VM VM

ImageImage

Image

© 2011 innoQ Deutschland GmbH

Step 3: IT-Supported

Application for user requests

Manual installation by ops team

“Virtual” hardware

Pre-built images

Physical System(s)

User

IT Ops

VM VM

IT Support App

VM

User DB

ImageImage

Image

© 2011 innoQ Deutschland GmbH

Step 4: Automated

Provisioning application for user self-service

Automated installation

“Virtual” hardware

User-de!ned images

Physical System(s)

User

IT Ops

VM VMVM

User DBPersistence Store

ImageImageImage

Provisioning App

© 2011 innoQ Deutschland GmbH

Step 5: Autoscaling

Provisioning application for user self-service

Automated installation by the application

“Virtual” hardware

User-de!ned imagesPhysical System(s)

VM VM

Provisioning App

VM

User DBPersistence Store

ImageImageImage

Application

© 2011 innoQ Deutschland GmbH

Step 6: High-level Services

Physical System(s)

User

IT Ops

VM VMVM

User DBPersistence Store

ImageImageImage

Provisioning App

Management

Monitoring

BillingMulti-tenancy

Shared Images

Licensing

Backup

Load Balancing

Autoscaling

© 2011 innoQ Deutschland GmbH

Usage Models

1. Dynamic, virtualized deployment

2. Do-It-Yourself scaling

3. Higher-level services

4. Parallel processing

5. Someone else’s platform

© 2011 innoQ Deutschland GmbH

First of all …

© 2011 innoQ Deutschland GmbH

0. Don’t do anything

© 2011 innoQ Deutschland GmbH

Usage Models

1. Dynamic, virtualized deployment

2. Do-It-Yourself scaling

3. Higher-level services

4. Parallel processing

5. Someone else’s platform

© 2011 innoQ Deutschland GmbH

Deploy application without modi!cation

Same tools, same tasks Programmatic virtualization

“So" Hardware”Most popular: Amazon EC2

1. Dynamic, virtualized deployment

Characteristics

© 2011 innoQ Deutschland GmbH

Amazon EC2

Elastic Computing Cloud (EC2)Simple Storage Service (S3)

Elastic Block Storage (EBS)

© 2011 innoQ Deutschland GmbH

1. Dynamic, virtualized deployment

AdvantagesFast deployment

Easy backup

Simple packaging

Utilization-based licensing models

(Limited) scaling

Re-use of pre-packaged instances

Easy volume snapshots

Increased reliability via availability zones

© 2011 innoQ Deutschland GmbH

1. Dynamic, virtualized deployment

DisadvantagesExpensive when used un-elastically

Transient instances

No guarantees for latency between server instances

(Limited) scaling

(Slight) Vendor lock-in

Security concerns

© 2011 innoQ Deutschland GmbH

1. Dynamic, virtualized deployment

Use CasesReference installations

Performance and load testing

Development services (e.g. build)

Time-limited hosting

Use of pre-packaged images

Rarely used (“exotic”) apps

© 2011 innoQ Deutschland GmbH

Usage Models

1. Dynamic, virtualized deployment

2. Do-It-Yourself scaling

3. Higher-level services

4. Parallel processing

5. Someone else’s platform

© 2011 innoQ Deutschland GmbH

Distribution across independent instances

No single point of failure

Nothing shared

Everything partitioned

Growable/shrinkable dynamically

2. Do-it-yourself scaling

Characteristics

© 2011 innoQ Deutschland GmbH

Simple, distributed persistence

Key/value, document, or column-based

No (distributed) transactions

Eventual consistency

Examples: HBase, Cassandra, Riak, …

2. Do-it-yourself scaling

NoSQL Datastores

© 2011 innoQ Deutschland GmbH

Partitioning into “Shards”

Each shard handled by independent component

Distribute across multiple boxes, possibly at di!erent locations

2. Do-it-yourself scaling

Sharding

© 2011 innoQ Deutschland GmbH

Simple approach:

target server = hash(key) mod n

What happens when server dies?

Solution: HashRing

Client-side (re-)partitionin

2. Do-it-yourself scalingClient-controlled Partitioning

© 2011 innoQ Deutschland GmbH

2. Do-it-yourself scaling

Consistent Hashing

Diagrams by Tom White,http://tinyurl.com/cons-hash

A, B, C: nodes(e.g. caches)

1, 2, …: Hash values

Move clock-wise to !nd cache

Introduce virtual replicas to ensure distribution

© 2011 innoQ Deutschland GmbH

Full control

Choice of technology & products

Optimized solution

Vendor independence

2. Do-it-yourself scaling

Advantages

© 2011 innoQ Deutschland GmbH

Challenging technologies

Many low-level tasks

Emerging practices

Signi"cant e!ort required

2. Do-it-yourself scaling

Disadvantages

© 2011 innoQ Deutschland GmbH

Strong elasticity/scaling requirements

Building higher-level platform

Virtualization of existing scalable solution

Speci"c technology requirements

2. Do-it-yourself scaling

Use Cases

© 2011 innoQ Deutschland GmbH

Usage Models

1. Dynamic, virtualized deployment

2. Do-It-Yourself scaling

3. Higher-level services

4. Parallel processing

5. Someone else’s platform

© 2011 innoQ Deutschland GmbH

Use high-level service APIs

Let someone else handle operations

Scale (more or less) seamlessly

3. Higher-level services

Characteristics

© 2011 innoQ Deutschland GmbH

11/11/09 2:01 PMAPI Directory - Google Data Protocol - Google Code

Page 1 of 2http://code.google.com/apis/gdata/docs/directory.html

More personalization in Google Friend Connect New!

Google Data Protocol

The following Google services provide APIs that implement the Google Data Protocol.

Each API has its own set of guides and resources, including information about using client libraries. If you're trying toaccomplish a certain task with an API, the Developer's Guide for that API should point you in the right direction. Most APIsalso include code samples and other easy ways to get started.

API Home Guides Client Libraries

Google Analytics Data Export API Developer's Guide Reference Guide

Client Libraries and Sample Code(JS, Java, PHP, Python, Ruby)

Google Apps APIs List of All Apps APIs

Google Base Data API Developer's Guide Reference Guide

Blogger Data API Developer's Guide Reference Guide

Client Libraries and Sample Code(Java, .NET, PHP, Python, JS, Obj-C)

Google Booksearch Data API Developer's Guide Reference Guide

Client Libraries and Sample Code(Java, PHP)

Google Calendar Data API Developer's Guide Reference Guide

Client Libraries and Sample Code(Java, .NET, PHP, Python, JS, Obj-C)

Google Code Search Data API Developer's Guide Reference Guide

Google Contacts Data API Developer's Guide Reference Guide

Client Libraries and Sample Code(Java, .NET, Python, JS, Obj-C)

Google Documents List Data API Developer's Guide Reference Guide

Client Libraries and Sample Code(Java, .NET, PHP, Python, Obj-C)

Google Finance Portfolio Data API Developer's Guide Reference Guide

Google Health Data API Developer's Guide Reference Guide

Client Libraries and Sample Code(Java, .NET, PHP, Python, Ruby)

API Directory

11/11/09 2:01 PMAPI Directory - Google Data Protocol - Google Code

Page 2 of 2http://code.google.com/apis/gdata/docs/directory.html

©2009 Google - Code Home - Terms of Service - Privacy Policy - Site Directory

Google Code offered in: English - Español - 日本語 - ��� - Português - Pусский - 中文(�体) - 中文(繁體)

Google Maps Data API Developer's Guide Reference Guide

Picasa Web Albums Data API Developer's Guide Reference Guide

Client Libraries and Sample Code(Java, .NET, PHP, Python, Obj-C)

Google Sidewiki Data API Developer's Guide Reference Guide

Client Libraries and Sample Code(Java, JavaScript)

Google Sites Data API Developer's Guide Reference Guide

Client Libraries and Sample Code(Java, .NET, PHP, Python, Obj-C)

Google Spreadsheets Data API Developer's Guide Reference Guide

Client Libraries and Sample Code(Java, .NET, PHP, Python, Obj-C)

Google Webmaster Tools Data API Developer's Guide Reference Guide

YouTube Data API Developer's Guide Client Libraries and Sample Code(Java, .NET, PHP, Python, Obj-C)

11/11/09 2:01 PMAPI Directory - Google Data Protocol - Google Code

Page 1 of 2http://code.google.com/apis/gdata/docs/directory.html

More personalization in Google Friend Connect New!

Google Data Protocol

The following Google services provide APIs that implement the Google Data Protocol.

Each API has its own set of guides and resources, including information about using client libraries. If you're trying toaccomplish a certain task with an API, the Developer's Guide for that API should point you in the right direction. Most APIsalso include code samples and other easy ways to get started.

API Home Guides Client Libraries

Google Analytics Data Export API Developer's Guide Reference Guide

Client Libraries and Sample Code(JS, Java, PHP, Python, Ruby)

Google Apps APIs List of All Apps APIs

Google Base Data API Developer's Guide Reference Guide

Blogger Data API Developer's Guide Reference Guide

Client Libraries and Sample Code(Java, .NET, PHP, Python, JS, Obj-C)

Google Booksearch Data API Developer's Guide Reference Guide

Client Libraries and Sample Code(Java, PHP)

Google Calendar Data API Developer's Guide Reference Guide

Client Libraries and Sample Code(Java, .NET, PHP, Python, JS, Obj-C)

Google Code Search Data API Developer's Guide Reference Guide

Google Contacts Data API Developer's Guide Reference Guide

Client Libraries and Sample Code(Java, .NET, Python, JS, Obj-C)

Google Documents List Data API Developer's Guide Reference Guide

Client Libraries and Sample Code(Java, .NET, PHP, Python, Obj-C)

Google Finance Portfolio Data API Developer's Guide Reference Guide

Google Health Data API Developer's Guide Reference Guide

Client Libraries and Sample Code(Java, .NET, PHP, Python, Ruby)

API Directory

11/11/09 2:01 PMAPI Directory - Google Data Protocol - Google Code

Page 1 of 2http://code.google.com/apis/gdata/docs/directory.html

More personalization in Google Friend Connect New!

Google Data Protocol

The following Google services provide APIs that implement the Google Data Protocol.

Each API has its own set of guides and resources, including information about using client libraries. If you're trying toaccomplish a certain task with an API, the Developer's Guide for that API should point you in the right direction. Most APIsalso include code samples and other easy ways to get started.

API Home Guides Client Libraries

Google Analytics Data Export API Developer's Guide Reference Guide

Client Libraries and Sample Code(JS, Java, PHP, Python, Ruby)

Google Apps APIs List of All Apps APIs

Google Base Data API Developer's Guide Reference Guide

Blogger Data API Developer's Guide Reference Guide

Client Libraries and Sample Code(Java, .NET, PHP, Python, JS, Obj-C)

Google Booksearch Data API Developer's Guide Reference Guide

Client Libraries and Sample Code(Java, PHP)

Google Calendar Data API Developer's Guide Reference Guide

Client Libraries and Sample Code(Java, .NET, PHP, Python, JS, Obj-C)

Google Code Search Data API Developer's Guide Reference Guide

Google Contacts Data API Developer's Guide Reference Guide

Client Libraries and Sample Code(Java, .NET, Python, JS, Obj-C)

Google Documents List Data API Developer's Guide Reference Guide

Client Libraries and Sample Code(Java, .NET, PHP, Python, Obj-C)

Google Finance Portfolio Data API Developer's Guide Reference Guide

Google Health Data API Developer's Guide Reference Guide

Client Libraries and Sample Code(Java, .NET, PHP, Python, Ruby)

API Directory

3. Higher-level services

Google Service APIs

© 2011 innoQ Deutschland GmbH

Elastic Computing Cloud (EC2)Simple Storage Service (S3)

Simple Queueing Service (SQS)

Simple DB

Cloudfront

Elastic Block Storage (EBS)

Elastic MapReduce

DevPay

FPS

Relational Data Service (RDS)

3. Higher-level services

Amazon Service APIs

© 2011 innoQ Deutschland GmbH

Ease of use

Independence from implementation details

Maximum reach

High availability

No installation, operations, maintenance

3. Higher-level services

Advantages

© 2011 innoQ Deutschland GmbH

High latency

Proprietary APIs

Dubious SLAs

Little to no portability/vendor lock-in

3. Higher-level services

Disadvantages

© 2011 innoQ Deutschland GmbH

Storage of publicly accessible data

Global Collaboration

Coexistence with models 1 & 2

3. Higher-level services

Use Cases

© 2011 innoQ Deutschland GmbH

Usage Models

1. Dynamic, virtualized deployment

2. Do-It-Yourself scaling

3. Higher-level services

4. Parallel processing

5. Someone else’s platform

© 2011 innoQ Deutschland GmbH

O#en based on MapReduce Concept

Amazon ElasticMapReduce (model 3)

Essentially Grid computing

Dynamic due to scalable platform

4. Parallel processing

Characteristics

© 2011 innoQ Deutschland GmbH

Used internally by Google

Described in a research paper

Massive parallelization of large data set processing

4. Parallel processing

MapReduce

Map Map Map

Reduce Reduce Reduce

Shuffle/Sort

Collect Results

Split Input

Input

Output

© 2011 innoQ Deutschland GmbH

4. Parallel processing

Apache Hadoop

Hadoop Common

HBase

HDFS

Hive

MapReduce

Pig

ZooKeeper

© 2011 innoQ Deutschland GmbH

Simple yet highly scalable model

Unusual for developers

Makes certain impossible tasks possible

Highly cost-e!ective with elasticity

4. Parallel processing

Advantages

© 2011 innoQ Deutschland GmbH

Requires re-implementation

Hard for some (most?) developers

Only rarely applicable

4. Parallel processing

Disadvantages

© 2011 innoQ Deutschland GmbH

Very complex calculations

Computation over very large data sets

Large data migration tasks

4. Parallel processing

Use Cases

© 2011 innoQ Deutschland GmbH

Usage Models

1. Dynamic, virtualized deployment

2. Do-It-Yourself scaling

3. Higher-level services

4. Parallel processing

5. Someone else’s platform

© 2011 innoQ Deutschland GmbH

Deploy application into vendor cloud

No machines, instances, OS or infrastructure so#ware to maintain

Automatic scalability

New container model

5. Someone else’s platform

Characteristics

© 2011 innoQ Deutschland GmbH

Web app request/response programming modelQueueing/Async processingScalable PersistenceCachingMonitoringMuch more: Identity/SSO, Billing, …

5. Someone else’s platform

Programming model

© 2011 innoQ Deutschland GmbH

Restricted, sandboxed environmentno threads

no !le system

For Java:no java.lang.System

restricted reflection

5. Someone else’s platform

Google App Engine

© 2011 innoQ Deutschland GmbH

MailURL FetchXMPPImage Manipulation

MemcacheTaskQueueUser API

5. Someone else’s platform

Google App EngineSimpli!ed Python/Java Web App Container low level: key value storehigher-level Java persistence APIs: JDO, JPA

© 2011 innoQ Deutschland GmbH

Live Services

App Fabric(formerly .NET Services)

SQL Azure(formerly SQL Services)

SharePoint Services

Dynamic CRM Services

5. Someone else’s platform

Microsoft Azure

© 2011 innoQ Deutschland GmbH

Transparent hosting and scaling

Initially Rails, now also Node.js, Clojure, Java, Python, Scala

Synchronization via git

High-level APIs for Web requests, caching, etc.

Runs on AWS/EC2

Acquired by Salesforce.com (2010)

5. Someone else’s platform

Heroku

© 2011 innoQ Deutschland GmbH

Plattform based on Apache Tomcat 6

(Limited) Autoscaling

Administration console

Con"gurable

Customizable

5. Someone else’s platform

Amazon Elastic Beanstalk

© 2011 innoQ Deutschland GmbH

CloudBees.com/org

OpenShi#.com (Redhat/JBoss)

Oracle Public Cloud

5. Someone else’s platform

Emerging Cloud platforms

© 2011 innoQ Deutschland GmbH

Complete container model

No infrastructure, no middleware to maintain

Pre-de"ned, scalable architecture

Integration with higher-level services(e.g. identity, billing, …)

Unlimited scaling (in theory)

5. Someone else’s platform

Advantages

© 2011 innoQ Deutschland GmbH

Very little control

Vendor lock-in

New and di!erent APIs

5. Someone else’s platform

Disadvantages

© 2011 innoQ Deutschland GmbH

Public Web applications with unknown scalability requirements

Non-mission-critical, internal applications

Platform extensions (e.g. Facebook)

Programming model of the future (?)

5. Someone else’s platform

Use Cases

Conclusion

1.Cloud Computing is real

2.You have use cases today

3.What are you waiting for?

Stefan Tilkovstefan.tilkov@innoq.comhttp://www.innoq.com/blog/st/@stilkovPhone: +49 170 471 2625

innoQ Deutschland GmbH innoQ Schweiz GmbH

www.innoq.com info@innoq.com

Halskestr. 17D-40880 RatingenPhone: +49 21 02 77 172-100

Gewerbestr. 11CH-6630 ChamPhone: +41 41 02 743 01 11

Thank you!

Q&A

Somebody always asks:

“But what about Security?”

© 2011 innoQ Deutschland GmbH

Public Cloud = insecurePrivate Cloud = secure?

Counter-arguments

Private data centers are insecure

“O!line” is insecure

Attacks come from inside

Publicity/Visibility leads to scrutiny

“Compliance as a Service”

NSA sees everything, anyway

Counter-counter-arguments

Interesting business data

Really sensitive data

Who care about arguments?