AppScale Talk at SBonRails

49
The AppScale Project Presented by Chris Bunch (on behalf of the AppScale team) March 7, 2011 @ sbonrails meetup Thursday, March 10, 2011

description

These are the slides from my talk about the AppScale project at the SBonRails meetup. It covers AppScale as well as Google App Engine and the research projects have come out of it, including Neptune, a Ruby DSL focused on computation-heavy workloads.

Transcript of AppScale Talk at SBonRails

Page 1: AppScale Talk at SBonRails

The AppScale ProjectPresented by Chris Bunch

(on behalf of the AppScale team)March 7, 2011 @ sbonrails meetup

Thursday, March 10, 2011

Page 2: AppScale Talk at SBonRails

Thursday, March 10, 2011

Page 3: AppScale Talk at SBonRails

Overview

• Google App Engine

• AppScale - now with 50% Ruby!

• Research Directions

• Neptune - A Ruby DSL for the cloud

Thursday, March 10, 2011

Page 4: AppScale Talk at SBonRails

Google App Engine

• A web framework introduced in 2008

• Python and Java supported

• Offers a Platform-as-a-Service: Use Google’s APIs to achieve scale

• Upload your app to Google

Thursday, March 10, 2011

Page 5: AppScale Talk at SBonRails

Quotas

Thursday, March 10, 2011

Page 6: AppScale Talk at SBonRails

Data Model

• Not relational - semi-structured schema

• Compare to models in Rails

• Exposes a get / put / delete / query interface

Thursday, March 10, 2011

Page 7: AppScale Talk at SBonRails

Storing Data

• Datastore API - Persistent storage

• Memcache API - Transient storage

• User can set expiration times

• Blobstore API - Store large files

• need to enable billing to use it

Thursday, March 10, 2011

Page 8: AppScale Talk at SBonRails

Be Social!

• Mail API - Send and receive e-mail

• XMPP API - Send and receive IMs

• Channel API - Creating persistent connections via XMPP

• Use for chat rooms, games, etc.

Thursday, March 10, 2011

Page 9: AppScale Talk at SBonRails

Background Tasks

• Cron API - Access a URL periodically

• Descriptive language: “every 5 minutes”, “every 1st Sun of Jan, Mar, Dec”, etc.

• Uses a separate cron.yaml file

• Taskqueue API - Within your app, fire off tasks to be done later

Thursday, March 10, 2011

Page 10: AppScale Talk at SBonRails

Dealing with Users

• Users API: Uses Google Accounts

• Don’t write that ‘forgot password’ page ever again!

• Authorization: via app.yaml:

• anyone, must login, or admin only

Thursday, March 10, 2011

Page 11: AppScale Talk at SBonRails

When Services Fail

• Originally: failures throw exceptions

• Just catch them all!

• Capabilities API: Check if a service is available

• Datastore, Memcache, and so on

Thursday, March 10, 2011

Page 12: AppScale Talk at SBonRails

Deploying Your App

• Develop locally on SDK

• Stub implementations of most APIs

• Then deploy to Google

Thursday, March 10, 2011

Page 13: AppScale Talk at SBonRails

How to Scale

• Limitations on the programming model:

• No filesystem interaction

• 30 second limit per web request

• Language libraries must be on whitelist

• Sandboxed execution

Thursday, March 10, 2011

Page 14: AppScale Talk at SBonRails

Enter AppScale

• App Engine is easy to use

• but we really want to tinker with the internals!

• Need an open platform to experiment on

• test API implementations

• add new APIs

Thursday, March 10, 2011

Page 15: AppScale Talk at SBonRails

Enter AppScale

• Lots of NoSQL DBs out there

• Hard to compare DBs

• Configuration and deployment can be complex

• Need one-button deployment

Thursday, March 10, 2011

Page 16: AppScale Talk at SBonRails

Storing Data

• Datastore API - AppServers use a database agnostic layer - sends requests to PBServer

• Named for data format: Protocol Buffers

• Memcache API - memcached

• Blobstore API - Custom server

Thursday, March 10, 2011

Page 17: AppScale Talk at SBonRails

Be Social!

• Mail API - sendmail (disabled by default)

• XMPP API - ejabberd

• Channel API - strophejs

Thursday, March 10, 2011

Page 18: AppScale Talk at SBonRails

Background Tasks

• Cron API - Uses Vixie Cron

• Taskqueue - Separate thread fetches web page

• Both make a single attempt

• Will replace with distributed, fault-tolerant versions

Thursday, March 10, 2011

Page 19: AppScale Talk at SBonRails

Dealing with Users

• Users API: Defers users to AppLoadBalancer

• Password reset via command-line tools

• Authorization: no major changes here

Thursday, March 10, 2011

Page 20: AppScale Talk at SBonRails

Deploying Your App

• Develop locally on SDK

• Stub implementations of most APIs

• Then deploy to AppScale!

• Use your own cluster or via Amazon

• Command-line tools mirror Amazon’s

Thursday, March 10, 2011

Page 21: AppScale Talk at SBonRails

Deploying Your App

• run-instances: Start AppScale

• describe-instances: View cloud metadata

• upload-app: Deploy an App Engine app

• remove-app: Un-deploy an App Engine app

• terminate-instances: Stop AppScale

Thursday, March 10, 2011

Page 22: AppScale Talk at SBonRails

Deployment Models

• Cloud deployment: Amazon EC2 or Eucalyptus (the open source implementation of the EC2 APIs)

• Just specify how many machines you need

• Non-cloud deployment via Xen or KVM

Thursday, March 10, 2011

Page 23: AppScale Talk at SBonRails

Thursday, March 10, 2011

Page 24: AppScale Talk at SBonRails

AppController

• The brains of the outfit

• Runs on every node

• Handles configuration and deployment of all services (including other AppControllers)

• Written in Ruby

Thursday, March 10, 2011

Page 25: AppScale Talk at SBonRails

Load balancer

• Routes users to their app via nginx

• haproxy makes sure app servers are live

• Can’t assume the user has DNS:

• Thus we wrote the AppLoadBalancer

• Rails app that routes users to apps

• Performs authentication as well

Thursday, March 10, 2011

Page 26: AppScale Talk at SBonRails

AppLoadBalancer

Thursday, March 10, 2011

Page 27: AppScale Talk at SBonRails

App Server

• We modified the App Engine SDK

• Easier for Python (source included)

• Harder for Java (had to decompile)

• Removed non-scalable API implementations

• Goal: Use open source whenever possible

Thursday, March 10, 2011

Page 28: AppScale Talk at SBonRails

A Common Feature Request

Thursday, March 10, 2011

Page 29: AppScale Talk at SBonRails

Database Options

• Open source / open APIs / proprietary

• Master / slave v. peer-to-peer

• Differences in query languages

• Data model (key/val, semi-structured)

• In-memory or persistent

• Data consistency model

• Interfaces - REST / Thrift / libraries

Thursday, March 10, 2011

Page 30: AppScale Talk at SBonRails

In AppScale:

• BigTable clones:

• Master / slave relationship

• Master stores metadata

• Slaves store data

• Fault-tolerant to slave failure

• Partially tolerant to master failure

Thursday, March 10, 2011

Page 31: AppScale Talk at SBonRails

In AppScale:

• Variably consistent DBs

• Voldemort and

• Both are peer-to-peer: no SPOF

• Voldemort: Specify consistency per table

• Cassandra: Specify consistency per request

Thursday, March 10, 2011

Page 32: AppScale Talk at SBonRails

In AppScale:

• Relational:

• Not NoSQL but used like NoSQL

• Document-oriented:

• Targets append-heavy workloads

Thursday, March 10, 2011

Page 33: AppScale Talk at SBonRails

In AppScale:

• Key-value datastores:

• MemcacheDB: like memcached but persistent and replicated

• Scalaris: in-memory, no persistence

• SimpleDB: semi-structured but used as key-value (will update this in the future)

Thursday, March 10, 2011

Page 34: AppScale Talk at SBonRails

Research Ideas• Placement support

• Monitoring

• Shared memory

• Cost modeling

• Hybrid cloud

• Active Cloud DB

• Disaster Recovery

• Neptune

Thursday, March 10, 2011

Page 35: AppScale Talk at SBonRails

Placement Support

Thursday, March 10, 2011

Page 36: AppScale Talk at SBonRails

Monitr

Thursday, March 10, 2011

Page 37: AppScale Talk at SBonRails

Shared memory

• Since AppServer + DB are co-located, reduce message overhead

• no serialization

• Leverage CoLoRs to do so across languages

• AS is in Python or Java, DBS is Python

• Can be orders-of-magnitude faster

Thursday, March 10, 2011

Page 38: AppScale Talk at SBonRails

Cost modeling

• Can we reproduce Google’s cost model?

• We can reproduce memory, network bandwidth in / out, size and types of data

• Can’t reproduce CPU - it’s based on Google’s load, which we can’t capture

• varies based on placement and time of day

Thursday, March 10, 2011

Page 39: AppScale Talk at SBonRails

Hybrid Cloud

Thursday, March 10, 2011

Page 40: AppScale Talk at SBonRails

Database Agnostic Transactions

• Want to support disparate DBs with ACID

• Leverage ZooKeeper for versioning

• And PBServer as the DB agnostic layer

• Needs strong consistency from DB itself

• And row-level atomicity on updates

Thursday, March 10, 2011

Page 41: AppScale Talk at SBonRails

Active Cloud DB

• Need a common interface to DBs

• But not just for Java / Python

• Named after Rails’ ActiveRecord

• Exposes REST interface for DB

• Included in AppScale 1.3

Thursday, March 10, 2011

Page 42: AppScale Talk at SBonRails

Disaster Recovery

• People are using App Engine as a production level environment

• Need a way to automatically back up data

• Can leverage this data for data analytics

• Need to also seamlessly switch to AppScale version if App Engine version goes down

Thursday, March 10, 2011

Page 43: AppScale Talk at SBonRails

Neptune

• Need a simple way to run compute-intensive jobs

• We have the code from the ‘net

• We have the resources - the cloud

• But the average user does not have the know how

• Our solution: create a domain specific language for configuring cloud apps

• Based on Ruby

Thursday, March 10, 2011

Page 44: AppScale Talk at SBonRails

Syntax

• It’s as easy as:

neptune :type => “mpi”,

:code => “MpiNQueens”,

:nodes_to_use => 8,

:output => “/mpi/output-1.txt”

Thursday, March 10, 2011

Page 45: AppScale Talk at SBonRails

Neptune Supports:

• Message Passing Interface (MPI)

• MapReduce

• Unified Parallel C (UPC)

• X10

• Erlang

Thursday, March 10, 2011

Page 46: AppScale Talk at SBonRails

Extensibility

• Experts can add support for other computational jobs

• Biochemists can run simulations via DFSP and dwSSA

• Embarassingly parallel Monte Carlo simulations

Thursday, March 10, 2011

Page 47: AppScale Talk at SBonRails

Compiling Code

• You may not have the binaries, so compile from source!

• Auto-generates makefiles for beginners

neptune :type => “compile”,

:code => “/home/appscale/mpi_nqueens”

Thursday, March 10, 2011

Page 48: AppScale Talk at SBonRails

Installing Neptune

• Just use good old ‘gem’:

• gem install neptune

• Current version is 0.0.4, fully compatible with AppScale 1.5

• More info at our web page:

• http://neptune-lang.org

Thursday, March 10, 2011

Page 49: AppScale Talk at SBonRails

Wrapping It Up

• Thanks to the AppScale team, especially:

• Co-lead Navraj Chohan and advisor Professor Chandra Krintz

• Check us out on the web:

• http://appscale.cs.ucsb.edu

• http://code.google.com/p/appscale

Thursday, March 10, 2011