AppScale Talk at SBonRails
-
Upload
chris-bunch -
Category
Technology
-
view
1.997 -
download
0
description
Transcript of AppScale Talk at SBonRails
The AppScale ProjectPresented by Chris Bunch
(on behalf of the AppScale team)March 7, 2011 @ sbonrails meetup
Thursday, March 10, 2011
Thursday, March 10, 2011
Overview
• Google App Engine
• AppScale - now with 50% Ruby!
• Research Directions
• Neptune - A Ruby DSL for the cloud
Thursday, March 10, 2011
Google App Engine
• A web framework introduced in 2008
• Python and Java supported
• Offers a Platform-as-a-Service: Use Google’s APIs to achieve scale
• Upload your app to Google
Thursday, March 10, 2011
Quotas
Thursday, March 10, 2011
Data Model
• Not relational - semi-structured schema
• Compare to models in Rails
• Exposes a get / put / delete / query interface
Thursday, March 10, 2011
Storing Data
• Datastore API - Persistent storage
• Memcache API - Transient storage
• User can set expiration times
• Blobstore API - Store large files
• need to enable billing to use it
Thursday, March 10, 2011
Be Social!
• Mail API - Send and receive e-mail
• XMPP API - Send and receive IMs
• Channel API - Creating persistent connections via XMPP
• Use for chat rooms, games, etc.
Thursday, March 10, 2011
Background Tasks
• Cron API - Access a URL periodically
• Descriptive language: “every 5 minutes”, “every 1st Sun of Jan, Mar, Dec”, etc.
• Uses a separate cron.yaml file
• Taskqueue API - Within your app, fire off tasks to be done later
Thursday, March 10, 2011
Dealing with Users
• Users API: Uses Google Accounts
• Don’t write that ‘forgot password’ page ever again!
• Authorization: via app.yaml:
• anyone, must login, or admin only
Thursday, March 10, 2011
When Services Fail
• Originally: failures throw exceptions
• Just catch them all!
• Capabilities API: Check if a service is available
• Datastore, Memcache, and so on
Thursday, March 10, 2011
Deploying Your App
• Develop locally on SDK
• Stub implementations of most APIs
• Then deploy to Google
Thursday, March 10, 2011
How to Scale
• Limitations on the programming model:
• No filesystem interaction
• 30 second limit per web request
• Language libraries must be on whitelist
• Sandboxed execution
Thursday, March 10, 2011
Enter AppScale
• App Engine is easy to use
• but we really want to tinker with the internals!
• Need an open platform to experiment on
• test API implementations
• add new APIs
Thursday, March 10, 2011
Enter AppScale
• Lots of NoSQL DBs out there
• Hard to compare DBs
• Configuration and deployment can be complex
• Need one-button deployment
Thursday, March 10, 2011
Storing Data
• Datastore API - AppServers use a database agnostic layer - sends requests to PBServer
• Named for data format: Protocol Buffers
• Memcache API - memcached
• Blobstore API - Custom server
Thursday, March 10, 2011
Be Social!
• Mail API - sendmail (disabled by default)
• XMPP API - ejabberd
• Channel API - strophejs
Thursday, March 10, 2011
Background Tasks
• Cron API - Uses Vixie Cron
• Taskqueue - Separate thread fetches web page
• Both make a single attempt
• Will replace with distributed, fault-tolerant versions
Thursday, March 10, 2011
Dealing with Users
• Users API: Defers users to AppLoadBalancer
• Password reset via command-line tools
• Authorization: no major changes here
Thursday, March 10, 2011
Deploying Your App
• Develop locally on SDK
• Stub implementations of most APIs
• Then deploy to AppScale!
• Use your own cluster or via Amazon
• Command-line tools mirror Amazon’s
Thursday, March 10, 2011
Deploying Your App
• run-instances: Start AppScale
• describe-instances: View cloud metadata
• upload-app: Deploy an App Engine app
• remove-app: Un-deploy an App Engine app
• terminate-instances: Stop AppScale
Thursday, March 10, 2011
Deployment Models
• Cloud deployment: Amazon EC2 or Eucalyptus (the open source implementation of the EC2 APIs)
• Just specify how many machines you need
• Non-cloud deployment via Xen or KVM
Thursday, March 10, 2011
Thursday, March 10, 2011
AppController
• The brains of the outfit
• Runs on every node
• Handles configuration and deployment of all services (including other AppControllers)
• Written in Ruby
Thursday, March 10, 2011
Load balancer
• Routes users to their app via nginx
• haproxy makes sure app servers are live
• Can’t assume the user has DNS:
• Thus we wrote the AppLoadBalancer
• Rails app that routes users to apps
• Performs authentication as well
Thursday, March 10, 2011
AppLoadBalancer
Thursday, March 10, 2011
App Server
• We modified the App Engine SDK
• Easier for Python (source included)
• Harder for Java (had to decompile)
• Removed non-scalable API implementations
• Goal: Use open source whenever possible
Thursday, March 10, 2011
A Common Feature Request
Thursday, March 10, 2011
Database Options
• Open source / open APIs / proprietary
• Master / slave v. peer-to-peer
• Differences in query languages
• Data model (key/val, semi-structured)
• In-memory or persistent
• Data consistency model
• Interfaces - REST / Thrift / libraries
Thursday, March 10, 2011
In AppScale:
• BigTable clones:
• Master / slave relationship
• Master stores metadata
• Slaves store data
• Fault-tolerant to slave failure
• Partially tolerant to master failure
Thursday, March 10, 2011
In AppScale:
• Variably consistent DBs
• Voldemort and
• Both are peer-to-peer: no SPOF
• Voldemort: Specify consistency per table
• Cassandra: Specify consistency per request
Thursday, March 10, 2011
In AppScale:
• Relational:
• Not NoSQL but used like NoSQL
• Document-oriented:
• Targets append-heavy workloads
Thursday, March 10, 2011
In AppScale:
• Key-value datastores:
• MemcacheDB: like memcached but persistent and replicated
• Scalaris: in-memory, no persistence
• SimpleDB: semi-structured but used as key-value (will update this in the future)
Thursday, March 10, 2011
Research Ideas• Placement support
• Monitoring
• Shared memory
• Cost modeling
• Hybrid cloud
• Active Cloud DB
• Disaster Recovery
• Neptune
Thursday, March 10, 2011
Placement Support
Thursday, March 10, 2011
Monitr
Thursday, March 10, 2011
Shared memory
• Since AppServer + DB are co-located, reduce message overhead
• no serialization
• Leverage CoLoRs to do so across languages
• AS is in Python or Java, DBS is Python
• Can be orders-of-magnitude faster
Thursday, March 10, 2011
Cost modeling
• Can we reproduce Google’s cost model?
• We can reproduce memory, network bandwidth in / out, size and types of data
• Can’t reproduce CPU - it’s based on Google’s load, which we can’t capture
• varies based on placement and time of day
Thursday, March 10, 2011
Hybrid Cloud
Thursday, March 10, 2011
Database Agnostic Transactions
• Want to support disparate DBs with ACID
• Leverage ZooKeeper for versioning
• And PBServer as the DB agnostic layer
• Needs strong consistency from DB itself
• And row-level atomicity on updates
Thursday, March 10, 2011
Active Cloud DB
• Need a common interface to DBs
• But not just for Java / Python
• Named after Rails’ ActiveRecord
• Exposes REST interface for DB
• Included in AppScale 1.3
Thursday, March 10, 2011
Disaster Recovery
• People are using App Engine as a production level environment
• Need a way to automatically back up data
• Can leverage this data for data analytics
• Need to also seamlessly switch to AppScale version if App Engine version goes down
Thursday, March 10, 2011
Neptune
• Need a simple way to run compute-intensive jobs
• We have the code from the ‘net
• We have the resources - the cloud
• But the average user does not have the know how
• Our solution: create a domain specific language for configuring cloud apps
• Based on Ruby
Thursday, March 10, 2011
Syntax
• It’s as easy as:
neptune :type => “mpi”,
:code => “MpiNQueens”,
:nodes_to_use => 8,
:output => “/mpi/output-1.txt”
Thursday, March 10, 2011
Neptune Supports:
• Message Passing Interface (MPI)
• MapReduce
• Unified Parallel C (UPC)
• X10
• Erlang
Thursday, March 10, 2011
Extensibility
• Experts can add support for other computational jobs
• Biochemists can run simulations via DFSP and dwSSA
• Embarassingly parallel Monte Carlo simulations
Thursday, March 10, 2011
Compiling Code
• You may not have the binaries, so compile from source!
• Auto-generates makefiles for beginners
neptune :type => “compile”,
:code => “/home/appscale/mpi_nqueens”
Thursday, March 10, 2011
Installing Neptune
• Just use good old ‘gem’:
• gem install neptune
• Current version is 0.0.4, fully compatible with AppScale 1.5
• More info at our web page:
• http://neptune-lang.org
Thursday, March 10, 2011
Wrapping It Up
• Thanks to the AppScale team, especially:
• Co-lead Navraj Chohan and advisor Professor Chandra Krintz
• Check us out on the web:
• http://appscale.cs.ucsb.edu
• http://code.google.com/p/appscale
Thursday, March 10, 2011