appscale: open-source platform-level cloud computing · appscale cloud computing • Remote access...
Transcript of appscale: open-source platform-level cloud computing · appscale cloud computing • Remote access...
appscale: open-source platform-level cloud computing
I2 Joint Techs February 2nd, 2010
Chandra Krintz Computer Science Dept.
Univ. of California, Santa Barbara
appscale
cloud computing • Remote access to distributed and shared cluster resources
Potentially owned by someone else (e.g. Amazon, Google, …) Users rent access to vast resources
Advertised service-level-agreements (SLAs) Resources are opaque and isolated
Highly scalable, fault tolerant
Service-oriented, utility computing Relies on OS, network, and storage virtualization
SLAs
Web Services
Virtualization
appscale
cloud computing • 3 types: as-a-Service (aaS)
Infrastructure: Amazon Web Services (EC2, S3, EBS) Virtualized, isolated (CPU, Network, Storage) systems on which
users execute entire runtime stacks Fully customer self-service
Open APIs (IaaS standard), scalable services
Platform: Google App Engine, Microsoft Azure Scalable program-level abstractions via well-defined interfaces Enable construction of network-accessible applications Process-level (sandbox) isolation, complete software stack
Software: Salesforce.com Applications provided to thin clients over a network Customizable
appscale
an opening in the clouds • Open-source cloud computing systems from the
UCSB Computer Science Department Goal: Bring popular cloud fabrics to “on-premise” clusters that
are easy to use and are transparent
To facilitate investigation of Energy-efficient cloud computing
Services, underlying device technology, support technologies Customization (availability, performance, application behavior)
Hybrid cloud solutions (public and on-premise)
appscale
an opening in the clouds • Open-source cloud computing systems from the
UCSB Computer Science Department Goal: Bring popular cloud fabrics to “on-premise” clusters that
are easy to use and are transparent
To facilitate investigation of Energy-efficient cloud computing
Services, underlying device technology, support technologies Customization (availability, performance, application behavior)
Hybrid cloud solutions (public and on-premise)
By emulating key cloud layers from the commercial sector Engender user community, access to real applications/users Leverage extant software technologies
Not a replacement technology for any Public Cloud service
appscale
cloud computing from UCSB • IaaS:
Open-source implementation of all AWS APIs Robust, highly-available, scalable emulation Cluster/data center support over Xen, KVM, VMWare
• PaaS: Open-source implementation of Google App Engine APIs Pluggable (services), scalable, fault tolerant Runs over virtualization or IaaS layer: AWS, Eucalyptus
appscale
google app engine
GAE Application (Python, Java)
private, enterprise data
Images
IM
Memcache Mail
Users URL Fetch
Adm
inis
trat
or
Cons
ole
Data Store
Protobuf Data APIs
SDC
Google App Engine (GAE)
Services Cron
Tasks
MyApp.appspot.com
Blob store
appscale
google app engine: the sdk
GAE Application (Python, Java)
Google App Engine (GAE) python2.5 dev_appserver.py –port=8181 MyApp
Open-source Google App Engine Software Development Kit (SDK)
Images IM Mem Cache Mail Users URL
Fetch Cron Tasks Data store
Blob Store
appscale
google app engine: run/test locally
GAE Application (Python, Java)
Google App Engine (GAE)
Open-source Google App Engine Software Development Kit (SDK)
python2.5 dev_appserver.py –port=8181 MyApp
send- mail
= simulation of actual API functionality using localhost (flat file, in-memory hash (Memcache))
curl /wget
frame- work lib
no auth
on console
on console
Images IM Mem Cache Mail Users URL
Fetch Cron Tasks Data store
Blob Store
on console
appscale
google app engine: upload to google
GAE Application (Python, Java)
Google App Engine (GAE) appcfg.py update MyApp/
private, enterprise data
SDC
Administrator Console
Free w/ quotas Pay for additional scale: CPU, BW, emails, data BigTable Automatic scaling High availability
…
GAE app users
via the Internet
Images IM Mem Cache Mail Users URL
Fetch Cron Tasks Data store
Blob Store
MyApp.appspot.com
appscale
sandbox restrictions
GAE Application (Python, Java)
Google App Engine (GAE) MyApp.appspot.com
• Pure Python or Java, white list of library calls to framework • No thread/subprocess spawning, system calls • No writes to file system, reads only to static files uploaded w/app • Storage using key-value, schema-free datastore (Bigtable-based) • HTTP/S communication only, CGI to handle page requests • Limit on number of datastore elements accessed per request • Limit on response duration, task frequency, request rate • Enforced quotas (BW, CPU, requests/s, files, app size, …)
• Other things to consider • Your code and data on Google resources • APIs customized for MVC applications
• Other application domains not supported
appscale
from gae to appscale • GAE SDK extensions
Pluggable using open-source distributed database technologies HBase, Hypertable, Cassandra, Voldemort, MongoDB, MemcacheDB, MySQL
MemcacheD library (Python and Java)
From console or as background thread (automatically) Interface to Hadoop (MapReduce)
Multi-language support: Python, Java, Ruby, Perl, soon: X10
Translator to Linux Cron job, similar to Tasks
Pluggable: built-in cloud-wide authentication via Rails, support for Eucalyptus and EC2 credentials
Mem Cache
Users
Cron
Tasks
Data store
appscale
appscale
GAE App Developer
(AppScale Admin)
GAE App Users
AppScaletools
HTTPS
AppController
ALB
DB S/P
AS GAE App Users GAE App
Users
AppScale Cloud
• Distributed system with four key components AppLoadBalancer (ALB) Database Master/Peer (DB M/P)
AppServer (AS) Database Slave/Peer (DB S/P)
• Services Automatic deployment, database replication, node & front-end scaling
Over Eucalyptus, EC2, and virtualization (Xen, KVM)
System-wide performance/availability monitoring, user/admin console
Tasks(e.g.MapReduce)
DB M/P
appscale
appscale
GAE App Developer
(AppScale Admin)
GAE App Users
AppScaletools
HTTPS
AppController
ALB
DB S/P
AS GAE App Users GAE App
Users
AppScale Cloud Tasks(e.g.MapReduce)
DB M/P
• Implements every AppScale component Can instantiate as a particular role (ALB, AS, DB) Can change functionality and instantiate itself as another
• AppScale tools deploy/control cloud
appscale
appscale performance • 2 VCPUs 2.83GHz, 4GB RAM, 16GB disk
0
1
2
3
4
5
6
7
1 2 3 4
Que
ry T
ime
[s]
Number of Nodes
Average Time to Query a Database of Size 1000
HBase (1 accessor)MongoDB (1 accessor)
MemcacheDB (1 accessor)Google (1 accessor)HBase (3 accessors)
MongoDB (3 accessors)MemcacheDB (3 accessors)
Google (3 accessors)
appscale
appscale projects: http://appscale.cs.ucsb.edu • Open-source community management
Bug fixes, feature additions, releases, user support
• Research (currently only internally available) Automatic scaling of load, demand, other metrics
Scheduling and load balancing of apps, tasks, components
Hybrid cloud solutions (public/private, multi-zone) Tunable fault-tolerance and availability
Efficient communication across isolation boundaries Alternative application domains (streaming, HPC) Distributed profiling/sampling, feedback-driven optimization Paas/IaaS integration and co-operation
Customized, dynamic/adaptive SLAs Platform-aware resource scheduling, isolation, provisioning
appscale
appscale projects: http://appscale.cs.ucsb.edu • Open-source community management
Bug fixes, feature additions, releases, user support
• Research (currently only internally available) Automatic scaling of load, demand, other metrics
Scheduling and load balancing of apps, tasks, components
Hybrid cloud solutions (public/private, multi-zone) Tunable fault-tolerance and availability
Efficient communication across isolation boundaries Alternative application domains (streaming, HPC) Distributed profiling/sampling, feedback-driven optimization Paas/IaaS integration and co-operation
Customized, dynamic/adaptive SLAs Platform-aware resource scheduling, isolation, provisioning