Solving the Scalability Dilemma with Clouds, Crowds, and Algorithms Michael Franklin UC Berkeley
description
Transcript of Solving the Scalability Dilemma with Clouds, Crowds, and Algorithms Michael Franklin UC Berkeley
UC Berkeley
1
Solving the Scalability Dilemma with Clouds, Crowds, and Algorithms
Michael FranklinUC Berkeley
Joint work with: Michael Armbrust, Peter Bodik, Kristal Curtis, Armando Fox, Randy Katz, Mike Jordan, Nick Lanham, David Patterson, Scott Shenker,
Ion Stoica, Beth Trushkowsky, Stephen Tu and Matei Zaharia
Image: John Curley http://www.flickr.com/photos/jay_que/1834540/
Save the Date(s): CIDR 2011 Conference
2
• Abstracts Due: Sept 24, 2010• Papers Due: October 1, 2010• Focus: innovative and visionary approaches to data
systems architecture and use. • Regular CIDR track plus CCC-sponsored
“outrageous ideas” track.• Website coming soon!
5th Biennial Conference on Innovative Data Systems Research
CIDR 2011 Jan 9-12 Asilomar, CA
Continuous Improvement of Client Devices
4
Computing as a Commodity
5
Ubiquitous Connectivity
6
AMP: Algorithms, Machines, People
Adaptive/Active Machine
Learning and Analytics
Cloud ComputingCrowdSourcing
Massive and
DiverseData
7
The Scalability Dilemma
• State-of-the Art Machine Learning techniques do not scale to large data sets.
• Data Analytics frameworks can’t handle lots of incomplete, heterogeneous, dirty data.
• Processing architectures struggle with increasing diversity of programming models and job types.
• Adding people to a late project makes it later.
Exactly Opposite of what we Expect and Need
RAD Lab 5-year MissionEnable 1 person to develop, deploy, operate a
next-generation Internet application at scaleInitial Technical Bet:
• Machine Learning to make large-scale systems self-managingMulti-area faculty, postdocs, & students
• Systems, Networks, Databases, Security, Statistical Machine Learning all in a single, open, collaborative space
Corporate Sponsorship and intensive industry interaction– Bi-annual 2.5 day offsite research retreats with sponsors
8
PIQL + SCADS
9
SCADS:Distributed Key
Value Store
PIQL: QueryInterface &Executor
Flexible ConsistencyManagement
“Active PIQL”(don’t ask)
UC Berkeley
SCADS: Scale Independent Storage
10
Scale Independence
• As a site’s user base grows and workload volatility increases:– No changes to application required– Cost per user remains constant– Request latency SLA is unchanged
• Key techniques– Model-Driven Scale Up and Scale Down– Performance Insightful Query Language– Declarative Performance/Consistency
Tradeoffs11
12
Over-provisioning a stateless systemWikipedia example
overprovision by 25%to handle spike
Michael Jackson dies
13
Over-provisioning a stateful systemWikipedia example
overprovision by 300%to handle spike
(assuming data stored on ten servers)
Data storage configuration• Shared-nothing storage cluster
– (key,value) pairs in a namespace, e.g. (user,email)– Each node stores set of data ranges, – Data ranges can be split until some minimum size
promised by PIQL, to ensure range queries don’t touch more than one node
14
A-C A-CFD-E D-E
D-EF-G G
15
Workload-based policy stages
Storage nodes
Stage 1: Replicate
Wor
kloa
d threshold
Bin
s
16Storage nodes
Stage 2: Data Movement
Wor
kloa
d threshold
Bin
s
destination
Workload-based policy stages
17Storage nodes
Stage 3: Server Allocation
Wor
kloa
d threshold
Bin
s
Workload-based policy stages
18
Workload-based policyPolicy input:
– Workload per histogram bin – Cluster configuration
Policy output:– Short actions (per bin)
SCADSnamespace
Policyactions
ActionExecutor
Performance Model
Workload Histogram
actionssampled workload as histogram
smoothed workload
config
Considerations:– Performance model– Overprovision buffer
Action Executor– Limit actions to X kb/s
19
Example ExperimentWorkload
• Ebates.com + wikipedia’s MJ spike (see Bodik et al. SOCC 2010 for workload generation)• One million (key,value) pairs, each ~256 bytes
Model: max sustainable workload per server
Cost:• machine cost: 1 unit/10 minutes• SLA: 99th percentile of get/put latency
Deployment• using m1.small instances on EC2, 1GB of RAM• server boot up time: 48 seconds• Delay server removal until 2 minutes left
20
Goal: selectively absorb hotspot
thou
sand
req
/ sec
21
Actions during the spike
10:00 10:14
Add replica
Move data, partition
Move data, coalesce
data movement and actions during the spike
Kb/
s
22
Configuration at end of Spike
Per server workload and # keys after added replicas
23
Cost-comparison to fixed and optimal
• Fixed allocation policy: 648 server units• Optimal policy: 310 server units
Overprovision factor
get/put SLA (ms)
# server units
% savings (vs fixed alloc)
0.5 180/250 358 48
0.6 140/225 389 40
0.7 120/200 422 35
PIQL [Armbrust et al. SIGMOD 2010 (demo) and SOCC 2010 (design paper)]
• “Performance Insightful” language subset• Compiler reasons about operation bounds
– Unbounded queries are disallowed– Queries above specified threshold generate a warning– Predeclare query templates: Optimizer decides what
indexes are needed (i.e., materialized views)• Provides: Bounded number of operations• + Strong SLAs = Predictable performance?24
RDBMS NoSQL
PIQL DDL
25
ENTITY User { string username, string password, PRIMARY KEY(username)}
ENTITY Subscription { boolean approved, string owner, string target, FOREIGN KEY owner REF User, FOREIGN KEY target REF User
MAX 5000, PRIMARY KEY(owner, target)}
ENTITY Thought { int timestamp, string owner, string text, FOREIGN KEY owner REFERENCES User PRIMARY KEY(owner,
timestamp)}
F.K.s are Required forJoins
Cardinality Limits requiredfor un-paginated Joins
More Queries
26
“Return the most recent thoughts from allof my “approved” subscriptions.”
Operations are bounded via schema and limit max
PIQL:Help Fix “Bad” Queries
• Interactive Query Visualizer– Shows record counts
and # ops– Highlights unbounded
parts of query– SIGMOD’10 Demo:
piql.knowsql.org
RDBMS NoSQL
PIQL + SCADS
• Goals are “Scale Independence” and “Performance Insightfulness”
• SCADS provides scalable foundation with SLA adherence
• PIQL uses language restrictions, schema limits, and precomputed views to bound # of SCADS operations per query.
• These work together to bridge the gap between “SQL” and “NoSQL” worlds.
28
UC Berkeley
Spark: Support for Iterative Data-Intensive Computing
M. Zaharia et al. HotClouds Workshop 2010
29
Analytics: Logistic Regression
Goal: find best line separating 2 datasets
+
–
+ ++
+
+
++ +
– ––
–
–
–– –
+
target
–
random initial line
Serial Version
val data = readData(...)
var w = Vector.random(D)
for (i <- 1 to ITERATIONS) { var gradient = Vector.zeros(D) for (p <- data) { val scale = (1/(1+exp(-p.y*(w dot p.x))) - 1) * p.y gradient += scale * p.x } w -= gradient}
println("Final w: " + w)
Spark Version
val data = spark.hdfsTextFile(...).map(readPoint).cache()
var w = Vector.random(D)
for (i <- 1 to ITERATIONS) { var gradient = spark.accumulator(Vector.zeros(D)) for (p <- data) { val scale = (1/(1+exp(-p.y*(w dot p.x))) - 1) * p.y gradient += scale * p.x } w -= gradient.value}
println("Final w: " + w)
Spark Version
val data = spark.hdfsTextFile(...).map(readPoint).cache()
var w = Vector.random(D)
for (i <- 1 to ITERATIONS) { var gradient = spark.accumulator(Vector.zeros(D)) for (p <- data) { val scale = (1/(1+exp(-p.y*(w dot p.x))) - 1) * p.y gradient += scale * p.x } w -= gradient.value}
println("Final w: " + w)
Spark Version
val data = spark.hdfsTextFile(...).map(readPoint).cache()
var w = Vector.random(D)
for (i <- 1 to ITERATIONS) { var gradient = spark.accumulator(Vector.zeros(D)) data.foreach(p => { val scale = (1/(1+exp(-p.y*(w dot p.x))) - 1) * p.y gradient += scale * p.x }) w -= gradient.value}
println("Final w: " + w)
Iterative Processing Dataflow
Hadoop / Dryad Spark. . .
w
f(x,w) w
f(x,w)x
xx
w
f(x,w)
Performance
40s / iteration
first iteration 60sfurther iterations 2s
UC Berkeley
What about the People?
38
39
Participatory Culture – “Indirect”John Murrell: GM SV 9/17/09…every time we use a Google app or service, we are working on behalf of the search sovereign, creating more content for it to index and monetize or teaching it something potentially useful about our desires, intentions and behavior.
40
Participatory Culture - Direct
Crowdsourcing Example
41
From: Yan, Kumar, Ganesan, CrowdSearch: Exploiting Crowds for Accurate Real-time Image Search on Mobile Phones, Mobisys 2010.
Mechanical Turk vs. Cluster Computing
• What challenges are similar?• What challenges are new?• Allocation, Cost, Reliability, Quality, Bias,
Making jobs appealing, ….
43
AMP: Algorithms, Machines, People
Adaptive/Active Machine
Learning and Analytics
Cloud ComputingCrowdSourcing
Massive and
DiverseData
Clouds and CrowdsInteractive Cloud Analytic Cloud People Cloud
Data Acquisition
Transactional systems
Data entry
… + Sensors(physical & software)
… + Web 2.0
Computation Get and Put Map ReduceParallel DBMS
Stream Processing
… + Collaborative Structures (e.g., Mechanical Turk,
Intelligence Markets)
Data Model Records Numbers, Media … + Text, Media, Natural Language
Response Time
Seconds Hours/Days … +Continuous
44
The Future Cloud will be a Hybrid of These.
AMPLab Technical Plan• Machine Learning & Analytics (Jordan, Fox, Franklin)
– Error Bars on all Answers– Active learning, continuous/adaptive improvement
• Data Management (Franklin, Joseph)– Pay-as-you-go integration and structure– Privacy
• Infrastructure (Stoica, Shenker, Patterson, Katz)– Nexus cloud OS and analytics languages
• Hybrid Crowd/Cloud Systems (Bayen, Waddell) – Incentive structures, systems aspects
45
Guiding Use Cases• Crowdsourced
Sensing, Work, Policy, Journalism
• Urban Micro-Simulation
46
Alogorithms, Machines & People
• A holistic view of the entire stack.
• Highly interdisciplinary faculty & students
• Developing a five-year plan; will dovetail with RADLab completion
For more information: [email protected]
47
Enable many people to collaborate to collect, generate, clean, make sense of and utilize lots of data.
Data Visualization, Collaboration, HCI, PoliciesText analyticsMachine Learning and StatsDatabase, OLAP, MapReduceSecurity and PrivacyMPP,Data Centers, NetworksMulti-Core Parallelism