[Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin,...
description
Transcript of [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin,...
![Page 1: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.fdocuments.us/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/1.jpg)
Machine Learning on Big DataLessons Learned from Google Projects
Max LinSoftware Engineer | Google Research
Massively Parallel Computing | Harvard CS 264Guest Lecture | March 29th, 2011
![Page 2: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.fdocuments.us/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/2.jpg)
Outline
• Machine Learning intro
• Scaling machine learning algorithms up
• Design choices of large scale ML systems
![Page 3: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.fdocuments.us/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/3.jpg)
Outline
• Machine Learning intro
• Scaling machine learning algorithms up
• Design choices of large scale ML systems
![Page 4: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.fdocuments.us/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/4.jpg)
“Machine Learning is a study of computer algorithms that
improve automatically through experience.”
![Page 5: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.fdocuments.us/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/5.jpg)
![Page 6: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.fdocuments.us/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/6.jpg)
![Page 7: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.fdocuments.us/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/7.jpg)
![Page 8: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.fdocuments.us/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/8.jpg)
![Page 9: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.fdocuments.us/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/9.jpg)
![Page 10: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.fdocuments.us/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/10.jpg)
Training
Testing
The quick brown fox jumped over the lazy dog. English
To err is human, but to really foul things up you need a computer.
English
No hay mal que por bien no venga. Spanish
La tercera es la vencida. Spanish
To be or not to be -- that is the question
?
La fe mueve montañas. ?
Input X
f(x’)
Output Y
Model f(x)
= y’
![Page 11: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.fdocuments.us/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/11.jpg)
The quick brown fox jumped over the lazy dog.
Linear Classifier
0,‘a’ ‘aardvark’...
[ 0,...
... ...‘dog’
1,... ‘the’
1,...... ‘montañas’... 0, ...
...]x
0.1,[ 132,... ... 150, 200,... ... -153, ... ]w
f(x) = w · x =P�
p=1
wp ∗ xp
![Page 12: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.fdocuments.us/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/12.jpg)
Training Data
...
...
...
... ... ... ... ... ...
...
NP
Input X Ouput Y
![Page 13: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.fdocuments.us/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/13.jpg)
http://www.flickr.com/photos/mr_t_in_dc/5469563053/
Typical machine learning data at Google
N: 100 billions / 1 billionP: 1 billion / 10 million(mean / median)
![Page 14: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.fdocuments.us/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/14.jpg)
Classifier Training
• Training: Given {(x, y)} and f, minimize the following objective function
argminw
N�
n=1
L(yi, f(xi;w)) +R(w)
![Page 15: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.fdocuments.us/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/15.jpg)
http://www.flickr.com/photos/visitfinland/5424369765/
Use Newton’s method?wt+1 ← wt −H(wt)−1∇J(wt)
![Page 16: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.fdocuments.us/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/16.jpg)
Outline
• Machine Learning intro
• Scaling machine learning algorithms up
• Design choices of large scale ML systems
![Page 17: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.fdocuments.us/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/17.jpg)
Scaling Up
• Why big data?
• Parallelize machine learning algorithms
• Embarrassingly parallel
• Parallelize sub-routines
• Distributed learning
![Page 18: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.fdocuments.us/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/18.jpg)
Machine
SubsamplingBig Data
Shard 1 Shard 2 Shard MShard 3...
Model
Reduce N
![Page 19: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.fdocuments.us/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/19.jpg)
Why not Small Data?
[Banko and Brill, 2001]
![Page 20: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.fdocuments.us/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/20.jpg)
• Why big data?
• Parallelize machine learning algorithms
• Embarrassingly parallel
• Parallelize sub-routines
• Distributed learning
Scaling Up
![Page 21: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.fdocuments.us/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/21.jpg)
Parallelize Estimates
• Naive Bayes Classifier
• Maximum Likelihood Estimates
wthe|EN =
�Ni=1 1EN,the(xi)�N
i=1 1EN (xi)
argminw
−N�
i=1
P�
p=1
P (xip|yi;w)P (yi;w)
![Page 22: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.fdocuments.us/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/22.jpg)
Word Counting
MapX: “The quick brown fox ...”Y: EN
(‘the|EN’, 1)(‘quick|EN’, 1)(‘brown|EN’, 1)
Reduce [ (‘the|EN’, 1), (‘the|EN’, 1), (‘the|EN’, 1) ]
C(‘the’|EN) = SUM of values = 3
w�the�|EN =C(�the�|EN)
C(EN)
![Page 23: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.fdocuments.us/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/23.jpg)
Map
Reduce
Big Data
Mapper 1
Shard 1
Mapper 2
Shard 2
Mapper 3
Shard 3
Mapper M
Shard M
(‘the’ | EN, 1)
Reducer
Tally counts and update w
...
Word Counting
(‘fox’ | EN, 1) ... (‘montañas’ | ES, 1)
Model
![Page 24: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.fdocuments.us/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/24.jpg)
Parallelize Optimization
• Maximum Entropy Classifiers
• Good: J(w) is concave
• Bad: no closed-form solution like NB
• Ugly: Large N
argminw
N�
i=1
exp(�P
p=1 wp ∗ xip)
yi
1 + exp(�P
p=1 wp ∗ xip)
![Page 25: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.fdocuments.us/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/25.jpg)
Gradient Descent
http://www.cs.cmu.edu/~epxing/Class/10701/Lecture/lecture7.pdf
![Page 26: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.fdocuments.us/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/26.jpg)
Gradient Descent
• w is initialized as zero
• for t in 1 to T
• Calculate gradients
•
∇J(w)
wt+1 ← wt − η∇J(w)
∇J(w) =N�
i=1
P (w, xi, yi)
![Page 27: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.fdocuments.us/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/27.jpg)
Distribute Gradient
• w is initialized as zero
• for t in 1 to T
• Calculate gradients in parallel
• Training CPU: O(TPN) to O(TPN / M)
wt+1 ← wt − η∇J(w)
![Page 28: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.fdocuments.us/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/28.jpg)
Distribute Gradient
Map
Reduce
Big Data
Machine 1
Shard 1
Machine 2
Shard 2
Machine 3
Shard 3
Machine M
Shard M
(dummy key, partial gradient sum)
Sum and Update w
...
ModelRepeat M/R
until converge
![Page 29: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.fdocuments.us/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/29.jpg)
• Why big data?
• Parallelize machine learning algorithms
• Embarrassingly parallel
• Parallelize sub-routines
• Distributed learning
Scaling Up
![Page 30: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.fdocuments.us/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/30.jpg)
Parallelize Subroutines
• Support Vector Machines
• Solve the dual problems.t. 1− yi(w · φ(xi) + b) ≤ ζi, ζi ≥ 0
arg minw,b,ζ
1
2||w||22 + C
n�
i=1
ζi
argminα
1
2αTQα− αT1
s.t. 0 ≤ α ≤ C,yTα = 0
![Page 31: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.fdocuments.us/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/31.jpg)
http://www.flickr.com/photos/sea-turtle/198445204/
The computational cost for the Primal-Dual Interior Point
Method is O(n^3) in time and O(n^2) in
memory
![Page 32: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.fdocuments.us/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/32.jpg)
Parallel SVM• Parallel, row-wise incomplete Cholesky
Factorization for Q
• Parallel interior point method
• Time O(n^3) becomes O(n^2 / M)
• Memory O(n^2) becomes O(n / M)
• Parallel Support Vector Machines (psvm) http://code.google.com/p/psvm/
• Implement in MPI
√N
[Chang et al, 2007]
![Page 33: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.fdocuments.us/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/33.jpg)
• Distribute Q by row into M machines
• For each dimension n <
• Send local pivots to master
• Master selects largest local pivots and broadcast the global pivot to workers
Machine 1
row 1
√N
Parallel ICF
Machine 2
...row 2
row 3
row 4
Machine 3
row 5
row 6
![Page 34: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.fdocuments.us/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/34.jpg)
![Page 35: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.fdocuments.us/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/35.jpg)
• Why big data?
• Parallelize machine learning algorithms
• Embarrassingly parallel
• Parallelize sub-routines
• Distributed learning
Scaling Up
![Page 36: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.fdocuments.us/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/36.jpg)
Majority Vote
Map
Big Data
Machine 1
Shard 1
Machine 2
Shard 2
Machine 3
Shard 3
Machine M
Shard M...
Model 1 Model 2 Model 3 Model 4
![Page 37: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.fdocuments.us/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/37.jpg)
Majority Vote
• Train individual classifiers independently
• Predict by taking majority votes
• Training CPU: O(TPN) to O(TPN / M)
![Page 38: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.fdocuments.us/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/38.jpg)
Parameter Mixture
Map
Reduce
Big Data
Machine 1
Shard 1
Machine 2
Shard 2
Machine 3
Shard 3
Machine M
Shard M
(dummy key, w1)
Average w
...
(dummy key, w2) ...
[Mann et al, 2009]
Model
![Page 39: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.fdocuments.us/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/39.jpg)
http://www.flickr.com/photos/annamatic3000/127945652/
Much Less network usage than distributed gradient descentO(MN) vs. O(MNT)
![Page 40: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.fdocuments.us/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/40.jpg)
![Page 41: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.fdocuments.us/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/41.jpg)
Iterative Param Mixture
Map
Reduce after each epoch
Big Data
Machine 1
Shard 1
Machine 2
Shard 2
Machine 3
Shard 3
Machine M
Shard M
(dummy key, w1)
Average w
...
(dummy key, w2) ...
Model
[McDonald et al., 2010]
![Page 42: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.fdocuments.us/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/42.jpg)
![Page 43: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.fdocuments.us/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/43.jpg)
Outline
• Machine Learning intro
• Scaling machine learning algorithms up
• Design choices of large scale ML systems
![Page 44: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.fdocuments.us/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/44.jpg)
http://www.flickr.com/photos/mr_t_in_dc/5469563053/
Scalable
![Page 45: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.fdocuments.us/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/45.jpg)
http://www.flickr.com/photos/aloshbennett/3209564747/
Parallel
![Page 46: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.fdocuments.us/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/46.jpg)
http://www.flickr.com/photos/wanderlinse/4367261825/
Accuracy
![Page 47: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.fdocuments.us/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/47.jpg)
http://www.flickr.com/photos/imagelink/4006753760/
![Page 48: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.fdocuments.us/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/48.jpg)
http://www.flickr.com/photos/brenderous/4532934181/
Binary Classification
![Page 49: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.fdocuments.us/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/49.jpg)
http://www.flickr.com/photos/mararie/2340572508/
Automatic Feature
Discovery
![Page 50: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.fdocuments.us/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/50.jpg)
http://www.flickr.com/photos/prunejuice/3687192643/
Fast Response
![Page 51: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.fdocuments.us/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/51.jpg)
http://www.flickr.com/photos/jepoirrier/840415676/
Memory is new hard disk.
![Page 52: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.fdocuments.us/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/52.jpg)
http://www.flickr.com/photos/neubie/854242030/
Algorithm + Infrastructure
![Page 53: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.fdocuments.us/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/53.jpg)
Design for Multicores
http://www.flickr.com/photos/geektechnique/2344029370/
![Page 54: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.fdocuments.us/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/54.jpg)
Combiner
![Page 55: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.fdocuments.us/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/55.jpg)
![Page 56: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.fdocuments.us/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/56.jpg)
Multi-shard Combiner
[Chandra et al., 2010]
![Page 57: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.fdocuments.us/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/57.jpg)
Machine Learning on
Big Data
![Page 58: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.fdocuments.us/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/58.jpg)
Parallelize ML Algorithms
• Embarrassingly parallel
• Parallelize sub-routines
• Distributed learning
![Page 59: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.fdocuments.us/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/59.jpg)
Parallel Accuracy
Fast Response
![Page 60: [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)](https://reader034.fdocuments.us/reader034/viewer/2022051513/547bc2aeb479597c098b4eb2/html5/thumbnails/60.jpg)
Google APIs
• Prediction API
• machine learning service on the cloud
• http://code.google.com/apis/predict
• BigQuery
• interactive analysis of massive data on the cloud
• http://code.google.com/apis/bigquery