Cloud Computing - Carnegie Mellon Universitymsakr/15319-s12/lectures/Lecture09_1… · Apache...

76
Apache Mahout Feb 13, 2012 Shannon Quinn Cloud Computing CS 15-319

Transcript of Cloud Computing - Carnegie Mellon Universitymsakr/15319-s12/lectures/Lecture09_1… · Apache...

Page 1: Cloud Computing - Carnegie Mellon Universitymsakr/15319-s12/lectures/Lecture09_1… · Apache Mahout Feb 13, 2012 Shannon Quinn Cloud Computing CS 15-319. MapReduce Review C0 C1 C2

Apache MahoutFeb 13, 2012

Shannon Quinn

Cloud ComputingCS 15-319

Page 2: Cloud Computing - Carnegie Mellon Universitymsakr/15319-s12/lectures/Lecture09_1… · Apache Mahout Feb 13, 2012 Shannon Quinn Cloud Computing CS 15-319. MapReduce Review C0 C1 C2

MapReduce Review

C0 C1 C2 C3

M0 M1 M2 M3

IO0 IO1 IO2 IO3

R0 R1

FO0 FO1

chunks

mappers

Reducers

Map

Pha

seR

educ

e Ph

ase Shuffling Data

Scalable programming modelMap phaseShuffleReduce phaseMapReduceImplementations

GoogleHadoop Figure from lecture 6: MapReduce

Page 3: Cloud Computing - Carnegie Mellon Universitymsakr/15319-s12/lectures/Lecture09_1… · Apache Mahout Feb 13, 2012 Shannon Quinn Cloud Computing CS 15-319. MapReduce Review C0 C1 C2

MapReduce Review

C0 C1 C2 C3

M0 M1 M2 M3

IO0 IO1 IO2 IO3

R0 R1

FO0 FO1

chunks

mappers

Reducers

Map

Pha

seR

educ

e Ph

ase Shuffling Data

Scalable programming modelMap phaseShuffleReduce phaseMapReduceImplementations

GoogleHadoop Figure from lecture 6: MapReduce

This is our focus!

Page 4: Cloud Computing - Carnegie Mellon Universitymsakr/15319-s12/lectures/Lecture09_1… · Apache Mahout Feb 13, 2012 Shannon Quinn Cloud Computing CS 15-319. MapReduce Review C0 C1 C2

Apache Mahout

A scalable machine learning library

Page 5: Cloud Computing - Carnegie Mellon Universitymsakr/15319-s12/lectures/Lecture09_1… · Apache Mahout Feb 13, 2012 Shannon Quinn Cloud Computing CS 15-319. MapReduce Review C0 C1 C2

Apache Mahout

A scalable machine learning libraryBuilt on Hadoop

Page 6: Cloud Computing - Carnegie Mellon Universitymsakr/15319-s12/lectures/Lecture09_1… · Apache Mahout Feb 13, 2012 Shannon Quinn Cloud Computing CS 15-319. MapReduce Review C0 C1 C2

Apache Mahout

A scalable machine learning libraryBuilt on HadoopPhilosophy of Mahout (and Hadoop by proxy)

Page 7: Cloud Computing - Carnegie Mellon Universitymsakr/15319-s12/lectures/Lecture09_1… · Apache Mahout Feb 13, 2012 Shannon Quinn Cloud Computing CS 15-319. MapReduce Review C0 C1 C2

What does Mahout do?

Page 8: Cloud Computing - Carnegie Mellon Universitymsakr/15319-s12/lectures/Lecture09_1… · Apache Mahout Feb 13, 2012 Shannon Quinn Cloud Computing CS 15-319. MapReduce Review C0 C1 C2

Recommendation

Page 9: Cloud Computing - Carnegie Mellon Universitymsakr/15319-s12/lectures/Lecture09_1… · Apache Mahout Feb 13, 2012 Shannon Quinn Cloud Computing CS 15-319. MapReduce Review C0 C1 C2

Classification

Page 10: Cloud Computing - Carnegie Mellon Universitymsakr/15319-s12/lectures/Lecture09_1… · Apache Mahout Feb 13, 2012 Shannon Quinn Cloud Computing CS 15-319. MapReduce Review C0 C1 C2

Clustering

Page 11: Cloud Computing - Carnegie Mellon Universitymsakr/15319-s12/lectures/Lecture09_1… · Apache Mahout Feb 13, 2012 Shannon Quinn Cloud Computing CS 15-319. MapReduce Review C0 C1 C2

Other Mahout Algorithms

Dimensionality Reduction

Regression

Evolutionary Algorithms

Page 12: Cloud Computing - Carnegie Mellon Universitymsakr/15319-s12/lectures/Lecture09_1… · Apache Mahout Feb 13, 2012 Shannon Quinn Cloud Computing CS 15-319. MapReduce Review C0 C1 C2

Mahout

1. Recommendation

2. Classification

3. Clustering

Page 13: Cloud Computing - Carnegie Mellon Universitymsakr/15319-s12/lectures/Lecture09_1… · Apache Mahout Feb 13, 2012 Shannon Quinn Cloud Computing CS 15-319. MapReduce Review C0 C1 C2

Recommendation Overview

Help users find items they might like based on historical preferences

Page 14: Cloud Computing - Carnegie Mellon Universitymsakr/15319-s12/lectures/Lecture09_1… · Apache Mahout Feb 13, 2012 Shannon Quinn Cloud Computing CS 15-319. MapReduce Review C0 C1 C2

Recommendation Overview

Mathematically

Page 15: Cloud Computing - Carnegie Mellon Universitymsakr/15319-s12/lectures/Lecture09_1… · Apache Mahout Feb 13, 2012 Shannon Quinn Cloud Computing CS 15-319. MapReduce Review C0 C1 C2

Recommendation Overview

Alice

Bob

Peter

5 1 4

? 2 5

4 3 2 *based on example bySebastian Schelter

Page 16: Cloud Computing - Carnegie Mellon Universitymsakr/15319-s12/lectures/Lecture09_1… · Apache Mahout Feb 13, 2012 Shannon Quinn Cloud Computing CS 15-319. MapReduce Review C0 C1 C2

Recommendation Overview

5 1 4

- 2 5

4 3 2 *based on example bySebastian Schelter

Page 17: Cloud Computing - Carnegie Mellon Universitymsakr/15319-s12/lectures/Lecture09_1… · Apache Mahout Feb 13, 2012 Shannon Quinn Cloud Computing CS 15-319. MapReduce Review C0 C1 C2

Recommendation Overview

5 1 4

? 2 5

4 3 2

Bob

*based on example bySebastian Schelter

Page 18: Cloud Computing - Carnegie Mellon Universitymsakr/15319-s12/lectures/Lecture09_1… · Apache Mahout Feb 13, 2012 Shannon Quinn Cloud Computing CS 15-319. MapReduce Review C0 C1 C2

Recommendation Overview

5 1 4

1.5 2 5

4 3 2

Bob

*based on example bySebastian Schelter

Page 19: Cloud Computing - Carnegie Mellon Universitymsakr/15319-s12/lectures/Lecture09_1… · Apache Mahout Feb 13, 2012 Shannon Quinn Cloud Computing CS 15-319. MapReduce Review C0 C1 C2

Recommendation in Mahout

1st Map phase: process input

*based on example bySebastian Schelter

Page 20: Cloud Computing - Carnegie Mellon Universitymsakr/15319-s12/lectures/Lecture09_1… · Apache Mahout Feb 13, 2012 Shannon Quinn Cloud Computing CS 15-319. MapReduce Review C0 C1 C2

Recommendation in Mahout

1st Map phase: process input

1st Reduce phase: list by user

*based on example bySebastian Schelter

Page 21: Cloud Computing - Carnegie Mellon Universitymsakr/15319-s12/lectures/Lecture09_1… · Apache Mahout Feb 13, 2012 Shannon Quinn Cloud Computing CS 15-319. MapReduce Review C0 C1 C2

Recommendation in Mahout

2nd Map phase: Emit co-occurred ratings

*based on example bySebastian Schelter

Page 22: Cloud Computing - Carnegie Mellon Universitymsakr/15319-s12/lectures/Lecture09_1… · Apache Mahout Feb 13, 2012 Shannon Quinn Cloud Computing CS 15-319. MapReduce Review C0 C1 C2

Recommendation in Mahout

2nd Map phase: Emit co-occurred ratings

2nd Reduce phase: Compute similarities

*based on example bySebastian Schelter

Page 23: Cloud Computing - Carnegie Mellon Universitymsakr/15319-s12/lectures/Lecture09_1… · Apache Mahout Feb 13, 2012 Shannon Quinn Cloud Computing CS 15-319. MapReduce Review C0 C1 C2

Mahout

1. Recommendation

2. Classification

3. Clustering

Page 24: Cloud Computing - Carnegie Mellon Universitymsakr/15319-s12/lectures/Lecture09_1… · Apache Mahout Feb 13, 2012 Shannon Quinn Cloud Computing CS 15-319. MapReduce Review C0 C1 C2

Classification Overview

Assigning data to discrete categories

Page 25: Cloud Computing - Carnegie Mellon Universitymsakr/15319-s12/lectures/Lecture09_1… · Apache Mahout Feb 13, 2012 Shannon Quinn Cloud Computing CS 15-319. MapReduce Review C0 C1 C2

Classification Overview

Assigning data to discrete categoriesTrain a model on labeled data

Spam Not spam

Page 26: Cloud Computing - Carnegie Mellon Universitymsakr/15319-s12/lectures/Lecture09_1… · Apache Mahout Feb 13, 2012 Shannon Quinn Cloud Computing CS 15-319. MapReduce Review C0 C1 C2

Classification Overview

Assigning data to discrete categoriesTrain a model on labeled dataRun the model on new, unlabeled dataSpam Not spam

?

Page 27: Cloud Computing - Carnegie Mellon Universitymsakr/15319-s12/lectures/Lecture09_1… · Apache Mahout Feb 13, 2012 Shannon Quinn Cloud Computing CS 15-319. MapReduce Review C0 C1 C2

Naïve Bayes Example

Page 28: Cloud Computing - Carnegie Mellon Universitymsakr/15319-s12/lectures/Lecture09_1… · Apache Mahout Feb 13, 2012 Shannon Quinn Cloud Computing CS 15-319. MapReduce Review C0 C1 C2

Naïve Bayes Example

Prob (token | label) =

Page 29: Cloud Computing - Carnegie Mellon Universitymsakr/15319-s12/lectures/Lecture09_1… · Apache Mahout Feb 13, 2012 Shannon Quinn Cloud Computing CS 15-319. MapReduce Review C0 C1 C2

Naïve Bayes ExampleNot spam

Page 30: Cloud Computing - Carnegie Mellon Universitymsakr/15319-s12/lectures/Lecture09_1… · Apache Mahout Feb 13, 2012 Shannon Quinn Cloud Computing CS 15-319. MapReduce Review C0 C1 C2

Naïve Bayes Example

President Obama’s Nobel Prize Speech

Not spam

Page 31: Cloud Computing - Carnegie Mellon Universitymsakr/15319-s12/lectures/Lecture09_1… · Apache Mahout Feb 13, 2012 Shannon Quinn Cloud Computing CS 15-319. MapReduce Review C0 C1 C2

Naïve Bayes ExampleSpam

Page 32: Cloud Computing - Carnegie Mellon Universitymsakr/15319-s12/lectures/Lecture09_1… · Apache Mahout Feb 13, 2012 Shannon Quinn Cloud Computing CS 15-319. MapReduce Review C0 C1 C2

Naïve Bayes ExampleSpam

Spam email content

Page 33: Cloud Computing - Carnegie Mellon Universitymsakr/15319-s12/lectures/Lecture09_1… · Apache Mahout Feb 13, 2012 Shannon Quinn Cloud Computing CS 15-319. MapReduce Review C0 C1 C2

Naïve Bayes Example

Page 34: Cloud Computing - Carnegie Mellon Universitymsakr/15319-s12/lectures/Lecture09_1… · Apache Mahout Feb 13, 2012 Shannon Quinn Cloud Computing CS 15-319. MapReduce Review C0 C1 C2

Naïve Bayes Example

“Order a trial Adobe chicken dailyEAB-List new summer savings, welcome!”

Page 35: Cloud Computing - Carnegie Mellon Universitymsakr/15319-s12/lectures/Lecture09_1… · Apache Mahout Feb 13, 2012 Shannon Quinn Cloud Computing CS 15-319. MapReduce Review C0 C1 C2

Naïve Bayes in Mahout

Complex!

Page 36: Cloud Computing - Carnegie Mellon Universitymsakr/15319-s12/lectures/Lecture09_1… · Apache Mahout Feb 13, 2012 Shannon Quinn Cloud Computing CS 15-319. MapReduce Review C0 C1 C2

Naïve Bayes in Mahout

Complex!Training1. Read the features

Page 37: Cloud Computing - Carnegie Mellon Universitymsakr/15319-s12/lectures/Lecture09_1… · Apache Mahout Feb 13, 2012 Shannon Quinn Cloud Computing CS 15-319. MapReduce Review C0 C1 C2

Naïve Bayes in Mahout

Complex!Training1. Read the features2. Calculate per-document statistics

Page 38: Cloud Computing - Carnegie Mellon Universitymsakr/15319-s12/lectures/Lecture09_1… · Apache Mahout Feb 13, 2012 Shannon Quinn Cloud Computing CS 15-319. MapReduce Review C0 C1 C2

Naïve Bayes in Mahout

Complex!Training1. Read the features2. Calculate per-document statistics3. Normalize across categories

Page 39: Cloud Computing - Carnegie Mellon Universitymsakr/15319-s12/lectures/Lecture09_1… · Apache Mahout Feb 13, 2012 Shannon Quinn Cloud Computing CS 15-319. MapReduce Review C0 C1 C2

Naïve Bayes in Mahout

Complex!Training1. Read the features2. Calculate per-document statistics3. Normalize across categories4. Calculate normalizing factor of each label

Page 40: Cloud Computing - Carnegie Mellon Universitymsakr/15319-s12/lectures/Lecture09_1… · Apache Mahout Feb 13, 2012 Shannon Quinn Cloud Computing CS 15-319. MapReduce Review C0 C1 C2

Naïve Bayes in Mahout

Complex!Training1. Read the features2. Calculate per-document statistics3. Normalize across categories4. Calculate normalizing factor of each label

TestingClassification

Page 41: Cloud Computing - Carnegie Mellon Universitymsakr/15319-s12/lectures/Lecture09_1… · Apache Mahout Feb 13, 2012 Shannon Quinn Cloud Computing CS 15-319. MapReduce Review C0 C1 C2

Other Classification Algorithms

Stochastic Gradient Descent

Page 42: Cloud Computing - Carnegie Mellon Universitymsakr/15319-s12/lectures/Lecture09_1… · Apache Mahout Feb 13, 2012 Shannon Quinn Cloud Computing CS 15-319. MapReduce Review C0 C1 C2

Other Classification Algorithms

Stochastic Gradient DescentSupport Vector Machines

Page 43: Cloud Computing - Carnegie Mellon Universitymsakr/15319-s12/lectures/Lecture09_1… · Apache Mahout Feb 13, 2012 Shannon Quinn Cloud Computing CS 15-319. MapReduce Review C0 C1 C2

Other Classification Algorithms

Stochastic Gradient DescentSupport Vector MachinesRandom Forests

Page 44: Cloud Computing - Carnegie Mellon Universitymsakr/15319-s12/lectures/Lecture09_1… · Apache Mahout Feb 13, 2012 Shannon Quinn Cloud Computing CS 15-319. MapReduce Review C0 C1 C2

Mahout

1. Recommendation

2. Classification

3. Clustering

Page 45: Cloud Computing - Carnegie Mellon Universitymsakr/15319-s12/lectures/Lecture09_1… · Apache Mahout Feb 13, 2012 Shannon Quinn Cloud Computing CS 15-319. MapReduce Review C0 C1 C2

Clustering Overview

Grouping unstructured data

Page 46: Cloud Computing - Carnegie Mellon Universitymsakr/15319-s12/lectures/Lecture09_1… · Apache Mahout Feb 13, 2012 Shannon Quinn Cloud Computing CS 15-319. MapReduce Review C0 C1 C2

Clustering Overview

Grouping unstructured dataSmall intra-cluster distance

Page 47: Cloud Computing - Carnegie Mellon Universitymsakr/15319-s12/lectures/Lecture09_1… · Apache Mahout Feb 13, 2012 Shannon Quinn Cloud Computing CS 15-319. MapReduce Review C0 C1 C2

Clustering Overview

Grouping unstructured dataSmall intra-cluster distanceLarge inter-cluster distance

Page 48: Cloud Computing - Carnegie Mellon Universitymsakr/15319-s12/lectures/Lecture09_1… · Apache Mahout Feb 13, 2012 Shannon Quinn Cloud Computing CS 15-319. MapReduce Review C0 C1 C2

K-Means Clustering Example

Page 49: Cloud Computing - Carnegie Mellon Universitymsakr/15319-s12/lectures/Lecture09_1… · Apache Mahout Feb 13, 2012 Shannon Quinn Cloud Computing CS 15-319. MapReduce Review C0 C1 C2

K-Means Clustering Example

Page 50: Cloud Computing - Carnegie Mellon Universitymsakr/15319-s12/lectures/Lecture09_1… · Apache Mahout Feb 13, 2012 Shannon Quinn Cloud Computing CS 15-319. MapReduce Review C0 C1 C2

K-Means Clustering Example

Page 51: Cloud Computing - Carnegie Mellon Universitymsakr/15319-s12/lectures/Lecture09_1… · Apache Mahout Feb 13, 2012 Shannon Quinn Cloud Computing CS 15-319. MapReduce Review C0 C1 C2

K-Means Clustering Example

Page 52: Cloud Computing - Carnegie Mellon Universitymsakr/15319-s12/lectures/Lecture09_1… · Apache Mahout Feb 13, 2012 Shannon Quinn Cloud Computing CS 15-319. MapReduce Review C0 C1 C2

K-Means Clustering Example

Page 53: Cloud Computing - Carnegie Mellon Universitymsakr/15319-s12/lectures/Lecture09_1… · Apache Mahout Feb 13, 2012 Shannon Quinn Cloud Computing CS 15-319. MapReduce Review C0 C1 C2

K-Means Clustering Example

Page 54: Cloud Computing - Carnegie Mellon Universitymsakr/15319-s12/lectures/Lecture09_1… · Apache Mahout Feb 13, 2012 Shannon Quinn Cloud Computing CS 15-319. MapReduce Review C0 C1 C2

K-Means Clustering Example

Page 55: Cloud Computing - Carnegie Mellon Universitymsakr/15319-s12/lectures/Lecture09_1… · Apache Mahout Feb 13, 2012 Shannon Quinn Cloud Computing CS 15-319. MapReduce Review C0 C1 C2

K-Means Clustering Example

Page 56: Cloud Computing - Carnegie Mellon Universitymsakr/15319-s12/lectures/Lecture09_1… · Apache Mahout Feb 13, 2012 Shannon Quinn Cloud Computing CS 15-319. MapReduce Review C0 C1 C2

K-Means Clustering Example

Page 57: Cloud Computing - Carnegie Mellon Universitymsakr/15319-s12/lectures/Lecture09_1… · Apache Mahout Feb 13, 2012 Shannon Quinn Cloud Computing CS 15-319. MapReduce Review C0 C1 C2

K-Means Clustering Example

Cats

Dogs

Page 58: Cloud Computing - Carnegie Mellon Universitymsakr/15319-s12/lectures/Lecture09_1… · Apache Mahout Feb 13, 2012 Shannon Quinn Cloud Computing CS 15-319. MapReduce Review C0 C1 C2

K-Means Clustering in Mahout

+

C0 C1 C2 C3

M0 M1 M2 M3

IO0 IO1 IO2 IO3

R0 R1

FO0 FO1

chunks

mappers

Reducers

Map

Pha

seR

educ

e Ph

ase Shuffling Data

Figure from lecture 6: MapReduce

Page 59: Cloud Computing - Carnegie Mellon Universitymsakr/15319-s12/lectures/Lecture09_1… · Apache Mahout Feb 13, 2012 Shannon Quinn Cloud Computing CS 15-319. MapReduce Review C0 C1 C2

K-Means Clustering in Mahout

Assume: # clusters <<< # points

Page 60: Cloud Computing - Carnegie Mellon Universitymsakr/15319-s12/lectures/Lecture09_1… · Apache Mahout Feb 13, 2012 Shannon Quinn Cloud Computing CS 15-319. MapReduce Review C0 C1 C2

K-Means Clustering in Mahout

Assume: # clusters <<< # points

M0 M1 M2 M3

Page 61: Cloud Computing - Carnegie Mellon Universitymsakr/15319-s12/lectures/Lecture09_1… · Apache Mahout Feb 13, 2012 Shannon Quinn Cloud Computing CS 15-319. MapReduce Review C0 C1 C2

K-Means Clustering in Mahout

Assume: # clusters <<< # points

M0 M1 M2 M3

<clusterID, observation>

R0 R1

Page 62: Cloud Computing - Carnegie Mellon Universitymsakr/15319-s12/lectures/Lecture09_1… · Apache Mahout Feb 13, 2012 Shannon Quinn Cloud Computing CS 15-319. MapReduce Review C0 C1 C2

K-Means Clustering in Mahout

Map phase: assign cluster IDs

Page 63: Cloud Computing - Carnegie Mellon Universitymsakr/15319-s12/lectures/Lecture09_1… · Apache Mahout Feb 13, 2012 Shannon Quinn Cloud Computing CS 15-319. MapReduce Review C0 C1 C2

K-Means Clustering in Mahout

Map phase: assign cluster IDs

Reduce phase: reset centroids

Page 64: Cloud Computing - Carnegie Mellon Universitymsakr/15319-s12/lectures/Lecture09_1… · Apache Mahout Feb 13, 2012 Shannon Quinn Cloud Computing CS 15-319. MapReduce Review C0 C1 C2

K-Means Clustering in Mahout

Important notes--maxIter--convergenceDeltamethod

Page 65: Cloud Computing - Carnegie Mellon Universitymsakr/15319-s12/lectures/Lecture09_1… · Apache Mahout Feb 13, 2012 Shannon Quinn Cloud Computing CS 15-319. MapReduce Review C0 C1 C2

Other Clustering Algorithms

Latent Dirichlet AllocationTopic models

Page 66: Cloud Computing - Carnegie Mellon Universitymsakr/15319-s12/lectures/Lecture09_1… · Apache Mahout Feb 13, 2012 Shannon Quinn Cloud Computing CS 15-319. MapReduce Review C0 C1 C2

Other Clustering Algorithms

Latent Dirichlet AllocationTopic models

Fuzzy K-MeansPoints are assigned multiple clusters

Page 67: Cloud Computing - Carnegie Mellon Universitymsakr/15319-s12/lectures/Lecture09_1… · Apache Mahout Feb 13, 2012 Shannon Quinn Cloud Computing CS 15-319. MapReduce Review C0 C1 C2

Other Clustering Algorithms

Latent Dirichlet AllocationTopic models

Fuzzy K-MeansPoints are assigned multiple clusters

Canopy clusteringFast approximations of clusters

Page 68: Cloud Computing - Carnegie Mellon Universitymsakr/15319-s12/lectures/Lecture09_1… · Apache Mahout Feb 13, 2012 Shannon Quinn Cloud Computing CS 15-319. MapReduce Review C0 C1 C2

Other Clustering Algorithms

Latent Dirichlet AllocationTopic models

Fuzzy K-MeansPoints are assigned multiple clusters

Canopy clusteringFast approximations of clusters

Spectral clusteringTreat points as a graph

Page 69: Cloud Computing - Carnegie Mellon Universitymsakr/15319-s12/lectures/Lecture09_1… · Apache Mahout Feb 13, 2012 Shannon Quinn Cloud Computing CS 15-319. MapReduce Review C0 C1 C2

Other Clustering Algorithms

Latent Dirichlet AllocationTopic models

Fuzzy K-MeansPoints are assigned multiple clusters

Canopy clusteringFast approximations of clusters

Spectral clusteringTreat points as a graph

K-Means & Eigencuts

Page 70: Cloud Computing - Carnegie Mellon Universitymsakr/15319-s12/lectures/Lecture09_1… · Apache Mahout Feb 13, 2012 Shannon Quinn Cloud Computing CS 15-319. MapReduce Review C0 C1 C2

Mahout in Summary

Page 71: Cloud Computing - Carnegie Mellon Universitymsakr/15319-s12/lectures/Lecture09_1… · Apache Mahout Feb 13, 2012 Shannon Quinn Cloud Computing CS 15-319. MapReduce Review C0 C1 C2

Mahout in Summary

Scalable library

Page 72: Cloud Computing - Carnegie Mellon Universitymsakr/15319-s12/lectures/Lecture09_1… · Apache Mahout Feb 13, 2012 Shannon Quinn Cloud Computing CS 15-319. MapReduce Review C0 C1 C2

Mahout in Summary

Scalable libraryThree primary areas of focus

Page 73: Cloud Computing - Carnegie Mellon Universitymsakr/15319-s12/lectures/Lecture09_1… · Apache Mahout Feb 13, 2012 Shannon Quinn Cloud Computing CS 15-319. MapReduce Review C0 C1 C2

Mahout in Summary

Scalable libraryThree primary areas of focusOther algorithms

Page 74: Cloud Computing - Carnegie Mellon Universitymsakr/15319-s12/lectures/Lecture09_1… · Apache Mahout Feb 13, 2012 Shannon Quinn Cloud Computing CS 15-319. MapReduce Review C0 C1 C2

Mahout in Summary

Scalable libraryThree primary areas of focusOther algorithmsAll in your friendly neighborhood MapReduce

Page 75: Cloud Computing - Carnegie Mellon Universitymsakr/15319-s12/lectures/Lecture09_1… · Apache Mahout Feb 13, 2012 Shannon Quinn Cloud Computing CS 15-319. MapReduce Review C0 C1 C2

Mahout in Summary

Scalable libraryThree primary areas of focusOther algorithmsAll in your friendly neighborhood MapReducehttp://mahout.apache.org/

Page 76: Cloud Computing - Carnegie Mellon Universitymsakr/15319-s12/lectures/Lecture09_1… · Apache Mahout Feb 13, 2012 Shannon Quinn Cloud Computing CS 15-319. MapReduce Review C0 C1 C2

Thank you!