Machine Learning with Spark - Meetupfiles.meetup.com/19103884/3_ML_intro.pdf · Mercedes-Benz ML...

12
Machine Learning with Spark What is Machine Learning?

Transcript of Machine Learning with Spark - Meetupfiles.meetup.com/19103884/3_ML_intro.pdf · Mercedes-Benz ML...

Page 1: Machine Learning with Spark - Meetupfiles.meetup.com/19103884/3_ML_intro.pdf · Mercedes-Benz ML 270 turbodiesel cat CD' € 04/2000 251,000 km 120 kW hp 9 private, 1-00038 vatmontone

Machine Learning with Spark

What is Machine Learning?

Page 2: Machine Learning with Spark - Meetupfiles.meetup.com/19103884/3_ML_intro.pdf · Mercedes-Benz ML 270 turbodiesel cat CD' € 04/2000 251,000 km 120 kW hp 9 private, 1-00038 vatmontone
Page 3: Machine Learning with Spark - Meetupfiles.meetup.com/19103884/3_ML_intro.pdf · Mercedes-Benz ML 270 turbodiesel cat CD' € 04/2000 251,000 km 120 kW hp 9 private, 1-00038 vatmontone

Gain wisdom around 500 BC Input/Output system

? Wisdom

Question

Wisdom

Page 4: Machine Learning with Spark - Meetupfiles.meetup.com/19103884/3_ML_intro.pdf · Mercedes-Benz ML 270 turbodiesel cat CD' € 04/2000 251,000 km 120 kW hp 9 private, 1-00038 vatmontone

Gain wisdom around 1990 Input/Output system

? Wisdom

Question

Page 5: Machine Learning with Spark - Meetupfiles.meetup.com/19103884/3_ML_intro.pdf · Mercedes-Benz ML 270 turbodiesel cat CD' € 04/2000 251,000 km 120 kW hp 9 private, 1-00038 vatmontone

? Wisdom

Question

Gain wisdom around 2016 Input/Output system

Page 6: Machine Learning with Spark - Meetupfiles.meetup.com/19103884/3_ML_intro.pdf · Mercedes-Benz ML 270 turbodiesel cat CD' € 04/2000 251,000 km 120 kW hp 9 private, 1-00038 vatmontone

Work 1 month for 30data points

Wait 1 second for 30.000 points

Page 7: Machine Learning with Spark - Meetupfiles.meetup.com/19103884/3_ML_intro.pdf · Mercedes-Benz ML 270 turbodiesel cat CD' € 04/2000 251,000 km 120 kW hp 9 private, 1-00038 vatmontone

“More data usually beats better algorithms”Anand Rajaraman (when teaching at Stanford)

http://anand.typepad.com/datawocky/2008/03/more-data-usual.html

Old Skool StatisticsOld Skool Statistics Big Data!Big Data!

Nice read!

Page 8: Machine Learning with Spark - Meetupfiles.meetup.com/19103884/3_ML_intro.pdf · Mercedes-Benz ML 270 turbodiesel cat CD' € 04/2000 251,000 km 120 kW hp 9 private, 1-00038 vatmontone
Page 9: Machine Learning with Spark - Meetupfiles.meetup.com/19103884/3_ML_intro.pdf · Mercedes-Benz ML 270 turbodiesel cat CD' € 04/2000 251,000 km 120 kW hp 9 private, 1-00038 vatmontone

Use Cases with Spark

Page 10: Machine Learning with Spark - Meetupfiles.meetup.com/19103884/3_ML_intro.pdf · Mercedes-Benz ML 270 turbodiesel cat CD' € 04/2000 251,000 km 120 kW hp 9 private, 1-00038 vatmontone

MLlib demo recommendation engine

Page 11: Machine Learning with Spark - Meetupfiles.meetup.com/19103884/3_ML_intro.pdf · Mercedes-Benz ML 270 turbodiesel cat CD' € 04/2000 251,000 km 120 kW hp 9 private, 1-00038 vatmontone

RDD

ML Model

Result

1. Create an RDD, Map/Reduce to clean the Data

2. Use a pre-build Spark ML technique to calculate a Model

3. Buy my next Car

MLlib demo car market model

Page 12: Machine Learning with Spark - Meetupfiles.meetup.com/19103884/3_ML_intro.pdf · Mercedes-Benz ML 270 turbodiesel cat CD' € 04/2000 251,000 km 120 kW hp 9 private, 1-00038 vatmontone

MLlib overview

Next Meetup?Play around with

Spark, Mllib?

Kaggle Competition?