Machine Learning: je m'y mets demain!

65
Machine Learning: je m’y mets demain! @louisdorard #TTFX - March 31, 2016

Transcript of Machine Learning: je m'y mets demain!

Machine Learning: je m’y mets demain!@louisdorard

#TTFX - March 31, 2016

AI is everywhere

ChurnSpotter.io

• Startups pitch

• AI asks questions live to each startup

• AI assigns score

• Startup with highest score wins 100000 €

18

AI Star tup Batt le at PAPIs. io

Preseries

How does it work?

Data + Machine Learning

Bedrooms Bathrooms Surface (foot²) Year built Type Price ($)

3 1 860 1950 house 565,0003 1 1012 1951 house2 1.5 968 1976 townhouse 447,0004 1315 1950 house 648,0003 2 1599 1964 house3 2 987 1951 townhouse 790,0001 1 530 2007 condo 122,0004 2 1574 1964 house 835,0004 2001 house 855,0003 2.5 1472 2005 house4 3.5 1714 2005 townhouse2 2 1113 1999 condo1 769 1999 condo 315,000

Bedrooms Bathrooms Surface (foot²) Year built Type Price ($)

3 1 860 1950 house 565,0003 1 1012 1951 house2 1.5 968 1976 townhouse 447,0004 1315 1950 house 648,0003 2 1599 1964 house3 2 987 1951 townhouse 790,0001 1 530 2007 condo 122,0004 2 1574 1964 house 835,0004 2001 house 855,0003 2.5 1472 2005 house4 3.5 1714 2005 townhouse2 2 1113 1999 condo1 769 1999 condo 315,000

Bedrooms Bathrooms Surface (foot²) Year built Type Price ($)

3 1 860 1950 house 565,0003 1 1012 1951 house2 1.5 968 1976 townhouse 447,0004 1315 1950 house 648,0003 2 1599 1964 house3 2 987 1951 townhouse 790,0001 1 530 2007 condo 122,0004 2 1574 1964 house 835,0004 2001 house 855,0003 2.5 1472 2005 house4 3.5 1714 2005 townhouse2 2 1113 1999 condo1 769 1999 condo 315,000

ML is a set of AI techniques where “intelligence” is built from

examples

30

Use cases

• Real-estate

• Spam filtering

• City bikes

• Startup competition

• Reduce churn

• Optimize pricing

• Anticipate demand

property price

email spam indicator

location, context #bikes

startup success indicator

customer churn indicator

product, price #sales

product, store, date #sales

Zillow

Gmail

V3 predict

Preseries

ChurnSpotter

Amazon

Blue Yonder

RULES

Making Machine Learning accessible with cloud platforms

HTML / CSS / JavaScript

HTML / CSS / JavaScript

squarespace.com

The two phases of ML

• TRAIN a model

• PREDICT with a model

38

M achine Learning APIs

The two methods of ML Application Programming Interfaces (here in Python)

• model = create_model(‘training.csv’)

• predicted_output, confidence = create_prediction(model, new_input)

39

M achine Learning APIs

The two methods of ML Application Programming Interfaces (here in Python)

• model = create_model(‘training.csv’)

• predicted_output, confidence = create_prediction(model, new_input)

40

M achine Learning APIs

Example request to BigML API

$ curl https://bigml.io/dev/model?$BIGML_AUTH \ -X POST \ -H "content-type: application/json" \ -d '{"dataset": "dataset/50ca447b3b56356ae0000029"}'

• Classification problem

• Features:

• Text of email

• Sender in address book?

• How often do I reply?

• How quickly do I reply?

• Demo43

Prior it y detec t ion

• VM with Jupyter notebooks (Python & Bash)

• API wrappers preinstalled: BigML & Google Pred

• Notebook for easy setup of credentials

• Scikit-learn and Pandas preinstalled

• Open source VM provisioning script & notebooks

• Search public Snaps on terminal.com: “machine learning”45

G etting star ted

Making Machine Learning easier

How was i t before?

from sklearn import svmmodel = svm.SVC(gamma=0.001, C=100.)

from sklearn import datasetsdigits = datasets.load_digits() model.fit(digits.data[:-1], digits.target[:-1])

model.predict(digits.data[-1])

How was i t before?

from sklearn import svmmodel = svm.SVC(gamma=0.001, C=100.)

from sklearn import datasetsdigits = datasets.load_digits() model.fit(digits.data[:-1], digits.target[:-1])

model.predict(digits.data[-1])

WAT?

http://oscar.sensout.com

• Spearmint: “Bayesian optimization” for tuning parameters → Whetlab → Twitter

• Auto-sklearn: “automated machine learning toolkit and drop-in replacement for a scikit-learn estimator”

50

Open S ource AutoML l ibrar ies

S cik it

from sklearn import svmmodel = svm.SVC(gamma=0.001, C=100.)

from sklearn import datasetsdigits = datasets.load_digits() model.fit(digits.data[:-1], digits.target[:-1])

model.predict(digits.data[-1])

S cik it

from sklearn import svmmodel = svm.SVC(gamma=0.001, C=100.)

from sklearn import datasetsdigits = datasets.load_digits() model.fit(digits.data[:-1], digits.target[:-1])

model.predict(digits.data[-1])

AutoML S cik it

import autosklearnmodel = autosklearn.AutoSklearnClassifier()

from sklearn import datasetsdigits = datasets.load_digits() model.fit(digits.data[:-1], digits.target[:-1])

model.predict(digits.data[-1])

• Algorithm selection… AutoML

• Scaling… Azure ML or Yhat (Greg at PAPIs Connect)

• “Automating ML workflows: a report from the trenches” — Jose A. Ortega Ruiz

54

Automatizat ion

Making Deep Learning accessible

• Classification problem

• Input is an image = pixel values

56

I mage categorizat ion

pixel1 pixel2 pixel3 animal?

102 0 255 Yes35 41 209 No… … … …

• Neural network:

• Layers

• Neurons of one layer connected to neurons of next layer

• Each neuron receives signals from previous layer and sends new signal to next layer

• New signal based on linear combination of signals received

• “Deep” -> more than 3 layers57

Deep Learning

58

Deep Learning for animal detec t ion

59

Deep Learning for animal detec t ion

pixel1

pixel2

pixel3

cat

dog

1st layer value=(102, 0, 255)

Last layer value=(0.1, 0.7, 0.4)

Output value=(0.8, 0.3) => there’s

probably a cat!

60

Deep Learning for animal detec t ion

pixel1

pixel2

pixel3

cat

dog

1st layer value=(4, 166, 23)

Last layer value=(0.1, 0.7, 0.4)

Output value=(0.1, 0.2) => probably no

animal here

pixel1

pixel2

pixel3

cat

dog

1st layer value=(102, 0, 255)

Output value=(0.8, 0.3) => there’s

probably a cat!

Last layer value=(0.1, 0.7, 0.4)

62

Deep Learning for animal detec t ion

pixel1 pixel2 pixel3 animal?

102 0 255 Yes35 41 209 No… … … …

• Replace images with “smart” representation given by last layer

neuron1 neuron2 neuron3 animal?

0.1 0.2 0.5 Yes0.8 0.3 0.8 No… … … …

• Prochain meetup:

• Développer une application prédictive (Hors-série débutants)

• Mardi 12 Avril à 19h - Le Node

• Workshop:

• Operational Machine Learning with open source and cloud platforms

• Samedi 23 Avril - sera annoncé sur le Meetup!63

Prochains événements ML à B ordeaux

Machine Learning: je m’y mets le 12 et le 23 Avril!

meetup.com/Bordeaux-Machine-Learning-Meetup/

meetup.com/Bordeaux-Machine-Learning-Meetup/

@louisdorard