Cloud Academy & AWS: how we use Amazon Web Services for machine learning and data collection

26
Cloud Academy & AWS: how we use Amazon Web Services for machine learning and data collec:on cloudacademy.com 4/27/2016

Transcript of Cloud Academy & AWS: how we use Amazon Web Services for machine learning and data collection

Cloud  Academy  &  AWS:  how  we  use  Amazon  Web  Services  

for  machine  learning  and  data  collec:on

cloudacademy.com4/27/2016

About  us

Alex  Casalboni Roberto  Turrin Luca  BaroffioSr.  SoCware  Engineer Sr.  Data  Scien:st  (PhD)  Data  Scien:st  (PhD)

@alex_casalboni @robytur @lucabaroffio

clda.co/webinar-ML

What  is  Machine  Learning  (ML)?

Back  to  1959  (A.  Samuel)

Decision  problems  that    can  be  modeled  from  data

clda.co/webinar-ML

Machine  Learning  pipeline

Training Predic1on

batch real-­‐:me

Feature  extrac1on

batch

data informaGon

features ML  models

clda.co/webinar-ML

?

Machine  Learning  taxonomy

Supervised    Learning

Unsupervised    Learning

clda.co/webinar-ML

?Machine  Learning  taxonomy

classifica3on

regression 170cm

Supervised    Learning

Unsupervised    Learning

clda.co/webinar-ML

Machine  Learning  taxonomy

Supervised    Learning

Unsupervised    Learning

clda.co/webinar-ML

Machine  Learning  taxonomy

clustering

rule  extrac3on

group A group B

A, B C

Supervised    Learning

Unsupervised    Learning

clda.co/webinar-ML

What  problems  can  ML  solve  for  you?

Supervised    Learning

Unsupervised    Learning

classifica'on

regression

clustering

rule  extrac'on

?

170cm

gro gro

A, B C

clda.co/webinar-ML

What  problems  can  ML  solve  for  you?

Supervised    Learning

Unsupervised    Learning

classifica'on

regression

clustering

rule  extrac'on

?fraud  detecGon

170cm

gro gro

A, B C

price  of  a  stock  over  Gme

purchase  likelihood

user  segmentaGon

clda.co/webinar-ML

LearningDataMachine

Cloud

Big

Science

Information

Internet

Statistics

Technology

Python Future

Mining Social

Deep

IOT

AlgorithmsManagement

Storage Petabytes

Parallel

Network

Privacy

MillionNoSQL

PaaS

SQL

Database

Exabytes

Billion

Dataset

Hadoop

R

clda.co/webinar-ML

Machine  learning  and  Big  data

“90%  of  the  data  in  the  world  today  has  been    created  in  the  last  two  years  alone”  -­‐  IBM

“300+  hours  worth  of  video  content  is  being    uploaded  to  the  site  every  minute”  -­‐  Youtube

clda.co/webinar-ML

Big  data  challenges

clda.co/webinar-ML

This  much  data  can’t  be  manually  inspected

Data-­‐driven  decisions

Distributed/parallel  compu=ng

The  curse  of  dimensionality

Why  is  deploying  ML  models  a  challenge?

clda.co/webinar-ML

Why  is  deploying  ML  models  a  challenge?

1.  Prototyping  !=  Produc=on-­‐ready

2.  We  need  Elas=city

4.  Avoid  lack  of  ownership

clda.co/webinar-ML

3.  Too  many  nice-­‐to-­‐have  features

Where  is  the  lack  of  ownership?

clda.co/webinar-ML

!=

Data  Scien=st DevOps

Machine  Learning  Data  mining  

Sta:s:cal  analysis

System  administra:on  (Cloud)  Opera:ons  SoCware  engineering

Many  op:ons  and  tools  offered  by  AWS

ELB Auto  Scaling

Elas:c  Beanstalk

Amazon  MLECS

EMR LambdaEC2

API  Gateway

clda.co/webinar-ML

Serverless  compu:ng  to  the  rescue!

Transparent  scalability,  elas=city  and  availability

Developer-­‐friendly  maintenance  (versioning  +  aliases)

AWS  Lambda

Event-­‐driven  approach  &  never  pay  for  idle

1  func=on  =  1  model

clda.co/webinar-ML

A/B  tes=ng  via  composi=on

How  is  “Serverless”  possible?

There is always a server somewhere, you just don't have to worry about it :)

clda.co/webinar-ML

AWS  Lambda  +  Amazon  API  Gateway

+AWS  

LambdaAPI  

Gateway

RESTful  &  auth  layer

Global  CDN  and  caching  (CloudFront)

Staging  &  versioning  &  mocking

API  Decoupling

clda.co/webinar-ML

Quick  Example

clda.co/webinar-ML

clda.co/webinar-ML-example

clda.co/webinar-ML

clda.co/webinar-ML-lambda

AWS  Lambda  limita:ons

clda.co/webinar-ML

No  real-­‐=me  models  (only  pseudo  real-­‐=me)

Deployment  package  management:  size  limit  and  OS  libraries

Not  suitable  for  model  training  yet  (5  min  max  execu=on  =me)AWS  Lambda

What  about  Amazon  Machine  Learning?

clda.co/webinar-ML

Amazon  ML

One  of  the  first  MLaaS  solu=ons  (1  year  old)

Great  service  for  classifica=on  and  regression

Only  linear  models  (linear  &  logis=c  regression  +  SGD)

No  support  for  advanced  scenarios  yet    (collabora=ve  recommenda=on,  mul=media,  online  learning,  etc.)

Key  Takeaways

clda.co/webinar-ML

Data-­‐driven  decision  and  user-­‐centered  ML  will  make  your  product  smarter

Maximize  ownership  by  removing  obstacles  btw  prototype  and  produc=on

Eliminate  tradeoffs  btw  high-­‐scalability  and  nice-­‐to-­‐have  features

Go  Serverless  and  stop  worrying  about  Ops

MLaaS  makes  your  life  even  simpler,  unless  you  need  more  control

Thank  you  for  acending  :)

cloudacademy.com

Q  &  A