Getting Started With Dato - August 2015

Post on 28-Jan-2018

608 views 2 download

Transcript of Getting Started With Dato - August 2015

Dato Confidential1

Creating an Intelligent World at Dato

Shawn Scullyscully@dato.com

Dato Confidential2

Hello, my name is…

Shawn Scullyscully@dato.comDirector of Product(Physicist, Cleantech Geek, Data Scientist, Urban Farmer)

I

Intelligent Applications

Dato Confidential3

Who is Dato?

45+ and growing fast!

Dato Confidential4

by making

sophisticated machine learning

Dato’s mission is to

accelerate the creation of

intelligent applications

as easy as

“Hello world!”

Dato Confidential5

Business

must be intelligent

Machine learning applications

• Recommenders

• Fraud detection

• Ad targeting

• Financial models

• Personalized medicine

• Churn prediction

• Smart UX

(video & text)

• Personal assistants

• IoT

• Socials nets

• …

Last decade:

Data management

Now:

Intelligent apps

?Last 5 years:

Traditional analytics

Dato Confidential

Example Intelligent Applications

6

Dato Confidential

SystemsElastic, scalable

PeopleData scientist

Challenge today: Path from inspiration to production

ScalePrototyping

Data engineering is painful

• Limited by system memory

• Data munging & feature eng.

• Manipulate complex data types

Data intelligence is hard

• Models don’t scale

• No task-oriented ML

• Algos trapped in papers

Production is fragile

• Build custom services & API

• Write new code to scale

• Model management

Inspiration

Data Intelligence

Data Engineering Production

Dato Confidential8

Our customers

Dato Confidential

We make small teams extremely productive.

9

Developer (former DBA) built & deployed first recommender to increase community engagement (and therefore ad revenue).

Small team of developers built & deployed a recommender in 1/5 the time of previous efforts and at higher performance for increased sales.

Small team of data scientists more rapidly iterating on models to improve state of the art music experience for better user experience.

Small team iterating quickly to improve personalization (and increase revenue) in their daily deals.

2 person team iterate & deploy better job search ranking using text to increase clicks & therefore revenue.

Dato Confidential

Demo: Recommender

Dato Confidential

• Out-of-core computation

• Tools for feature engineering

• Rich data type support

• Models built for scale

• App-oriented toolkits

• Advanced ML & Extensible

• Deploy models as low-latency REST services

• Same code for distributed computation

• Elastically scale up or out with one command

• Job monitoring & model management

• Deploy existing Python code & models

• Run on AWS EC2 or Hadoop YARN

SGraph

Create Engine

SFrameCanvas

Machine Learning Toolkits SDK

GraphLab Create Dato DistributedDato Predictive Services

Predictive Engine

REST Client Direct

Model Mgmt

Distributed Engine

DirectJob Client

Job Mgmt

The Dato Machine Learning Platform

Dato Confidential12

Sophisticated ML made easy - Toolkits

RecommenderImage search

Sentiment analysis

Data matching

Auto tagging

Churn predictor

Object detectorProduct

sentimentClick

predictionFraud

detectionUser

segmentationData

completion

Anomaly detection

Document clustering

Forecasting Search ranking

Summarization …

import graphlab as gl

data = gl.SFrame.read_csv('my_data.csv')

model = gl.recommender.create(data,

user_id='user',

item_id='moviez

target='rating')

recommendations = model.recommend(k=5)

Principles:

• Get started fast

• Rapidly iterate

• Combine for new apps

Dato Confidential13

Sophisticate ML made easy - Transfer learning

• Train a model on one task, use it for another task

• Examples

- Learn to walk, use that knowledge to run

- Train image tagger to recognize cars, use that knowledge to

recognize trucks.

13

Dato Confidential14

Create an intelligent world!

Data Engineering

Sophisticated ML

Deployment

• Fast & scalable

• Rich data types

• Built for ML

• App-oriented ML

• Supporting utils

• Extensibility

• Batch & always-on

• RESTful interface

• Elastic & robust

scully@dato.com

Dato Confidential

Get the software: dato.com/download

Start learning: dato.com/learn

Bug me: scully@dato.com