Design for X: Exploring Product Design with Apache Spark and GraphLab
Create -...
Transcript of Create -...
Data-driven, predictive data apps are making our world amazing…
Historical data
Sensor & interaction data
Real-time predictions &
decisions
Recommenders Industrial Apps
Forecasters
Social
Human Sensing
Fraud & Anomaly Detection
Sentiment Analysis & other Text
Apps
Pers. Medicine
Building a predictive app
Was using 217 business rules hoping world doesn’t change
Have an inspiring idea to reinvent their business
Key pains:
Hiring Talent
Shortfall in data-savvy workers needed to make sense out of big data by 2018 [McKinsey 2011]
35%
Noisy Space of Tools
Data scientists use a variety of tools, across different programming languages… require a lot of context-switching…
affects productivity and impedes reproducibility. Ben Lorica,
Data Analysis: Just one component of the Data Science workflow
Crossing the Big Data Chasm
speed of iteration
scal
e of
dat
a
Get a Hadoop cluster!?!?
single machine memory
production data
CreateTM
GraphLab Create: Unleashing data science from inspiration to production
big data chasm
Data scientist: inspiration to production
Analyze big data on one machine graphs, tables, text, images
in Python doesn’t have to fit in memory
Distribute in production with same code on EC2, Yarn,…
Use my laptop Variety of data
Not toy data scales Language I love
Iterate quickly
Prototype Monitor Production
Clean Learn Deploy
data pipeline predictive service
GraphLab Canvas: Monitor & visualize from prototype to production
GraphLab Create
Same code, many environments
Local
HDFS
S3
SQL/noSQL
GraphDB
GraphLab Canvas
Clean Learn Deploy
data pipelines predictive services <Python>
SGraph
Fastest graph analytics
GraphLab Engine
SFrame Scales out-of-core
Machine Learning Robust, scalable, auto-tuning, task-oriented
Graphs, tables, text, images End-to-end visualization monitoring
management
Same code, many environments