Big Data and Machine Learning at Zalando€¦ · BIG DATA AT ZALANDO Business Intelligence Machine...
Transcript of Big Data and Machine Learning at Zalando€¦ · BIG DATA AT ZALANDO Business Intelligence Machine...
Big Data and
Machine
Learning at
Zalando
K s h i t i j K u m a r
V i c e P r e s i d e n t ,
D a t a I n f r a s t r u c t u r e
1 0 - 0 3 - 2 0 1 9
2
WE LOVE FASHION
3
WHAT STARTED AS A
SIMPLE ONLINE
SHOP…
4
…HAS BECOME THE
LEADING EUROPEAN
ONLINE PLATFORM
FOR FASHION
5
W E O F F E R A S U C C E S S F U L AN D C U R AT E D AS S O R T M E N T
HIGHLY
EXPERIENCED category management
CURATED
SHOPPINGwith Zalon
> 500 designers & stylists
> 300,000 articles from
~ 2,000international brands
private labels11
LOCALIZATIONof the assortment
6
PLATFORM STRATEGY
BRANDS CONSUMERS
ENABLER
7
WE DRESS CODE
8
WE ARE CONSTANTLY INNOVATING
CLOUD-BASED,
CUTTING-EDGE
& SCALABLEtechnology solutions
> 2,000employees at
international
tech locations8
HQsin Berlin
help our brand to
WIN ONLINE
9
BIG DATA AT ZALANDO
Business
Intelligence
Machine
Learning
Data
Governance
Data at the
core of
everything
we do
10
A TYPICAL BIG DATA INFRASTRUCTURE
ML Platform
•Explore
•Train
•Serve
•Observe
Data Platform
• Ingestion,
•Metadata,
•Store,
•Process
Business Intelligence
•Data Warehousing
•Visual KPIs
•Trusted datasets
Data Governance
•Data Catalog
•Privacy
•GDPR
11
SOME ML USE CASES AT AN ONLINE RETAILER
12
AN ML-DRIVEN CUSTOMER
EXPERIENCE
13
ML-driven
real-time
reco
engine
People who browsed this style also browsed these other styles…
14
COMPLETE THE LOOK
• Multi-dimensional ML driven
product placement
• Search
• Recommended products
• Complimentary items
• Size (fit)
• Delivery promise
15
THE ML JOURNEY
Explore
Fetch
Prepare
Train Model
Evaluate Model
Deploy to production
Monitor/ Evaluate
Ready the dataServe the models
16
ACHIEVING THE BALANCE TO RUN ML AT SCALE
Exploding new With the needs
17
THE ML PIPELINE – FOR A SINGLE USE CASE
ML Use CaseNotebook/UI
creates workflows
Fetch Data
Extract Features
Prepare Data
Train Model Deploy Model
Serve
Monitor
Evaluate and Feedback
18
WHAT HAPPENS – WITH A COUPLE OF USE CASES
19
AND THE MESS THAT COMES WITH MANY USE
CASES
20
TACKLING THE ML SCALING CHALLENGE
With cost efficiency
The ability to run hundreds of training jobs that are “serverless”. Trainings produce models and infrastructure is automatically shutdown.
3
With safety
The ability to understand metadata at every stage of the ML journey by just describing a training job at the
call of an API.
2
With speed
The ability to compose training jobs, tuning jobs and endpoints with ease, at the call of an API, and with algorithms available out of the box.
1
21
END TO END ML PIPELINE(real-life use case)
22
Productionizing ML: Speed, with simplicity
23
SAFE AND MONITORABLE ML
How is the model endpoint performing?
WHERE WOULD BE LIKE TO BE IN 2020?
A Scalable, Cost-efficient, Flexible Data Infrastructure
Shared Data, Models, Features
Safe, secure data usability, with privacy
Open source, Inner source, best-of-breed vendor tools
We’re hiring!
Big Data and
Machine
Learning at
Zalando
K s h i t i j K u m a r
V i c e P r e s i d e n t ,
D a t a I n f r a s t r u c t u r e
k s h i t i j . k u m a r @ z a l a n d o . d e
1 0 - 0 3 - 2 0 1 9
27
ML pipelines should be safe and understandable
What training job resulted in the deployment?
Which model(s) was deployed?
What instances are the model(s) deployed?
How much traffic routed to which model?