Build Your Own Recommendation Engine

Build Your Own Recommendation Engine

(during Weekend)

Michal Malohlava @mmalohlava && @h2oai

presents

Mus

ic S

ervi

ce

Activities clicks/swipes/likes

Clients iOS/Android/…

Next N-recommendations

? REBB*?

*REBB = Recommendation Engine Black Box

Requ

irem

ents

Activities can be >100/s

REBB should be accessible via REST API

Recommendations need to be served <500ms, should keep users exploring

AWS infrastructure

Need to be ready in 2 days!

Requirements• Recommendations should be served <500ms

• ML part should allow quick prototyping & experimentation

• Storage (online/offline) - user stats, histories, recommendations

• Scalable

• frontend receiving requests

• backend solving ML

• storage Need to be ready in 2 days!

Engine Architecture

Variation of λ-architecture…

… with pluggable ML backend

Engine Architecture

Regular EC2 nodes

API Router

REST API via Spray

Akka Actor accepting and filtering: • user activities • recommendation requests

Scalable via HAProxy

API Router

Akka Actor handles

• POST of user activity

• publish activity to Redis

• update stats in Redis (quick updates)

• trigger recommendation computation

API Router

Akka Actor handles

• GET recommendation request

• fetch pre-computed recommendation from Redis if exists

• OR try to do best-effort to provide “coldstart" recommendation based on history of user activities

Redis StoreRedis is used as

• events bus: • inform subscribers about user

activities • requests to provide new

recommendation for user

• data storage • old/new recommendations • statistics (likes/swipe per user) • simple persistence model

• computation engine • keep top-N artists, top-N songs per user

ML BackendLanguage/technology agnostic

• Needs to be flexible enough to prototype different strategies

“Runners” for

• generating recommendationswith H2O and Python

• collecting/generating statistics

• clustering users with H2O JVM

“Runners” are subscribed to Redis/processing Redis data

ML BackendFinal strategy

• identify user cluster based on users activities (aka music styles)

• apply different recommendationstrategies inside each cluster

• identify “weird” users (~outliers)

• adapt recommendation for them

• needs manual intervention/algorithm tuning

Results

• Single machine for API Router and Redis

• peeks 50 activities/sec, avg 10 activities/sec

• small memory footprint

• ML Runners spread over EC2 machines

• even simple but different strategies for each user sectors and selected individual users provides surprisingly good results

Learn more at h2o.ai Follow us at @h2oai

Thank you!

Build Your Own Recommendation Engine

Technology

Transcript of Build Your Own Recommendation Engine