Build Your Own Recommendation Engine
-
Upload
jo-fai-chow -
Category
Technology
-
view
1.309 -
download
0
Transcript of Build Your Own Recommendation Engine
Build Your Own Recommendation Engine
(during Weekend)
Michal Malohlava @mmalohlava && @h2oai
presents
Mus
ic S
ervi
ce
Activities clicks/swipes/likes
Clients iOS/Android/…
Next N-recommendations
? REBB*?
*REBB = Recommendation Engine Black Box
Requ
irem
ents
Activities can be >100/s
REBB should be accessible via REST API
Recommendations need to be served <500ms, should keep users exploring
AWS infrastructure
Need to be ready in 2 days!
Requirements• Recommendations should be served <500ms
• ML part should allow quick prototyping & experimentation
• Storage (online/offline) - user stats, histories, recommendations
• Scalable
• frontend receiving requests
• backend solving ML
• storage Need to be ready in 2 days!
Engine Architecture
Variation of λ-architecture…
… with pluggable ML backend
Engine Architecture
Regular EC2 nodes
API Router
REST API via Spray
Akka Actor accepting and filtering: • user activities • recommendation requests
Scalable via HAProxy
API Router
Akka Actor handles
• POST of user activity
• publish activity to Redis
• update stats in Redis (quick updates)
• trigger recommendation computation
API Router
Akka Actor handles
• GET recommendation request
• fetch pre-computed recommendation from Redis if exists
• OR try to do best-effort to provide “coldstart" recommendation based on history of user activities
Redis StoreRedis is used as
• events bus: • inform subscribers about user
activities • requests to provide new
recommendation for user
• data storage • old/new recommendations • statistics (likes/swipe per user) • simple persistence model
• computation engine • keep top-N artists, top-N songs per user
ML BackendLanguage/technology agnostic
• Needs to be flexible enough to prototype different strategies
“Runners” for
• generating recommendationswith H2O and Python
• collecting/generating statistics
• clustering users with H2O JVM
“Runners” are subscribed to Redis/processing Redis data
ML BackendFinal strategy
• identify user cluster based on users activities (aka music styles)
• apply different recommendationstrategies inside each cluster
• identify “weird” users (~outliers)
• adapt recommendation for them
• needs manual intervention/algorithm tuning
Results
• Single machine for API Router and Redis
• peeks 50 activities/sec, avg 10 activities/sec
• small memory footprint
• ML Runners spread over EC2 machines
• even simple but different strategies for each user sectors and selected individual users provides surprisingly good results
Learn more at h2o.ai Follow us at @h2oai
Thank you!