Content recommendations
-
Upload
torben-brodt -
Category
Documents
-
view
212 -
download
2
Transcript of Content recommendations
Content Recommendationswith Redis
Torben Brodtplista GmbH
28. February 2013
Recommender SystemsStammtisch
http://recommenders.de
Introduction
● plista GmbH○ recommendations & advertising○ founded in 2008, Berlin [DE]○ ~3k recommendations/ second
● never batch = never Hadoop● stream computing with In Memory Database
● we love
How to build recommendations?
welt.de/football/berlin_wins.html
We only have the URL?
to show recommendations we are integrated on the website
so "at least" we can count the hits
Most popular
welt.de/football/berlin_wins.html● ZINCR "p:welt.de" berlin_wins● ZREVRANGEBYSCORE
p:welt.de
berlin_wins 689 +1
summer_is_coming 420
plista_company 135
Live Read+ Live Write= Real Time Recommendations
Most popular with timeseries
welt.de/football/berlin_wins.html● ZINCR "p:welt.de:1360007000" berlin_wins● ZUNION
○ "p:welt.de:1360007000"○ "p:welt.de:1360006000"○ "p:welt.de:1360005000"
● ZREVRANGEBYSCOREp:welt.de:1360005000
berlin_wins 420
summer_is_coming 135
plista_best_company 689
p:welt.de:1360006000
berlin_wins 420
summer_is_coming 135
plista_best_company 689
p:welt.de:1360007000
berlin_wins 689
summer_is_coming 420
plista_best_company 135
Most popular with timeseries
welt.de/football/berlin_wins.html● ZINCR "p:welt.de:1360007000" berlin_wins● ZUNION ... WEIGHTS
○ "p:welt.de:1360007000" .. 4○ "p:welt.de:1360006000" .. 2○ "p:welt.de:1360005000" .. 1
● ZREVRANGEBYSCOREp:welt.de:1360005000
berlin_wins 420
summer_is_coming 135
plista_best_company 689
p:welt.de:1360006000
berlin_wins 420
summer_is_coming 135
plista_best_company 689
p:welt.de:1360007000
berlin_wins 689
summer_is_coming 420
plista_best_company 135
Most popular with timeseries
:1360007000
-1h -2h -3h -4h -5h -6h -7h -8h
:1360007000
:1360007000
Most popular to any context
● it's not only publisher, we use ~50 context attributes
context attributes:● publisher● weekday● geolocation● demographics● ...
publisher = welt.de
berlin_wins 689 +1
summer_is_coming 420
plista_company 135
weekday = sunday
berlin_wins 400 +1
dortmund_wins 200
... 100
geolocation = dortmund
dortmund_wins 200
berlin_wins 10 +1
... 5
Most popular to any context
ZUNION ... WEIGHTSp:welt.de:1360007 4p:welt.de:1360006 2p:welt.de:1360005 1
w:sunday:1360007 4w:sunday:1360006 2w:sunday:1360005 1
g:dortmund:1360007 4g:dortmund:1360006 2g:dortmund:1360005 1
● how it looks like in Redispublisher = welt.de
berlin_wins 689 +1
summer_is_coming 420
plista_company 135
weekday = sunday
berlin_wins 400
dortmund_wins 200
... 100
geolocation = dortmund
dortmund_wins 200
berlin_wins 10
... 5
Most popular with Effect size
ZUNION ... WEIGHTSp:welt.de:1360007 4p:welt.de:1360006 2p:welt.de:1360005 1
w:sunday:1360007 4w:sunday:1360006 2w:sunday:1360005 1
g:dortmund:1360007 4g:dortmund:1360006 2g:dortmund:1360005 1
* 70%* 70%* 70%
* 10%* 10%* 10%
* 30%* 30%* 30%
Effect Size
Examples:small effect: weatherbig effect: publisher
Data with small effect should not been taken into account, otherwise we get avg results
● which context has an influence?
Most popular with Significance
● some data has more significance/trust● so we add a significance matrix
● Significance might depend on a common limit, like 200 (in the example)
X
sig:publisher = welt.de
berlin_wins 1
summer_is_coming 1
plista_company 0.5
publisher = welt.de
berlin_wins 689
summer_is_coming 420
plista_company 135
Most popular with Significance
● some data has more significance/trust● so we add a significance matrix
XNumerator
Denominatorsig:publisher = welt.de
berlin_wins 1
summer_is_coming 1
plista_company 0.5
sig:publisher = welt.de
berlin_wins 1
summer_is_coming 1
plista_company 0.5
publisher = welt.de
berlin_wins 689
summer_is_coming 420
plista_company 135
Σ
Σ
SUM over all context
SUM over all context
( )
SUM over..
● timeseries● different context● previous hits of the user● similar publisher
knowledge
publisher = welt.de
berlin_wins 689
summer_is_coming 420
plista_company 135ΣZUNION ... WEIGHTSp:welt.de:1360007 4p:welt.de:1360006 2p:welt.de:1360005 1
w:sunday:1360007 4w:sunday:1360006 2w:sunday:1360005 1
g:dortmund:1360007 4g:dortmund:1360006 2g:dortmund:1360005 1
... redis can do it ;)
Even more Matrix Operations ;)
● Similarity Matrix
● Human Control Matrix
● Meta-learning Matrix○ might be covered in next talk
○ cooperation with
○ aided from
∏Σ
Conclusions
● Redis fits perfect for simple operations○ SUM + AGGREGATE + MIN + MAX
● In-Memory operations are pretty fast
● Real-time features feel better in a real-time
database (e.g. time series)
● We don't need batch
What else?
In Redis● Incremental Collaborative Filtering● More Recommender● Live StatisticsAt plista● Semantics with Lucene● Cloud Technologies
○ Scalability○ Enterprise Service Bus
● Contest for Recommenders
Questions?
www.plista.com
@torbenbrodt
xing.com/profile/Torben_Brodt
http://goo.gl/pvXm5
http://lnkd.in/MUXXuv