Post on 25-Jun-2015
description
Sixth MomentComputing
recommendation systems for every business
online retail digital media
recommendation systems for every business
mobile apps
online gaming
cloud infrastructure
affordabilityavailabilityscalability
accelerated computing
less timeless energy
less dollars
accelerating big data
Shazam
eBay / Cortexica
Twitter / Salesforce.com
10M querier per day against27M content library
500+ keypoint fngerprint search of like things
500M tweets against1M expressions daily
Jen-Hsun HuangNvidia CEO & co-founderAnnual Investor Day 2013
big data platforms
MapReducee.g. Hadoop
traditionaldatabase cluster
e.g. MPI
multicore+ accelerators
ease of programming simple analytics
ease of programming complex analytics
performance
energy
data placementexternal
diskinternalmemory
conceptborrowed
from slides ofDavid A. Bader
Steve McConnellCode Complete
ideaborrowed
from slides ofMichael A. Heroux
efficiency vs other quality metricsHow focusing on the factor below afects the factor to the right
correctness
usability
efficiency
reliability
integrity
adaptability
accuracy
robustness
corr
ectn
ess
usab
ility
effici
ency
relia
bilit
y
inte
grity
adap
tabi
lity
accu
racy
robu
stne
ss
helps it
hurts it
Efficiency is the hard part.Improving efficiency hurtsall the other quality metrics.
case studieshardware specs
2x Intel Sandy Bridge8 cores (16 threads) / CPU
2.6 GHz
2x Nvidia Tesla K202688 CUDA cores / GPU
705 MHz
case studysmall
1 million users20 thousand items50 million records(50 items per user on average)
20 most similar items for each item20 recommendations for each user(400 thousand total similarities)(20 million total recommendations)
nearest neighbor algorithm (item-based)Tanimoto (a.k.a. Jaccard) similarity20 nearest neighbors per item
2 minutes(120 seconds)
case studysmall
1 million users20 thousand items50 million records(50 items per user on average)
20 most similar items for each item20 recommendations for each user(400 thousand total similarities)(20 million total recommendations)
latent variable model100 features per user / itemalternating least squares algorithm (10 iterations)
2 minutes20 seconds(140 seconds)
case studymedium
10 million users20 thousand items500 million records(50 items per user on average)
20 most similar items for each item20 recommendations for each user(400 thousand total similarities)(200 million total recommendations)
nearest neighbor algorithm (item-based)Tanimoto (a.k.a. Jaccard) similarity20 nearest neighbors per item
28 minutes
case studymedium
10 million users20 thousand items500 million records(50 items per user on average)
20 most similar items for each item20 recommendations for each user(400 thousand total similarities)(200 million total recommendations)
latent variable model100 features per user / itemalternating least squares algorithm (10 iterations)
32 minutes
Sixth MomentComputing
https://www.sixthmoment.com/contactus
http://www.slideshare.net/SixthMoment
https://twitter.com/SixthMoment
https://plus.google.com/+Sixthmoment
http://www.linkedin.com/company/sixth-moment-computing-corporation