RecSys 2015 - Unifying the Problem of Search and Recommendations at OpenTable
-
Upload
jeremy-schiff -
Category
Software
-
view
616 -
download
0
Transcript of RecSys 2015 - Unifying the Problem of Search and Recommendations at OpenTable
Unifying the Problem of Search and Recommendations at OpenTable
Jeremy Schiff, Ph.D.RecSys 201509/20/2015
BEFORE DURING AFTER
DIN
ERS
REST
AURA
NTS
Understanding& Evolving
Attracting &Planning
OpenTable: Deliver great experiences at every step, based on who you are
Proprietary 3
OpenTable in Numbers•Our network connects diners with more than 32,000 restaurants worldwide.•Our diners have spent approximately $35 billion at our partner restaurants.•OpenTable seats more than 17 million diners each month.• Every month, OpenTable diners write more than 475,000 restaurant reviews
4
OpenTable Data Ecosystem
Search(Context & Intent)
Restaurant Profile(Decision Confidence)
Reservation History (Verifying the Loop)
Reviews(Verifying the Loop)
User’s LocationSearch LocationDate, TimeQuery
Reviews, Ratings (Overall, Food, Noise Level, etc)
Seating Logs
Photos, Reviews, Ratings, Menus
OpenTable Data Ecosystem
Search(Context & Intent)
Restaurant Profile(Decision Confidence)
Reservation History (Verifying the Loop)
Reviews(Verifying the Loop)
User’s LocationSearch LocationDate, TimeQuery
Reviews, Ratings (Overall, Food, Noise Level, etc)
Seating Logs
Photos, Reviews, Ratings, MenusUser
Interaction Logs
So what are recommendations?
So what are recommendations?
What’s the Goal
Minimizing Engineering Time to Improve The Metric that Matters
•Make it Easy to Measure•Make it Easy to Iterate• Reduce Iteration Cycle Times
9
Pick Your Business MetricRevenue, Conversions •OpenTable • Amazon
Retention, Engagement•Netflix• Pandora• Spotify
10
Importance of A/B Testing• If you don’t measure it, you can’t improve it
•Metrics Drive Behavior
• Continued Forward Progress
11
The Optimization Loops
Introspect Offline Learning
OnlineLearning
Hours Days Weeks
12
13
The ingredients of a spectaculardining experience…
14
… and a spectacularly bad one
Examples of Topics (using MF)
15
Edit via the Header & Footer menu in PowerPoint
1616
Lead
Tim
eLe
ad T
ime
Distance
Distance
New York
Dallas
Lead Time
95%
95%
Query Logs• Effective mechanism for understanding what users are trying to do
• Reducing 0 result queries
- Anecdote: should we support zipcodes next?
Search to Recommendations Continuum• Common Themes
- Ranking always tries to move key metric (like conversion)
- Always leverage implicit signals (time of day, day of week, location, etc)
- User Control vs. Paradox of ChoiceAdvantage Example Stage Item Count
Search User Control $$, French, Takes Credit Card
Retrieval Many
Browse Use Case Control Great View / Romantic
Ranking Many
Recommend Data-Driven Flexibility
Best around me Ranking Few
Differences in Recommender Usage
Right now vs. Planning
Cost of Being Wrong
Search vs. Recommendations
20
Search vs. RecommendationsCollaborative Filtering Models• Personalized•Without Context
Search • Leverage Context•Using CF as One of Many Inputs
Search & Recommendation Stack
Query Interpretation
Retrieval
Ranking – Item & Explanation
Index Building
Context for Query & User
Model Building
Explanation Content
Visualization
CollaborativeFilters
Item / User Metadata
22
Using Context, Frequency & Sentiment• Context
- Implicit: Location, Time, Mobile/Web
- Explicit: Query
• High End Restaurant for Dinner
- Low Frequency, High Sentiment
• Fast, Mediocre Sushi for Lunch
- High Frequency, Moderate Sentiment
23
Offline Models with Limited Data• Minimize Confusing User Experience
• Little to No Data- Heuristics
Encoding Product Expectations• Eg: Romantic Dates are not $. Sushi is not good for
Breakfast• Limited Data
- Data-Informed Eg: Analyze what Cuisines Users Click on when they
Query for Lunch
Offline Models with Significant Data• Compensate for Sparseness• As Signals Improve, Popular -> Personalized•OpenTable Example
- Context: User Location, Searched Location, Query, etc.
• Learning to Rank- E [ Revenue | Query, Position, Item, User ]- E [ Engagement | Query, Position, Item, User ]- Regression, RankSVM, LambdaMart…
The Metric Gap
26
Training Test
Training Error
Generalization Error
RMSE Precision @ K
Stage
Example
The Metric Gap
27
Training Error
Generalization Error
RMSE Precision @ K
Stage
Example
Generalization Gap
Training Test
The Metric Gap
28
Training Error
Generalization Error
RMSE Precision @ K
Stage
Example
Learning to Rank
Training Test
The Metric Gap
Training Error
Generalization Error A/B Metric
RMSE Precision @ K Conversion
Learning to Rank
Stage
Example
Offline (Hours) Online (Weeks)
The Metric Gap
30
Training Error
Generalization Error A/B Metric
RMSE Precision @ K Conversion
Learning to Rank
Stage
Example
Offline (Hours) Online (Weeks)
Offline -> Online Gap
Online Learning – Overview• Naïve Online Learning is A/B testing
- Try different sets of parameters, pick the winner
•Multi-Arm Bandit- Exploiting the parameter
sets that do well- Exploring parameters that
we don’t understand well yet (high variance)
Online Learning – Implementation• Iteration Loop
- Add Sets of Parameters - Explore vs. Exploit
Current Parameters• Validate Online Learning with A/B testing•Note: Tradeoff in Time to Statistical Significance
Example – Start with 1 arm
Parameter
Met
ric
Example – Resample arm
Parameter
Met
ric
Example – Determine 2nd Arm
Parameter
Met
ric
Example – Select Arm
Parameter
Met
ric
Example – Improve Arm’s Estimation
Parameter
Met
ric
Example – Select Arm
Parameter
Met
ric
Example – Learn from Arm
Parameter
Met
ric
Example – Determine New Arm
Parameter
Met
ric
Training DataFlow
Collaborative Filter Service
(Realtime)
Collaborative Filter HyperParameter Tuning
(Batch with Spark)
Collaborative Filter Training
(Batch with Spark)
Training DataFlow
Collaborative Filter Service
(Realtime)
Collaborative Filter HyperParameter Tuning
(Batch with Spark)
Collaborative Filter Training
(Batch with Spark)
Search Service(Realtime)
Search HyperParameter Tuning
(Batch with Spark)
Search Training(Batch with Spark)
Training DataFlow
Collaborative Filter Service
(Realtime)
Collaborative Filter HyperParameter Tuning
(Batch with Spark)
Collaborative Filter Training
(Batch with Spark)
Search Service(Realtime)
Search HyperParameter Tuning
(Batch with Spark)
Search Training(Batch with Spark)
User Interaction Logs(Kafka)
Frontends & Backend Services
Training DataFlow
Collaborative Filter Service
(Realtime)
Collaborative Filter HyperParameter Tuning
(Batch with Spark)
Collaborative Filter Training
(Batch with Spark)
Search Service(Realtime)
Search HyperParameter Tuning
(Batch with Spark)
Search Training(Batch with Spark)
User Interaction Logs(Kafka)
Online Learning
Frontends & Backend Services
Training DataFlow
Collaborative Filter Service
(Realtime)
Collaborative Filter HyperParameter Tuning
(Batch with Spark)
Collaborative Filter Training
(Batch with Spark)
Search Service(Realtime)
Search HyperParameter Tuning
(Batch with Spark)
Search Training(Batch with Spark)
User Interaction Logs(Kafka)
Online Learning
Frontends & Backend Services
A/B Validation
Compelling Recommendations
46
Recommendation Explanations• Amazon
• Ness
• Netflix
• Ness - Social
47
Summarizing Content
• Essential for Mobile• Balance Utility With Trust?
- Summarize, but surface raw data
• Example: - Initially, read every review- Later, use average star rating
48
Summarizing Restaurant Attributes
49
Dish Recommendation•What to try once I have arrived?
50
Thanks!
Jeremy Schiff, [email protected]
Other OpenTable Members @ RecSys:Sudeep Das & Pablo Delgado