RecSys 2015 - Unifying the Problem of Search and Recommendations at OpenTable

Unifying the Problem of Search and Recommendations at OpenTable

Jeremy Schiff, Ph.D.RecSys 201509/20/2015

BEFORE DURING AFTER

DIN

ERS

REST

AURA

NTS

Understanding& Evolving

Attracting &Planning

OpenTable: Deliver great experiences at every step, based on who you are

Proprietary 3

OpenTable in Numbers•Our network connects diners with more than 32,000 restaurants worldwide.•Our diners have spent approximately $35 billion at our partner restaurants.•OpenTable seats more than 17 million diners each month.• Every month, OpenTable diners write more than 475,000 restaurant reviews

4

OpenTable Data Ecosystem

Search(Context & Intent)

Restaurant Profile(Decision Confidence)

Reservation History (Verifying the Loop)

Reviews(Verifying the Loop)

User’s LocationSearch LocationDate, TimeQuery

Reviews, Ratings (Overall, Food, Noise Level, etc)

Seating Logs

Photos, Reviews, Ratings, Menus

OpenTable Data Ecosystem

Search(Context & Intent)

Restaurant Profile(Decision Confidence)

Reservation History (Verifying the Loop)

Reviews(Verifying the Loop)

User’s LocationSearch LocationDate, TimeQuery

Reviews, Ratings (Overall, Food, Noise Level, etc)

Seating Logs

Photos, Reviews, Ratings, MenusUser

Interaction Logs

So what are recommendations?

What’s the Goal

Minimizing Engineering Time to Improve The Metric that Matters

•Make it Easy to Measure•Make it Easy to Iterate• Reduce Iteration Cycle Times

9

Pick Your Business MetricRevenue, Conversions •OpenTable • Amazon

Retention, Engagement•Netflix• Pandora• Spotify

10

Importance of A/B Testing• If you don’t measure it, you can’t improve it

•Metrics Drive Behavior

• Continued Forward Progress

11

The Optimization Loops

Introspect Offline Learning

OnlineLearning

Hours Days Weeks

12

13

The ingredients of a spectaculardining experience…

14

… and a spectacularly bad one

Examples of Topics (using MF)

15

Edit via the Header & Footer menu in PowerPoint

1616

Lead

Tim

eLe

ad T

ime

Distance

Distance

New York

Dallas

Lead Time

95%

95%

Query Logs• Effective mechanism for understanding what users are trying to do

• Reducing 0 result queries

- Anecdote: should we support zipcodes next?

Search to Recommendations Continuum• Common Themes

- Ranking always tries to move key metric (like conversion)

- Always leverage implicit signals (time of day, day of week, location, etc)

- User Control vs. Paradox of ChoiceAdvantage Example Stage Item Count

Search User Control $$, French, Takes Credit Card

Retrieval Many

Browse Use Case Control Great View / Romantic

Ranking Many

Recommend Data-Driven Flexibility

Best around me Ranking Few

Differences in Recommender Usage

Right now vs. Planning

Cost of Being Wrong

Search vs. Recommendations

20

Search vs. RecommendationsCollaborative Filtering Models• Personalized•Without Context

Search • Leverage Context•Using CF as One of Many Inputs

Search & Recommendation Stack

Query Interpretation

Retrieval

Ranking – Item & Explanation

Index Building

Context for Query & User

Model Building

Explanation Content

Visualization

CollaborativeFilters

Item / User Metadata

22

Using Context, Frequency & Sentiment• Context

- Implicit: Location, Time, Mobile/Web

- Explicit: Query

• High End Restaurant for Dinner

- Low Frequency, High Sentiment

• Fast, Mediocre Sushi for Lunch

- High Frequency, Moderate Sentiment

23

Offline Models with Limited Data• Minimize Confusing User Experience

• Little to No Data- Heuristics

Encoding Product Expectations• Eg: Romantic Dates are not $. Sushi is not good for

Breakfast• Limited Data

- Data-Informed Eg: Analyze what Cuisines Users Click on when they

Query for Lunch

Offline Models with Significant Data• Compensate for Sparseness• As Signals Improve, Popular -> Personalized•OpenTable Example

- Context: User Location, Searched Location, Query, etc.

• Learning to Rank- E [ Revenue | Query, Position, Item, User ]- E [ Engagement | Query, Position, Item, User ]- Regression, RankSVM, LambdaMart…

The Metric Gap

26

Training Test

Training Error

Generalization Error

RMSE Precision @ K

Stage

Example

The Metric Gap

27

Training Error


RMSE Precision @ K

Stage

Example

Generalization Gap

Training Test

The Metric Gap

28

Training Error


RMSE Precision @ K

Stage

Example

Learning to Rank

Training Test

The Metric Gap

Training Error

Generalization Error A/B Metric

RMSE Precision @ K Conversion

Learning to Rank

Stage

Example

Offline (Hours) Online (Weeks)

The Metric Gap

30

Training Error

Generalization Error A/B Metric

RMSE Precision @ K Conversion

Learning to Rank

Stage

Example

Offline (Hours) Online (Weeks)

Offline -> Online Gap

Online Learning – Overview• Naïve Online Learning is A/B testing

- Try different sets of parameters, pick the winner

•Multi-Arm Bandit- Exploiting the parameter

sets that do well- Exploring parameters that

we don’t understand well yet (high variance)

Online Learning – Implementation• Iteration Loop

- Add Sets of Parameters - Explore vs. Exploit

Current Parameters• Validate Online Learning with A/B testing•Note: Tradeoff in Time to Statistical Significance

Example – Start with 1 arm

Parameter

Met

ric

Example – Resample arm

Parameter

Met

ric

Example – Determine 2nd Arm

Parameter

Met

ric

Example – Select Arm

Parameter

Met

ric

Example – Improve Arm’s Estimation

Parameter

Met

ric

Example – Select Arm

Parameter

Met

ric

Example – Learn from Arm

Parameter

Met

ric

Example – Determine New Arm

Parameter

Met

ric

Training DataFlow

Collaborative Filter Service

(Realtime)

Collaborative Filter HyperParameter Tuning

(Batch with Spark)

Collaborative Filter Training

(Batch with Spark)

Training DataFlow


(Realtime)


(Batch with Spark)


(Batch with Spark)

Search Service(Realtime)

Search HyperParameter Tuning

(Batch with Spark)

Search Training(Batch with Spark)

Training DataFlow


(Realtime)


(Batch with Spark)


(Batch with Spark)



(Batch with Spark)


User Interaction Logs(Kafka)

Frontends & Backend Services

Training DataFlow


(Realtime)


(Batch with Spark)


(Batch with Spark)



(Batch with Spark)



Online Learning


Training DataFlow


(Realtime)


(Batch with Spark)


(Batch with Spark)



(Batch with Spark)



Online Learning


A/B Validation

Compelling Recommendations

46

Recommendation Explanations• Amazon

• Ness

• Netflix

• Ness - Social

47

Summarizing Content

• Essential for Mobile• Balance Utility With Trust?

- Summarize, but surface raw data

• Example: - Initially, read every review- Later, use average star rating

48

Summarizing Restaurant Attributes

49

Dish Recommendation•What to try once I have arrived?

50

Thanks!

Jeremy Schiff, [email protected]

Other OpenTable Members @ RecSys:Sudeep Das & Pablo Delgado

mailto:[email protected]



RecSys 2015 - Unifying the Problem of Search and Recommendations at OpenTable

Software

Transcript of RecSys 2015 - Unifying the Problem of Search and Recommendations at OpenTable