RecSys 2015 - Unifying the Problem of Search and Recommendations at OpenTable

Unifying the Problem of Search and Recommendations at OpenTable

Jeremy Schiff, Ph.D.RecSys 201509/20/2015

BEFORE DURING AFTER

Understanding& Evolving

Attracting &Planning

OpenTable: Deliver great experiences at every step, based on who you are

Proprietary 3

OpenTable in Numbers•Our network connects diners with more than 32,000 restaurants worldwide.•Our diners have spent approximately $35 billion at our partner restaurants.•OpenTable seats more than 17 million diners each month.• Every month, OpenTable diners write more than 475,000 restaurant reviews

OpenTable Data Ecosystem

Search(Context & Intent)

Restaurant Profile(Decision Confidence)

Reservation History (Verifying the Loop)

Reviews(Verifying the Loop)

User’s LocationSearch LocationDate, TimeQuery

Reviews, Ratings (Overall, Food, Noise Level, etc)

Seating Logs

Photos, Reviews, Ratings, Menus

OpenTable Data Ecosystem

Search(Context & Intent)

Restaurant Profile(Decision Confidence)

Reservation History (Verifying the Loop)

Reviews(Verifying the Loop)

User’s LocationSearch LocationDate, TimeQuery

Reviews, Ratings (Overall, Food, Noise Level, etc)

Seating Logs

Photos, Reviews, Ratings, MenusUser

Interaction Logs

So what are recommendations?

What’s the Goal

Minimizing Engineering Time to Improve The Metric that Matters

•Make it Easy to Measure•Make it Easy to Iterate• Reduce Iteration Cycle Times

Pick Your Business MetricRevenue, Conversions •OpenTable • Amazon

Retention, Engagement•Netflix• Pandora• Spotify

Importance of A/B Testing• If you don’t measure it, you can’t improve it

•Metrics Drive Behavior

• Continued Forward Progress

The Optimization Loops

Introspect Offline Learning

OnlineLearning

Hours Days Weeks

The ingredients of a spectaculardining experience…

… and a spectacularly bad one

Examples of Topics (using MF)

Edit via the Header & Footer menu in PowerPoint

Distance

New York

Dallas

Lead Time

Query Logs• Effective mechanism for understanding what users are trying to do

• Reducing 0 result queries

- Anecdote: should we support zipcodes next?

Search to Recommendations Continuum• Common Themes

- Ranking always tries to move key metric (like conversion)

- Always leverage implicit signals (time of day, day of week, location, etc)

- User Control vs. Paradox of ChoiceAdvantage Example Stage Item Count

Search User Control $$, French, Takes Credit Card

Retrieval Many

Browse Use Case Control Great View / Romantic

Ranking Many

Recommend Data-Driven Flexibility

Best around me Ranking Few

Differences in Recommender Usage

Right now vs. Planning

Cost of Being Wrong

Search vs. Recommendations

Search vs. RecommendationsCollaborative Filtering Models• Personalized•Without Context

Search • Leverage Context•Using CF as One of Many Inputs

Search & Recommendation Stack

Query Interpretation

Retrieval

Ranking – Item & Explanation

Index Building

Context for Query & User

Model Building

Explanation Content

Visualization

CollaborativeFilters

Item / User Metadata

Using Context, Frequency & Sentiment• Context

- Implicit: Location, Time, Mobile/Web

- Explicit: Query

• High End Restaurant for Dinner

- Low Frequency, High Sentiment

• Fast, Mediocre Sushi for Lunch

- High Frequency, Moderate Sentiment

Offline Models with Limited Data• Minimize Confusing User Experience

• Little to No Data- Heuristics

Encoding Product Expectations• Eg: Romantic Dates are not $. Sushi is not good for

Breakfast• Limited Data

- Data-Informed Eg: Analyze what Cuisines Users Click on when they

Query for Lunch

Offline Models with Significant Data• Compensate for Sparseness• As Signals Improve, Popular -> Personalized•OpenTable Example

- Context: User Location, Searched Location, Query, etc.

• Learning to Rank- E [ Revenue | Query, Position, Item, User ]- E [ Engagement | Query, Position, Item, User ]- Regression, RankSVM, LambdaMart…

The Metric Gap

Training Test

Training Error

Generalization Error

RMSE Precision @ K

Example

The Metric Gap

Training Error

RMSE Precision @ K

Example

Generalization Gap

Training Test

The Metric Gap

Training Error

RMSE Precision @ K

Example

Learning to Rank

Training Test

The Metric Gap

Training Error

Generalization Error A/B Metric

RMSE Precision @ K Conversion

Learning to Rank

Example

Offline (Hours) Online (Weeks)

The Metric Gap

Training Error

Generalization Error A/B Metric

RMSE Precision @ K Conversion

Learning to Rank

Example

Offline (Hours) Online (Weeks)

Offline -> Online Gap

Online Learning – Overview• Naïve Online Learning is A/B testing

- Try different sets of parameters, pick the winner

•Multi-Arm Bandit- Exploiting the parameter

sets that do well- Exploring parameters that

we don’t understand well yet (high variance)

Online Learning – Implementation• Iteration Loop

- Add Sets of Parameters - Explore vs. Exploit

Current Parameters• Validate Online Learning with A/B testing•Note: Tradeoff in Time to Statistical Significance

Example – Start with 1 arm

Parameter

Example – Resample arm

Parameter

Example – Determine 2nd Arm

Parameter

Example – Select Arm

Parameter

Example – Improve Arm’s Estimation

Parameter

Example – Select Arm

Parameter

Example – Learn from Arm

Parameter

Example – Determine New Arm

Parameter

Training DataFlow

Collaborative Filter Service

(Realtime)

Collaborative Filter HyperParameter Tuning

(Batch with Spark)

Collaborative Filter Training

(Batch with Spark)

Training DataFlow

(Realtime)

(Batch with Spark)

Search Service(Realtime)

Search HyperParameter Tuning

(Batch with Spark)

Search Training(Batch with Spark)

Training DataFlow

(Realtime)

(Batch with Spark)

User Interaction Logs(Kafka)

Frontends & Backend Services

Training DataFlow

(Realtime)

(Batch with Spark)

Online Learning

Training DataFlow

(Realtime)

(Batch with Spark)

Online Learning

A/B Validation

Compelling Recommendations

Recommendation Explanations• Amazon

• Ness

• Netflix

• Ness - Social

Summarizing Content

• Essential for Mobile• Balance Utility With Trust?

- Summarize, but surface raw data

• Example: - Initially, read every review- Later, use average star rating

Summarizing Restaurant Attributes

Dish Recommendation•What to try once I have arrived?

Thanks!

Jeremy Schiff, Ph.D.jschiff@opentable.com

Other OpenTable Members @ RecSys:Sudeep Das & Pablo Delgado

RecSys 2015 - Unifying the Problem of Search and Recommendations at OpenTable

Software

Transcript of RecSys 2015 - Unifying the Problem of Search and Recommendations at OpenTable

OpenTable Analysis

Recsys 2014 Tutorial - The Recommender Problem Revisited

Philadelphia Best Places to Work Roadshow | OpenTable

User-driven Approaches to Recsys

ITMO RecSys course. Autumn2014. Lecture1

Jeremy Schiff, Senior Manager, Data Science, OpenTable at MLconf NYC

State of RecSys: Recap of RecSys 2012

Using Data Science to Transform OpenTable Into Your Local Dining Expert

Multiple objectives in Collaborative Filtering (RecSys 2010)

LinkedIn Skills: RecSys Conference 2014

Yelp vs Opentable - Restaurant Reservations

Los Angeles Best Places to Work Roadshow | OpenTable

on an oblivious recommender - RecSys

Recommendation Architecture - OpenTable - RecSys 2014 - Large Scale Recommender Systems

Orthogonal query recommendation (RecSys 2013)

Recsys matrix-factorizations

RecSys Challenge 2015: ensemble learning with categorical features

Strategy Spotlight - NMG Capital Group · 3/19/2020 · Source: RBC US Equity Strategy, OpenTable. Bars show year-over-year seated diners at restaurants on the OpenTable network

Relevance Displays: A case study in RecSys UI Research

OpenTable, Zomato, Yelp,Foursquare Labs | Company Showdown