Large-scale Recommendations in a Dynamic Marketplace Jay Katukuri Rajyashree Mukherjee Tolga Konik...

27
Large-scale Recommendations in a Dynamic Marketplace Jay Katukuri Rajyashree Mukherjee Tolga Konik Chu-Cheng Hsieh LSRS 2013 1

Transcript of Large-scale Recommendations in a Dynamic Marketplace Jay Katukuri Rajyashree Mukherjee Tolga Konik...

Page 1: Large-scale Recommendations in a Dynamic Marketplace Jay Katukuri Rajyashree Mukherjee Tolga Konik Chu-Cheng Hsieh LSRS 20131.

1

Large-scale Recommendations in a Dynamic Marketplace

Jay KatukuriRajyashree Mukherjee

Tolga KonikChu-Cheng Hsieh

LSRS 2013

Page 2: Large-scale Recommendations in a Dynamic Marketplace Jay Katukuri Rajyashree Mukherjee Tolga Konik Chu-Cheng Hsieh LSRS 20131.

2

John is interested in an item: “iPhone 5 64gb white”, should we recommends– “iPhone 5 case”

(or)– “iPhone 5s gold”

Meet John Doe

LSRS 2013

Page 3: Large-scale Recommendations in a Dynamic Marketplace Jay Katukuri Rajyashree Mukherjee Tolga Konik Chu-Cheng Hsieh LSRS 20131.

3

Recommendation on e-marketplace

• Recommendation “before” purchase– iPhone 5S gold

• Recommendation “after” purchase– iPhone 5 case

Similar Item Recommendation (SIR)

Related Item Recommendation (RIR)

LSRS 2013

Page 4: Large-scale Recommendations in a Dynamic Marketplace Jay Katukuri Rajyashree Mukherjee Tolga Konik Chu-Cheng Hsieh LSRS 20131.

4

SIR- Example 1

LSRS 2013

Page 5: Large-scale Recommendations in a Dynamic Marketplace Jay Katukuri Rajyashree Mukherjee Tolga Konik Chu-Cheng Hsieh LSRS 20131.

5

SIR Example 2

LSRS 2013

Page 6: Large-scale Recommendations in a Dynamic Marketplace Jay Katukuri Rajyashree Mukherjee Tolga Konik Chu-Cheng Hsieh LSRS 20131.

Related Item Recommendation

6

Recommendations forXbox 360 4GB on Checkout page

LSRS 2013

Page 7: Large-scale Recommendations in a Dynamic Marketplace Jay Katukuri Rajyashree Mukherjee Tolga Konik Chu-Cheng Hsieh LSRS 20131.

7

Main Idea

• Similar Item Clustering (SIC)– Titles–Attributes (Price, etc.)– Images

• Recommendation– SIR: (same cluster)– RIR: (neighbor clusters)

LSRS 2013

Page 8: Large-scale Recommendations in a Dynamic Marketplace Jay Katukuri Rajyashree Mukherjee Tolga Konik Chu-Cheng Hsieh LSRS 20131.

8

Models

• Item clustersCluster represented by meaningful keywords– “clarks women shoe pumps classics”– “authentic handmade amish quilt”

• Cluster-Cluster Relations– “samsung galaxy s4” – “samsung galaxy s4 screen

protector”– “wolfgang puck electric pressure cooker” –

“kitchenaid food processor”

LSRS 2013

Page 9: Large-scale Recommendations in a Dynamic Marketplace Jay Katukuri Rajyashree Mukherjee Tolga Konik Chu-Cheng Hsieh LSRS 20131.

LSRS 2013 9

System Architecture - Overview

Inventory

Cluster-ClusterRelations

Transactions

Clusters

Conceptual Knowledgebase

Offline Model Generation The Data Store Real-time Performance System

Similar Items Recommender

(SIR)

Related Items Recommender

(RIR)

Clusters Model Generation

Related Clusters Model

Generation

Clickstream

Lost Item

Similar Items

?similarTo(item)

Bought Item

Related Items

?relatedTo(item)

Page 10: Large-scale Recommendations in a Dynamic Marketplace Jay Katukuri Rajyashree Mukherjee Tolga Konik Chu-Cheng Hsieh LSRS 20131.

10

Cluster Generation(offline)

LSRS 2013

Page 11: Large-scale Recommendations in a Dynamic Marketplace Jay Katukuri Rajyashree Mukherjee Tolga Konik Chu-Cheng Hsieh LSRS 20131.

11

Data on eBay

• Item-item co-occurrences on transaction logs• Large Data – Much bigger data set in both users and inventory

than other ecommerce sites.• Scale – More than 300M listings.– More than 10M new items every day

LSRS 2013

Page 12: Large-scale Recommendations in a Dynamic Marketplace Jay Katukuri Rajyashree Mukherjee Tolga Konik Chu-Cheng Hsieh LSRS 20131.

12

Challenges

• Global clustering not feasible• Size bias on different categories• Performance

LSRS 2013

Page 13: Large-scale Recommendations in a Dynamic Marketplace Jay Katukuri Rajyashree Mukherjee Tolga Konik Chu-Cheng Hsieh LSRS 20131.

13

Model Generation - Clusters

1. Select a few keyword to represents “big notions”, e.g. iPhone, Handbags, etc.– How to select?

2. Clustering by K-means– How to set K?

LSRS 2013

Page 14: Large-scale Recommendations in a Dynamic Marketplace Jay Katukuri Rajyashree Mukherjee Tolga Konik Chu-Cheng Hsieh LSRS 20131.

14

Model Generation - Clusters

new clustersitems user queries

concepts,categories

query-to-itemsQuery-Recall Generation

Cluster Generation

Clusters Model Generation

Data Store

Clusters

Inventory

Clickstream

Conceptual Knowledgebase

• Problem:Global clustering not feasible

• Solution:Partition input data by user queries

• Parallel distributed K-Means in Hadoop MapReduce

• Dedupe and merge overlapping clusters(100X reduction in size over inventory with over 90% coverage)

LSRS 2013

Page 15: Large-scale Recommendations in a Dynamic Marketplace Jay Katukuri Rajyashree Mukherjee Tolga Konik Chu-Cheng Hsieh LSRS 20131.

15

Base Cluster Generation

• Base Cluster ≡ Query• Find merge candidates based on query term

overlap– Eg: “nike airmax tennis shoes” -> “nike airmax”

• Score candidates using cosine similarity– Term weight : TF-IDF in the query

space(document=query)• TF : Query Demand• IDF : Number of Queries

LSRS 2013

Page 16: Large-scale Recommendations in a Dynamic Marketplace Jay Katukuri Rajyashree Mukherjee Tolga Konik Chu-Cheng Hsieh LSRS 20131.

16

Step 1: base cluster candidates

• Method for choosing the ``base clusters’’ (initial states):

– Minimum frequency– Supply threshold (Enough Inventory)– Min and max token constraint (Length of queries)– Heuristic constraints • Queries that have only numbers are not

allowed: “10 5”• …

–Merge similar clusters into one

LSRS 2013

Page 17: Large-scale Recommendations in a Dynamic Marketplace Jay Katukuri Rajyashree Mukherjee Tolga Konik Chu-Cheng Hsieh LSRS 20131.

17

candidates merge

• 4.34M base clusters merged into 1.95M• Example

phrase(hand,made) phrase(king,s) queen quiltphrase(hand,made) phrase(pink,s) quilt phrase(hand,made) phrase(prae,owned) queen quiltphrase(hand,made) queen quiltphrase(hand,made) phrase(prae,owned) quiltphrase(hand,made) quilt size twinphrase(hand,made) quilt silkphrase(hand,made) quilt twinphrase(hand,made) phrase(patch,work) quiltphrase(hand,made) quilt whitephrase(hand,made) phrase(king,size) quiltphrase(hand,made) phrase(yo,yo,s) quiltphrase(hand,made) quilt salephrase(hand,made) quilt red

phrase(hand,made) quilt

LSRS 2013

Page 18: Large-scale Recommendations in a Dynamic Marketplace Jay Katukuri Rajyashree Mukherjee Tolga Konik Chu-Cheng Hsieh LSRS 20131.

18

Step 2: K-Means Clustering

Split Clusters

Query to Items Data

Base Cluster Generation

K-Means Clustering of Base Clusters

Generate Item Features

Transaction Logs

Inventory Logs

Scoring Models

LSRS 2013

Page 19: Large-scale Recommendations in a Dynamic Marketplace Jay Katukuri Rajyashree Mukherjee Tolga Konik Chu-Cheng Hsieh LSRS 20131.

19

Clusters on Item Signature

apple ipod touch 4g clear film protector screen

Cluster

clarks women shoe pumps classics

LSRS 2013

Page 20: Large-scale Recommendations in a Dynamic Marketplace Jay Katukuri Rajyashree Mukherjee Tolga Konik Chu-Cheng Hsieh LSRS 20131.

20

Recommendation (online)

LSRS 2013

Page 21: Large-scale Recommendations in a Dynamic Marketplace Jay Katukuri Rajyashree Mukherjee Tolga Konik Chu-Cheng Hsieh LSRS 20131.

21

Performance System

Clusters InventoryConceptual Knowledgebase

?similarTo(item)

SIR query formation

Item

Sel

ecti

on

Cluster Assignment

SIR Ranking

 

items

Data Store

Lost Item Similar

Items

recommendations

Item Search

 

query

Clusters

Inventory

Conceptual Knowledgebase

?relatedTo(item)

Item

Sel

ecti

on

Cluster Assignment

RIR Ranking

 

items

Data Store

BoughtItem Related

Items

recommendations

Item Search

 

queriesRIR Query Formation

Cluster-ClusterRelations

clusters related

clusters

LSRS 2013

Page 22: Large-scale Recommendations in a Dynamic Marketplace Jay Katukuri Rajyashree Mukherjee Tolga Konik Chu-Cheng Hsieh LSRS 20131.

22

Items in the same cluster

LSRS 2013

Page 23: Large-scale Recommendations in a Dynamic Marketplace Jay Katukuri Rajyashree Mukherjee Tolga Konik Chu-Cheng Hsieh LSRS 20131.

23

Similar Item Recommendations

LSRS 2013

Page 24: Large-scale Recommendations in a Dynamic Marketplace Jay Katukuri Rajyashree Mukherjee Tolga Konik Chu-Cheng Hsieh LSRS 20131.

LSRS 2013 24

Experimental Results

• A/B Tests comparing against legacy systems– SIR legacy system

• Completely online• Naïve approach of using seed item title as a search query

– RIR legacy system• Chen, Y. and J.F. Canny, Recommending ephemeral items at web scale,

ACM SIGIR 2011• Collaborative Filtering on stable representations of items

– Significant improvements at 90% confidence interval• SIR resulted in 38.18% higher user engagement (CTR)• RIR resulted in 10.5% higher CTR• Statistically significant improvement in site-wide business metrics

from both SIR & RIR

Page 25: Large-scale Recommendations in a Dynamic Marketplace Jay Katukuri Rajyashree Mukherjee Tolga Konik Chu-Cheng Hsieh LSRS 20131.

LSRS 2013 25

Conclusion

• Balance between similarity and quality crucial in driving user engagement and conversion

• Clusters of similar items in the inventory– Local clustering in the coverage set of user queries

• Offline models built using Map-Reduce– Huge input datasets including inventory, clickstream

and transactional data• Efficient real-time performance system• Currently deployed on ebay.com

Page 26: Large-scale Recommendations in a Dynamic Marketplace Jay Katukuri Rajyashree Mukherjee Tolga Konik Chu-Cheng Hsieh LSRS 20131.

LSRS 2013 26

Acknowledgments

• Current & Past team members– Kranthi Chalasani – Santanu Kolay – Riyaaz Shaik – Venkat Sundaranatha

Page 27: Large-scale Recommendations in a Dynamic Marketplace Jay Katukuri Rajyashree Mukherjee Tolga Konik Chu-Cheng Hsieh LSRS 20131.

LSRS 2013 27

WE’RE HIRINGChu-Cheng Hsieh [email protected]