Flexible recommender systems based on graphs
Transcript of Flexible recommender systems based on graphs
|
kernixdigital factory + data lab
Flexible recommender systems based on graphs
|
KERNIX
45co-workers 500
projects
2co-founders
3,5M€ revenue 15
years experience
10books
published
Digital factory Data lab
CO-FOUNDERS
Fabrice Métayer and François-Xavier
Bois, two EPITA engineers, gathered
their complementary profiles to create
Kernix in 2001.
ABOUT KERNIX
Kernix’s core business consists in a
digital factory and a data lab.
This double skill allows us to
accompany our clients from upstream
phases (consulting, study, POC) to
downstream phases (industrialization
by production teams).
|
3
DATA LAB
Clients Collaborations
EXPERTISE
Data Pipelines
Cop21
TerraRush
Predictive maintenance
ERDF
Data Vizualisation
SolarImpulse
Recommender systems
PriceMinister
WikiDistrict
Clickalto
HobbyStreet
Marketing Automation
Performics
RadiumOne
Open Data
Accessible.net
|
• Graph database– data stored as nodes
• label : “type” of data stored in the node
• properties : collection of information describing
the node
– nodes are linked together by edges
• type : describes the nature of the relation
– query language : allows to perform graph traversals
• Why graph-oriented recommender
systems ?– gather heterogeneous data in the same structure
– explicitly take advantage of relationships
– "meaningful" for humans
– easy implementation
– fast execution (no training)
GRAPH-ORIENTED RECOMMENDER SYSTEM
|
USE CASE 1 : HOBBYSTREET
|
Facilitate connections between craftsmen and private individuals• Craftsmen : propose workshops (different categories, dates, prices)
• Individuals : follow workshops/categories, sign up at workshops
• Hobbystreet : handle registrations, plannings, payments, propose customized suggestions
CONTEXT
|
DATA STRUCTURE
Username
city
Carftmanname
activity
Workshopname
description
GPS coordinates
Sessiondate, time
price
status
stock
Categoryname
activity
follows
proposes
related to instance of
participates
|
SUGGESTIONS : OVERALL STRATEGY
Category
User
Workshop 1
Category 1
Category 2
Workshop 2
Workshop 3
Workshop 4
Similar descriptions
User
Workshop 1
Workshop 2
Workshop 3
Workshop 4
Workshop 5
Workshop 6
from LSA
Similar users
User 1
Workshop 1
Workshop 2
Workshop 3
User 2
User 3
Workshop 4
Workshop 5
Workshop 6
Usim
|
USE CASE 2 : KONBINI
|
Context
“... multi format media company
producing its own mix of culture, art
and news content. It promotes
online journalism, advocating an
emphasis on pop culture and a
commitment to develop local
emerging talents.”
“... became one of the first
websites to put Social Media
platforms at the heart of their
strategy.”Issue: ~90% bounce rate (users going back after viewing a
page)
Solution: Recommend interesting articles on the visited
pages will help user experience.
|
Entities
French posts [693]Authors [56]
Categories [534] Mexican posts [149]
English posts [417]
Examples of node properties
blog_id: 9
post_id: 217628
post_date: 20151007
slug: rihanna-thinks-rachel...
boost: 0
viewed_count: 0
facebook_count: 148
twitter_count: 0
Multiple web sites [US,
England, Mexic, France]
US posts [364]
|
Recommendations principles
For each posts, we will recommend a list of other posts
based on relations shared with the initial post:
- semantic similarity of the contents [LSA]
- number of common categories
- number of common authors
And also on their own properties:
- the freshness
- social counts
- manual boost
Once the graph constructed, these recommendations
can be obtained thanks to a single Cypher query.
|
Conclusion and outlook
|
Stacks and Workflows
Konbini web siteHobbystreet web site
POST content GET recommendations POST content
Daily cached
recommendadions
GET recommendations
Live recommendation for dynamic
interactions
Cached recommendation for high
availability needs
|
Improve semantic analysis:
• exploit similarity of short descriptions (tweets, comments, …). PhD thesis on the subject.
Assess recommendation quality:
• A/B testing but Needs production deployment.
• Offline testing ? No real assessment on the impact of the recommendations performed.
• Rating of pool of testers ?
Outlook