Near real-time recommendations in enterprise social networks

40
#AICRECSYS

description

- how to compute recommendations using a graph with 40m edges and 11m nodes in 0.2s (200ms) - new perspective on near real-time social recommendations in enterprise social platforms using Linked Data - recommender system that is easy to integrate with social networks and legacy data - application of data analytics in enterprise context

Transcript of Near real-time recommendations in enterprise social networks

Page 1: Near real-time recommendations in enterprise social networks

#AICRECSYS

Page 2: Near real-time recommendations in enterprise social networks

ADVANsse Advances in social semantic enterprise

HTTP://ADVANSSE.DERI.IE/

MACIEJ DABROWSKI BENJAMIN HEITMANN CONOR HAYES KEITH GRIFFIN

10TH JULY 2013

Page 3: Near real-time recommendations in enterprise social networks

About me MACIEJ DABROWSKI!

[email protected]!

lecturerAt

co-PI

contact

co-PI

worksWith

researcherAtgraduated

name

Page 4: Near real-time recommendations in enterprise social networks

Overview

THIS TALK

RESEARCH

INDUSTRY

1.  WHY?

2.  WHAT?

3.  HOW?

4.  TECHNICAL DECISIONS

5.  LESSONS LEARNED

Page 5: Near real-time recommendations in enterprise social networks

Why? What? How?

technical considerations

lessons learned

Page 6: Near real-time recommendations in enterprise social networks

Various information domains

preferencesrecommendations

implicitconnections

Page 7: Near real-time recommendations in enterprise social networks

User profile

TRAVEL

FOOD SPORTS

POLITICS ??

Page 8: Near real-time recommendations in enterprise social networks

Use Case: Enterprise Social Web

Page 9: Near real-time recommendations in enterprise social networks

Enterprise social web ENTERPRISE INFORMATION SPACE

MARKETING

DEVELOPMENT

R & D

ANDREW

BOB

CECILIA

DANNY

Page 10: Near real-time recommendations in enterprise social networks

Limited information flow

MARKETING

DEVELOPMENT

R & D

GREAT TOOL!"

MEETING IBM"

TALK BY DERI"

ANDREW

BOB

CECILIA

DANNY

ENTERPRISE INFORMATION SPACE

Page 11: Near real-time recommendations in enterprise social networks

Disconnected Social Networks

?  

ANDREW

BOB

CECILIA

DANNY

MARKETING DEVELOPMENT

R & D

Page 12: Near real-time recommendations in enterprise social networks

Distributed Social Platforms

?  

MARKETING

DEVELOPMENT

R & D

Page 13: Near real-time recommendations in enterprise social networks
Page 14: Near real-time recommendations in enterprise social networks

Problem 1: information overload and discovery

Page 15: Near real-time recommendations in enterprise social networks

Problem 2: data level issues

DISTRIBUTION

MULTIPLE DOMAINS AND TYPES OF ENTITIES

PEOPLE INTERESTS

CONTENT

Page 16: Near real-time recommendations in enterprise social networks

Requirements - personalization

USE BACKGROUND KNOWLEDGE

ALLOW CROSS-DOMAIN MULTI-SOURCE PERSONALIZATION

EXPLOIT SOCIAL GRAPH

ALLOW REAL-TIME APPLICATIONS

Page 17: Near real-time recommendations in enterprise social networks

Requirements - data

DATA LEVEL •  FLEXIBLE •  COMPACT •  ENABLE CRUD •  GRAPH?

TRANSPORT PROTOCOL: •  RELIABLE •  EFFICIENT •  PUBSUB?

Page 18: Near real-time recommendations in enterprise social networks

What?

A PLATFORM BASED ON OPEN STANDARDS THAT IS EASILY PLUGGABLE TO EXISTING INFRASTRUCTURES AND THAT EXPLOITS LEGACY INFORMATION, SOCIAL GRAPH AND INTEREST GRAPH TO PROVIDE A PERSONALIZED INFORMATION “DASHBOARD” IN NEAR REAL-TIME.

Page 19: Near real-time recommendations in enterprise social networks

use cases

Page 20: Near real-time recommendations in enterprise social networks

HOW? A look inside

Page 21: Near real-time recommendations in enterprise social networks

Step 1: Exploit distributed (social) graphs

http://www.insidefacebook.com/wp-content/uploads/2013/06/shutterstock_107108318.jpg

Page 22: Near real-time recommendations in enterprise social networks

Step 2: Exploit interest graphs

BENEFITS OF USING INTEREST GRAPHS:

1.  FLEXIBLE SOURCE OF BACKGROUND KNOWLEDGE

2.  ANY DATASET CAN BE “PLUGGED-IN” IF NEEDED

3.  CROSS-DOMAIN RECOMMENDATIONS

4.  VERY GOOD IN DISCOVERING INTERESTING RECOMMENDATIONS

OUR APPROACH: SPREADING ACTIVATION

Page 23: Near real-time recommendations in enterprise social networks

Interest graphs

DERIMaciej

BlogPost2

Maurice

"Emerging Technology"

http://dbpedia.org/resource/Data_analytics

http://dbpedia.org/resource/Emerging_technologies

sioc:creator_of

sioc:topic

worksat

interestrecommended

interest

owl:sameAs

Expanded User Profile (EUP)Includes both original and recommended interests

Social Software Entities

Additional Profile Knowledge

External Background Knowledge

(DBPedia + domain datasets)

Page 24: Near real-time recommendations in enterprise social networks

Our Approach

A PLATFORM FOR SOCIAL NETWORKS: §  ENTERPRISE FOCUS: PEOPLE, COMMUNITIES, INFORMATION

§  EFFICIENCY USING XMPP PUBSUB AND SPARQL 1.1 UPDATE

§  EXPLOIT INTEREST GRAPH AND VARIOUS DATA SOURCES TO PROVIDE PERSONALIZATION THROUGH SOPHISTICATED NEAR REAL-TIME RECOMMENDATIONS

Page 25: Near real-time recommendations in enterprise social networks

Demonstrator

EASY TO INTEGRATE WITH CISCO INFRASTRUCTURE

OPEN STANDARDS (XMPP, SPARQL 1.1 UPDATE)

SCALABLE RECOMMENDATIONS BASED ON SOCIAL GRAPH WITH OVER 10M ENTITIES AND 40M EDGES COMPUTED BELOW 1 SECOND (0.2S ON AVERAGE).

MORE DETAILS: HTTP://ADVANSSE.DERI.IE/

Page 26: Near real-time recommendations in enterprise social networks

demonstrator

Page 27: Near real-time recommendations in enterprise social networks

Prototype stats

SOCIAL NETWORK GRAPH: •  100S USERS •  100S POSTS •  500+ TAGS •  2000+ ENTITIES •  15000+ EDGES

Saffron.deri.ie

BACKGROUND KNOWLEDGE GRAPH: •  11M ENTITIES •  40M EDGES

CROSS-DOMAIN GRAPH: •  3956 RESEARCH ARTICLES •  LANGUAGE CONFERENCES

Page 28: Near real-time recommendations in enterprise social networks

Why? What? How?

technical considerations

lessons learned

Page 29: Near real-time recommendations in enterprise social networks

Technical considerations

ALGORITHM: •  SEMANTIC NETWORK •  LARGE DATASET •  ITERATIVE GRAPH ALGORITHM •  STATEFUL NODES •  EMBEDDING OF DOMAIN LOGIC

Page 30: Near real-time recommendations in enterprise social networks

Technical considerations

NON-NATIVE IMPORT OF RDF STARTUP TIME WITH DBPEDIA

•  12 MIN ON 24 CORE, 96GB RAM TO LOAD

PARALLEL PROCESSING OF ACTIVATIONS •  STATE FOR EACH USER AT EACH NODE

SCALABILITY ISSUES LACK OF GLOBAL ALGORITHM CONTROL IMMATURE CODE BASE, LACK OF DOCUMENTATION

Page 31: Near real-time recommendations in enterprise social networks

Technical considerations

NATIVE SUPPORT FOR RDF DBPEDIA (5.46GB) COMPRESSED TO 436MB LOW MEMORY REQUIREMENTS LOW STARTUP TIME (90S) FAST QUERY ACCESS < 1ms

Page 32: Near real-time recommendations in enterprise social networks

Server design

XMPP SPREADING ACTIVATION HDT

ADVANSSE connectedsocial platform

XMPP client:Ignite Smack

Web application:Tomcat + Servlet

RDF store:Jena Fuseki

ADVANSSEserver

Personalisationcomponent

Recommendationalgorithm

XMPP

R/W RDF store:Jena Fuseki

XMPP

Java API

XMPP server:Ignite OpenFire

XMPP client:Ignite Smack

Fast, R/O RDF store: HDT

SPARQL

SPARQL + Java API

Java API + SPARQL

Java API

SPARQL

Java API

File import

Link resolver RDF store: Jena Fuseki

Page 33: Near real-time recommendations in enterprise social networks

configuration

•  DISTANCE CONSTRAINT DISABLED •  FANOUT CONSTRAINT ENABLED •  10 TARGET ACTIVATIONS •  ACTIVATION THRESHOLD 0.5 •  INITIAL ACTIVATION 4.0, •  MAXIMUM OUT EDGES 500, •  AND A MAXIMUM OF 10 WAVES AND 1 PHASE

Page 34: Near real-time recommendations in enterprise social networks

stats

DATASET: •  371 USERS •  6 INTEREST ON AVERAGE •  DEGREE 2-5, UP TO 51

200ms 85% AVERAGE EXECUTION COVERAGE

Page 35: Near real-time recommendations in enterprise social networks

The value

SOCIAL CAPITAL IN ENTERPRISE SOCIAL NETWORKS IN NOT FULLY EXPLOITED. ENTERPRISE SOCIAL PLATFORMS ARE DISTRIBUTED AND INCLUDE VARIOUS SOURCES OF INFORMATION. VALUABLE INFORMATION IN AN ORGANIZATION IS NOT DISCOVERED BY THE RELEVANT EMPLOYEES.

DISCOVER AND CONNECT WITH RELEVANT PEOPLE IN THE ORGANIZATION. AGGREGATE INFORMATION FROM VARIOUS DISTRIBUTED SOCIAL PLATFORMS USING OPEN STANDARDS PROVIDE NEAR REAL-TIME PERSONALIZATION BASED ON LARGE, DYNAMIC GRAPH DATA.

Page 36: Near real-time recommendations in enterprise social networks

Why? What? How?

technical considerations

lessons learned

Page 37: Near real-time recommendations in enterprise social networks

Lessons learned

•  GREATER RELEVANCE TO REAL PROBLEMS •  CLEARER REQUIREMENTS (AND MORE) •  ACCESS TO ACTUAL USAGE DATA (REAL USERS)

•  PATENTS VS. PUBLISHING

•  PROTOTYPE INTEGRATION CONSUMES RESOURCES •  MORE FOCUS ON FEATURE DEVELOPMENT •  LESS EXPLORATION AND HYPOTHESIS TESTING

Page 38: Near real-time recommendations in enterprise social networks

major considerations

ACCESS TO INDUSTRY DATA

INTEGRATION WITH THE PRODUCT?

https://www.keytrac.net/assets/industry-social-networks.jpg http://www.autointhenews.com/wp-content/uploads/2010/05/volvo-s60-crash-video-image.jpg

Page 39: Near real-time recommendations in enterprise social networks

Summary

PROBLEM §  INFORMATION OVERLOAD AND INEFFICIENT INFORMATION

DISCOVERY IN DISTRIBUTED ENTERPRISE SOCIAL NETWORKS SOLUTION

§  RECOMMENDER SYSTEM THAT EXPLOITS SOCIAL GRAPH §  UTILIZE INTEREST GRAPH AND LEGACY INFORMATION §  NEAR-REAL TIME PERSONALIZATION

TECHNOLOGY §  OPEN SOURCE COMPONENT FOR RDF DATA AGGREGATION

USING XMPP AND SPARQL 1.1 UPDATE §  PERSONALIZATION COMPONENT BASED ON SPREADING

ACTIVATION APPLICABLE TO MULTI-SOURCE, CROSS DOMAIN DATA

Page 40: Near real-time recommendations in enterprise social networks

ENORMOUS VALUE

IN

INDUSTRY-ACADEMIA COLLABORATIONS

CONTACT: [email protected]

@MACDAB