The 5 Graphs of Love

Post on 15-Jan-2015

1.847 views 0 download

Tags:

description

Recorded webinar: neotechnology.com/webinar-five-graphs-love The iDating industry cares about interactions and connections. Those two concepts are closely linked. If someone has a connection to another person, through a shared friend or a shared interest, they are much more likely to interact. Graph databases are optimized for querying connections between people, things, interests, or really anything that can be connected. Dating sites and apps worldwide have begun to use graph databases to achieve competitive gain. Neo4j provides thousand-fold performance improvements and massive agility benefits over relational databases, enabling new levels of performance and insight. Amanda Laucher discusses the five graphs of love, and how companies like eHarmony, Hinge and AreYouInterested.com, are now using graph algorithms to create more interactions and connections.

Transcript of The 5 Graphs of Love

�1

Amanda Laucher Neo Technology @pandamonial

(Neo4j)-[:POWERS] ->(Love)

�2

Most of your favorite dating sites

�3

The 5 Graphs of Love

�4

The 5 Graphs of Love

• The Friends-of-Friends Graph

!

!

!

!

!

!

!

�5

The 5 Graphs of Love

• The Friends-of-Friends Graph

!

• The Passion Graph

!

!

!

!

!

�6

The 5 Graphs of Love

• The Friends-of-Friends Graph

!

• The Passion Graph

!

• The Location Graph

!

!

!

�7

The 5 Graphs of Love

• The Friends-of-Friends Graph

!

• The Passion Graph

!

• The Location Graph

!

• The Safety Graph

!

�8

The 5 Graphs of Love

• The Friends-of-Friends Graph

!

• The Passion Graph

!

• The Location Graph

!

• The Safety Graph

!

• The Poser Graph

๏from: California

๏appearance: very handsome

๏personality: super friendly nerd

๏interests: piano, coding

Meet Jeremy...

Jeremy

๏Kerstin: his sister

๏Peter: his buddy

๏Andreas: his coworker

Jeremy has some friends

KerstinAndreas

JeremyPeter

๏Michael: master hacker, divorced, 2 kids

๏Johan: technology sage, likes fast cars

๏Madelene: polyglot journalist, loves dogs

๏Allison: marketing maven, likes long walks on the beach

His friends introduced more friends

Johan

Kerstin

Allison

Andreas

Michael

Madelene

JeremyPeter

๏how do we know they are friends?

๏either ask each pair: are you friends?

๏or, we can add explicit connections

๏Twitter, Facebook, LinkedIn, etc.

So, we have a bunch of people

Johan

Kerstin

Allison

Andreas

Michael

Madelene

JeremyPeter

๏it's just a graph

This is really just data

Johan

Kerstin

Allison

AnnaAdamAndreas

Michael

Madelene

JeremyPeter

�14

A graph?

Yes, a graph...

�15

๏you know the common data structures

•linked lists, trees, object "graphs"

๏a graph is the general purpose data structure

•suitable for any connected data

๏well-understood patterns and algorithms

•studied since Leonard Euler's 7 Bridges (1736)

•Codd's Relational Model (1970)

•not a new idea, just an idea who's time is now

�16

How can you use this? With a Graph Database

A graph database...

�17

๏optimized for the connections between records

๏really, really fast at querying across records

๏a database: transactional with the usual operations

๏“A relational database may tell you the average age of everyone here,

but a graph database will tell you who is most likely to buy you a beer later.”

What’s love got to do with it?

�18

�19

Friends of Friends Graph

!

๏4% likelihood of interacting with a stranger

๏10% likelihood of interacting with friend of friend

๏7% chance of interacting with 3rd degree connection (friend of friend of friend)

๏Connections mean a much larger number of interactions!

JeremyPeterJohan

Jennifer

Allison

AnnaAdamAndreas

Michael

Madelene

According to SNAP Interactive if you are a female user, you have a:

�21

Friends of friends = larger dating pool

Friends

Peter JenniferAndreasJeremy

Friends of friends

PeterJohan

Jennifer

Allison

Andreas

Jeremy

MadeleneFrank

Amanda

Jeremy

Friends of friends of friends

�25

Find Jeremy’s FoFs

�26

Demo - Find who Jeremy shares the most friends with

JakePeter JenniferAndreas

:WORKS_FOR:FRIENDS:FRIENDS

Complicated Relationships

:WANTS_TO_DATE

JakePeter JenniferAndreas

:WORKS_FOR:FRIENDS:FRIENDS

Friends

Awkward!!

JakePeter JenniferAndreas

:WORKS_FOR:FRIENDS:FRIENDS

Friends

:WANTS_TO_DATE

:WANTS_TO_DATE

Awkward

:WANTS_TO_DATE

JakePeter JenniferAndreas

:WORKS_FOR:FRIENDS:FRIENDS

Friends of Friends

:WANTS_TO_DATE

:WANTS_TO_DATE

JakePeter JenniferAndreas

:WORKS_FOR:FRIENDS:FRIENDS

:NO_DATE

Too complex!

Friends of Friends

Friends of Friends of Friends

:WANTS_TO_DATE :WANTS_TO_DATE

JakePeter JenniferAndreas

:WORKS_FOR:FRIENDS:FRIENDS

:NO_DATE

:NO_DATE

:WANTS_TO_DATE

:WANTS_TO_DATE

Friends of Friends of Friends

Friends of Friends of Friends

๏from: UK

๏seeking: Females

๏appearance: Hot, hot, hot!

๏personality: Fun loving, easy going

๏interests: cooking, chemistry

Jon

Meet Jon...

�36

Location Graph

Jon wants to find a date and refuses to have a long distance relationship

�37

�38

Location Graph*Neo4j Spatial

�39

Passion Graph

Jon wants to find someone he can share his passions

with.

�40

Jon

:REPORTED_INTEREST

Match Specific Interests

Cooking

Jon

:REPORTED_INTEREST

Match Specific Interests

Jon

:REPORTED_INTEREST

JenniferAnne Julia

Match Specific Interests

�44

Safety Graph

Jon uses social networks

Jon

Let’s dig into his Twitter

He follows some strange people

…and tweets about strange things!

Some basic word analysis

Let’s update based on behavior

:DEMONSTRATED_INTEREST

Jon

Any ladies ok with this?

Jennifer Jane Maria

Any ladies ok with this?

�53

Passion Graph

Jon loves the New England Patriots

�54

Jon:HAS_INTEREST

�55

Sports

:IS_A

:IS_A

:IS_A:IS_A

�56

Sports

:HAS_TEAM

:HAS_TEAM

:HAS_TEAM

:HAS_TEAM

:HAS_TEAM

:IS_A:IS_A

:IS_A

:IS_A

�57

Sports

:HAS_TEAM

:HAS_TEAM

:HAS_TEAM

:HAS_TEAM

:HAS_TEAM

:IS_A:IS_A

:IS_A

:IS_A

Jon

�58

Sports

Jon

�59

Find ladies who like football

�60

Jennifer Katie Greta

Find ladies who like football

�61

Poser Graph

Jon has no luck with online dating. All of his interactions are with

spam profiles.

�62

Find real people with at least 1 social network & minimum 2 posts

�63

�64

Find ladies who aren’t spam bots

Put it all together

�65

�66

Find Jon’s perfect date

�67

JenniferJon:PERFECT_FOR

�68

JenniferJon:HAS_DATE_WITH

�69

Jon & Jennifer delete their profiles and go off into the sunset!

JenniferJon

Jon Jennifer

Love

[:FOUND]

[:AIDS]

[:AIDS]

[:AIDS]

[:AIDS]

[:AIDS][:POWERS]

�71

Amanda Laucher Neo Technology

(Neo4j)-[:POWERS] ->(Love)

RDBMS/Other vs. Native Graph Database

Performance Challenges with Connected Data

Connectedness of Data Set

Resp

onse

Tim

e

RDBMS / Other NOSQL# Hops: 0-2 Degree: < 3

Size: ThousandsNeo4j

# Hops: Tens to Hundreds Degree: Thousands+ Size: Billions+

1000x faster

Neo Technology, Inc Confidential

Core Industries & Use Cases:

Web / ISV Financial Services

Telecomm-unications

Network & Data Center Management

Master Data Management

Social

Geo

Core Industries & Use Cases: Software

Financial Services

Telecommunications

Health Care & Life Sciences

Web Social,HR & Recruiting

Media & Publishing

Energy, Services, Automotive, Gov’t, Logistics, Education,

Gaming, Other

Network & Data Center Management

MDM / System of Record

Social

Geo

Recommend-ations

Identity & Access Mgmt

Content Management

BI, CRM, Impact Analysis, Fraud Detection, Resource

Optimization, etc.

Accenture

Aviation

Neo4j Adoption SnapshotSelect Commercial Customers* (some NDA)

*Community Users Not Included

Neo Technology, Inc Confidential

Graph Database Deployment

ApplicationOther

Databases

ETL

Graph Database Cluster

Data Storage & Business Rules Execution

Reporting

Graph- Dashboards&Ad-hocAnalysis

Graph Visualization

End User Ad-hoc visual navigation & discovery

Bulk Analytic Infrastructure

(e.g. Graph Compute Engine)

ETL

Graph Mining & Aggregation

Data Scientist

Ad-HocAnalysis

*“Find all direct reports and how many they manage, up to 3 levels down”

(SELECT T.directReportees AS directReportees, sum(T.count) AS count FROM ( SELECT manager.pid AS directReportees, 0 AS count FROM person_reportee manager WHERE manager.pid = (SELECT id FROM person WHERE name = "fName lName") UNION SELECT manager.pid AS directReportees, count(manager.directly_manages) AS count FROM person_reportee manager WHERE manager.pid = (SELECT id FROM person WHERE name = "fName lName") GROUP BY directReportees UNION SELECT manager.pid AS directReportees, count(reportee.directly_manages) AS count FROM person_reportee manager JOIN person_reportee reportee ON manager.directly_manages = reportee.pid WHERE manager.pid = (SELECT id FROM person WHERE name = "fName lName") GROUP BY directReportees UNION SELECT manager.pid AS directReportees, count(L2Reportees.directly_manages) AS count FROM person_reportee manager JOIN person_reportee L1Reportees ON manager.directly_manages = L1Reportees.pid JOIN person_reportee L2Reportees ON L1Reportees.directly_manages = L2Reportees.pid WHERE manager.pid = (SELECT id FROM person WHERE name = "fName lName") GROUP BY directReportees ) AS T GROUP BY directReportees) UNION (SELECT T.directReportees AS directReportees, sum(T.count) AS count FROM ( SELECT manager.directly_manages AS directReportees, 0 AS count FROM person_reportee manager WHERE manager.pid = (SELECT id FROM person WHERE name = "fName lName") UNION SELECT reportee.pid AS directReportees, count(reportee.directly_manages) AS count FROM person_reportee manager JOIN person_reportee reportee ON manager.directly_manages = reportee.pid WHERE manager.pid = (SELECT id FROM person WHERE name = "fName lName") GROUP BY directReportees UNION

(continued from previous page...) SELECT depth1Reportees.pid AS directReportees, count(depth2Reportees.directly_manages) AS count FROM person_reportee manager JOIN person_reportee L1Reportees ON manager.directly_manages = L1Reportees.pid JOIN person_reportee L2Reportees ON L1Reportees.directly_manages = L2Reportees.pid WHERE manager.pid = (SELECT id FROM person WHERE name = "fName lName") GROUP BY directReportees ) AS T GROUP BY directReportees) UNION (SELECT T.directReportees AS directReportees, sum(T.count) AS count FROM( SELECT reportee.directly_manages AS directReportees, 0 AS count FROM person_reportee manager JOIN person_reportee reportee ON manager.directly_manages = reportee.pid WHERE manager.pid = (SELECT id FROM person WHERE name = "fName lName") GROUP BY directReportees UNION SELECT L2Reportees.pid AS directReportees, count(L2Reportees.directly_manages) AS count FROM person_reportee manager JOIN person_reportee L1Reportees ON manager.directly_manages = L1Reportees.pid JOIN person_reportee L2Reportees ON L1Reportees.directly_manages = L2Reportees.pid WHERE manager.pid = (SELECT id FROM person WHERE name = "fName lName") GROUP BY directReportees ) AS T GROUP BY directReportees) UNION (SELECT L2Reportees.directly_manages AS directReportees, 0 AS count FROM person_reportee manager JOIN person_reportee L1Reportees ON manager.directly_manages = L1Reportees.pid JOIN person_reportee L2Reportees ON L1Reportees.directly_manages = L2Reportees.pid WHERE manager.pid = (SELECT id FROM person WHERE name = "fName lName") ) !

Experiencing Query Pain Actual HR Query* (in SQL)

MATCH  (boss)-­‐[:MANAGES*0..3]-­‐>(sub),              (sub)-­‐[:MANAGES*1..3]-­‐>(report)  WHERE  boss.name  =  “John  Doe”  RETURN  sub.name  AS  Subordinate,  count(report)  AS  Total

Experiencing Query Pain Same Query*, using Cypher

*“Find all direct reports and how many they manage, up to 3 levels down”