The Football Graph - Neo4j and the Premier League

74
In a League of their Own: Neo4j and Premiership Football Mark Needham @markhneedham
  • date post

    14-Sep-2014
  • Category

    Sports

  • view

    594
  • download

    22

description

 

Transcript of The Football Graph - Neo4j and the Premier League

Page 1: The Football Graph - Neo4j and the Premier League

In  a  League  of  their  Own:    Neo4j  and  Premiership  Football  

Mark  Needham  @markhneedham  

Page 2: The Football Graph - Neo4j and the Premier League

Outline  

•  Intro  to  graphs  •  When  do  we  need  a  graph?  •  Property  graph  model  •  Neo4j’s  query  language  •  The  football  graph  •  Using  Neo4j  from  .NET  

Page 3: The Football Graph - Neo4j and the Premier League

Let’s  talk  graphs  

Page 4: The Football Graph - Neo4j and the Premier League

You  mean  these?  

EaJng  Brains  

Dancing  With  Michael  Jackson  

Page 5: The Football Graph - Neo4j and the Premier League

Nope!  

EaJng  Brains  

Dancing  With  Michael  Jackson  Thes

e����������� ������������������  are����������� ������������������  Cha

rts!����������� ������������������  

����������� ������������������  NOT

����������� ������������������  Graphs!

����������� ������������������  

Page 6: The Football Graph - Neo4j and the Premier League

Ok  so  what’s  a  graph  then?  

Node  

RelaJonship  

Page 7: The Football Graph - Neo4j and the Premier League

The  tube  

Page 8: The Football Graph - Neo4j and the Premier League

The  social  network  (graph)  

Page 9: The Football Graph - Neo4j and the Premier League

Complexity  

What  are  graphs  good  for?  

Page 10: The Football Graph - Neo4j and the Premier League

complexity = f(size, semi-structure, connectedness)

Data  Complexity  

Page 11: The Football Graph - Neo4j and the Premier League

Size  

Page 12: The Football Graph - Neo4j and the Premier League

complexity = f(size, semi-structure, connectedness)

The  Real  Complexity  

Page 13: The Football Graph - Neo4j and the Premier League

Semi-­‐Structure  

Page 14: The Football Graph - Neo4j and the Premier League

Email:  [email protected]  Email:  [email protected]  TwiXer:  @markhneedham  Skype:  mk_jnr1984  

USER  

CONTACT  

CONTACT_TYPE  

FIRST_NAME   LAST_NAME  USER_ID   EMAIL_1   EMAIL_2   TWITTER  FACEBOOK   SKYPE  

Mark   Needham  315   [email protected]  

[email protected]   @markhneedham  NULL   mk_jnr1984  

Semi-­‐Structure  

Page 15: The Football Graph - Neo4j and the Premier League

complexity = f(size, semi-structure, connectedness)

The  Real  Complexity  

Page 16: The Football Graph - Neo4j and the Premier League

Connectedness  

Page 17: The Football Graph - Neo4j and the Premier League

Connectedness  

Page 18: The Football Graph - Neo4j and the Premier League

Connectedness  

Page 19: The Football Graph - Neo4j and the Premier League

When  do  we  need  a  graph?  

Densely  Connected  

Semi  Structured  

Page 20: The Football Graph - Neo4j and the Premier League

Densely  connected?  

Lots  of  join  tables  

Page 21: The Football Graph - Neo4j and the Premier League

Semi-­‐Structured?  

Lots  of  sparse  tables  

Page 22: The Football Graph - Neo4j and the Premier League

ProperJes  of  graph  databases  

• Millions  of  ‘joins’  per  second  •  Consistent  query  Jmes  as  dataset  grows  •  Join  Complexity  and  Performance  •  Easy  to  evolve  data  model  •  Easy  to  ‘layer’  different  types  of  data  together  

Page 23: The Football Graph - Neo4j and the Premier League

Property  Graph  Data  Model  

Page 24: The Football Graph - Neo4j and the Premier League

Nodes  

Page 25: The Football Graph - Neo4j and the Premier League

Nodes  can  have  properJes  

•  Used  to  represent  enJty  a"ributes  and/or  metadata  (e.g.  Jmestamps,  version)  

•  Key-­‐value  pairs  •  Java  primiJves  •  Arrays  •  null  is  not  a  valid  value  

•  Every  node  can  have  different  properJes  

Page 26: The Football Graph - Neo4j and the Premier League

What’s  a  node?  

Page 27: The Football Graph - Neo4j and the Premier League

RelaJonships  

Page 28: The Football Graph - Neo4j and the Premier League

RelaJonships  

•  RelaJonships  are  first  class  ciJzens    •  Every  relaJonship  has  a  name  and  a  direc.on  – Add  structure  to  the  graph  – Provide  semanJc  context  for  nodes  

•  ProperJes  used  to  represent  quality  or  weight  of  relaJonship,  or  metadata  

•  Every  relaJonship  must  have  a  start  node  and  end  node  

Page 29: The Football Graph - Neo4j and the Premier League

RelaJonships  

Nodes  can  have  more  than  one  relaJonship  

Self  relaJonships  are  allowed  

Nodes  can  be  connected  by  more  than  one  relaJonship  

Page 30: The Football Graph - Neo4j and the Premier League

Labels  

Page 31: The Football Graph - Neo4j and the Premier League

Think  Gmail  labels  

Page 32: The Football Graph - Neo4j and the Premier League

•  Nodes  – EnJJes  

•  RelaJonships  – Connect  enJJes  and  structure  domain  

•  ProperJes  – EnJty  aXributes,  relaJonship  qualiJes,  and  metadata  

•  Labels  – Group  nodes  by  role  

Four  Building  Blocks  

Page 33: The Football Graph - Neo4j and the Premier League

Purposeful  abstracJon  of  a  domain  designed  to  saJsfy  parJcular  applicaJon/end-­‐user  goals  

Models  

Page 34: The Football Graph - Neo4j and the Premier League

Model  Query  

Design  for  Queryability  

Page 35: The Football Graph - Neo4j and the Premier League

Model  Model  

Design  for  Queryability  

Page 36: The Football Graph - Neo4j and the Premier League

Model  Query  

Design  for  Queryability  

Page 37: The Football Graph - Neo4j and the Premier League

Introducing  Cypher  

•  DeclaraJve  PaXern-­‐Matching  language  •  SQL-­‐like  syntax  •  Designed  for  graphs  

Page 38: The Football Graph - Neo4j and the Premier League

PaXerns,  paXerns,  everywhere  

A

B C

Page 39: The Football Graph - Neo4j and the Premier League

(a) --> (b)

a b

It’s  all  about  the  ASCII  art!  

Page 40: The Football Graph - Neo4j and the Premier League

a b

The  most  basic  query  

MATCH (a)-->(b) RETURN a, b

Page 41: The Football Graph - Neo4j and the Premier League

(a)–[:ACTED_IN]->(m)

a m

Adding  in  a  relaJonship  type  

ACTED IN

Page 42: The Football Graph - Neo4j and the Premier League

a m

Adding  in  a  relaJonship  type  

MATCH (a)-[:ACTED_IN]->(m) RETURN a.name, m.name

ACTED IN

Page 43: The Football Graph - Neo4j and the Premier League

The  football  graph  

Page 44: The Football Graph - Neo4j and the Premier League

The  football  graph  

Page 45: The Football Graph - Neo4j and the Premier League

Find  Arsenal’s  away  matches  

Page 46: The Football Graph - Neo4j and the Premier League

Find  Arsenal’s  away  matches  

Page 47: The Football Graph - Neo4j and the Premier League

Find  Arsenal’s  away  matches  

MATCH (team:Team)<-[:away_team]-(game)

WHERE team.name = "Arsenal"

RETURN game

Page 48: The Football Graph - Neo4j and the Premier League

Graph  PaXern  

MATCH (team:Team)<-[:away_team]-(game)

WHERE team.name = "Arsenal"

RETURN game.name

Page 49: The Football Graph - Neo4j and the Premier League

Anchor  paXern  in  graph  

MATCH (team:Team)<-[:away_team]-(game)

WHERE team.name = "Arsenal"

RETURN game.name

Page 50: The Football Graph - Neo4j and the Premier League

Create  projecJon  of  results  

MATCH (team:Team)<-[:away_team]-(game)

WHERE team.name = "Arsenal"

RETURN game.name

Page 51: The Football Graph - Neo4j and the Premier League

Find  Arsenal’s  away  matches  

Page 52: The Football Graph - Neo4j and the Premier League

Evolving  the  football  graph  

Page 53: The Football Graph - Neo4j and the Premier League

Find  the  top  away  goal  scorers  

Page 54: The Football Graph - Neo4j and the Premier League

Find  the  top  away  goal  scorers  

MATCH (team)<-[:away_team]-(game:Game),

(game)<-[:contains_match]-(season:Season),

(team)<-[:for]-(stats)<-[:played]-(player),

(stats)-[:in]->(game)

WHERE season.name = "2012-2013"

RETURN player.name,

COLLECT(DISTINCT team.name),

SUM(stats.goals) as goals

ORDER BY goals DESC

LIMIT 10

Page 55: The Football Graph - Neo4j and the Premier League

MulJple  graph  paXerns  

MATCH (team)<-[:away_team]-(game:Game),

(game)<-[:contains_match]-(season:Season),

(team)<-[:for]-(stats)<-[:played]-(player),

(stats)-[:in]->(game)

WHERE season.name = "2012-2013"

RETURN player.name,

COLLECT(DISTINCT team.name),

SUM(stats.goals) as goals

ORDER BY goals DESC

LIMIT 10

Page 56: The Football Graph - Neo4j and the Premier League

Anchor  paXern  in  the  graph  

MATCH (team)<-[:away_team]-(game:Game),

(game)<-[:contains_match]-(season:Season),

(team)<-[:for]-(stats)<-[:played]-(player),

(stats)-[:in]->(game)

WHERE season.name = "2012-2013"

RETURN player.name,

COLLECT(DISTINCT team.name),

SUM(stats.goals) as goals

ORDER BY goals DESC

LIMIT 10

Page 57: The Football Graph - Neo4j and the Premier League

Group  by  player  

MATCH (team)<-[:away_team]-(game:Game),

(game)<-[:contains_match]-(season:Season),

(team)<-[:for]-(stats)<-[:played]-(player),

(stats)-[:in]->(game)

WHERE season.name = "2012-2013"

RETURN player.name,

COLLECT(DISTINCT team.name),

SUM(stats.goals) as goals

ORDER BY goals DESC

LIMIT 10

Page 58: The Football Graph - Neo4j and the Premier League

Find  the  top  away  goal  scorers  

Page 59: The Football Graph - Neo4j and the Premier League

Other  football  queries  

• Goals  scored  in  each  month  by  Michu  •  ToXenham  results  when  Gareth  Bale  scores  • What  did  Wayne  Rooney  do  in  April?  • Which  players  only  score  when  a  game  is  televised?  

Page 60: The Football Graph - Neo4j and the Premier League

Graph  Query  Design  

Page 61: The Football Graph - Neo4j and the Premier League

The  relaJonal  version  

Page 62: The Football Graph - Neo4j and the Premier League

Graph  vs  RelaJonal  

Rela%onal   Graphs  Tables  -­‐  assume  records  all  have  the        same  structure    

Nodes  -­‐  no  need  to  set  a  property  if  it          doesn’t  exist  

Foreign  keys  between  tables  -­‐  joins  calculated  at  run  Jme  -­‐  the  more  tables  you  join  to  a          query  the  slower  the  query  gets  

Rela%onships  -­‐  stored  as  a  ‘Pre-­‐computed          index’  at  write  Jme  -­‐  very  easy  to  do  lots  of  ‘hops’          between  relaJonships  

Page 63: The Football Graph - Neo4j and the Premier League

.NET  and  Neo4j  

REST  Client  

ApplicaJon  

H  T  T  P  

Neo4j  Server  

Page 64: The Football Graph - Neo4j and the Premier League

Neo4jClient    

.NET  and  Neo4j  

ApplicaJon  

H  T  T  P  

Neo4j  Server  

REST  Client  

Page 65: The Football Graph - Neo4j and the Premier League

.NET  and  Neo4j  

Page 66: The Football Graph - Neo4j and the Premier League

.NET  and  Neo4j  

Page 67: The Football Graph - Neo4j and the Premier League

.NET  and  Neo4j  

Page 68: The Football Graph - Neo4j and the Premier League

.NET  and  Neo4j  

Page 69: The Football Graph - Neo4j and the Premier League

.NET  and  Neo4j  

Page 70: The Football Graph - Neo4j and the Premier League

Thinking  in  graphs    

Page 71: The Football Graph - Neo4j and the Premier League

Graphs  should  be  fun!  

Page 72: The Football Graph - Neo4j and the Premier League

Ask  for  help  if  you  get  stuck  Last  Wednesday  of  the  month    

Page 73: The Football Graph - Neo4j and the Premier League

Come  take  a  copy,  it’s  free!  

Ian Robinson, Jim Webber & Emil Eifrem

Graph Databases

h

Compliments

of Neo Technology

www.graphdatabases.com  

Page 74: The Football Graph - Neo4j and the Premier League

QuesJons?  

Mark  Needham  @markhneedham  [email protected]