Introduction toGraph Databases
Josh Adell <[email protected]>20110806
Who am I?
• Software developer: PHP, Javascript, SQL• http://www.dunnwell.com• Fan of using the right tool for the job
The Problem
The Solution?
> -- Given "Keanu Reeves" find a connection to "Kevin Bacon"> SELECT ??? FROM cast WHERE ???
+---------------------------------------------------------------------+| actor_name | movie_title |+============================+========================================+| Jennifer Connelley | Higher Learning |+----------------------------+----------------------------------------+| Laurence Fishburne | Mystic River |+----------------------------+----------------------------------------+| Laurence Fishburne | Higher Learning |+----------------------------+----------------------------------------+| Kevin Bacon | Mystic River |+----------------------------+----------------------------------------+| Keanu Reeves | The Matrix |+----------------------------+----------------------------------------+| Laurence Fishburne | The Matrix |+----------------------------+----------------------------------------+
Find Every Actor at Each Degree
> -- First degree> SELECT actor_name FROM cast WHERE movie_title IN (SELECT DISTINCT movie_title FROM cast WHERE actor_name='Kevin Bacon')
> -- Second degree> SELECT actor_name FROM cast WHERE movie_title IN (SELECT DISTINCT movie_title FROM cast WHERE actor_name IN (SELECT actor_name FROM cast WHERE movie_title IN (SELECT DISTINCT movie_title FROM cast WHERE actor_name='Kevin Bacon')))
> -- Third degree> SELECT actor_name FROM cast WHERE movie_title IN(SELECT DISTINCT movie_title FROM cast WHERE actor_name IN (SELECT actor_name FROM cast WHERE movie_title IN (SELECT DISTINCT movie_title FROM cast WHERE actor_name IN (SELECT actor_name FROM cast WHERE movie_title IN (SELECT DISTINCT movie_title FROM cast WHERE actor_name='Kevin Bacon'))))
The Truth
Relational databases aren't very good with relationships
Data
RDBMs
The Real Problem
Finding relationships across multiple degrees of separation ...and across multiple data types ...and where you don't even know there is a relationship
The Real Solution
Graph Examples
Relational Databases are Graphs!
Some Graph Use Cases
• Social networking• Manufacturing• Mapping and Geolocation• Bioinformatics• Fraud detection• Multi-tenancy
Modelling a Domain with Graphs
• Graphs are "whiteboard-friendly"• Nouns become nodes• Verbs become relationships• Properties are adjectives and adverbs
Graph Mining
• Paths• Traversals• Ad-hoc Queries
New Solution to the Bacon Problem
$keanu = $actorIndex->find('name', 'Keanu Reeves');$kevin = $actorIndex->find('name', 'Kevin Bacon');
$path = $keanu->findPathTo($kevin);
Cypher
• "What to find" vs. "How to find"
// Find all the directors who have directed a movie scored by John Williams// that starred Kevin Bacon
START actor=(actors, 'Kevin Bacon'), composer=(compsers, 'John Williams')MATCH (actor)-[:IN]->(movie)<-[:DIRECTED]-(director), (movie)<-[:SCORED]-(composer)RETURN director
Are RDBs Useful At All?
• Aggregation• Ordered data• Truly tabular data• Few or clearly defined relationships
• Neo Technologies• http://neo4j.org• Embedded in Java applications• Standalone server via REST• Plugins: spatial, lucene, rdf
Others:• Tinkerpop• OrientDB
Questions?
Resources
• http://neo4j.org• http://docs.neo4j.org• http://www.youtube.com/watch?v=UodTzseLh04
o Emil Eifrem (Neo Tech. CEO) webinaro Check out around the 54 minute mark
• http://github.com/jadell/Neo4jPHP
• http://joshadell.com• [email protected]• @josh_adell• Google+, Facebook, LinkedIn
Top Related