DevFest Istanbul - a free guided tour of Neo4J
-
Upload
florent-biville -
Category
Technology
-
view
113 -
download
2
description
Transcript of DevFest Istanbul - a free guided tour of Neo4J
A free guided tour of
(Reality)-[:IS_A]->(Graph)
What we have always known
selection sort (O(n2))
What we have always known
selection sort (O(n2))
|
heap sort (O(n*log(n))
What we have always known
selection sort (O(n2))
|
heap sort (O(n*log(n))
same algorithm, different data structure, better execution time !
What we have always known
1 data structure
1 usage
One NOSQL lesson?
1 data STORE
1 usage
One NOSQL lesson?
polyglot persistence, anyone ?
ZOOM on Graph Databases
graph=
nodes/vertices+
edges/relationships/arcs
Graph DB : a common model
property graph=
nodes+
labeled relationships+
K/V pairs
Graph DB : a common model
property graph=
labeledneov2 nodes+
labeled relationships+
K/V pairs
Property Graph DBs
Flock DB
Property Graph DBs
WHY DO THEY
KICK ASS?
BECAUSE
Graph-based computing
Intuitive model
Expressive querying
Powerful analyses
Graph-based computing
Intuitive model
Expressive querying
Powerful analyses
Whiteboard-friendliness
Pregel (GOOG), TAO (FB)
Pattern matching, path
finding...
A glimpse at usecases: mantra
RELATIONSHIPS
ARE AS IMPORTANT AS E N T I T I E S
A glimpse at usecases
Recommendations
People I may know ex: people known by contacts I have worked with in the past
A glimpse at usecases
Recommendations
People I may know ex: people known by contacts I have worked with in the past
Products I should buy ex: people who bought “Twilight” and“Justin Bieber biography” like you also bought “The ultimate emo guide”
A glimpse at usecases
Recommendations
People I may know ex: people known by contacts I have worked with in the past
Products I should buy ex: people who bought “Twilight” and“Justin Bieber biography” like you also bought “The ultimate emo guide”
Movies I should watch with whom and where...
A glimpse at usecases
Pattern detection
Fraud ex: many IPs from Fraudistan havemade a purchase of game X in the last hour
A glimpse at usecases
Pattern detection
Fraud ex: many IPs from Fraudistan havemade a purchase of game X in the last hour
Disease detection ex: DNA sequencing
A glimpse at usecases
Pattern detection
Fraud ex: many IPs from Fraudistan havemade a purchase of game X in the last hour
Disease detection ex: DNA sequencing
Trend detection ex: the term Flu has been tweeted 789% times more in Guatemala areain the last 24 hours
A glimpse at usecases
Path finding
Genealogy ex: is François Mitterand related toElizabeth II ? (yes)
A glimpse at usecases
Path finding
Genealogy ex: is François Mitterand related toElizabeth II ? (yes)
Navigation ex: what is the cheapest way to go to a sushi place < 15€ for me (Placede Clichy) and my friend (Place d’Italie)?
A glimpse at usecases
Path finding
Genealogy ex: is François Mitterand related toElizabeth II ? (yes)
Navigation ex: what is the cheapest way to go to a sushi place < 15€ for me (Place de Clichy) and my friend (Place d’Italie)?
Impact analysis ex: which customers are impacted ifnetwork switch XYZ fails?
A glimpse at usecases
and more...
Topological ordering ex: given a set of dependencies, in which order should I include them?
Community detection ex: tag clustering on annotated resources to detect groups of interest(targeted advertising)
and much more...
A glimpse at usecases
http://www.opentreeoflife.org/
http://bio4j.com/
http://www.reco4j.org/
http://structr.org/
https://github.com/neo4j-contrib/graphgist/wiki
In short
GRAPHS ARE
EVERYWHERE!
THAT’S WHYYOU
SHOULD TRY
Neo4J - the origins
Circa 2000, somewhere in Sweden
2 swedish guys hacking in a garage
Neo4J - the origins
Dialogue
- Man, I cannot stand Informix anymore
- Right, we’ve pushed it to the limit
- All these documents, these workflows…- Right, it’s way too densely connected.
- Connected… connections? CONNECTIONS??
Flash-forward: Neo Technology!
1 company Neo Technology
1 main product Neo4J
~50 employees
All over the world Sweden, US, Germany,
France, Malaysia, NZ...
Neo4J - moaar facts & figures
Versions 2.0.0.M06 / 1.9.4
Licenses GPL, AGPL, OEM, Commercial
235 nodes 34_359_738_368
235 relationships
> 236 properties at most 238
… capacity can be tailored on demand
Neo4J anatomy
GRAPH ON DISK(roughly)
original presentation: http://www.ustream.tv/channel/neo4j
COREAPI
Neo4J anatomy
Node CRUD (JVM)
GraphDatabaseService graphDB = new TestGraphDatabaseFactory ()
.newImpermanentDatabase();
try (Transaction transaction = graphDB.beginTx()) {
Node character=graphDB.createNode(DynamicLabel.label("CHARACTER"));
character.setProperty("name", "Homer Simpson");
transaction.success();
}
Node CRUD (JVM)
GraphDatabaseService graphDB = new TestGraphDatabaseFactory()
.newImpermanentDatabase();
try (Transaction transaction = graphDB.beginTx()) {
Node character=graphDB.createNode(DynamicLabel.label("CHARACTER"));
character.setProperty("name", "Homer Simpson");
transaction.success();
}
officially distributed
test version!
Node CRUD (JVM)
GraphDatabaseService graphDB = new TestGraphDatabaseFactory ()
.newImpermanentDatabase();
try (Transaction transaction = graphDB.beginTx()) {
Node character=graphDB.createNode(DynamicLabel.label("CHARACTER"));
character.setProperty("name", "Homer Simpson");
transaction.success();
}
transaction is MANDATORY
Java 7 required since 2.0
Node CRUD (JVM)
GraphDatabaseService graphDB = new TestGraphDatabaseFactory ()
.newImpermanentDatabase();
try (Transaction transaction = graphDB.beginTx()) {
Node character=graphDB.createNode(DynamicLabel.label("CHARACTER"));
character.setProperty("name", "Homer Simpson");
transaction.success();
}
labels are a way
to semi-structure your nodes
(since 2.0)
Node CRUD (JVM)
try (Transaction transaction = graphDB.beginTx()) {
for (Node node: graphDB.findNodesByLabelAndProperty(
DynamicLabel.label("CHARACTER"),
"name",
"Homer Simpson")) {
/* do something very useful */
}
transaction.success();
}
Node CRUD (JVM)
try (Transaction transaction = graphDB.beginTx()) {
for (Node node: graphDB.findNodesByLabelAndProperty(
DynamicLabel.label("CHARACTER"),
"name",
"Homer Simpson")) {
/* do something very useful */
}
transaction.success();
}
Gotchas
● avoid graphDB.findById !!!
● transaction is MANDATORY for reads as well (new in 2.0)
Node CRUD (JVM)
try (Transaction transaction = graphDB.beginTx()) {
Node character = /*lookup*/;
character.delete();
transaction.success();
}
Node CRUD (JVM)
try (Transaction transaction = graphDB.beginTx()) {
Node character = /*lookup*/;
character.delete();
transaction.success();
}
Gotchas
● no relationships must be attached when transaction commits
● all properties will be automatically removed
Relationship CRUD (JVM)
try (Transaction transaction = graphDB.beginTx()) {
Node homer = /*lookup*/;
Node doughnut = /*lookup*/;
Relationship eating = homer.createRelationshipTo(
doughnut,
DynamicRelationshipType .withName("LOVES_EATING")
);
eating.setProperty("quantity", Long.MAX_VALUE);
transaction.success();
}
Relationship CRUD (JVM)
try (Transaction transaction = graphDB.beginTx()) {
Node homer = /*lookup*/;
Node doughnut = /*lookup*/;
Relationship eating = homer.createRelationshipTo(
doughnut,
DynamicRelationshipType.withName("LOVES_EATING")
);
eating.setProperty("quantity", Long.MAX_VALUE);
transaction.success();
}
unrelated to
Node labels
Relationship CRUD (JVM)
try (Transaction transaction = graphDB.beginTx()) {
Node homer = /*lookup*/;
Node doughnut = /*lookup*/;
Relationship eating = homer.createRelationshipTo(
doughnut,
DynamicRelationshipType .withName("LOVES_EATING")
);
eating.setProperty("quantity", Long.MAX_VALUE);
transaction.success();
}Gotchas
● relationship direction matters at query time
● avoid human-eating doughnuts ;-)
Relationship CRUD (JVM)
try (Transaction transaction = graphDB.beginTx()) {
Relationship relationship = /*lookup*/;
relationship.delete();
transaction.success();
}
Relationship CRUD (JVM)
try (Transaction transaction = graphDB.beginTx()) {
Relationship relationship = /*lookup*/;
relationship.delete();
transaction.success();
}
Gotcha
● a write lock is set on the relationship, as well as both start AND end
nodes of the relationship
Core API
Low Level
Transactions are ALWAYS required (v2.0)
Technical IDs are dangerous (findById)
have a look at github.com/sarmbruster/neo4j-uuid
SAME capabilities with REST API
QUERYING
DATA
Neo4J anatomy
Two strategies
IMPERATIVE
VERY extensive
Totally customizable
100% under your
responsability
DECLARATIVE
VERY intuitive
90% of your needs
No free lunch (yet)!
Cypher PROFILE on its way
Two strategies
TRAVERSALS CYPHER QL
(/GREMLIN)
Traversals
DEPTH FIRST BREADTH FIRST
Traversals
Traversal - basic git log
try (Transaction transaction = graphDB.beginTx()) {
for (Path position : Traversal.description()
.depthFirst()
.evaluator(toDepth(LOG_DEFAULT_SIZE))
.relationships(
DynRelType.withName("PARENT_COMMIT"),
INCOMING
).traverse(headCommit)) {
Node currentNode = position.endNode;
logs.add(currentNode);
}
transaction.success();
}
Traversal - basic git log
try (Transaction transaction = graphDB.beginTx()) {
for (Path position : Traversal.description()
.depthFirst()
.evaluator(toDepth(LOG_DEFAULT_SIZE))
.relationships(
DynRelType.withName("PARENT_COMMIT"),
Direction.INCOMING
).traverse(headCommit)) {
Node currentNode = position.endNode;
logs.add(currentNode);
}
transaction.success();
}
lazy traversal
definition
Traversal - basic git log
try (Transaction transaction = graphDB.beginTx()) {
for (Path position : Traversal.description()
.depthFirst()
.evaluator(toDepth(LOG_DEFAULT_SIZE))
.relationships(
DynRelType.withName("PARENT_COMMIT"),
INCOMING
).traverse(headCommit)) {
Node currentNode = position.endNode;
logs.add(currentNode);
}
transaction.success();
}
start traversal
with node
Traversal - basic git log
try (Transaction transaction = graphDB.beginTx()) {
for (Path position : Traversal.description()
.depthFirst()
.evaluator(toDepth(LOG_DEFAULT_SIZE))
.relationships(
DynRelType.withName("PARENT_COMMIT"),
INCOMING
).traverse(headCommit)) {
Node currentNode = position.endNode;
logs.add(currentNode);
}
transaction.success();
}
keeps track of
current position
& visited nodes/rels
Traversals
Extensive
but verbose
and error-prone
WE <3 ASCII ART!
Pattern matching and ASCII art
WE <3
CYPHER
Pattern matching and ASCII art
Pattern matching and ASCII art
Pattern matching and ASCII art
Pattern matching and ASCII art
Pattern matching and ASCII art
Pattern matching and ASCII art
Cypher syntax with <3
Cypher
()-->()
Cypher
(A)-->(B)
Cypher syntax with <3
Cypher
(A)-[:LOVES]->(B)LOVES
Cypher syntax with <3
Cypher
(C)<--(A)-->(B)-->(C)
A-->B-->C,A-->C
Cypher syntax with <3
Cypher reads
START <lookup> (optional)
MATCH <pattern>
WHERE <filtering>
RETURN <expression>
Cypher reads
MATCH (homer:HUMAN)-[:LOVES_EATING]->(doughnut:FOOD)
WHERE homer.name = "Homer Simpson"
AND doughnut.brand = "Fattylicious!"
RETURN homer
Cypher reads
MATCH (sugg:CONTACT)-[:IN_CONTACT*2..10]-(me:CONTACT)
WHERE me.name = "Florent Biville"
AND me <> sugg
RETURN me, sugg
Cypher reads
RULES OF THUMB
● MATCH for results
● use WHERE to filter (WHERE a-[:F]->b or NOT(a-[:F]->b))
● favour parameters over literals (exec. plan reuse)
● javacodegeeks.com: “optimizing Neo4j Cypher Queries”
Cypher writes
CREATE (UNIQUE) <expression>
MERGE <expression>
Cypher writes
CREATE (homer:HUMAN:DAD {name: "Homer Simpson"})
RETURN homer
Cypher writes
START homer = node:characters("name:Hom*")
MATCH (d:JUNK:FOOD)
WHERE d.brand = "Fattylicious!"
CREATE (homer)-[luv:LOVES_EATING {quantity:∞}]->(d)
RETURN luv
Cypher writes
MERGE (keanu:ACTOR {name:'Keanu Reeves'})
ON CREATE keanu
SET keanu.created = timestamp()
ON MATCH keanu
SET keanu.lastSeen = timestamp()
RETURN keanu
Cypher - I want moaaar
Declarative power
Super nice syntax
Evolutionary design with MERGE!
http://console.neo4j.org to try it out!
Cypher will the #1 way to query data!
OBJECT-GRAPH
MAPPING
With...
Spring Data
History ~2010
Rod Johnson, Scala last poet
Emil Eifrem, Neo Tech. founder & CEO
Spring Data
Familiar model for Spring apps
THIN common layer
Embraces diversity MongoDB
Redis
Neo4J
ElasticSearch…
Current version 2.3.1.RELEASE
Vanilla Neo4J repositories with Spring
@Repository
public class BranchRepository {
public Relationship createBranch(Node p, Node c, Map<String,?> props) {
try (Transaction transaction = graphDB.beginTx()) {
Relationship relationship = p.createRelationshipTo(
c,
DynamicRelationshipType .name("HAS_BRANCH")
);
for (Entry<String,?> entry:props.entrySet()) {
relationship.setProperty(entry .getKey(), entry.getValue());
}
transaction.success();
return relationship;
}}}
Vanilla Neo4J repositories with Spring
@Repository
public class BranchRepository {
public Relationship createBranch(Node p, Node c, Map<String,?> props) {
try (Transaction transaction = graphDB.beginTx()) {
Relationship relationship = p.createRelationshipTo(
commit,
DynamicRelationshipType .name("HAS_BRANCH")
);
for (Entry<String,?> entry:props.entrySet()) {
relationship.setProperty(entry .getKey(), entry.getValue());
}
transaction.success();
return relationship;
}}}
Spring Data Neo4J repositories
public interface BranchRepository extends GraphRepository< Branch> {
// look ma! no code!
}
Moaaar Spring Data Neo4J repositories
public interface BranchRepository extends GraphRepository< Branch> {
Iterable<Branch> findByNameLike(String name);
@Query("MATCH (p:PROJECT)-[b:HAS_BRANCH]->(c:COMMIT) RETURN b" )
Page<Branch> lookMaIveGotPages ();
Branch findByNameAndCommitIdentifierLike (String name, String commit);
}
Moaaar Spring Data Neo4J repositories
public interface BranchRepository extends GraphRepository< Branch> {
Iterable<Branch> findByNameLike(String name);
@Query("MATCH (p:PROJECT)-[b:HAS_BRANCH]->(c:COMMIT) RETURN b" )
Page<Branch> lookMaIveGotPages ();
Branch findByNameAndCommitIdentifierLike (String name, String commit);
}
Cool things
● boilerplate methods already provided
● you declare methods following a naming convention, Spring Data Neo4J
generates the right implementation for ya!
● YOU EXPOSE YOUR DOMAIN, no Nodes, no Relationships!
Spring Data Neo4J node entities
@NodeEntity
public class Person {
@GraphId
private Long id;
@Indexed(indexName = "people", type=FULLTEXT)
private String name;
@RelatedTo(type="OWNS", enforceTargetType = true)
private Car car;
@RelatedToVia(type="FRIEND_OF", direction = Direction.INCOMING)
private Iterable<Friendship> friendships;
@GraphTraversal(traversal = PeopleTraversalBuilder .class,
elementClass = Person.class, params = "persons")
private Iterable<Person> people;
}
Spring Data Neo4J relationship entities
@RelationshipEntity(type ="FRIEND_OF")
public class Friendship {
@StartNode
private Person person;
@EndNode
private Dog humansBestFriend;
@GraphProperty /* optional here ;-) */
private Date since;
/**
* moaaaaar properties
*/
}
Neo4jTemplate
Geospatial queries
Cross-store support
Dynamic relationships
“Advanced” mapping
And much more
Conclusion
So much to talk about, so little time
● moaaar Cypher
● REST
○ standard API
○ unmanaged extensions
○ streaming
● Tinkerpop abstractions, http://www.tinkerpop.com/
● dataviz
○ auto : http://linkurio.us/, Neoclipse, Gephi
○ custom : d3.js, sigma.js…● NeoAAS : http://www.graphenedb.com/, Heroku
● misc. : backup, batch-import, JDBC drivers
And one more thing
AssertJ-Neo4J 1.0 is coming soon!
Fluent test assertions for Neo4J
https://github.com/joel-costigliola/assertj-neo4j
SORUSU
OLAN?
www.lateral-thoughts.com
Bana ulaşın