Post on 05-Dec-2014
description
PyCon India 2014• • created by Sonal Raj •
Neo4j and Python Playing with graph data
Graph Everything
Sonal Raj
PyCon India 2014• • created by Sonal Raj •
The Plan for today
Graphs and NOSQL
Step One
Neo4j and Cypher
Step Two
4Step Two
Use CasesPy2neo and
REST
Step Two
PyCon India 2014• • created by Sonal Raj •
Once upon a time..
PyCon India 2014• • created by Sonal Raj •
Once upon a time..• Relational databases ruled the earth . .
• Data was stored in Tables, Rows and Columns
• Connections using Primary keys, Foreign keys . .
• That’s all that is relational about then
• No on-the-fly structural (schema) changes
• Horrible for Interconnected data ( joins, really? )
PyCon India 2014• • created by Sonal Raj •
In the NOSQL Space
PyCon India 2014• • created by Sonal Raj •
In the NOSQL Space
PyCon India 2014• • created by Sonal Raj •
In the NOSQL Space
PyCon India 2014• • created by Sonal Raj •
In the NOSQL Space
PyCon India 2014• • created by Sonal Raj •
In the NOSQL Space
RDBMS
PyCon India 2014• • created by Sonal Raj •
In the NOSQL Space
Elastic scaling – Scale out, not up
PyCon India 2014• • created by Sonal Raj •
In the NOSQL Space
Elastic scaling – Scale out, not up
Big data, Transaction Friendly
PyCon India 2014• • created by Sonal Raj •
In the NOSQL Space
Elastic scaling – Scale out, not up
Big data, Transaction Friendly
Economical, can run on commodity hardware
PyCon India 2014• • created by Sonal Raj •
In the NOSQL Space
Elastic scaling – Scale out, not up
Big data, Transaction Friendly
Economical, can run on commodity hardware
End of the DBA rule
PyCon India 2014• • created by Sonal Raj •
In the NOSQL Space
Elastic scaling – Scale out, not up
Big data, Transaction Friendly
Economical, can run on commodity hardware
End of the DBA rule
Flexible Data models
Graph Trivia
PyCon India 2014• • created by Sonal Raj •
Where are Graphs . . .
PyCon India 2014• • created by Sonal Raj •
Where are Graphs . . .
PyCon India 2014• • created by Sonal Raj •
Where are Graphs . . .
PyCon India 2014• • created by Sonal Raj •
Where are Graphs . . .
PyCon India 2014• • created by Sonal Raj •
Some Graphswe overlook . .
PyCon India 2014• • created by Sonal Raj •
Some Graphswe overlook . .
PyCon India 2014• • created by Sonal Raj •
Some Graphswe overlook . .
PyCon India 2014• • created by Sonal Raj •
Apart from that
Fraud AnalysesInvestment securities &debt analysis
RecommendationEngines
Impact Analysis innetworks
PyCon India 2014• • created by Sonal Raj •
So, Why Graphs ?
• Increasing Connectivity of Data
• Increasing Semi-Structredness
• Rising Complexity
PyCon India 2014• • created by Sonal Raj •
So, Why Graphs ?
• Increasing Connectivity of Data
• Increasing Semi-Structredness
• Rising Complexity
Seven Bridges of Königsberg
Leonhard Euler in 1735
PyCon India 2014• • created by Sonal Raj •
Property Graphs
PyCon India 2014• • created by Sonal Raj •
Property Graphs
- Has nodes
PyCon India 2014• • created by Sonal Raj •
Property Graphs
- Has nodes- Has properties for
each node
PyCon India 2014• • created by Sonal Raj •
Property Graphs
- Has nodes- Has properties for
each node- Has Relationships
PyCon India 2014• • created by Sonal Raj •
Property Graphs
- Has nodes- Has properties for
each node- Has Relationships- Has properties for
each relationship
PyCon India 2014• • created by Sonal Raj •
Building Blocks
Nodes
Relationships
Labels
Graph Database
Properties
PyCon India 2014• • created by Sonal Raj •
Data Models
Native Graphs
Inherently store data as nodes and relationships.
PyCon India 2014• • created by Sonal Raj •
Data Models
Native Graphs
Inherently store data as nodes and relationships.
The Other ones . . .
Data stored in tables, joins and aggregates to simulate a graph
PyCon India 2014• • created by Sonal Raj •
Data Models
Native Graphs
Inherently store data as nodes and relationships.
The Other ones . . .
Data stored in tables, joins and aggregates to simulate a graph
PyCon India 2014• • created by Sonal Raj •
Why Neo4j ?
Schema-less property graph
PyCon India 2014• • created by Sonal Raj •
Why Neo4j ?
Schema-less property graph
Handles complex connected data efficiently
PyCon India 2014• • created by Sonal Raj •
Why Neo4j ?
Schema-less property graph
Handles complex connected data efficiently
Fully ACID Transactions
PyCon India 2014• • created by Sonal Raj •
Why Neo4j ?
Schema-less property graph
Handles complex connected data efficiently
Fully ACID Transactions
Highly Scalable, High Availability Clusters
PyCon India 2014• • created by Sonal Raj •
Why Neo4j ?
Schema-less property graph
Handles complex connected data efficiently
Fully ACID Transactions
Highly Scalable, High Availability Clusters
REST API for servers. Can be embedded to applications on JVM.
PyCon India 2014• • created by Sonal Raj •
Why Neo4j ?
Schema-less property graph
Handles complex connected data efficiently
Fully ACID Transactions
Highly Scalable, High Availability Clusters
REST API for servers. Can be embedded to applications on JVM.
Cypher – a declarative querying solution
Graph DB with good native python bindings . .
PyCon India 2014• • created by Sonal Raj •
Cypher in action
• Highly expressive query language
• Cares about ‘what’ rather than ‘how’ to retrieve from the graph.
• Uses pattern matching expressions.
1 2
(1) – [ :label ] - (2)
label
PyCon India 2014• • created by Sonal Raj •
Cypher in action
• Highly expressive query language
• Cares about ‘what’ rather than ‘how’ to retrieve from the graph.
• Uses pattern matching expressions.
1 2
START n=(1), m=(2)
MATCH n – [r:label] – m
RETURN r
label
PyCon India 2014• • created by Sonal Raj •
Cypher in action
• Highly expressive query language
• Cares about ‘what’ rather than ‘how’ to retrieve from the graph.
• Uses pattern matching expressions.
• To make life easy for some, it is inspired by SQL.
1 2
START n=(1), m=(2)
MATCH n – [r:label] – m
RETURN r
label
PyCon India 2014• • created by Sonal Raj •
Cypher in actionCreate
Read
CREATE (n:Person { name : ‘Chuck Norris', title : ‘Analyst' })
RETURN n
MATCH (a:Person),(b:Person)
WHERE a.name = ‘Chuck' AND b.name = ‘Rajani'
CREATE (a)-[r:RELTYPE { name : ‘cannot_find’ }]->(b)
RETURN r
MATCH (n) RETURN n #everything is returned
MATCH (n:Label) RETURN n #all with specific label
MATCH (Titanic { title:‘Titanic' })<-[:ACTED_IN|:DIRECTED]-(person)
RETURN person
PyCon India 2014• • created by Sonal Raj •
Cypher in actionUpdate
Delete
MATCH (n { name: 'Andres' })
SET n.surname = 'Taylor'
RETURN n
MATCH (peter { name: 'Peter' })
SET peter += { hungry: TRUE , position: 'Entrepreneur' }
MATCH (n { name: 'Peter' })
REMOVE n.title
REMOVE n:German
RETURN n
SET n.name = NULL
PyCon India 2014• • created by Sonal Raj •
REST in peace !!Create
POST http://localhost:7474/db/data/node
{
"foo" : "bar"
}
POST http://localhost:7474/db/data/node/1/relationships
{
"to" : "http://localhost:7474/db/data/node/10",
"type" : "LOVES",
"data" : {
"foo" : "bar"
}
}
POST http://localhost:7474/db/data/schema/index/person
{
"property_keys" : [ "name" ]
}
PyCon India 2014• • created by Sonal Raj •
REST in peace !!Read
Update
Delete
GET http://localhost:7474/db/data/node/144
GET http://localhost:7474/db/data/relationship/65
GET http://localhost:7474/db/data/relationship/61/properties
GET http://localhost:7474/db/data/schema/index/user
PUT http://localhost:7474/db/data/relationship/66/properties
{
"happy" : false
}
PUT http://localhost:7474/db/data/relationship/60/properties/cost
"deadly"
DELETE http://localhost:7474/db/data/node/308
DELETE http://localhost:7474/db/data/relationship/58
DELETE http://localhost:7474/db/data/schema/index/SomeLabel/name
Beauty of py2neo
PyCon India 2014• • created by Sonal Raj •
For the pythonistasAs simple as that!
from py2neo import neo4j
graph_db = neo4j.GraphDatabaseService("http://localhost:7474/db/data/")
from py2neo import node, rel die_hard = graph_db.create(
node(name="Bruce Willis"),
node(name="John McClane"),
node(name="Alan Rickman"),
node(name="Hans Gruber"),
node(name="Nakatomi Plaza"),
rel(0, "PLAYS", 1),
rel(2, "PLAYS", 3),
rel(1, "VISITS", 4),
rel(3, "STEALS_FROM", 4),
rel(1, "KILLS", 3),
)
PyCon India 2014• • created by Sonal Raj •
For the pythonistas
graphdb • clear()
• create(*abstracts)
• delete(*entities)
• delete_index(content_type, index_name)
• find(label, property_key=None, property_value=None)
• get_index(content_type, index_name)
• get_indexed_node(index_name, key, value)
• ...
PyCon India 2014• • created by Sonal Raj •
For the pythonistas
• get_indexed_relationship(index_name, key, value)
• get_properties(*entities)
• match(start_node=None, rel_type=None, end_node=None,
bidirectional=False, limit=None)
• match_one(start_node=None, rel_type=None, end_node=None,
bidirectional=False)
• node(id_)
• get_or_create_index(content_type, index_name, config=None)
• get_or_create_indexed_node(index_name, key, value,
properties=None)
graphdb
PyCon India 2014• • created by Sonal Raj •
Complexity Handling
“ A graph database without traversals is just a persistent graph ”
PyCon India 2014• • created by Sonal Raj •
Paths with py2neo#Create Paths
from py2neo import neo4j, node
a, b, c = node(name="Alice"), node(name="Bob"), node(name="Carol")
abc = neo4j.Path(a, ’KNOWS’, b, ’KNOWS’, c)
d, e = node(name=“Doctor”), node(name=“Easter”)
de = neo4j.Path(d, ‘KNOWS’, e)
#Join paths
abcde = neo4j.Path.join(abc, ‘KNOWS’, de)
#commit to the db
abcde.get_or_create(graph_db)
PyCon India 2014• • created by Sonal Raj •
Schema, Indices with py2neo#The class
py2neo.neo4j.Schema
py2neo.neo4j.Index
#Join paths
create_index(label, property_key)
drop_index(label, property_key)
get_indexed_property_keys(label)
add_if_none(key, value, entity)
#Apache Lucene Query
people = graph_db.get_or_create_index(neo4j.Node, "People")
s_people = people.query("family_name:S*")
neo4j.Node
neo4j.Relationship
PyCon India 2014• • created by Sonal Raj •
Cypher with py2neo#Create transaction object
from py2neo import cypher
Session = cypher.Session(“http://localhost:7474/”)
tx = session.create_transaction()
#Add transactions, execute or commit
tx.append(“some cypher query”)
tx.append(“some cypher query”)
tx.execute()
tx.append(“some cypher query”)
tx.commit()
#The classical way
from py2neo import neo4j
graph_db = neo4j.GraphDatabaseSercice()
query = neo4j.CypherQuery(graph_db, ‘your cypher query’)
query.execute()
#query.stream()
PyCon India 2014• • created by Sonal Raj •
Command Line neotool#Syntax of operation
neotool [<option>] <command> <args>
Or python –m py2neo.tool ..
#Some serious examples
neotool clear
neotool cypher "start n=node(1) return n, n.name?“
neotool cypher-csv "start n=node(1) return n.name, n.age?"
neotool cypher-tsv "start n=node(1) return n.name, n.age?"
#Guess what, you can also access the shell
neotool shell
PyCon India 2014• • created by Sonal Raj •
Neo4j level 2
• Batch Inserter
• High Availability
• Built-in online backup tools
• HTTPS support
PyCon India 2014• • created by Sonal Raj •
Neo4j level 2
• Batch Inserter
• High Availability
• Built-in online backup tools
• HTTPS support
Neo4J Framework.
• GraphUnit, for unit testing neo4j• Libraries for performance and API testing• Batch Transaction tools• Transaction Event tools• Some other utilities . .
Use Cases
PyCon India 2014• • created by Sonal Raj •
Recommendation Engines Complex pattern matching
PyCon India 2014• • created by Sonal Raj •
Social Network Data Many entities, highly interconnected
PyCon India 2014• • created by Sonal Raj •
Map DataTraversals and routing
PyCon India 2014• • created by Sonal Raj •
Python FamilyRelatives for Neo4j
Neo4
• •PyCon India 2014
Thank YouNeo appreciates your patience.
• Sonal Raj
• http://www.sonalraj.com/
• http://github.com/sonal-raj/
• sonal@enfoss.org