Neo4j_02262015

25
7/17/2019 Neo4j_02262015 http://slidepdf.com/reader/full/neo4j02262015 1/25 Copyright © 2013 Accenture All Rights Reserved. Accenture, its logo, and Accenture High Performance Delivered are trademarks of Accenture. NoSQL Neo4j Presented by:- Venkat Sirlam Copyright © 2015 Accenture All Rights Reserved. Confidential — For Company Internal Use Only.

description

Neo4j_02262015

Transcript of Neo4j_02262015

Page 1: Neo4j_02262015

7/17/2019 Neo4j_02262015

http://slidepdf.com/reader/full/neo4j02262015 1/25

Copyright © 2013 Accenture All Rights Reserved. Accenture, its logo, and Accenture High Performance Delivered are trademarks of Accenture.

NoSQL – Neo4j Presented by:- Venkat Sirlam

Copyright © 2015 Accenture All Rights Reserved. Confidential — For Company Internal Use Only.

Page 2: Neo4j_02262015

7/17/2019 Neo4j_02262015

http://slidepdf.com/reader/full/neo4j02262015 2/25

Copyright © 2013 Accenture All Rights Reserved .

Agenda

2

Copyright © 2015 Accenture All Rights Reserved. Confidential — For Company Internal Use Only.

NoSQL Solutions

NoSQL Categories – Graph Stores and it’s example

NoSQL Overview: Graph-based Databases

Neo4j - Graph Database

Cypher Query Language

Installation & Demo

Page 3: Neo4j_02262015

7/17/2019 Neo4j_02262015

http://slidepdf.com/reader/full/neo4j02262015 3/25

Copyright © 2013 Accenture All Rights Reserved . 3

Copyright © 2015 Accenture All Rights Reserved. Confidential — For Company Internal Use Only.

The current NoSQL world fits into 4 basic categories• Key-Value Pair Databases : Stores data for retrieval by primary keys andcorresponding content. This is the simplest format (Dynamo, Redis,Coherance)

• Column-oriented Databases : Stores data in extendable columns without pre-defined structured tables. Columns are grouped together by related data(HBase, Accumulo, Cassandra)

• Document-based Databases : Stores data in collections of documents or dataobjects (mongoDB, CouchDB, RavenDB)

• Graph-based Databases : Stores data structured in graphs rather than in

tables. A “graph” is a linked data structure that allows a more agile and rapidstyle of modeling / development (Neo4j)Graph Databases are built with nodes, relationships between notes and theproperties of nodes. Instead of tables of rows and columns and the rigidstructure of SQL, a flexible graph model is used which can scale across manymachines.

NoSQL Solutions

Page 4: Neo4j_02262015

7/17/2019 Neo4j_02262015

http://slidepdf.com/reader/full/neo4j02262015 4/25

Copyright © 2013 Accenture All Rights Reserved . 4

Copyright © 2015 Accenture All Rights Reserved. Confidential — For Company Internal Use Only.

NoSQL Solutions Cont.

Page 5: Neo4j_02262015

7/17/2019 Neo4j_02262015

http://slidepdf.com/reader/full/neo4j02262015 5/25

Copyright © 2013 Accenture All Rights Reserved . 5

Copyright © 2015 Accenture All Rights Reserved. Confidential — For Company Internal Use Only.

NoSQL Categories – Graph Stores

• Graph Stores use graph structures with nodes, edges and properties torepresent and store data – Nodes represent entities such as people, businesses or accounts –Properties are pertinent information that relate to nodes – Edges are the lines that connect nodes to nodes or nodes to properties and theyrepresent the relationship between the two. Most of the important information is really

stored in the edges. Meaningful patterns emerge when one examines the connectionsand interconnections of nodes, properties, and edges

• This kind of database is designed for data whose relations are wellrepresented as a graph (elements interconnected with an undeterminednumber of relations between them). The kind of data could be social relations,public transport links, road maps or network topologies

• Every element contains a direct pointer to its adjacent element and no indexlookups are necessary

Page 6: Neo4j_02262015

7/17/2019 Neo4j_02262015

http://slidepdf.com/reader/full/neo4j02262015 6/25

Copyright © 2013 Accenture All Rights Reserved . 6

Copyright © 2015 Accenture All Rights Reserved. Confidential — For Company Internal Use Only.

NoSQL Categories – Graph Stores (example)

Page 7: Neo4j_02262015

7/17/2019 Neo4j_02262015

http://slidepdf.com/reader/full/neo4j02262015 7/25Copyright © 2013 Accenture All Rights Reserved . 7Copyright © 2015 Accenture All Rights Reserved. Confidential — For Company Internal Use Only.

• What is a Graph Database?

A graph database stores data in a graph, the most generic of data structures,capable of elegantly representing any kind of data in a highly accessible way.Let’s follow along some graphs, using them to express graph concepts. We’ll“read” a graph by following arrows around the diagram to form sentences.

NoSQL Overview: Graph-based Databases

Graph-based Model : A Graph contains Nodes and Relationships“ A Graph —records data in→ Nodes —which have→ Properties”

Page 8: Neo4j_02262015

7/17/2019 Neo4j_02262015

http://slidepdf.com/reader/full/neo4j02262015 8/25Copyright © 2013 Accenture All Rights Reserved . 8Copyright © 2015 Accenture All Rights Reserved. Confidential — For Company Internal Use Only.

Relationships organize the Graph“Nodes —are organized by→Relationships —which also have→Properties ”

Query a Graph with a Traversal“A Traversal —navigates→ a Graph; it —identifies→ Paths —which order→ Nodes”

Indexes look-up Nodes or Relationships

“An Index —maps from→ Properties — toeither→ Nodes or Relationships”

NoSQL Overview: Graph-based Databases Cont.

Page 9: Neo4j_02262015

7/17/2019 Neo4j_02262015

http://slidepdf.com/reader/full/neo4j02262015 9/25Copyright © 2013 Accenture All Rights Reserved . 9Copyright © 2015 Accenture All Rights Reserved. Confidential — For Company Internal Use Only.

Neo4j - Graph Database

• Highlights As a robust, scalable and high-performance database, Neo4j is suitable for fullenterprise deployment or a subset of the full server can be used in lightweightprojects.

It features:• true ACID transactions • high availability • scales to billions of nodes and relationships • high speed querying through traversals

Proper ACID behavior is the foundation of data reliability. Neo4j enforces that alloperations that modify data occur within a transaction, guaranteeing consistentdata. This robustness extends from single instance embedded graphs to multi-server high availability installations

Page 10: Neo4j_02262015

7/17/2019 Neo4j_02262015

http://slidepdf.com/reader/full/neo4j02262015 10/25Copyright © 2013 Accenture All Rights Reserved . 10Copyright © 2015 Accenture All Rights Reserved. Confidential — For Company Internal Use Only.

Neo4j - Graph Database Cont.

“ A Graph Database —manages a→ Graph and —also manages related→Indexes ”

Neo4j is a commercially supported open-source graph database. It was designedand built from the ground-up to be a

reliable database, optimized for graphstructures instead of tables. Working withNeo4j, your application gets all theexpressiveness of a graph, with all thedependability you expectout of a database.

Page 11: Neo4j_02262015

7/17/2019 Neo4j_02262015

http://slidepdf.com/reader/full/neo4j02262015 11/25Copyright © 2013 Accenture All Rights Reserved . 11Copyright © 2015 Accenture All Rights Reserved. Confidential — For Company Internal Use Only.

Neo4j - Graph Database Cont.

•Comparing Database Model - A Graph Database transforms a RDBMS

RDBMS

Graph Database as RDBMS

Page 12: Neo4j_02262015

7/17/2019 Neo4j_02262015

http://slidepdf.com/reader/full/neo4j02262015 12/25Copyright © 2013 Accenture All Rights Reserved . 12Copyright © 2015 Accenture All Rights Reserved. Confidential — For Company Internal Use Only.

Neo4j - Graph Database Cont.

• NodesThe fundamental units that form a graph are nodes and relationships. In Neo4j,both nodes and relationships can contain properties. Nodes are often used torepresent entities, but depending on the domain relationships may be used forthat purpose as well.

Let’s start out with a really simple graph,containing only a single node with oneproperty:

Page 13: Neo4j_02262015

7/17/2019 Neo4j_02262015

http://slidepdf.com/reader/full/neo4j02262015 13/25Copyright © 2013 Accenture All Rights Reserved . 13Copyright © 2015 Accenture All Rights Reserved. Confidential — For Company Internal Use Only.

Neo4j - Graph Database Cont.

• RelationshipsThe fundamental units that form a graph are nodes and relationships. In Neo4j,both nodes and relationships can contain properties. Relationships betweennodes are a key part of a graph database. They allow for finding related data.Just like nodes, relationships can have properties.

A relationship connects two nodes, and is guaranteed to have valid start andend nodes.

As relationships are always directed, theycan be viewed as outgoing or incomingrelative to a node, which is useful whentraversing the graph:Relationships are equally well traversed ineither direction.

Page 14: Neo4j_02262015

7/17/2019 Neo4j_02262015

http://slidepdf.com/reader/full/neo4j02262015 14/25Copyright © 2013 Accenture All Rights Reserved . 14Copyright © 2015 Accenture All Rights Reserved. Confidential — For Company Internal Use Only.

Neo4j - Graph Database Cont.

Page 15: Neo4j_02262015

7/17/2019 Neo4j_02262015

http://slidepdf.com/reader/full/neo4j02262015 15/25

Copyright © 2013 Accenture All Rights Reserved . 15Copyright © 2015 Accenture All Rights Reserved. Confidential — For Company Internal Use Only.

Neo4j - Graph Database Cont.

• PropertiesBoth nodes and relationships can have properties. Properties are key-valuepairs where the key is a string. Property values can be either a primitive or anarray of one primitive type. For example String, int and int[] values are valid forproperties.

Page 16: Neo4j_02262015

7/17/2019 Neo4j_02262015

http://slidepdf.com/reader/full/neo4j02262015 16/25

Copyright © 2013 Accenture All Rights Reserved . 16Copyright © 2015 Accenture All Rights Reserved. Confidential — For Company Internal Use Only.

Neo4j - Graph Database Cont.

• Paths A path is one or more nodes with connecting relationships, typically retrieved asa query or traversal result.

Page 17: Neo4j_02262015

7/17/2019 Neo4j_02262015

http://slidepdf.com/reader/full/neo4j02262015 17/25

Copyright © 2013 Accenture All Rights Reserved . 17Copyright © 2015 Accenture All Rights Reserved. Confidential — For Company Internal Use Only.

Cypher Query Language

Cypher is a declarative graph query language that allows for expressive andefficient querying and updating of the graph store without having to writetraversals through the graph structure in code. Cypher is still growing andmaturing, and that means that there probably will be breaking syntax changes

The query language is comprised of several distinct clauses.

• START: Starting points in the graph, obtained via index lookups or by element IDs. • MATCH: The graph pattern to match, bound to the starting points in START. • WHERE: Filtering criteria. • RETURN: What to return. • CREATE: Creates nodes and relationships. • DELETE: Removes nodes, relationships and properties. • SET: Set values to properties. • FOREACH: Performs updating actions once per element in a list. • WITH: Divides a query into multiple, distinct parts.

Page 18: Neo4j_02262015

7/17/2019 Neo4j_02262015

http://slidepdf.com/reader/full/neo4j02262015 18/25

Copyright © 2013 Accenture All Rights Reserved . 18Copyright © 2015 Accenture All Rights Reserved. Confidential — For Company Internal Use Only.

Cypher Query Language Cont.• Create

Creating graph elements — nodes and relationships, is done with CREATE.Create single node, Create single node and set properties, Return created node, Create a relationship between twonodes, Create a relationship and set properties, Create a full path etc.

• Operators - Operators in Cypher are of three different varieties — mathematical, equality and relationships.

• Expressions can be

• A numeric literal (integer or double): 13, 40000, 3.14

• A string literal: "Hello", 'World'.• A boolean literal: true, false, TRUE, FALSE.• An identifier: n, x, rel, myFancyIdentifier, ̀ A name with weird stuff in it[]!`.• A property: n.prop, x.prop, rel.thisProperty, myFancyIdentifier.`(weird property name)`.• A nullable property: it’s a property, with a question mark or exclamation mark — n.prop?,•rel.thisProperty!.• A parameter: {param}, {0}• A collection of expressions: ["a", "b"], [1,2,3], ["a", 2, n.property, {param}], [ ].• A function call: length(p), nodes(p).• An aggregate function: avg(x.prop), count(*).•Relationship types: :REL_TYPE, :`REL TYPE`, :REL1|REL2.• A path-pattern: a-->()<--b.• A predicate expression is an expression that returns true or false: a.prop = "Hello", length(p) > 10, has(a.name)

• Parameters - Cypher supports querying with parameters. This allows developers to not to have to do stringbuilding to create a query, and it also makes caching of execution plans much easier for Cypher

Page 19: Neo4j_02262015

7/17/2019 Neo4j_02262015

http://slidepdf.com/reader/full/neo4j02262015 19/25

Copyright © 2013 Accenture All Rights Reserved . 19Copyright © 2015 Accenture All Rights Reserved. Confidential — For Company Internal Use Only.

Cypher Query Language Cont.

• IdentifiersWhen you reference parts of the pattern, you do so by naming them. The names you give the different parts arecalled identifiers.In this example:START n=node(1) MATCH n-->b RETURN bThe identifiers are n and b.Identifier names are case sensitive, and can contain underscores and alphanumeric characters (a-z,0-9), but muststart with a letter. If other characters are needed, you can quote the identifier using backquote (`) signs.The same rules apply to property names.

• CommentsTo add comments to your queries, use double slash. Examples:START n=node(1) RETURN n //This is an end of line commentSTART n=node(1)//This is a whole line comment

RETURN nSTART n=node(1) WHERE n.property = "//This is NOT a comment" RETURN n

• Updating the graphCypher can be used for both querying and updating your graph

Page 20: Neo4j_02262015

7/17/2019 Neo4j_02262015

http://slidepdf.com/reader/full/neo4j02262015 20/25

Copyright © 2013 Accenture All Rights Reserved . 20Copyright © 2015 Accenture All Rights Reserved. Confidential — For Company Internal Use Only.

Cypher Query Language Cont.

• Transactions Any query that updates the graph will run in a transaction. An updating query will always either fullysucceed, or not succeed at all.Cypher will either create a new transaction, and commit it once the query finishes. Or if a transaction already exists inthe running context, the query will run inside it, and nothing will be persisted to disk until the transaction issuccessfully committed.This can be used to have multiple queries be committed as a single transaction:1. Open a transaction,2. run multiple updating Cypher queries,3. and commit all of them in one go.

• PatternsPatterns are at the very core of Cypher, and are used in a lot of different places. Using patterns, you describe theshape of the data that you are looking for. Patterns are used in the MATCH clause. Path patterns are expressions.Since these expressions are collections, they can also be used as predicates (a non-empty collection signifies true).They are also used to CREATE/CREATE UNIQUE the graph.So, understanding patterns is important, to be able to be effective with Cypher.

• StartEvery query describes a pattern, and in that pattern one can have multiple starting points. A starting point is arelationship or a node where a pattern is anchored. You can either introduce starting points by id, or by index lookups.Note that trying to use an index that doesn’t exist will throw an exception

Page 21: Neo4j_02262015

7/17/2019 Neo4j_02262015

http://slidepdf.com/reader/full/neo4j02262015 21/25

Copyright © 2013 Accenture All Rights Reserved . 21Copyright © 2015 Accenture All Rights Reserved. Confidential — For Company Internal Use Only.

Cypher Query Language Cont.

• MatchIn the MATCH clause, patterns are used a lotRelated nodesOutgoing relationshipsDirected relationships and identifier etc.

• WhereIf you need filtering apart from the pattern of the data that you are looking for, you can add clauses in the WHEREpart of the query.

• ReturnIn the RETURN part of your query, you define which parts of the pattern you are interested in. It can be nodes,relationships, or properties on these.

• AggregationTo calculate aggregated data, Cypher offers aggregation, much like SQL’s GROUP BY . Aggregate functions takemultiple input values and calculate an aggregated value from them. Examples are AVG that calculate the average ofmultiple numeric values, or MIN that finds the smallest numeric value in a set of values.

Page 22: Neo4j_02262015

7/17/2019 Neo4j_02262015

http://slidepdf.com/reader/full/neo4j02262015 22/25

Copyright © 2013 Accenture All Rights Reserved . 22Copyright © 2015 Accenture All Rights Reserved. Confidential — For Company Internal Use Only.

Cypher Query Language Cont.

• Order byTo sort the output, use the ORDER BY clause. Note that you can not sort on nodes or relationships, just onproperties on these.

• LimitLIMIT enables the return of only subsets of the total result.

• Skip

SKIP enables the return of only subsets of the total result. By using SKIP, the result set will get trimmed from the top.Please note that no guarantees are made on the order of the result unless the query specifies the ORDER BYclause.

• WithThe ability to chain queries together allows for powerful constructs. In Cypher, the WITH clause is used to pipe theresult from one query to the next.WITH is also used to separate reading from updating of the graph. Every sub-query of a query must be either read-only or write-only.

Page 23: Neo4j_02262015

7/17/2019 Neo4j_02262015

http://slidepdf.com/reader/full/neo4j02262015 23/25

Copyright © 2013 Accenture All Rights Reserved . 23Copyright © 2015 Accenture All Rights Reserved. Confidential — For Company Internal Use Only.

Cypher Query Language Cont.

• SetUpdating properties on nodes and relationships is done with the SET clause. SET can also be used with maps fromparameters.

• DeleteRemoving graph elements — nodes, relationships and properties, is done with DELETE.

• Foreach

Collections and paths are key concepts in Cypher. To use them for updating data, you can use theFOREACH construct. It allows you to do updating commands on elements in a collection — a path, or acollection created by aggregation.

• FunctionsMost functions in Cypher will return null if the input parameter is null.Here is a list of the functions in Cypher, separated into three different sections: Predicates, Scalarfunctions and Aggregated functions

• CompatibilityCypher is still changing rather rapidly. Parts of the changes are internal — we add new patternmatchers, aggregators and other optimizations, which hopefully makes your queries run faster.

Page 24: Neo4j_02262015

7/17/2019 Neo4j_02262015

http://slidepdf.com/reader/full/neo4j02262015 24/25

Copyright © 2013 Accenture All Rights Reserved . 24Copyright © 2015 Accenture All Rights Reserved. Confidential — For Company Internal Use Only.

Cypher Query Language Cont.

Demo

Page 25: Neo4j_02262015

7/17/2019 Neo4j_02262015

http://slidepdf.com/reader/full/neo4j02262015 25/25

C i h © 2013 A All Ri h R d 25