Database Pro Power Days 2010 - Graph data in the cloud using .NET
-
Upload
achim-friedland -
Category
Technology
-
view
4.030 -
download
0
description
Transcript of Database Pro Power Days 2010 - Graph data in the cloud using .NET
Database Pro Power Days , 10/20/2010Achim Friedland <[email protected]> 1Achim Friedland <[email protected]>
19./20. Oktober 2010
Nürnbergwww.databasepro-powerdays.de
Graph Data in the cloudusing .NET
1Database Pro Power Days , 10/20/2010Achim Friedland <[email protected]>
Photo: Large Magellanic Cloud, ESO
sones
Database Pro Power Days , 10/20/2010Achim Friedland <[email protected]> 2
Photo: funky64, flickr
2Database Pro Power Days , 10/20/2010Achim Friedland <[email protected]>
For 35 years information has been well-defined data within some tables
jailed in closed database silos.
Database Pro Power Days , 10/20/2010Achim Friedland <[email protected]> 3
Photo: shamballah, flickr
3Database Pro Power Days , 10/20/2010Achim Friedland <[email protected]>
The relational model and SQL have become much too limited for open
linked data and cloud requirements.== graph data
Database Pro Power Days , 10/20/2010Achim Friedland <[email protected]> 4
Photo: shamballah, flickr
4Database Pro Power Days , 10/20/2010Achim Friedland <[email protected]>
Applications can not access, under-stand and process unknown relational data easily.
DB 1
Application 1
DB 2
Application 2
? ?
Database Pro Power Days , 10/20/2010Achim Friedland <[email protected]> 5
Photo: Gephi, flickr
5Database Pro Power Days , 10/20/2010Achim Friedland <[email protected]>
1. Graph-Databases
Database Pro Power Days , 10/20/2010Achim Friedland <[email protected]> 6
The Property-Graph Model
6Database Pro Power Days , 10/20/2010Achim Friedland <[email protected]>
Photo: Gephi, flickr
FriendsAliceID = 1
Age = 21
BobID = 2
Age = 23since = 2009/09/21
reason = classmates
Edge
Vertex
Edge-Properties
Vertex-Properties
Database Pro Power Days , 10/20/2010Achim Friedland <[email protected]> 7
The Property-Graph Model
7
1
Person
20101014…
Alice
21
Red, Green
1 Infinite Loop
Cupertino
Database Pro Power Days , 10/20/2010Achim Friedland <[email protected]>
CarolID = 3
Age = 20
ID
TYPE
REVISION
Name
Age
Boyfriend
Friends
FavColors
Address.Street
Address.TownClose to Object- and
Document-Databases
BobID = 2
Age = 23
Photo: Gephi, flickr
Direct linking without external
indices
Database Pro Power Days , 10/20/2010Achim Friedland <[email protected]> 8
Photo: Gephi, flickr
The Property-Graph Model
8
BobID = 2
Age = 23
20101014…
Alice
21
Red, Green
1 Infinite Loop
Cupertino
ID
rdf:type
sones:revId
foaf:name
foaf:age
person
Set<person>
List<string>
XML
gn:streeet
gn:town
Database Pro Power Days , 10/20/2010Achim Friedland <[email protected]>
CarolID = 3
Age = 20
using RDF-like semantics
ID
TYPE
REVISION
Name
Age
Boyfriend
Friends
FavColors
Address
Address.Street
Address.Town
+ Unambiguous identifiers
+ Named relations
+ Close to RDF molecules
http://test.com/vertices/1
http://test.com/#person
Database Pro Power Days , 10/20/2010Achim Friedland <[email protected]> 9
Graph-Databases in a cloud
Achim Friedland <[email protected]> 9
Photo: Large Magellanic Cloud, ESO
Database Pro Power Days , 10/20/2010
GraphDB
REST
Hypermedia • Representation must be „link-aware“e.g. XML+XLINK, ATOM, RDFa…
• Representation should be self-describing
• Vertices and edges are resources• Access via e.g. http://test.com/vertices/[$id]• Common CRUD operations (GET, POST, PUT…)
+ Atomicity
+ Statelessness
+ Idempotence
+ Parallelism
Database Pro Power Days , 10/20/2010Achim Friedland <[email protected]> 10
Photo: bombeador, flickr
10Database Pro Power Days , 10/20/2010Achim Friedland <[email protected]>
2. The Object-relationalImpedance Mismatch
Graph
Database Pro Power Days , 10/20/2010Achim Friedland <[email protected]> 1111Database Pro Power Days , 10/20/2010Achim Friedland <[email protected]>
Inflexible relational schemata
• Expensive ALTER TABLE operations
• Entity-Attribute-Value Model ↔ RDF
• No semi-/unstructured dataXML, JSON, … hierarchies, graphs, … binary data
• No Multi-Attribute ValuesList<String>, Set<Integer>, Set<Person>
• No simple way for versioned data
Photo: bombeador, flickr
Database Pro Power Days , 10/20/2010Achim Friedland <[email protected]> 1212Database Pro Power Days , 10/20/2010Achim Friedland <[email protected]>
Relational Anti-Patterns:
• Relations via foreign key constraintsNo explicit concept for relationsNo index-free adjacency
• Querying relational data via JOINs is hardJust storing a graph was never a challenge ;)
• No recursive JOINsInefficient query processing(Except: Oracle’s “CONNECTED BY”)
Photo: bombeador, flickr
Database Pro Power Days , 10/20/2010Achim Friedland <[email protected]> 13
Photo: shamballah, flickr
13Database Pro Power Days , 10/20/2010Achim Friedland <[email protected]>
SQL and Cloud-Readiness?
• No explicit scaling or partitioning within the relational model
• No JOINs between different databases and/or vendors
• No well interaction with state-of-the-art web technologiese.g. HTTP/REST, Hypermedia, Semantic Web
Database Pro Power Days , 10/20/2010Achim Friedland <[email protected]> 14
Photo: Gephi, flickr
14Database Pro Power Days , 10/20/2010Achim Friedland <[email protected]>
3. Benefits of Graph-Databases
Database Pro Power Days , 10/20/2010Achim Friedland <[email protected]> 15
Photo: Gephi, flickr
The explicit graph data model provides a higher level of abstraction
and a better understanding of the domain model.
15Database Pro Power Days , 10/20/2010Achim Friedland <[email protected]>
Database Pro Power Days , 10/20/2010Achim Friedland <[email protected]> 16
Photo: Gephi, flickr
Index-free adjacency provides an improved scalability, data-locality
and a superior graph traversal performance.
( Independent of the size of the graph )
16Database Pro Power Days , 10/20/2010Achim Friedland <[email protected]>
Database Pro Power Days , 10/20/2010Achim Friedland <[email protected]> 17
Photo: squacco, flickr
17Database Pro Power Days , 10/20/2010Achim Friedland <[email protected]>
Consistency criteria and indices for simple attributes up to complex
subgraph structures.
Database Pro Power Days , 10/20/2010Achim Friedland <[email protected]> 1818Database Pro Power Days , 10/20/2010Achim Friedland <[email protected]>
Photo: Khem, flickr
Traversing linked information, finding shortest-paths, do semantic
partitions.
Database Pro Power Days , 10/20/2010Achim Friedland <[email protected]> 19
Photo: NASA, flickr
19Database Pro Power Days , 10/20/2010Achim Friedland <[email protected]>
Recommendation and discovery
of potentially interessting linked information.
personal social item-related
Database Pro Power Days , 10/20/2010Achim Friedland <[email protected]> 2020Database Pro Power Days , 10/20/2010Achim Friedland <[email protected]>
Photo: Birger Hoppe, flickr
Good integration into state-of-the-art programming concepts and web
technologies.
Database Pro Power Days , 10/20/2010Achim Friedland <[email protected]> 21Achim Friedland <[email protected]> 21
Photo: Large Magellanic Cloud, ESO
Database Pro Power Days , 10/20/2010
Graph-Databases, REST and RDF symantics are a solid foundation for
cloud services
Database Pro Power Days , 10/20/2010Achim Friedland <[email protected]> 22
Photo: Gephi, flickr
22Database Pro Power Days , 10/20/2010Achim Friedland <[email protected]>
4. Graph data in the cloudusing .NET / Mono
Database Pro Power Days , 10/20/2010Achim Friedland <[email protected]> 23
• URL• License• Language• Goals• Concurrency• Repl./Scaling• Persistency• Cloud
http://www.sones.deAGPLv3C# 4.0Management of linked dataMVCCp2p (alpha)Proprietary file systemConnector for Microsoft Azure
sones GraphDB
Photo: sones
23Database Pro Power Days , 10/20/2010Achim Friedland <[email protected]>
Database Pro Power Days , 10/20/2010Achim Friedland <[email protected]> 24
Photo: sones
24Database Pro Power Days , 10/20/2010Achim Friedland <[email protected]>
sones Architecture
GraphDSREST, WebShell, C# API
GraphDBGQL, Graph Traversals, Indices
GraphFSObject Management, (De-)Serialization
Host File System / Microsoft Azure
Database Pro Power Days , 10/20/2010Achim Friedland <[email protected]> 25
Photo: sones
25Database Pro Power Days , 10/20/2010Achim Friedland <[email protected]>
Shared nothing
User
GraphDS 1
GraphDS 2
GraphDB 2
GraphDB 1
GraphFS 2
GraphFS 1
Azureit depends…
Database Pro Power Days , 10/20/2010Achim Friedland <[email protected]> 26
Photo: sones
User Friend BobID = 2
Alter = 23since = 2009/09/21
26Database Pro Power Days , 10/20/2010Achim Friedland <[email protected]>
sones Property-Hypergraph
User Friend CarolID = 3
Alter = 20since = 2010/04/11
SET<User> Friends
SetMaxNumber = 12
AliceID = 1
Alter = 21
Hyperedge-Properties
Hyperedge
Edge
Virtual -Edge
Database Pro Power Days , 10/20/2010Achim Friedland <[email protected]> 27
Photo: sones
• Properties may include code as dataThink of stored procedures; C#: Func<…>, ExpressionTrees
• Allows hyperedge calculations be done among the set of their edges(GetMinWeight, SetMaxNumber, …)
sones Property-Hypergraph
27NoSQL Frankfurt , 9/28/2010Achim Friedland <[email protected]>
Database Pro Power Days , 10/20/2010Achim Friedland <[email protected]> 28
Photo: Shayne Kaye, flickr
sones Graph Query Language
28Database Pro Power Days , 10/20/2010Achim Friedland <[email protected]>
FROM User SELECT User.Friends.Friends.Name
• “SQL for graphs” providing a user-friendly DSL for ad-hoc graph queries and graph discovery
• Functions and aggregates are type-safe and can be extended by your own plug-ins, e.g.
• SELECT COUNT(User.Friends)
• SELECT User.Friends.Random(2)
• SELECT User.Friends.Name.Substring(2,5)
Database Pro Power Days , 10/20/2010Achim Friedland <[email protected]> 29
Photo: Gephi, flickr
sones Graph Query Language
Database Pro Power Days , 10/20/2010Achim Friedland <[email protected]> 29
// sones gql example
CREATE VERTEX UserADD ATTRIBUTES (String Name, SET<User> Friends)INDICES (Name)MANDATORY (Name)
INSERT INTO User VALUES (Name = "Alice", Age = 21)INSERT INTO User VALUES (Name = "Bob", Age = 23)
LINK User(Name = ‘Alice') VIA Friends TO User(Name = ‘Bob')LINK User(Name = ‘Bob') VIA Friends TO User(Name = ‘Alice‘)
Database Pro Power Days , 10/20/2010Achim Friedland <[email protected]> 30
Photo: Gephi, flickr
C# API
Database Pro Power Days , 10/20/2010Achim Friedland <[email protected]> 30
// C# API type creation
var _Person = _GraphDB.TypeManager.CreateVertex(„Person“).AddString(„Name“, mandatory: true, indexed: true).AddLoop(„Friends”, hyperEdge: true).execute();
Type _PersonT = _GraphDB.TypeManager.GenerateType(_Person);
Database Pro Power Days , 10/20/2010Achim Friedland <[email protected]> 31
Photo: Gephi, flickr
C# API
Database Pro Power Days , 10/20/2010Achim Friedland <[email protected]> 31
// C# API vertex initialization
Person _Alice = _GraphDB.TypeManager.ActivateVertex(_Person, new VertexUUID(1));
_Alice.Name = „Alice“;
dynamic _Alice2 = _Alice;_Alice.Age = 21;_Alice.bdayparty = (Action) (() => { _Alice.Age++; });
_Alice.bdayparty();
Database Pro Power Days , 10/20/2010Achim Friedland <[email protected]> 32
Photo: Gephi, flickr
C# API
Database Pro Power Days , 10/20/2010Achim Friedland <[email protected]> 32
// sones C# API example
var _Friends = new GraphAttribute(„Friends“, Type: „foaf:knows“);
var _Bob = _GraphDB.TypeManager.ActivateVertex(_Person, new VertexUUID(2));
_Alice.Link(_Friends, _Bob);_Bob.Link(_Friends, _Alice);
Database Pro Power Days , 10/20/2010Achim Friedland <[email protected]> 33
Photo: Gephi, flickr
C# API
Database Pro Power Days , 10/20/2010Achim Friedland <[email protected]> 33
// Graph Traversals
Public T TraverseVertex<T> (IVertex myStartVertex,TraversalOperation TraversalOperation =
TraversalOperation.BreathFirst,Func<IVertex, IEdge, Boolean> myFollowThisEdge = null,Func<IVertex, Boolean> myMatchEvaluator = null,Action<IVertex> myMatchAction = null,Func<TraversalState, Boolean> myStopEvaluator = null,Func<IEnumerable<IVertex>, T> myWhenFinished = null)
{// Traverse the graph
}
Database Pro Power Days , 10/20/2010Achim Friedland <[email protected]> 34Achim Friedland <[email protected]>
For more information…
[email protected]://www.twitter.com/ahzf
http://www.twitter.com/graphdbs
sones