GraphDB™ Clusters for Everyone

72
GraphDB Clusters for Everyone October 8 th 2015 GraphDB Clusters for Everyone #1 8/Oct/15

Transcript of GraphDB™ Clusters for Everyone

Page 1: GraphDB™ Clusters for Everyone

GraphDB Clusters for EveryoneOctober 8th 2015

GraphDB Clusters for Everyone #18/Oct/15

Page 2: GraphDB™ Clusters for Everyone

• Information management company providing text analysis, data management and state-of-the-art semantic technology

• 70 software developers in Sofia, Bulgaria• Presence in London and New York• Clients include BBC, FT, AstraZeneca, DoD, Wiley & Sons• Over 400 person-years in R&D to create a one-stop shop for:

– Content enrichment– Data management – Graph database engine

Ontotext

GraphDB Clusters for Everyone #28/Oct/15

Page 3: GraphDB™ Clusters for Everyone

Ontotext and BBC

GraphDB Clusters for Everyone

Profile• Mass media broadcaster founded in 1922• 23,000 employees and over 5 billion

pounds in annual revenue.

Goals• Create a dynamic semantic publishing

platform that assembled web pages on-the-fly using a variety of data sources

• Deliver highly relevant data to web site visitors with sub-second response

Challenges• BBC journalists author and publish content

which is then statistically rendered. The costs and time to do this were high.

• Diverse content was difficult to navigate, content re-use was not flexible

• User experience needed to be improved with relevant content

"The goal is to be able to more easily and accurately aggregate content, find it and share it across many sources. From these simple relationships and building blocks you can dynamically build up incredibly rich sites and navigation on any platform." John O’Donovan Chief Technical Architect

#38/Oct/15

Page 4: GraphDB™ Clusters for Everyone

GraphDB Clusters for Everyone #48/Oct/15

Clients

Page 5: GraphDB™ Clusters for Everyone

Technology Portfolio

GraphDB Clusters for Everyone #58/Oct/15

Page 6: GraphDB™ Clusters for Everyone

• GraphDB Lite: in-memory, fast, free to use • GraphDB Standard: file-based, highly optimized

– Efficient retraction of inferred statements– Comprehensive query optimization– Includes Graph DB Workbench – console for configuration, monitoring

and exploration of datasets– Plug-in architecture, RDF Rank, Geo-spatial index

• GraphDB Enterprise: HA cluster– Resilience and high query-answering bandwidth– Includes GraphDB Connectors to FTS and NoSQL engines

GraphDB Editions

GraphDB Clusters for Everyone #68/Oct/15

Page 7: GraphDB™ Clusters for Everyone

Cluster Nodes

• Master nodes– Coordinate all read and write operations– Synchronize worker nodes– Requires limited resources

• Worker nodes– Store all information– Execute all read operations– Requires capacity planning

8/Oct/15GraphDB Clusters for Everyone #7

Page 8: GraphDB™ Clusters for Everyone

Cluster Structure #1

Worker #2Worker #1

Master #1

Worker #3

8/Oct/15GraphDB Clusters for Everyone #8

• Share-nothing architecture• Very fast graph queries• Data failover/redundancy• Horizontal read scalability• Vertical write

scalability

Page 9: GraphDB™ Clusters for Everyone

Cluster Structure #2

Worker #2Worker #1

Master #1

Worker #3

write

8/Oct/15GraphDB Clusters for Everyone #9

• Share-nothing architecture• Very fast graph queries• Data failover/redundancy• Horizontal read scalability• Vertical write

scalability

Page 10: GraphDB™ Clusters for Everyone

Cluster Structure #3

Worker #2Worker #1

Master #1

Worker #3

8/Oct/15GraphDB Clusters for Everyone #10

• Share-nothing architecture• Very fast graph queries• Data failover/redundancy• Horizontal read scalability• Vertical write

scalability

Page 11: GraphDB™ Clusters for Everyone

Cluster Structure #4

Worker #2Worker #1

Master #1

Worker #3

I keep only a log of the

recent transactions

I keep a copy of the full repository

data 8/Oct/15GraphDB Clusters for Everyone #11

• Share-nothing architecture• Very fast graph queries• Data failover/redundancy• Horizontal read scalability• Vertical write

scalability

Page 12: GraphDB™ Clusters for Everyone

Cluster Structure #5

• Share-nothing architecture• Very fast graph queries• Data failover/redundancy• Horizontal read scalability• Vertical write

scalability Worker #2Worker #1

Master #1

Worker #3

readI choose a worker

8/Oct/15GraphDB Clusters for Everyone #12

Page 13: GraphDB™ Clusters for Everyone

Cluster Structure #6

Worker #2Worker #1

Master #1

Worker #3

read

I execute the query

8/Oct/15GraphDB Clusters for Everyone #13

• Share-nothing architecture• Very fast graph queries• Data failover/redundancy• Horizontal read scalability• Vertical write

scalability

Page 14: GraphDB™ Clusters for Everyone

Strict vs Eventual Consistency

Worker #2Worker #1

Master #1

Worker #3

read

I execute the query

8/Oct/15GraphDB Clusters for Everyone #14

• Eventual consistency• Strict consistency• Balance between strong and

eventual consistency

Page 15: GraphDB™ Clusters for Everyone

Strict vs Eventual Consistency

Worker #2Worker #1

Master #1

Worker #3

read

I execute the query

8/Oct/15GraphDB Clusters for Everyone #15

• Eventual consistency• Strict consistency• Balance between strong and

eventual consistency

Page 16: GraphDB™ Clusters for Everyone

Strict vs Eventual Consistency

Worker #2Worker #1

Master #1

Worker #3

read

I execute the query

8/Oct/15GraphDB Clusters for Everyone #16

Execute the query on the latest worker

data• Eventual consistency• Strict consistency• Balance between strong and

eventual consistency

All consistency models are supported

Page 17: GraphDB™ Clusters for Everyone

Common Cluster Operations

• Master transaction log• Worker replication• Cluster backup • Master peering• Split brain

8/Oct/15GraphDB Clusters for Everyone #17

Page 18: GraphDB™ Clusters for Everyone

Master transaction log

Master #1

Worker #1

Transaction Log

1write

Worker #2

Write operation8/Oct/15GraphDB Clusters for Everyone #18

Page 19: GraphDB™ Clusters for Everyone

Master transaction log

Master #1

Worker #1

Worker #2Transaction Log

1write

1

Write operation

1

I will test the write

first

8/Oct/15GraphDB Clusters for Everyone #19

Page 20: GraphDB™ Clusters for Everyone

Master transaction log

Master #1

Worker #1

Transaction Log

1write

1

1ok

Worker #2

Write operation

I confirm the

update

8/Oct/15GraphDB Clusters for Everyone #20

Page 21: GraphDB™ Clusters for Everyone

Master transaction log

Master #1

Worker #1

Transaction Log

1write

1

1

ok

Worker #2

Write operation

The write is confirmed

8/Oct/15GraphDB Clusters for Everyone #21

Page 22: GraphDB™ Clusters for Everyone

Worker #2

Master transaction log

Master #1

Worker #1

Transaction Log

1

1synchronize

1

Write operation

Propagate the write to

all other workers

8/Oct/15GraphDB Clusters for Everyone #22

Page 23: GraphDB™ Clusters for Everyone

Worker #2

Master transaction log

Master #1

Worker #1

Transaction Log

1

1

1

2write

Rollback write operation8/Oct/15GraphDB Clusters for Everyone #23

Page 24: GraphDB™ Clusters for Everyone

Worker #2

Master transaction log

Master #1

Worker #1

Transaction Log

1

1

1

2write

2

Rollback write operation

2

8/Oct/15GraphDB Clusters for Everyone #24

Page 25: GraphDB™ Clusters for Everyone

Worker #2

Master transaction log

Master #1

Worker #1

Transaction Log

1

1

1

2write

error

2

Rollback write operation

The update is

invalid

8/Oct/15GraphDB Clusters for Everyone #25

Page 26: GraphDB™ Clusters for Everyone

Worker #2

Master transaction log

Master #1

Worker #1

Transaction Log

1

1

1

2write

2

error

Rollback write operation

I mark the write as invalid

8/Oct/15GraphDB Clusters for Everyone #26

Page 27: GraphDB™ Clusters for Everyone

Worker #2

Master transaction log

Master #1

Worker #1

Transaction Log

1

New worker node add operation

Worker #3

23

1234

1234

8/Oct/15GraphDB Clusters for Everyone #27

Page 28: GraphDB™ Clusters for Everyone

Worker #2

Master transaction log

Master #1

Worker #1

Transaction Log

1

New worker node add operation

Worker #3

23

1234

1234

add

A new worker is

added

8/Oct/15GraphDB Clusters for Everyone #28

Page 29: GraphDB™ Clusters for Everyone

Worker #2

Master transaction log

Master #1

Worker #1

Transaction Log

1

New worker node add operation

Worker #3

23

1234

1234synchronize

1

I have all updates from the

start

8/Oct/15GraphDB Clusters for Everyone #29

Page 30: GraphDB™ Clusters for Everyone

Worker #2

Master transaction log

Master #1

Worker #1

Transaction Log

1

New worker node add operation

Worker #3

23

1234

1234synchronize

I have all updates from the

start

8/Oct/15GraphDB Clusters for Everyone #30

1234

4

Page 31: GraphDB™ Clusters for Everyone

Master transaction log cleanup

• Masters keep only limited transactions• LogMaxSize controls the maximum number of

stored transactions in the log• LogMaxDepth controls the maximum time in

minutes to keep transactions in the log

Transaction Log

12344in out

8/Oct/15GraphDB Clusters for Everyone #31

Page 32: GraphDB™ Clusters for Everyone

Worker node replication

Master #1

Worker #1

Transaction Log

New worker node add operation #2

Worker #3

1234

6789

56789

Worker #2

123456789

8/Oct/15GraphDB Clusters for Everyone #32

Page 33: GraphDB™ Clusters for Everyone

Worker node replication

Master #1

Worker #1

Transaction Log

New worker node add operation #2

Worker #3

1234

6789

56789

Worker #2

123456789

add

A new worker is

added

8/Oct/15GraphDB Clusters for Everyone #33

Page 34: GraphDB™ Clusters for Everyone

Worker node replication

Master #1

Worker #1

Transaction Log

New worker node add operation #2

Worker #3

1234

6789

56789

Worker #2

123456789

replication

off

I have only the recent

transactions

8/Oct/15GraphDB Clusters for Everyone #34

Page 35: GraphDB™ Clusters for Everyone

Worker node replication

Master #1

Worker #1

Transaction Log

New worker node add operation #2

Worker #3

1234

6789

56789

Worker #2

123456789

replication

off

123

Awrite

8/Oct/15GraphDB Clusters for Everyone #35

Page 36: GraphDB™ Clusters for Everyone

Worker node replication

Master #1

Worker #1

Transaction Log

New worker node add operation #2

Worker #3

1234

6789

56789

Worker #2

123456789

replication

off

123

Awrite

A

45

I have still workers to

use

8/Oct/15GraphDB Clusters for Everyone #36

Page 37: GraphDB™ Clusters for Everyone

Worker node replication

Master #1

Worker #1

Transaction Log

New worker node add operation #2

Worker #3

1234

6789

56789

Worker #2

123456789

replication

off

123

Awrite

A

A

45

write

6

8/Oct/15GraphDB Clusters for Everyone #37

Page 38: GraphDB™ Clusters for Everyone

Worker node replication

Master #1

Worker #1

Transaction Log

New worker node add operation #2

1234

6789

56789

Worker #2

123456789

synchronize

A

A

Worker #3

123456789

Replication is over; go

back to normal

8/Oct/15GraphDB Clusters for Everyone #38

Page 39: GraphDB™ Clusters for Everyone

Worker node replication

Master #1

Worker #1

Transaction Log

New worker node add operation #2

1234

6789

56789

Worker #2

123456789

synchronize

A

A

Worker #3

123456789

A

A

8/Oct/15GraphDB Clusters for Everyone #39

Page 40: GraphDB™ Clusters for Everyone

Worker node replication #2

Master #1

Worker #1

Transaction Log

New worker node add operation #3

Worker #2

1234

6789

56789

8/Oct/15GraphDB Clusters for Everyone #40

Page 41: GraphDB™ Clusters for Everyone

Worker node replication #2

Master #1

Worker #1

Transaction Log

New worker node add operation #2

Worker #2

1234

6789

56789

add

A new worker is

added

8/Oct/15GraphDB Clusters for Everyone #41

Page 42: GraphDB™ Clusters for Everyone

Worker node replication

Master #1

Worker #1

Transaction Log

New worker node add operation #2

Worker #3

1234

6789

56789

replication

off

I have only the recent

transactions

8/Oct/15GraphDB Clusters for Everyone #42

Page 43: GraphDB™ Clusters for Everyone

Worker node replication

Master #1

Worker #1

Transaction Log

New worker node add operation #2

Worker #3

1234

6789

56789

replication

off

No workers available in maintenanc

e modeA

write

8/Oct/15GraphDB Clusters for Everyone #43

Page 44: GraphDB™ Clusters for Everyone

Cluster Backup

Master #1

Worker #1

Transaction Log

1234

67

567

Worker #2

123456

Worker #3

12345

backup

Select an up to date worker

8/Oct/15GraphDB Clusters for Everyone #44

Page 45: GraphDB™ Clusters for Everyone

Cluster Backup

Master #1

Worker #1

Transaction Log

1234

67

567

Worker #2

123456

Worker #3

12345

backup

off

8/Oct/15GraphDB Clusters for Everyone #45

Page 46: GraphDB™ Clusters for Everyone

Cluster Backup

Master #1

Worker #1

Transaction Log

1234

67

567

Worker #2

123456

Worker #3

12345

backup

copy

6

7

8/Oct/15GraphDB Clusters for Everyone #46

Page 47: GraphDB™ Clusters for Everyone

Cluster Backup

Master #1

Worker #1

Transaction Log

1234

67

567

Worker #2

123456

Worker #3

12345

backup

6

7

Backup

12

copy

8/Oct/15GraphDB Clusters for Everyone #47

Page 48: GraphDB™ Clusters for Everyone

Cluster Backup

Master #1

Worker #1

Transaction Log

1234

67

567

Worker #2

123456

Worker #3

12345

backup

6

7

Backup

123

8write

copy

8/Oct/15GraphDB Clusters for Everyone #48

Page 49: GraphDB™ Clusters for Everyone

Cluster Backup

Master #1

Worker #1

Transaction Log

1234

67

567

Worker #2

123456

Worker #3

12345

backup

6

7

Backup

123

8write

copy

8

45

8/Oct/15GraphDB Clusters for Everyone #49

Page 50: GraphDB™ Clusters for Everyone

Cluster Backup

Master #1

Worker #1

Transaction Log

1234

67

567

Worker #2

123456

Worker #3

12345

backup

6

7

Backup

123

8write

copy

8

8

45

8/Oct/15GraphDB Clusters for Everyone #50

Page 51: GraphDB™ Clusters for Everyone

Cluster Backup

Master #1

Worker #1

Transaction Log

1234

67

567

Worker #2

123456

Worker #3

123456

7

Backup

123

synchronize

8

8

745

67 8

8/Oct/15GraphDB Clusters for Everyone #51

Page 52: GraphDB™ Clusters for Everyone

Cluster Backup

Master #1

Worker #1

Transaction Log

1234

67

567

Worker #2

123456

Worker #3

123456

7

Backup

123

8

8

745

67 8

8

8/Oct/15GraphDB Clusters for Everyone #52

Page 53: GraphDB™ Clusters for Everyone

Master peering (1)

#53

• Seamless Hot Standby failover– In case of a single master failure– GraphDB-Enterprise should failover to the 2nd master without any

transaction loss, errors or exceptions – In a completely self managed way– Support for several geographically distributed data centers

• Persistent asynchronous transactions – GraphDB-Enterprise is always R/W– No Tx loss even in case of restart/failure– Accepted Tx can be asynchronously made eventually persistent– All accepted transactions are recoverable after a server restart/failure

GraphDB Clusters for Everyone 8/Oct/15

Page 54: GraphDB™ Clusters for Everyone

Master peering (2)

#54

• Client Failover Utility– Failover to the GraphDB master in case of a master failure– Asynchronous and synchronous transactions are supported. The

synchronous transactions wait until the update is executed on the server and check if it is successful.

– Persists the submitted transaction locally to avoid the case of a possible master failure

GraphDB Clusters for Everyone 8/Oct/15

Page 55: GraphDB™ Clusters for Everyone

Master peering

Master #1

Transaction Log

Master #2

Transaction Log

peer

Worker #2

Worker #1

Worker #3

Worker #5

Worker #4

Worker #6

8/Oct/15GraphDB Clusters for Everyone #55

Client Failover Utility

Client Failover Utility

Page 56: GraphDB™ Clusters for Everyone

Master peering

Master #1

Transaction Log

Master #2

Transaction Log

1234

5678

write write

Worker #2

Worker #1

Worker #3

Worker #5

Worker #4

Worker #6

8/Oct/15GraphDB Clusters for Everyone #56

Page 57: GraphDB™ Clusters for Everyone

Master peering

Master #1

Transaction Log

Master #2

Transaction Log

1234

5678

write write

1234 5678

No workers available in maintenanc

e mode

Let’s synchronize

first

Worker #2

Worker #1

Worker #3

Worker #5

Worker #4

Worker #6

8/Oct/15GraphDB Clusters for Everyone #57

Page 58: GraphDB™ Clusters for Everyone

Master peering

Master #1

Transaction Log

Master #2

Transaction Log

1234

5678

write write

1526

No workers available in maintenanc

e mode

Our queues are sorted and ready to apply

374815263748

Worker #2

Worker #1

Worker #3

Worker #5

Worker #4

Worker #6

8/Oct/15GraphDB Clusters for Everyone #58

Page 59: GraphDB™ Clusters for Everyone

• Partition tolerance– consistency or availability

• HA cluster continues to operate after partition (AP)– Allow each master to receive updates– Backup before split brain– Smart merge by using the backup after split brain recovery

Split Brain

GraphDB Clusters for Everyone #598/Oct/15

Page 60: GraphDB™ Clusters for Everyone

Split Brain

Master #1

Transaction Log

Master #2

Transaction Log

12 12

Worker #2

Worker #1

Worker #3

Worker #5

Worker #4

Worker #6

8/Oct/15GraphDB Clusters for Everyone #60

Page 61: GraphDB™ Clusters for Everyone

Split Brain

Master #1

Transaction Log

Master #2

Transaction Log

12 12

Worker #2

Worker #1

Worker #3

Worker #5

Worker #4

Worker #6

8/Oct/15GraphDB Clusters for Everyone #61

Page 62: GraphDB™ Clusters for Everyone

Split Brain

Master #1

Transaction Log

Master #2

Transaction Log

12 12

Worker #2

Worker #1

Worker #3

Worker #5

Worker #4

Worker #6

8/Oct/15GraphDB Clusters for Everyone #62

M2 is dead or network

is down

M1 is dead or network

is down

Page 63: GraphDB™ Clusters for Everyone

Split Brain

Master #1

Transaction Log

Master #2

Transaction Log

12 12

Worker #2

Worker #1

Worker #3

Worker #5

Worker #4

Worker #6

8/Oct/15GraphDB Clusters for Everyone #63

Let’s wait X seconds to come back

Let’s wait wait X

seconds to come back

Page 64: GraphDB™ Clusters for Everyone

Split Brain

Master #1

Transaction Log

Master #2

Transaction Log

12 12

Worker #2

Worker #1

Worker #3

Worker #5

Worker #4

Worker #6

8/Oct/15GraphDB Clusters for Everyone #64

Split BrainSplit Brain

Page 65: GraphDB™ Clusters for Everyone

Split Brain

Master #1

Transaction Log

Master #2

Transaction Log

12 12

Worker #2

Worker #1

Worker #3

Worker #5

Worker #4

Worker #6

8/Oct/15GraphDB Clusters for Everyone #65

34

write

56

write

Page 66: GraphDB™ Clusters for Everyone

• Order of the writes is important• Updates should be magically integrated

Split Brain

GraphDB Clusters for Everyone #668/Oct/15

Master #1

Transaction Log

Master #2

Transaction Log

1234 1256

Worker #2

Worker #1

Worker #3

Worker #5

Worker #4

Worker #6

Page 67: GraphDB™ Clusters for Everyone

Split Brain

Master #1

Transaction Log

Master #2

Transaction Log

12 12

Worker #2

Worker #1

Worker #3

Worker #5

Worker #4

Worker #6

8/Oct/15GraphDB Clusters for Everyone #67

Split Brain;Backup a worker

Split Brain;Backup a worker

Worker Backup

12

Worker Backup

12

Page 68: GraphDB™ Clusters for Everyone

Split Brain

Master #1

Transaction Log

Master #2

Transaction Log

12 12

Worker #2

Worker #1

Worker #3

Worker #5

Worker #4

Worker #6

8/Oct/15GraphDB Clusters for Everyone #68

34

write

56

writeWait until the backup

is doneWait until

the backup is done

Worker Backup

12

Worker Backup

12

Page 69: GraphDB™ Clusters for Everyone

Split Brain

GraphDB Clusters for Everyone #698/Oct/15

Master #1

Transaction Log

Master #2

Transaction Log

1234 1256

Worker #2

Worker #1

Worker #3

Worker #5

Worker #4

Worker #6

Worker Backup

Worker Backup

12 12

Page 70: GraphDB™ Clusters for Everyone

Split Brain

GraphDB Clusters for Everyone #708/Oct/15

Master #1

Transaction Log

Master #2

Transaction Log

1234 1256

Worker #2

Worker #1

Worker #3

Worker #5

Worker #4

Worker #6

Worker Backup

Worker Backup

12 12

M2 is back M1 is back

Page 71: GraphDB™ Clusters for Everyone

• Start from the pre split brain backup• Synchronize all updates and reapply them

Split Brain

GraphDB Clusters for Everyone #718/Oct/15

Master #1

Transaction Log

Master #2

Transaction Log

1234 1256

Worker #2

Worker #1

Worker #3

Worker #5

Worker #4

Worker #6

Worker Backup

Worker Backup

12 12

Page 72: GraphDB™ Clusters for Everyone

• GraphDB is the leading graph and RDF database• For any further requests please contact us:

[email protected]

Thank you!

GraphDB Clusters for Everyone #728/Oct/15