GaianDB

25
GaianDB A dynamic distributed federated database Dale Lane @dalelane

description

presentation I gave on GaianDB - a dynamic federated distributed database available on IBM alphaWorks The presentation wont make a lot of sense without speaker notes... which I've not written yet. Sorry about that.

Transcript of GaianDB

Page 1: GaianDB

GaianDB

A dynamic distributed federated database

Dale Lane@dalelane

Page 2: GaianDB

A massively over-simplified view of data-warehousing...

Page 3: GaianDB

The “Internet of Things”

Page 4: GaianDB

GaianDB

a dynamic distributed federated database

Page 5: GaianDB

Federated data

Page 6: GaianDB

Network of distributed databases

Page 7: GaianDB

A dynamic network

Page 8: GaianDB

A dynamic networkBiologically-Inspired Self-Organisation

Exploit natural selection in nature to build better networks

Robust self-organizing network architectures

Frameworks and algorithms for robust fault-tolerant information dissemination

Robust communications with minimal complexity or human control

Page 9: GaianDB

Gaian database

N0

N3

N11

N4N5

N1

N2

N6

N7

N8

N10N9

SQL QueryN0

N3

N11

N4N5

N1

N2

N6

N7

N8

N10N9

SQL Query

N0

N3

N11

N4N5

N1

N2

N6

N7

N8

N10N9

SQL Query

N0

N3

N11

N4N5

N1

N2

N6

N7

N8

N10N9

SQL Queries

Queries routed to all database nodes – a flood query, but retrieving only the data required to satisfy a query

Exchanges query traffic in the network for data traffic – aiming to minimize total traffic

Predicated on a concept of ‘store data locally - read data from anywhere’ paradigm

Page 10: GaianDB

Architecture

GaianDB

Derby Engine: Parsing, Compilation, Execution

GaianPStmtNode VTI:Executes queries on physical leaf nodes +

Propagates the original SQL (+ queryID & steps state info) to linked Gaian nodes

Instantiates Invokes costingmethods

Pushes columns and ‘where’ clausein a structure

MQ(tt) Stream Data

Original SQL

DB2 Oracle MS SQLServer Sybase MySQL Flat files

In-memorytables

Derby

GaianDBGaianDB

GaianDB

propagate

Text Index

Derby tables

N0

N3

N11

N4N5

N1

N2

N6

N7

N8

N10N9

SQL QueryN0

N3

N11

N4N5

N1

N2

N6

N7

N8

N10N9

SQL Query

Expanded Node

Multithreaded, breadth-first query propagation

Loop detection/handling – no duplicates

Page 11: GaianDB

Performance – with 1,250 nodes

Query time for 1025 nodes, fetching up to 1025 rows from each

y = 4.217x + 349.251

0

1000

2000

3000

4000

5000

6000

0 200 400 600 800 1000 1200

Row s fetched per node

Tim

e (m

illis

econ

ds)

Query Execute Time

Total Query Time

Linear (Total Query Time)

Query Performance

0.0

53.9

107.8

161.7

215.6

269.5

323.4

377.3

431.2

485.1

539.0

0 200 400 600 800 1000 1200Number of Nodes

Qu

ery

Tim

e(m

illis

eco

nd

s)

Average Query TimePredicted Max (Layers)Predicted Min (Layers)

Page 12: GaianDB

Performance questions

The time to propagate a query to all of the nodes in the database, as a function of the number of database nodes (N);

The time to fetch data from across the nodes of the database to a single node, as a function of the volume of data;

The time to fetch data from across the database to multiple nodes concurrently querying, as a function of the number of nodes concurrently querying.

Page 13: GaianDB

Graph metricsThe eccentricity ε(νi) of a graph vertex νi is the maximum graph distance between νi and any other vertex νj of G i.e. the "longest shortest path" between any two graph vertices (νi , νj) of the graph.

The maximum eccentricity is the graph diameter Gd. The minimum graph eccentricity is the graph radius Gr. We define the size of G as the number of vertices N and the number of connections at each vertex as the vertex degree δi (1 < i ≤ N).

Page 14: GaianDB

Biologically inspired self-organisation

0123456789

10

0 200 400 600 800 1000Number of Nodes (N)

Grap

h Di

men

sion

(edg

es)

RadiusDiameter(1+e)ln(N)(1-e)ln(N)

Network growth by preferential attachment Using a fitness function at each node

Limit maximum vertex degree =10

Gd = nint [ (1+e) * ln(N) ]

Gr = nint [ (1-e) * ln(N) ]

e = 0.24

Page 15: GaianDB

Query propagation timeThe predicted maximum (Tmax) and minimum times (Tmin) to execute the flood query are:

TL = link latencyTp = processor delay

Tmax = (Gd + 1)(TL + Tp)Tmin = (Gr + 1)(TL + Tp)

with the predicted execute query time from any node (Tν) being:

Tν = (ε(ν) + 1)(TL + Tp)

Hence substituting for ε(ν) Tν = nint[1 + B * ln(N) * (TL + Tp)]

Page 16: GaianDB

Measured query propagationIndividual Query Time Scalability

0.0

53.9

107.8

161.7

215.6

269.5

323.4

377.3

431.2

485.1

539.0

592.9

0 200 400 600 800 1000 1200Number of Nodes

Query

Time (

ms)

Average Query TimePredicted Max (Diameter+1)Predicted Min (Radius+1)Queried node eccentricity+1

Individual Query Time Scalability

0

53.9

107.8

161.7

215.6

269.5

323.4

0 50 100Number of Nodes

Query

Time

(ms)

Individual Query TimesAverage Query TimeQueried node eccentricity+1

Page 17: GaianDB

Measured data fetch

Query time to fetch 1 million rows

y = 4.217x + 349.251

y = 1.7383x + 678.141

0

1000

2000

3000

4000

5000

6000

0 200000 400000 600000 800000 1000000 1200000Total Rows fetched

Tim

e (m

illis

econ

ds)

Total Query Time 1025 nodes

Total Query Time 1 node

Total Query Time 1 node indexed

Linear (Total Query Time 1025 nodes)

Linear (Total Query Time 1 node)

Page 18: GaianDB

Example uses

Page 19: GaianDB

Smart Metering

centralisedwrite

Page 20: GaianDB

Smart Metering

centralisedread

Page 21: GaianDB

Smart Metering

distributed federatedwrite

Page 22: GaianDB

Smart Metering

distributed federatedread

Page 23: GaianDB

Other uses...

Page 24: GaianDB

http://www.alphaworks.ibm.com/tech/gaiandb

Page 25: GaianDB

Image credits

Background: YouTube video “The Internet of Things”, IBMhttp://www.youtube.com/watch?v=sfEbMV295Kk

Icons: DB and envelope icons, Tim Morgan http://flickr.com/photos/timothymorgan/sets/1615269

Microsoft Excel icon, Vincent Garnier (courtesy of IconArchive) http://iconarchive.com/show/softdimension-icons-by-benjigarner/Excel-icon.html

Photo of car mechanics, Tomas http://flickr.com/photos/tma/2264878

All other images original from GaianDB work