MGRID - Professional PostgreSQL | 2ndQuadrant · PDF fileThe MGRID Solution 10/34 ......

43
MGRID Implementing MGRID CHAR(11), Cambridge, July 12, 2011

Transcript of MGRID - Professional PostgreSQL | 2ndQuadrant · PDF fileThe MGRID Solution 10/34 ......

Page 1: MGRID - Professional PostgreSQL | 2ndQuadrant · PDF fileThe MGRID Solution 10/34 ... ’1.92m’,’92kg’),(’Julia’,’150cm’,’50kg’),(’John’, ’188cm’,’84.3kg’),

MGRID

Implementing MGRID

CHAR(11), Cambridge, July 12, 2011

Page 2: MGRID - Professional PostgreSQL | 2ndQuadrant · PDF fileThe MGRID Solution 10/34 ... ’1.92m’,’92kg’),(’Julia’,’150cm’,’50kg’),(’John’, ’188cm’,’84.3kg’),

Portavita

2/34

• Chronic disease management

• Largest online electronic health record (EHR) in

Netherlands

• Largest telemedicine project in Europe

Page 3: MGRID - Professional PostgreSQL | 2ndQuadrant · PDF fileThe MGRID Solution 10/34 ... ’1.92m’,’92kg’),(’Julia’,’150cm’,’50kg’),(’John’, ’188cm’,’84.3kg’),

Portavita

3/34

Page 4: MGRID - Professional PostgreSQL | 2ndQuadrant · PDF fileThe MGRID Solution 10/34 ... ’1.92m’,’92kg’),(’Julia’,’150cm’,’50kg’),(’John’, ’188cm’,’84.3kg’),

Portavita

4/34

Page 5: MGRID - Professional PostgreSQL | 2ndQuadrant · PDF fileThe MGRID Solution 10/34 ... ’1.92m’,’92kg’),(’Julia’,’150cm’,’50kg’),(’John’, ’188cm’,’84.3kg’),

Portavita

5/34

Page 6: MGRID - Professional PostgreSQL | 2ndQuadrant · PDF fileThe MGRID Solution 10/34 ... ’1.92m’,’92kg’),(’Julia’,’150cm’,’50kg’),(’John’, ’188cm’,’84.3kg’),

Portavita’s growth

6/34

Jun-2002 Jun-2003 May-2004 May-2005 May-2006 May-2007 Apr-2008 Apr-2009 Apr-2010 Mar-20110

50000

100000

150000

200000

Anticoagulation

DiabetesCOPDCVRMTotal

Number of Patients NL

Page 7: MGRID - Professional PostgreSQL | 2ndQuadrant · PDF fileThe MGRID Solution 10/34 ... ’1.92m’,’92kg’),(’Julia’,’150cm’,’50kg’),(’John’, ’188cm’,’84.3kg’),

Scaling up

7/34

• Several months

orientation period,

talked with VARs

• Took three months to

implement

Page 8: MGRID - Professional PostgreSQL | 2ndQuadrant · PDF fileThe MGRID Solution 10/34 ... ’1.92m’,’92kg’),(’Julia’,’150cm’,’50kg’),(’John’, ’188cm’,’84.3kg’),

Cost of scaling up

8/34

Page 9: MGRID - Professional PostgreSQL | 2ndQuadrant · PDF fileThe MGRID Solution 10/34 ... ’1.92m’,’92kg’),(’Julia’,’150cm’,’50kg’),(’John’, ’188cm’,’84.3kg’),

Benefits of scaling out

9/34

• parallelize tuple streams for

increased speed

• eliminate single server memory

bandwidth, processor and IO

bottlenecks

• control query latency with

shard size

• machines can be each others

replicas

• no negative economy of scale

Page 10: MGRID - Professional PostgreSQL | 2ndQuadrant · PDF fileThe MGRID Solution 10/34 ... ’1.92m’,’92kg’),(’Julia’,’150cm’,’50kg’),(’John’, ’188cm’,’84.3kg’),

The MGRID Solution

10/34

• Make parallel PostgreSQL• that can scale out

• that has built-in redundancy

• that allows online adding of

hardware

• that supports all features of core

PostgreSQL (ACID, stored

procedures, etc)

• That supports medical data• ISO-21090 Healthcare Datatypes

Page 11: MGRID - Professional PostgreSQL | 2ndQuadrant · PDF fileThe MGRID Solution 10/34 ... ’1.92m’,’92kg’),(’Julia’,’150cm’,’50kg’),(’John’, ’188cm’,’84.3kg’),

Healthcare Datatypes

Page 12: MGRID - Professional PostgreSQL | 2ndQuadrant · PDF fileThe MGRID Solution 10/34 ... ’1.92m’,’92kg’),(’Julia’,’150cm’,’50kg’),(’John’, ’188cm’,’84.3kg’),

Physical Quantities: example

12/34

create table patient (name text, height pq, weight pq);

CREATE TABLE

insert into patient values

(’Jack’, ’1.92 m’, ’92 kg’)

,(’Julia’, ’150 cm’, ’50 kg’)

,(’John’, ’188 cm’, ’84.3 kg’)

,(’Luke’, ’78 cm’, ’11800 g’);

INSERT 0 4

create or replace function bmi(height pq, weight pq)

returns pq

as $$

select convert($2, ’kg’) / convert($1, ’m’)^2;

$$ language sql immutable;

CREATE FUNCTION

select *, bmi(height, weight) from patient where height > ’1.70 m’

order by weight;

name | height | weight | bmi

------+--------+---------+---------------------------

John | 188 cm | 84.3 kg | 23.8512901765504753 kg/m2

Jack | 1.92 m | 92 kg | 24.9565972222222222 kg/m2

(2 rows)

Page 13: MGRID - Professional PostgreSQL | 2ndQuadrant · PDF fileThe MGRID Solution 10/34 ... ’1.92m’,’92kg’),(’Julia’,’150cm’,’50kg’),(’John’, ’188cm’,’84.3kg’),

Physical Quantities: example

12/34

create table patient (name text, height pq, weight pq);

CREATE TABLE

insert into patient values

(’Jack’, ’1.92 m’, ’92 kg’)

,(’Julia’, ’150 cm’, ’50 kg’)

,(’John’, ’188 cm’, ’84.3 kg’)

,(’Luke’, ’78 cm’, ’11800 g’);

INSERT 0 4

create or replace function bmi(height pq, weight pq)

returns pq

as $$

select convert($2, ’kg’) / convert($1, ’m’)^2;

$$ language sql immutable;

CREATE FUNCTION

select *, bmi(height, weight) from patient where height > ’1.70 m’

order by weight;

name | height | weight | bmi

------+--------+---------+---------------------------

John | 188 cm | 84.3 kg | 23.8512901765504753 kg/m2

Jack | 1.92 m | 92 kg | 24.9565972222222222 kg/m2

(2 rows)

/* PQ contains most units used in science and engineering and can be used

* outside the medical vertical. E.g. what is the mean travel time of light ,

* from the sun to the earth?

*/

select convert(pq ’1 AU’ / ’[c]’, ’s’);

convert

------------------------

499.0047838061356433 s

(1 row)

Page 14: MGRID - Professional PostgreSQL | 2ndQuadrant · PDF fileThe MGRID Solution 10/34 ... ’1.92m’,’92kg’),(’Julia’,’150cm’,’50kg’),(’John’, ’188cm’,’84.3kg’),

Physical Quantities

13/34

• PQs used to document observations

• Based on Unified Code for Units of Measure• 294 units – a.o. units from SI, ISO 1000, ISO 2955, ANSI X3.50,

CGS, unified U.S. & British Imperial units

• Operations supported:• Comparison: <, > and friends

• Arithmetic: +, −, /, ∗, power

• Aggregation: min, max, avg, sum, var, stddev

• Indexable

Page 15: MGRID - Professional PostgreSQL | 2ndQuadrant · PDF fileThe MGRID Solution 10/34 ... ’1.92m’,’92kg’),(’Julia’,’150cm’,’50kg’),(’John’, ’188cm’,’84.3kg’),

Intervals and sets of point in time: example

14/34

select canonical(ivl_ts ’[2004;2005[’ + ivl_ts ’[2005;2006[’) as plus,

canonical(ivl_ts ’[2002;2010]’ - ivl_ts ’[2004;2005]’) as minus;

plus | minus

-------------+-------------------------

[2004;2006[ | [2002;2004[;]2005;2010]

(1 row)

create table medication (name text, effectivetime ivl_ts );

insert into medication values (’Pete’, ’[20100316;20100514] ’),

(’Pete’, ’[20100420;20100701] ’),

(’Pete’, ’[20101220;20110119] ’),

(’John’, ’[20100516;20100614] ’),

(’John’, ’[20100620;20100801] ’),

(’John’, ’[20101220;20110119] ’);

INSERT 0 6

select * from medication where effectivetime @> ’20100620’;

name | effectivetime

------+---------------------

Pete | [20100420;20100701]

John | [20100620;20100801]

(2 rows)

select name, canonical(’2010’ - sum(effectivetime)) as nomeds

from medication group by name;

name | nomeds

------+-------------------------------------------------------------

John | [20100101;20100516[;]20100614;20100620[;]20100801;20101220[

Pete | [20100101;20100316[;]20100701;20101220[

(2 rows)

Page 16: MGRID - Professional PostgreSQL | 2ndQuadrant · PDF fileThe MGRID Solution 10/34 ... ’1.92m’,’92kg’),(’Julia’,’150cm’,’50kg’),(’John’, ’188cm’,’84.3kg’),

Coded values

15/34

• Controlled vocabularies in medical informatics• record information unambiguously

• allow machine reasoning

• HL7v3 Coded value implementation

• Support for a large number of codesystems:• Systemized Nomenclature of Medicine – Clinical Terms

• HL7v3 vocabularies all Editions

• Logical Observation Identifiers Names and Codes

• you can add your own

• Supports• hierarchical code systems

• code system versioning

• Indexable

Page 17: MGRID - Professional PostgreSQL | 2ndQuadrant · PDF fileThe MGRID Solution 10/34 ... ’1.92m’,’92kg’),(’Julia’,’150cm’,’50kg’),(’John’, ’188cm’,’84.3kg’),

Coded values: example

16/34

select name, code(disorder), codesystemname(disorder),

displayname(disorder) from observation;

name | code | codesystemname | displayname

--------+-----------+----------------+---------------------

Willem | 71620000 | SNOMED -CT | Fracture of femur

Yeb | 66308002 | SNOMED -CT | Fracture of humerus

Henk | 262994004 | SNOMED -CT | Leg sprain

(3 rows)

select name, displayname(disorder) from observation

where disorder << ’284003005|Fracture of bone’::cv(’SNOMED -CT’);

name | displayname

--------+---------------------

Willem | Fracture of femur

Yeb | Fracture of humerus

(2 rows)

select name, displayname(disorder) from observation

where disorder << ’127279002|Injury of lower extremity’::cv(’SNOMED -CT’);

name | displayname

--------+-------------------

Willem | Fracture of femur

Henk | Leg sprain

(2 rows)

Page 18: MGRID - Professional PostgreSQL | 2ndQuadrant · PDF fileThe MGRID Solution 10/34 ... ’1.92m’,’92kg’),(’Julia’,’150cm’,’50kg’),(’John’, ’188cm’,’84.3kg’),

Parallel Processing

Page 19: MGRID - Professional PostgreSQL | 2ndQuadrant · PDF fileThe MGRID Solution 10/34 ... ’1.92m’,’92kg’),(’Julia’,’150cm’,’50kg’),(’John’, ’188cm’,’84.3kg’),

The Idea – a sketch

18/34

Serial Processing Parallel Processing

Page 20: MGRID - Professional PostgreSQL | 2ndQuadrant · PDF fileThe MGRID Solution 10/34 ... ’1.92m’,’92kg’),(’Julia’,’150cm’,’50kg’),(’John’, ’188cm’,’84.3kg’),

From serial queries to parallel queries

19/34

Relay

Cells

Partitioned

Queries

Partitioned

Results

Queries / Results

using PostgreSQL API

• layout defines distribution• of tables

• on cells

• via attributes

• using a degree of

parallelism (dop)

• relay grid gateway• provides a standard

PostgreSQL interface for clients

• plans distributed queries

• combines grid results

• cells hold partitioned data

• redundancy group• one complete copy of the data

Page 21: MGRID - Professional PostgreSQL | 2ndQuadrant · PDF fileThe MGRID Solution 10/34 ... ’1.92m’,’92kg’),(’Julia’,’150cm’,’50kg’),(’John’, ’188cm’,’84.3kg’),

Performance

Page 22: MGRID - Professional PostgreSQL | 2ndQuadrant · PDF fileThe MGRID Solution 10/34 ... ’1.92m’,’92kg’),(’Julia’,’150cm’,’50kg’),(’John’, ’188cm’,’84.3kg’),

Test platform

21/34

• Simple grid• one location

• one redundancy group

• and upto 10 hosts

• Each host is• AMD X3 720

• 16GB PC6400 DDR2

• 3x WD RE3 250GB SATA

• XFS, barrier = off

• 1Gb network

Page 23: MGRID - Professional PostgreSQL | 2ndQuadrant · PDF fileThe MGRID Solution 10/34 ... ’1.92m’,’92kg’),(’Julia’,’150cm’,’50kg’),(’John’, ’188cm’,’84.3kg’),

Test database

22/34

• Consider the pgbench ERD:

pgbench_accounts

aid

bid

abalance

filler

pgbench_branches

bid

bbalance

filler

pgbench_history

tid

bid

aid

delta

mtime

filler

pgbench_tellers

tid

bid

tbalance

filler

• With a layout “per_account”• distribution key accounts.aid and history.aid

Page 24: MGRID - Professional PostgreSQL | 2ndQuadrant · PDF fileThe MGRID Solution 10/34 ... ’1.92m’,’92kg’),(’Julia’,’150cm’,’50kg’),(’John’, ’188cm’,’84.3kg’),

Select test

23/34

• Determine read only / select speed

• Query

SELECT abalance, filler FROM pgbench_accounts WHERE aid = :aid

• Platform configuration• single host or

• layout per_account dop ∈ [2, 4, 9] and #hosts = dop

• pgbench configuration:• #clients ∈ [8, 16, 32, 64, 96]• scale_factor ∈ [100, 200, 400, 800, 1300, 1800]

Page 25: MGRID - Professional PostgreSQL | 2ndQuadrant · PDF fileThe MGRID Solution 10/34 ... ’1.92m’,’92kg’),(’Julia’,’150cm’,’50kg’),(’John’, ’188cm’,’84.3kg’),

Select results

24/34

Page 26: MGRID - Professional PostgreSQL | 2ndQuadrant · PDF fileThe MGRID Solution 10/34 ... ’1.92m’,’92kg’),(’Julia’,’150cm’,’50kg’),(’John’, ’188cm’,’84.3kg’),

TPC-B test

25/34

• Determine “TPC-B” transaction speed

• Query

BEGIN

UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid

SELECT abalance FROM pgbench_accounts WHERE aid = :aid

UPDATE pgbench_tellers SET tbalance = tbalance + :delta WHERE tid = :tid

UPDATE pgbench_branches SET bbalance = bbalance + :delta WHERE bid = :bid

INSERT INTO pgbench_history (tid, bid, aid, delta, mtime)

VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP)

END

• Platform and test configuration as before

Page 27: MGRID - Professional PostgreSQL | 2ndQuadrant · PDF fileThe MGRID Solution 10/34 ... ’1.92m’,’92kg’),(’Julia’,’150cm’,’50kg’),(’John’, ’188cm’,’84.3kg’),

TPC-B results

26/34

Page 28: MGRID - Professional PostgreSQL | 2ndQuadrant · PDF fileThe MGRID Solution 10/34 ... ’1.92m’,’92kg’),(’Julia’,’150cm’,’50kg’),(’John’, ’188cm’,’84.3kg’),

Mixed load test

27/34

• Determine results for a Portavita like mixed load

Query =

select 90%

tpc-b 9.9%

complex 0.1% of the time

• Where complex isSELECT a.bid,avg(abalance) AS b FROM pgbench_accounts a

WHERE a.bid= :bid GROUP BY a.bid ORDER BY a.bid

• Platform configuration as before

• pgbench configuration:• #clients = 16

• scale_factor ∈ [200, 400, 800, 1800]

Page 29: MGRID - Professional PostgreSQL | 2ndQuadrant · PDF fileThe MGRID Solution 10/34 ... ’1.92m’,’92kg’),(’Julia’,’150cm’,’50kg’),(’John’, ’188cm’,’84.3kg’),

Mixed load latencies, 1 server

28/34

single, scale 200 single, scale 400

single, scale 800 single, scale 1800

Page 30: MGRID - Professional PostgreSQL | 2ndQuadrant · PDF fileThe MGRID Solution 10/34 ... ’1.92m’,’92kg’),(’Julia’,’150cm’,’50kg’),(’John’, ’188cm’,’84.3kg’),

Mixed load latencies, grid, constant shard size

29/34

single, scale 200 grid2, scale 400

grid4, scale 800 grid9, scale 1800

Page 31: MGRID - Professional PostgreSQL | 2ndQuadrant · PDF fileThe MGRID Solution 10/34 ... ’1.92m’,’92kg’),(’Julia’,’150cm’,’50kg’),(’John’, ’188cm’,’84.3kg’),

Back to Portavita

Page 32: MGRID - Professional PostgreSQL | 2ndQuadrant · PDF fileThe MGRID Solution 10/34 ... ’1.92m’,’92kg’),(’Julia’,’150cm’,’50kg’),(’John’, ’188cm’,’84.3kg’),

Deploy with redundancy

31/34

redundancy group

hosts

switch

switch

redundancy group

hosts

switch

switch

redundancy group

hosts

switch

switch

redundancy group

hosts

switch

switch

location location

private

data

network

VPN

application

domain,

load balancer

public / application

data

network

WAN

admin

network

Page 33: MGRID - Professional PostgreSQL | 2ndQuadrant · PDF fileThe MGRID Solution 10/34 ... ’1.92m’,’92kg’),(’Julia’,’150cm’,’50kg’),(’John’, ’188cm’,’84.3kg’),

Conclusions

32/34

• Parallel PostgreSQL is a solution for mixed OLTP / OLAP

use cases, provided your data is partitionable

• Control complex query response time with shard size

• Healthcare Datatypes as UDTs (instead of ORM)

increases developer productivity

Page 34: MGRID - Professional PostgreSQL | 2ndQuadrant · PDF fileThe MGRID Solution 10/34 ... ’1.92m’,’92kg’),(’Julia’,’150cm’,’50kg’),(’John’, ’188cm’,’84.3kg’),

Questions

Page 35: MGRID - Professional PostgreSQL | 2ndQuadrant · PDF fileThe MGRID Solution 10/34 ... ’1.92m’,’92kg’),(’Julia’,’150cm’,’50kg’),(’John’, ’188cm’,’84.3kg’),

References

34/34

• E.F. Codd - Relational Database: A Practical Foundation for

Productivity, ACM Turing Award Lecture, 1981

• Urs Hölzle - The Google Linux Cluster, 2002

• M. Stonebraker, R. Cattel - 10 Rules for Scalable Performance

in ‘Simple Operation’ Datastores, 2011

• Wikipedia - Memory Wall

• G. Smith - pgbench-tools

• J.D. McCalpin - STREAM: Sustainable Memory Bandwidth in High

Performance Computers

• G. Smith - Stream scaling - Automate memory bandwidth testing

with STREAM using various core counts

• Y.T. Havinga, W.P. Dijkstra and A. de Keijzer - Adding HL7 version 3

data types to PostgreSQL, 2010

• G. Schadow - The Unified Code for Units of Measure, 2009

Page 36: MGRID - Professional PostgreSQL | 2ndQuadrant · PDF fileThe MGRID Solution 10/34 ... ’1.92m’,’92kg’),(’Julia’,’150cm’,’50kg’),(’John’, ’188cm’,’84.3kg’),

Contact

MG R IDMG R IDir. Willem DijkstraPartner

www.mgrid.net

T +31 886 474 302

F +31 886 474 301

M +31 611 144 118

[email protected]

Oostenburgervoorstraat 100

1018 MR Amsterdam

PO BOX 1287

1000 BG Amsterdam

The Netherlands

MG R IDMG R IDir. Yeb HavingaPartner

www.mgrid.net

T +31 886 474 303

F +31 886 474 301

M +31 652 523 546

[email protected]

Oostenburgervoorstraat 100

1018 MR Amsterdam

PO BOX 1287

1000 BG Amsterdam

The Netherlands

Page 37: MGRID - Professional PostgreSQL | 2ndQuadrant · PDF fileThe MGRID Solution 10/34 ... ’1.92m’,’92kg’),(’Julia’,’150cm’,’50kg’),(’John’, ’188cm’,’84.3kg’),

Backup slides

Page 38: MGRID - Professional PostgreSQL | 2ndQuadrant · PDF fileThe MGRID Solution 10/34 ... ’1.92m’,’92kg’),(’Julia’,’150cm’,’50kg’),(’John’, ’188cm’,’84.3kg’),

HL7v3 reference information model

1/6

Source: Grahame Grieve

Page 39: MGRID - Professional PostgreSQL | 2ndQuadrant · PDF fileThe MGRID Solution 10/34 ... ’1.92m’,’92kg’),(’Julia’,’150cm’,’50kg’),(’John’, ’188cm’,’84.3kg’),

Interval and sets of point in time

2/6

• Point in time is relevant to every query

• HL7v3 Point in Time implementation

• Operations supported:• Comparison: overlaps, contains

• Arithmetic: +, −, intersect

• Aggregation: sum

• Construction: intervalafter and friends

• Indexable

Page 40: MGRID - Professional PostgreSQL | 2ndQuadrant · PDF fileThe MGRID Solution 10/34 ... ’1.92m’,’92kg’),(’Julia’,’150cm’,’50kg’),(’John’, ’188cm’,’84.3kg’),

Mixed load results

3/6

Page 41: MGRID - Professional PostgreSQL | 2ndQuadrant · PDF fileThe MGRID Solution 10/34 ... ’1.92m’,’92kg’),(’Julia’,’150cm’,’50kg’),(’John’, ’188cm’,’84.3kg’),

Mixed load results – zoom

4/6

Page 42: MGRID - Professional PostgreSQL | 2ndQuadrant · PDF fileThe MGRID Solution 10/34 ... ’1.92m’,’92kg’),(’Julia’,’150cm’,’50kg’),(’John’, ’188cm’,’84.3kg’),

Mixed load test 2

5/6

• Determine results for a Portavita like mixed load

Query =

select 90%

tpc-b 9.9%

complex 0.1% of the time

• Where complex isSELECT h.tid as teller , SUM(delta), a.aid as account, AVG(abalance)

FROM pgbench_history h

JOIN pgbench_accounts a ON a.aid=h.aid

WHERE h.bid = :bid GROUP BY h.tid, a.aid ORDER BY h.tid

• Platform and test configuration as before

Page 43: MGRID - Professional PostgreSQL | 2ndQuadrant · PDF fileThe MGRID Solution 10/34 ... ’1.92m’,’92kg’),(’Julia’,’150cm’,’50kg’),(’John’, ’188cm’,’84.3kg’),

Mixed load results 2

6/6