Getting Started with PL/Proxy

40
CC-BY Getting Started with PL/Proxy Peter Eisentraut [email protected] F-Secure Corporation PostgreSQL Conference East 2011

description

presentation from PgEast 2011

Transcript of Getting Started with PL/Proxy

Page 1: Getting Started with PL/Proxy

CC-BY

Getting Started withPL/Proxy

Peter [email protected]

F-Secure Corporation

PostgreSQL Conference East 2011

Page 2: Getting Started with PL/Proxy

Concept

• a database partitioning system implemented as aprocedural language

• “sharding”/horizontal partitioning• PostgreSQL’s No(t-only)SQL solution

Page 3: Getting Started with PL/Proxy

Concept

application application application application

frontend

partition 1 partition 2 partition 3 partition 4

Page 4: Getting Started with PL/Proxy

Areas of Application

• high write load• (high read load)• allow for some “eventual consistency”• have reasonable partitioning keys• use/plan to use server-side functions

Page 5: Getting Started with PL/Proxy

ExampleHave:1

CREATE TABLE products (

prod_id serial PRIMARY KEY ,

category integer NOT NULL ,

title varchar (50) NOT NULL ,

actor varchar (50) NOT NULL ,

price numeric (12 ,2) NOT NULL ,

special smallint ,

common_prod_id integer NOT NULL

);

INSERT INTO products VALUES (...);

UPDATE products SET ... WHERE ...;

DELETE FROM products WHERE ...;

plus various queries

1dellstore2 example database

Page 6: Getting Started with PL/Proxy

Installation

• Download: http://plproxy.projects.postgresql.org,Deb, RPM, . . .

• Create language: psql -d dellstore2 -f

...../plproxy.sql

Page 7: Getting Started with PL/Proxy

Backend Functions ICREATE FUNCTION insert_product(p_category int ,

p_title varchar , p_actor varchar , p_price

numeric , p_special smallint ,

p_common_prod_id int) RETURNS int

LANGUAGE plpgsql

AS $$

DECLARE

cnt int;

BEGIN

INSERT INTO products (category , title ,

actor , price , special , common_prod_id)

VALUES (p_category , p_title , p_actor ,

p_price , p_special , p_common_prod_id);

GET DIAGNOSTICS cnt = ROW_COUNT;

RETURN cnt;

END;

$$;

Page 8: Getting Started with PL/Proxy

Backend Functions II

CREATE FUNCTION update_product_price(p_prod_id

int , p_price numeric) RETURNS int

LANGUAGE plpgsql

AS $$

DECLARE

cnt int;

BEGIN

UPDATE products SET price = p_price WHERE

prod_id = p_prod_id;

GET DIAGNOSTICS cnt = ROW_COUNT;

RETURN cnt;

END;

$$;

Page 9: Getting Started with PL/Proxy

Backend Functions III

CREATE FUNCTION delete_product_by_title(p_title

varchar) RETURNS int

LANGUAGE plpgsql

AS $$

DECLARE

cnt int;

BEGIN

DELETE FROM products WHERE title = p_title;

GET DIAGNOSTICS cnt = ROW_COUNT;

RETURN cnt;

END;

$$;

Page 10: Getting Started with PL/Proxy

Frontend Functions ICREATE FUNCTION insert_product(p_category int ,

p_title varchar , p_actor varchar , p_price

numeric , p_special smallint ,

p_common_prod_id int) RETURNS SETOF int

LANGUAGE plproxy

AS $$

CLUSTER 'dellstore_cluster ';

RUN ON hashtext(p_title);

$$;

CREATE FUNCTION update_product_price(p_prod_id

int , p_price numeric) RETURNS SETOF int

LANGUAGE plproxy

AS $$

CLUSTER 'dellstore_cluster ';

RUN ON ALL;

$$;

Page 11: Getting Started with PL/Proxy

Frontend Functions II

CREATE FUNCTION delete_product_by_title(p_title

varchar) RETURNS int

LANGUAGE plpgsql

AS $$

CLUSTER 'dellstore_cluster ';

RUN ON hashtext(p_title);

$$;

Page 12: Getting Started with PL/Proxy

Frontend Query Functions I

CREATE FUNCTION get_product_price(p_prod_id

int) RETURNS SETOF numeric

LANGUAGE plproxy

AS $$

CLUSTER 'dellstore_cluster ';

RUN ON ALL;

SELECT price FROM products WHERE prod_id =

p_prod_id;

$$;

Page 13: Getting Started with PL/Proxy

Frontend Query Functions II

CREATE FUNCTION

get_products_by_category(p_category int)

RETURNS SETOF products

LANGUAGE plproxy

AS $$

CLUSTER 'dellstore_cluster ';

RUN ON ALL;

SELECT * FROM products WHERE category =

p_category;

$$;

Page 14: Getting Started with PL/Proxy

Unpartitioned Small Tables

CREATE FUNCTION insert_category(p_categoryname)

RETURNS SETOF int

LANGUAGE plproxy

AS $$

CLUSTER 'dellstore_cluster ';

RUN ON 0;

$$;

Page 15: Getting Started with PL/Proxy

Which Hash Key?

• natural keys (names, descriptions, UUIDs)• not serials (Consider using fewer “ID” fields.)• single columns• group sensibly to allow joins on backend

Page 16: Getting Started with PL/Proxy

Set Basic Parameters

• number of partitions (2n), e. g. 8• host names, e. g.

• frontend: dbfe• backends: dbbe1, . . . , dbbe8

• database names, e. g.• frontend: dellstore2• backends: store01, . . . , store08

• user names, e. g. storeapp• hardware:

• frontend: lots of memory, normal disk• backends: full-sized database server

Page 17: Getting Started with PL/Proxy

Set Basic Parameters

• number of partitions (2n), e. g. 8• host names, e. g.

• frontend: dbfe• backends: dbbe1, . . . , dbbe8 (or start at 0?)

• database names, e. g.• frontend: dellstore2• backends: store01, . . . , store08 (or start at 0?)

• user names, e. g. storeapp• hardware:

• frontend: lots of memory, normal disk• backends: full-sized database server

Page 18: Getting Started with PL/Proxy

Configuration

CREATE FUNCTION

plproxy.get_cluster_partitions(cluster_name

text) RETURNS SETOF text LANGUAGE plpgsql AS

$$...$$;

CREATE FUNCTION

plproxy.get_cluster_version(cluster_name

text) RETURNS int LANGUAGE plpgsql AS

$$...$$;

CREATE FUNCTION plproxy.get_cluster_config(IN

cluster_name text , OUT key text , OUT val

text) RETURNS SETOF record LANGUAGE plpgsql

AS $$...$$;

Page 19: Getting Started with PL/Proxy

get_cluster_partitionsSimplistic approach:

CREATE FUNCTION

plproxy.get_cluster_partitions(cluster_name

text) RETURNS SETOF text

LANGUAGE plpgsql

AS $$

BEGIN

IF cluster_name = 'dellstore_cluster ' THEN

RETURN NEXT 'dbname=store01 host=dbbe1 ';

RETURN NEXT 'dbname=store02 host=dbbe2 ';

...

RETURN NEXT 'dbname=store08 host=dbbe8 ';

RETURN;

END IF;

RAISE EXCEPTION 'Unknown cluster ';

END;

$$;

Page 20: Getting Started with PL/Proxy

get_cluster_version

Simplistic approach:

CREATE FUNCTION

plproxy.get_cluster_version(cluster_name

text) RETURNS int

LANGUAGE plpgsql

AS $$

BEGIN

IF cluster_name = 'dellstore_cluster ' THEN

RETURN 1;

END IF;

RAISE EXCEPTION 'Unknown cluster ';

END;

$$ LANGUAGE plpgsql;

Page 21: Getting Started with PL/Proxy

get_cluster_config

CREATE OR REPLACE FUNCTION

plproxy.get_cluster_config(IN cluster_name

text , OUT key text , OUT val text) RETURNS

SETOF record

LANGUAGE plpgsql

AS $$

BEGIN

-- same config for all clusters

key := 'connection_lifetime ';

val := 30*60; -- 30m

RETURN NEXT;

RETURN;

END;

$$;

Page 22: Getting Started with PL/Proxy

Table-Driven Configuration ICREATE TABLE plproxy.partitions (

cluster_name text NOT NULL ,

host text NOT NULL ,

port text NOT NULL ,

dbname text NOT NULL ,

PRIMARY KEY (cluster_name , dbname)

);

INSERT INTO plproxy.partitions VALUES

('dellstore_cluster ', 'dbbe1 ', '5432',

'store01 '),

('dellstore_cluster ', 'dbbe2 ', '5432',

'store02 '),

...

('dellstore_cluster ', 'dbbe8 ', '5432',

'store03 ');

Page 23: Getting Started with PL/Proxy

Table-Driven Configuration II

CREATE TABLE plproxy.cluster_users (

cluster_name text NOT NULL ,

remote_user text NOT NULL ,

local_user NOT NULL ,

PRIMARY KEY (cluster_name , remote_user ,

local_user)

);

INSERT INTO plproxy.cluster_users VALUES

('dellstore_cluster ', 'storeapp ', 'storeapp ');

Page 24: Getting Started with PL/Proxy

Table-Driven Configuration IIICREATE TABLE plproxy.remote_passwords (

host text NOT NULL ,

port text NOT NULL ,

dbname text NOT NULL ,

remote_user text NOT NULL ,

password text ,

PRIMARY KEY (host , port , dbname ,

remote_user)

);

INSERT INTO plproxy.remote_passwords VALUES

('dbbe1 ', '5432', 'store01 ', 'storeapp ',

'Thu1Ued0 '),

...

-- or use .pgpass?

Page 25: Getting Started with PL/Proxy

Table-Driven Configuration IV

CREATE TABLE plproxy.cluster_version (

id int PRIMARY KEY

);

INSERT INTO plproxy.cluster_version VALUES (1);

GRANT SELECT ON plproxy.cluster_version TO

PUBLIC;

/* extra credit: write trigger that changes the

version when one of the other tables changes

*/

Page 26: Getting Started with PL/Proxy

Table-Driven Configuration VCREATE OR REPLACE FUNCTION plproxy.get_cluster_partitions(p_cluster_name text)

RETURNS SETOF textLANGUAGE plpgsqlSECURITY DEFINERAS $$DECLARE

r record;BEGIN

FOR r INSELECT 'host=' || host || ' port=' || port || ' dbname=' || dbname || '

user=' || remote_user || ' password=' || password AS dsnFROM plproxy.partitions NATURAL JOIN plproxy.cluster_users NATURAL JOIN

plproxy.remote_passwordsWHERE cluster_name = p_cluster_nameAND local_user = session_userORDER BY dbname -- important

LOOPRETURN NEXT r.dsn;

END LOOP;IF NOT found THEN

RAISE EXCEPTION 'no such cluster: %', p_cluster_name;END IF;RETURN;

END;$$;

Page 27: Getting Started with PL/Proxy

Table-Driven Configuration VI

CREATE FUNCTION

plproxy.get_cluster_version(p_cluster_name

text) RETURNS int

LANGUAGE plpgsql

AS $$

DECLARE

ret int;

BEGIN

SELECT INTO ret id FROM

plproxy.cluster_version;

RETURN ret;

END;

$$;

Page 28: Getting Started with PL/Proxy

SQL/MED ConfigurationCREATE SERVER dellstore_cluster FOREIGN DATA

WRAPPER plproxy

OPTIONS (

connection_lifetime '1800',

p0 'dbname=store01 host=dbbe1 ',

p1 'dbname=store02 host=dbbe2 ',

...

p7 'dbname=store08 host=dbbe8 '

);

CREATE USER MAPPING FOR storeapp SERVER

dellstore_cluster

OPTIONS (user 'storeapp ', password

'sekret ');

GRANT USAGE ON SERVER dellstore_cluster TO

storeapp;

Page 29: Getting Started with PL/Proxy

Hash Functions

RUN ON hashtext(somecolumn);

• want a fast, uniform hash function• typically use hashtext

• problem: implementation might change• possible solution: https://github.com/petere/pgvihash

Page 30: Getting Started with PL/Proxy

Sequences

shard 1:

ALTER SEQUENCE products_prod_id_seq MINVALUE 1

MAXVALUE 100000000 START 1;

shard 2:

ALTER SEQUENCE products_prod_id_seq MINVALUE

100000001 MAXVALUE 200000000 START 100000001;

etc.

Page 31: Getting Started with PL/Proxy

AggregatesExample: count all productsBackend:

CREATE FUNCTION count_products () RETURNS bigint

LANGUAGE SQL STABLE AS $$SELECT count (*)

FROM products$$;

Frontend:

CREATE FUNCTION count_products () RETURNS SETOF

bigint LANGUAGE plproxy AS $$

CLUSTER 'dellstore_cluster ';

RUN ON ALL;

$$;

SELECT sum(x) AS count FROM count_products () AS

t(x);

Page 32: Getting Started with PL/Proxy

Dynamic Queries Ia. k. a. “cheating” ;-)

CREATE FUNCTION execute_query(sql text) RETURNS

SETOF RECORD LANGUAGE plproxy

AS $$

CLUSTER 'dellstore_cluster ';

RUN ON ALL;

$$;

CREATE FUNCTION execute_query(sql text) RETURNS

SETOF RECORD LANGUAGE plpgsql

AS $$

BEGIN

RETURN QUERY EXECUTE sql;

END;

$$;

Page 33: Getting Started with PL/Proxy

Dynamic Queries II

SELECT * FROM execute_query('SELECT title ,

price FROM products ') AS (title varchar ,

price numeric);

SELECT category , sum(sum_price) FROM

execute_query('SELECT category , sum(price)

FROM products GROUP BY category ') AS

(category int , sum_price numeric) GROUP BY

category;

Page 34: Getting Started with PL/Proxy

Repartitioning

• changing partitioning key is extremely cumbersome• adding partitions is somewhat cumbersome, e. g., to split

shard 0:

COPY (SELECT * FROM products WHERE

hashtext(title::text) & 15 <> 0) TO

'somewhere ';

DELETE FROM products WHERE

hashtext(title::text) & 15 <> 0;

Better start out with enough partitions!

Page 35: Getting Started with PL/Proxy

PgBouncer

application application application application

frontend

PgBouncer PgBouncer PgBouncer PgBouncer

partition 1 partition 2 partition 3 partition 4

Use

pool_mode = statement

Page 36: Getting Started with PL/Proxy

Development Issues

• foreign keys• notifications• hash key check constraints• testing (pgTAP), no validator

Page 37: Getting Started with PL/Proxy

Administration

• centralized logging• distributed shell (dsh)• query canceling/timeouts• access control, firewalling• deployment

Page 38: Getting Started with PL/Proxy

High Availability

Frontend:• multiple frontends (DNS, load balancer?)• replicate partition configuration (Slony, Bucardo, WAL)• Heartbeat, UCARP, etc.

Backend:• replicate backends shards individually (Slony, WAL, DRBD)• use partition configuration to configure load spreading or

failover

Page 39: Getting Started with PL/Proxy

Advanced Topics

• generic insert, update, delete functions• frontend joins• backend joins• finding balance between function interface and dynamic

queries• arrays, SPLIT BY

• use for remote database calls• cross-shard calls• SQL/MED (foreign table) integration

Page 40: Getting Started with PL/Proxy

The End