Postgres Open, September 2014: Postgres Interface Performance
Developing and Deploying Apps with the Postgres FDW
-
Upload
jkatz05 -
Category
Technology
-
view
1.599 -
download
1
Transcript of Developing and Deploying Apps with the Postgres FDW
My Love of Developing with the Postgres FDW
...and how production tested those feelings.
Jonathan S. Katz PGConf EU 2015 - October 30, 2015
Hi! I'm Jonathan!
2
A Bit About Me• @jkatz05
• Chief Technology Officer @ VenueBook
• Using Postgres since ~2004
• Been using it decently ~2010
• One day hope to use it well ;-)
• Active Postgres community member
• Co-Chair, PGConf US
• Co-organizer, NYC PostgreSQL User Group
• Director, United States PostgreSQL Association
• Have been to every PGConf.EU except Madrid :(
3
Disclaimer
5
I loooooove PostgreSQL
Disclaimer #2
6
I'm some sort of weird dev / DBA / business-person hybrid
Okay, done with the boilerplate. !
Let's do this.
7
Foreign Data Wrappers in a Nutshell
• Provide a unified interface (i.e. SQL) to access different data sources
• RDBMS (like Postgres!)
• NoSQL
• APIs (HTTP, Twitter, etc.)
• Internet of things
8
IMHO: This is a killer feature
9
History of FDWs• Released in 9.2 with a few read-only interfaces
• SQL-MED
• Did not include Postgres :(
• 9.3: Writeable FDWs
• ...and did include Postgres :D
• 9.4: Considers triggers on foreign tables
• 9.5
• IMPORT FOREIGN SCHEMA
• Push Down API (WIP)
• Inheritance children
10
Not Going Anywhere• 9.6
• Join Push Down
• Aggregate API?
• Parallelism?
• "Hey we need some data from you, we will check back later"
11
So I was just waiting for a good problem to solve
with FDWs
12
And then a couple of them came.
13
Some Background
14
VenueBook is revolutionizing the way people think about event booking. Our platform lets venues and bookers plan together, creating a smarter and better-
connected experience for all. We simplify planning, so you can have more fun!
Translation
• We have two main products:
• A CRM platform that allows venue managers to control everything around an event.
• A marketplace that allows event planners source venues and book events.
15
Further Translation
16
There are a lot of moving pieces with our data.
So The Following Conversation Happend
17
18
Hey, can we build an API?
Sure, but I would want to run it as a separate application so that way we can
isolate the load from our primary database.
Okay, that makes sense.
Great. There is a feature in Postgres that makes it easy to talk between two separate Postgres databases, so it shouldn't be too difficult to build.
That sounds good. Let's do it!
There's one catch...
This could be a bit experimental...
19
I want to experiment with this thing called a "Foreign Data Wrapper" but it should make maintenance
easier overall.
20
"OK"
21
Assumptions
• We are running PostgreSQL 9.4
• The schema I'm working with is slightly contrived for the purposes of demonstration
23
So, let's build something in our development
environment
24
25
local:app jkatz$ createuser!Enter name of role to add: jkatz!Shall the new role be a superuser? (y/n) y
Yeah, of course I want superuser
26
# "local" is for Unix domain socket connections only!local all all trust
Yeah, of course I don't care about authentication settings.
(Pro-tip: "trust" means user privileges don't matter)
27
local:app jkatz$ createdb app
Let's pretend this is how I created the main database.
28
CREATE TABLE venues ( id serial PRIMARY KEY, name varchar(255) NOT NULL ); !CREATE TABLE events ( id serial PRIMARY KEY, venue_id int REFERENCES venues (id), name text NOT NULL, total int NOT NULL DEFAULT 0, guests int NOT NULL, start_time timestamptz NOT NULL, end_time timestamptz NOT NULL, created_at timestamptz DEFAULT CURRENT_TIMESTAMP NOT NULL );
And let's pretend this is how I created the schema for it.
29
And this magic function to check for availability.
CREATE FUNCTION get_availability( venue_id int, start_time timestamptz, end_time timestamptz ) RETURNS bool AS $$ SELECT NOT EXISTS( SELECT 1 FROM events WHERE events.venue_id = $1 AND ($2, $3) OVERLAPS (events.start_time, events.end_time) LIMIT 1 ); $$ LANGUAGE SQL STABLE;
30
local:app jkatz$ createdb api
So let's make the API schema
31
CREATE SCHEMA api;
We are going to be a bit smarter about how we organize the code.
32
CREATE TABLE api.users ( id serial PRIMARY KEY, key text UNIQUE NOT NULL, name text NOT NULL ); !CREATE TABLE api.venues ( id serial PRIMARY KEY, remote_venue_id int NOT NULL ); !CREATE TABLE api.events ( id serial PRIMARY KEY, user_id int REFERENCES api.users (id) NOT NULL, venue_id int REFERENCES api.venues (id) NOT NULL, remote_bid_id text, ip_address text, data json, created_at timestamptz DEFAULT CURRENT_TIMESTAMP NOT NULL );
Our API schema
33
CREATE EXTENSION postgres_fdw; !CREATE SERVER app_server FOREIGN DATA WRAPPER postgres_fdw OPTIONS (dbname 'app'); !CREATE USER MAPPING FOR CURRENT_USER SERVER app_server;
Our setup to pull the information from the main application
34
CREATE SCHEMA app; !CREATE FOREIGN TABLE app.venues ( id int, name text ) SERVER app_server OPTIONS (table_name 'venues');
We will isolate the foreign tables in their own schema
35
SELECT * FROM app.venues;
So that means this returns...
36
SELECT * FROM app.venues;
ERROR: relation "app.venues" does not exist CONTEXT: Remote SQL command: SELECT id, name FROM app.venues
37
...what?
38
CREATE FOREIGN TABLE app.venues ( id int, name text ) SERVER app_server OPTIONS ( table_name 'venues', schema_name 'public' );
If there is a schema mismatch between local and foreign table, you have to set the schema explicitly.
39
SELECT * FROM app.venues;
id | name ----+-------------- 1 | Venue A 2 | Restaurant B 3 | Bar C 4 | Club D
40
CREATE FOREIGN TABLE app.events ( id int, venue_id int, name text, total int, guests int, start_time timestamptz, end_time timestamptz ) SERVER app_server OPTIONS ( table_name 'events', schema_name 'public' );
Adding in our foreign table for events
41
INSERT INTO app.events ( venue_id, name, total, guests, start_time, end_time ) VALUES ( 1, 'Conference Party', 50000, 400, '2015-10-28 18:00', '2015-10-28 21:00' ) RETURNING id;
ERROR: null value in column "id" violates not-null constraint DETAIL: Failing row contains (null, 1, Conference Party, 50000, 400, 2015-10-28 22:00:00+00, 2015-10-29 01:00:00+00, 2015-10-27 22:19:10.555695+00). CONTEXT: Remote SQL command: INSERT INTO public.events(id, venue_id, name, total, guests, start_time, end_time) VALUES ($1, $2, $3, $4, $5, $6, $7)
42
Huh.
43
Two Solutions.
44
Solution #1
45
CREATE FOREIGN TABLE app.events ( id serial NOT NULL, venue_id int, name text, total int, guests int, start_time timestamptz, end_time timestamptz ) SERVER app_server OPTIONS ( table_name 'events', schema_name 'public' );
46
INSERT INTO app.events ( venue_id, name, total, guests, start_time, end_time ) VALUES ( 1, 'Conference Party', 50000, 400, '2015-10-28 18:00', '2015-10-28 21:00' ) RETURNING id;
id ---- 1 (1 row)
WARNING• This is using a sequence on the local database
• If you do not want to generate overlapping primary keys, this is not the solution for you.
• Want to use the sequence generating function on the foreign database
• But FDWs cannot access foreign functions
• However...
47
48
Solution #2
49(on the "app" database)
CREATE SCHEMA api; !CREATE VIEW api.events_id_seq_view AS SELECT nextval('public.events_id_seq') AS id;
50
CREATE FOREIGN TABLE app.events_id_seq_view ( id int ) SERVER app_server OPTIONS ( table_name 'events_id_seq_view', schema_name 'api' ); !CREATE FUNCTION app.events_id_seq_nextval() RETURNS int AS $$ SELECT id FROM app.events_id_seq_view $$ LANGUAGE SQL; !CREATE FOREIGN TABLE app.events ( id int DEFAULT app.events_id_seq_nextval(), venue_id int, name text, total int, guests int, start_time timestamptz, end_time timestamptz ) SERVER app_server OPTIONS ( table_name 'events', schema_name 'public' );
(on the "api" database)
51
INSERT INTO app.events ( venue_id, name, total, guests, start_time, end_time ) VALUES ( 1, 'Conference Party', 50000, 400, '2015-10-28 18:00', '2015-10-28 21:00' ) RETURNING id;
id ---- 4 (1 row)
52
Hey, can we check the availability on the api server before making an insert on the app server?
53
Sure, we have a function for that on "app" but... FDWs do not support foreign functions.
!And we cannot use a view.
!However...
dblink• Written in 2001 by Joe Conway
• Designed to make remote PostgreSQL database calls
• The docs say:
• See also postgres_fdw, which provides roughly the same functionality using a more modern and standards-compliant infrastructure.
54
55
-- setup the extensions (if not already done so) CREATE EXTENSION plpgsql; CREATE EXTENSION dblink; !-- create CREATE FUNCTION app.get_availability( venue_id int, start_time timestamptz, end_time timestamptz ) RETURNS bool AS $get_availability$ DECLARE is_available bool; remote_sql text; BEGIN remote_sql := format('SELECT get_availability(%L, %L, %L)', venue_id, start_time, end_time); SELECT availability.is_available INTO is_available FROM dblink('dbname=app', remote_sql) AS availability(is_available bool); RETURN is_available; EXCEPTION WHEN others THEN RETURN NULL::bool; END; $get_availability$ LANGUAGE plpgsql;
(on the "api" database)
56
SELECT app.get_availability(1, '2015-10-28 18:00', '2015-10-28 20:00');
get_availability ------------------ f (1 row)
get_availability ------------------ t (1 row)
SELECT app.get_availability(1, '2015-10-28 12:00', '2015-10-28 14:00');
Works great!
Summary So Far...• We created two separate databases with logical schemas
• We wrote some code using postgres_fdw and dblink that can
• Read data from "app" to "api"
• Insert data from "api" to the "app"
• ...with the help of the sequence trick
• Make a remote function call
57
Awesome! Let's Deploy
58
(And because we are good developers, we are going to test the deploy
configuration in a staging environment, but we can all safely assume that, right? :-)
59
(Note: when I say "superuser" I mean a Postgres superuser)
60
61
app api
db01: 10.0.0.80
api
api01: 10.0.0.20
app
app01: 10.0.0.10
Network Topography
62
db01:postgresql postgres$ createdb -O app app!db01:postgresql postgres$ createdb -O app api
How we are setting things up
63
# TYPE DATABASE USER ADDRESS METHOD # for the main user host app app 10.0.0.10/32 md5 host api api 10.0.0.20/32 md5 # for foreign table access local api app md5 local app api md5
pg_hba.conf setup
64
CREATE EXTENSION postgres_fdw; CREATE EXTENSION dblink;
So we already know to run these as a supuerser on "api" right? ;-)
65
CREATE SERVER app_server FOREIGN DATA WRAPPER postgres_fdw OPTIONS (dbname 'app');
ERROR: permission denied for foreign server app_server
But if we log in as the "api" user and try to run this...
66
As a superuser, grant permission
GRANT USAGE ON FOREIGN DATA WRAPPER postgres_fdw TO api;
67
CREATE SERVER app_server FOREIGN DATA WRAPPER postgres_fdw OPTIONS (dbname 'app'); !CREATE FOREIGN TABLE app.venues ( id int, name text ) SERVER app_server OPTIONS ( table_name 'venues', schema_name 'public' );
Now this works! Let's run a query...
68
SELECT * FROM app.venues;
ERROR: user mapping not found for "api"
69
CREATE USER MAPPING FOR api SERVER app_server OPTIONS ( user 'api', password 'test' );
So we create the user mapping and...
70
SELECT * FROM app.venues;
ERROR: permission denied for relation venues CONTEXT: Remote SQL command: SELECT id, name FROM public.venues
You've got to be kidding me...
71Go to "app" and as a superuser run this
GRANT SELECT ON venues TO api; GRANT SELECT, INSERT, UPDATE ON events TO api;
72
SELECT * FROM app.venues;
id | name ----+-------------- 1 | Venue A 2 | Restaurant B 3 | Bar C 4 | Club D
Meanwhile, back on "api"
Time to make the events work.
73
74
CREATE SCHEMA api; !CREATE VIEW api.events_id_seq_view AS SELECT nextval('public.events_id_seq') AS id;
Get things started on the "app" database
75
-- setup the sequence functionality CREATE FOREIGN TABLE app.events_id_seq_view ( id int ) SERVER app_server OPTIONS ( table_name 'events_id_seq_view', schema_name 'api' ); !CREATE FUNCTION app.events_id_seq_nextval() RETURNS int AS $$ SELECT id FROM app.events_id_seq_view $$ LANGUAGE SQL;
Back on the "api" database
And when we test the sequence function...
76
77
SELECT app.events_id_seq_nextval();
ERROR: permission denied for schema api CONTEXT: Remote SQL command: SELECT id FROM api.events_id_seq_view SQL function "events_id_seq_nextval" statement 1
Here we go again...
78
GRANT USAGE ON SCHEMA api TO api;
On the "app" database
79
SELECT app.events_id_seq_nextval();
ERROR: permission denied for relation events_id_seq_view CONTEXT: Remote SQL command: SELECT id FROM api.events_id_seq_view SQL function "events_id_seq_nextval" statement 1
On "api" - ARGH...
80
GRANT SELECT ON api.events_id_seq_view TO api;
On the "app" database
81
SELECT app.events_id_seq_nextval();
ERROR: permission denied for sequence events_id_seq CONTEXT: Remote SQL command: SELECT id FROM api.events_id_seq_view SQL function "events_id_seq_nextval" statement 1
On "api" - STILL?!?!?!?!
82
GRANT USAGE ON SEQUENCE events_id_seq TO api;
On the "app" database
83
SELECT app.events_id_seq_nextval();
And on "api" - YES!
events_id_seq_nextval ----------------------- 1
84
CREATE FOREIGN TABLE app.events ( id int DEFAULT app.events_id_seq_nextval(), venue_id int, name text, total int, guests int, start_time timestamptz, end_time timestamptz ) SERVER app_server OPTIONS ( table_name 'events', schema_name 'public' );
We can now create the foreign table and test the INSERT...
85
INSERT INTO app.events ( venue_id, name, total, guests, start_time, end_time ) VALUES ( 1, 'Conference Party', 50000, 400, '2015-10-28 18:00', '2015-10-28 21:00' ) RETURNING id;
id ---- 2
Yup...we ran "GRANT SELECT, INSERT, UPDATE ON events TO api;" on "app" earlier!
86
CREATE FUNCTION app.get_availability( venue_id int, start_time timestamptz, end_time timestamptz ) RETURNS bool AS $get_availability$ DECLARE is_available bool; remote_sql text; BEGIN remote_sql := format('SELECT get_availability(%L, %L, %L)', venue_id, start_time, end_time); SELECT availability.is_available INTO is_available FROM dblink('dbname=app user=api password=test', remote_sql) AS availability(is_available bool); RETURN is_available; EXCEPTION WHEN others THEN RETURN NULL::bool; END; $get_availability$ LANGUAGE plpgsql;
And install our availability function...
87
SELECT app.get_availability(1, '2015-10-28 18:00', '2015-10-28 20:00'); ! get_availability ------------------ f (1 row) !!SELECT app.get_availability(1, '2015-10-28 13:00', '2015-10-28 17:00'); ! get_availability ------------------ t (1 row)
...and wow.
WE DID IT!!!
88
What did we learn?
89
We Learned That...• PostgreSQL has a robust permission system
• http://www.postgresql.org/docs/current/static/sql-grant.html
• ...there is much more we could have done too.
• Double the databases, double the problems
• Always have a testing environment that can mimic your production environment
• ...when it all works, it is so sweet.
90
Conclusion• Foreign data wrappers are incredible
• The postgres_fdw is incredible
• ...and it is still a work in progress
• Make sure you understand its limitations
• Research what is required to properly install in production
91
Questions?
• @jkatz05
92