Scaling With SkyTools
-
Upload
gavin-m-roy -
Category
Documents
-
view
219 -
download
0
Transcript of Scaling With SkyTools
-
8/4/2019 Scaling With SkyTools
1/74
Scaling with SkyTools
& MoreScaling-Out Postgres with Skypes Open-Source Toolset
Gavin M. Roy
September 14th, 2011
-
8/4/2019 Scaling With SkyTools
2/74
About Me PostgreSQL ~ 6.5
CTO @myYearbook.com
Scaled initial infrastructure
Not as involved day-to-day databaseoperational and development
Twitter: @Crad
-
8/4/2019 Scaling With SkyTools
3/74
Scaling?
-
8/4/2019 Scaling With SkyTools
4/74
Concurrency
6am 8am 10am 12pm 2pm 4pm 6pm 8pm 10pm 12am 2am 4am 6am
Hourly breakdown
R
equestsperSeco
nd
-
8/4/2019 Scaling With SkyTools
5/74
Increasing Size-On-Disk
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
SizeinGB
-
8/4/2019 Scaling With SkyTools
6/74
Scaling andPostgreSQL Behavior
-
8/4/2019 Scaling With SkyTools
7/74
Size on Disk
-
8/4/2019 Scaling With SkyTools
8/74
Tuples, Indexes,Overhead
-
8/4/2019 Scaling With SkyTools
9/74
Table Size+
Size of all combined Indexes
Relations Indexes
-
8/4/2019 Scaling With SkyTools
10/74
Constraints
Available Memory
Disk Speed
IO Bus Speed
-
8/4/2019 Scaling With SkyTools
11/74
Keep it in memory.
-
8/4/2019 Scaling With SkyTools
12/74
Get Fast Disks & I/O.
-
8/4/2019 Scaling With SkyTools
13/74
Process Forking
+Locks
-
8/4/2019 Scaling With SkyTools
14/74
Client Connections
-
8/4/2019 Scaling With SkyTools
15/74
One Connection perConcurrent Request
-
8/4/2019 Scaling With SkyTools
16/74
Apache+PHPOne connection per backend for each pg_connect
-
8/4/2019 Scaling With SkyTools
17/74
PythonOne connection per connection*
-
8/4/2019 Scaling With SkyTools
18/74
ODBCOne connection to Postgres per ODBC connection
-
8/4/2019 Scaling With SkyTools
19/74
Master Process
Stats
Collector
Autovacuum
Wall Writer
Wall Writer
Connection
Backend Client Connection
Lock
Contention?Each backend for a connectedclient has to check for locks
-
8/4/2019 Scaling With SkyTools
20/74
Master Process
Stats
Collector
Autovacuum
Wall Writer
Wall Writer
Connection
BackendClient Connection
Connection
BackendClient Connection
New Client
Connection?Access ShareAccess Exclusive
ExclusiveShare
Share Row ExclusiveShare UpdateRow Share
Row Exclusive
-
8/4/2019 Scaling With SkyTools
21/74
Master Process
Stats
Collector
Autovacuum
Wall Writer
Wall Writer
Connection
BackendClient Connection
Connection
BackendClient Connection
Connection
BackendClient Connection
...
Too many
connections?Slow performance
-
8/4/2019 Scaling With SkyTools
22/74
250 Apache Backendsx
1 Connection per Backend
x250 Servers
=62,500 Connections
-
8/4/2019 Scaling With SkyTools
23/74
Solvable Problems!
-
8/4/2019 Scaling With SkyTools
24/74
The Trailblazers
-
8/4/2019 Scaling With SkyTools
25/74
Solving Concurrency
-
8/4/2019 Scaling With SkyTools
26/74
pgBouncer
-
8/4/2019 Scaling With SkyTools
27/74
Session Pooling
-
8/4/2019 Scaling With SkyTools
28/74
Transactional Pooling
-
8/4/2019 Scaling With SkyTools
29/74
Statement Pooling
-
8/4/2019 Scaling With SkyTools
30/74
Connection PoolingClients Clients Clients
PostgresServer #1
pgBouncer
PostgresServer #2
PostgresServer #3
Tens TensTens
Hundreds HundredsHundreds
-
8/4/2019 Scaling With SkyTools
31/74
Add Local Pooling
Local pgBouncer Local pgBouncer Local pgBouncer
PostgresServer #1
pgBouncer
PostgresServer #2
PostgresServer #3
ClientsClients Clients
Tens TensTens
Hundreds HundredsHundreds
Tens TensTens
-
8/4/2019 Scaling With SkyTools
32/74
Easy to runUsage: pgbouncer [OPTION]... config.ini-d, --daemon Run in background (as a daemon)-R, --restart Do a online restart
-q, --quiet Run quietly-v, --verbose Increase verbosity-u, --user= Assume identity of -V, --version Show version-h, --help Show this help screen and exit
-
8/4/2019 Scaling With SkyTools
33/74
userlist.txt
username passwordfoo bar
-
8/4/2019 Scaling With SkyTools
34/74
pgbouncer.ini
-
8/4/2019 Scaling With SkyTools
35/74
Specifying Connections[databases]; foodb over unix socketfoodb =
; redirect bardb to bazdb on localhostbardb = host=localhost dbname=bazdb
; access to dest database will go with single userforcedb = host=127.0.0.1 port=300 user=baz password=foo
client_encoding=UNICODE datestyle=ISO connect_query='SELECT1'
-
8/4/2019 Scaling With SkyTools
36/74
Base Daemon Config[pgbouncer]
logfile = pgbouncer.logpidfile = pgbouncer.pid; ip address or * which means all ip-slisten_addr = 127.0.0.1listen_port = 6432; unix socket is also used for -R.
;unix_socket_dir = /tmp
-
8/4/2019 Scaling With SkyTools
37/74
Authentication
; any, trust, plain, crypt, md5auth_type = trust#auth_file = 8.0/main/global/pg_authauth_file = etc/userlist.txtadmin_users = user2, someadmin, otheradminstats_users = stats, root
-
8/4/2019 Scaling With SkyTools
38/74
Stats Users?SHOW HELP|CONFIG|DATABASES|POOLS|CLIENTS|SERVERS|VERSIONSHOW FDS|SOCKETS|ACTIVE_SOCKETS|LISTS|MEM
pgbouncer=# SHOW CLIENTS;type | user | database | state | addr | port | local_addr | local_port | connect_time------+-------+-----------+--------+-----------+-------+------------+------------+---------------------C | stats | pgbouncer | active | 127.0.0.1 | 47229 | 127.0.0.1 | 6000 | 2011-09-13 17:55:46
* Truncated columns for display purposes
-
8/4/2019 Scaling With SkyTools
39/74
psql 9.0+ Problem?psql -U stats -p 6432 pgbouncerpsql: ERROR: Unknown startup parameterAdd to pgbouncer.ini:
ignore_startup_parameters = application_name
-
8/4/2019 Scaling With SkyTools
40/74
Pooling Behaviorpool_mode = statement
server_check_query = select 1
server_check_delay = 10
max_client_conn = 1000default_pool_size = 20
server_connect_timeout = 15
server_lifetime = 1200server_idle_timeout = 60
-
8/4/2019 Scaling With SkyTools
41/74
Skytools
-
8/4/2019 Scaling With SkyTools
42/74
Read Only Copy Read Only Copy Read Only Copy Read Only Copy
Load Balancer
pgBouncer
CanonicalDatabase
Clients Clients Clients Clients
Scale-Out Reads
-
8/4/2019 Scaling With SkyTools
43/74
PGQ
-
8/4/2019 Scaling With SkyTools
44/74
The Ticker
-
8/4/2019 Scaling With SkyTools
45/74
ticker.ini[pgqadm]job_name = pgopen_tickerdb = dbname=pgopen
# how often to run maintenance [seconds]maint_delay = 600# how often to check for activity [seconds]loop_delay = 0.1
logfile = ~/Source/pgopen_skytools/%(job_name)s.logpidfile = ~/Source/pgopen_skytools/%(job_name)s.pid
-
8/4/2019 Scaling With SkyTools
46/74
Getting PGQ RunningSetup our ticker:
pgqadm.py ticker.ini install
Run the ticker daemon:
pgqadm.py ticker.ini ticker -d
-
8/4/2019 Scaling With SkyTools
47/74
Londiste
-
8/4/2019 Scaling With SkyTools
48/74
replication.ini[londiste]job_name = pgopen_to_destination
provider_db = dbname=pgopensubscriber_db = dbname=destination
# it will be used as sql ident so no dots/spacespgq_queue_name = pgopen
logfile = ~/Source/pgopen_skytools/%(job_name)s.logpidfile = ~/Source/pgopen_skytools/%(job_name)s.pid
-
8/4/2019 Scaling With SkyTools
49/74
Install Londiste
londiste.py replication.ini provider install
londiste.py replication.ini subscriber install
-
8/4/2019 Scaling With SkyTools
50/74
Start Replication Daemon
londiste.py replication.ini replay -d
-
8/4/2019 Scaling With SkyTools
51/74
DDL?
-
8/4/2019 Scaling With SkyTools
52/74
Add the Provider
Tables and Sequences
londiste.py replication.ini provider add public.auth_user
-
8/4/2019 Scaling With SkyTools
53/74
Add the Subscriber
Tables and Sequences
londiste.py replication.ini subscriber add public.auth_user
-
8/4/2019 Scaling With SkyTools
54/74
Great Success!
-
8/4/2019 Scaling With SkyTools
55/74
PL/Proxy
-
8/4/2019 Scaling With SkyTools
56/74
Scale-Out Reads & Writes
A-F Server G-L Server M-R Server S-Z Server
plProxy Server
-
8/4/2019 Scaling With SkyTools
57/74
How does it work?
-
8/4/2019 Scaling With SkyTools
58/74
Simple Remote
Connection
CREATE FUNCTION get_user_email(username text)RETURNS SETOF text AS $$
CONNECT 'dbname=remotedb';SELECT email FROM users WHERE username = $1;
$$ LANGUAGE plproxy;
-
8/4/2019 Scaling With SkyTools
59/74
Sharded Request
CREATE FUNCTION get_user_email(username text)RETURNS SETOF text AS $$
CLUSTER usercluster;RUN ON hashtext(username);
$$ LANGUAGE plproxy;
-
8/4/2019 Scaling With SkyTools
60/74
Sharding Setup Need 3 Functions:
plproxy.get_cluster_partitions(cluster_nametext)
plproxy.get_cluster_version(cluster_name text)
plproxy.get_cluster_config(in cluster_name text,out key text,out val text)
-
8/4/2019 Scaling With SkyTools
61/74
get_cluster_partitionsCREATE OR REPLACE FUNCTIONplproxy.get_cluster_partitions(cluster_name text)RETURNS SETOF text AS $$
BEGINIF cluster_name = 'usercluster' THENRETURN NEXT 'dbname=part00 host=127.0.0.1';RETURN NEXT 'dbname=part01 host=127.0.0.1';RETURN;
END IF;RAISE EXCEPTION 'Unknown cluster';
END;$$ LANGUAGE plpgsql;
-
8/4/2019 Scaling With SkyTools
62/74
get_cluster_versionCREATE OR REPLACE FUNCTIONplproxy.get_cluster_version(cluster_name text)
RETURNS int4 AS $$BEGINIF cluster_name = 'usercluster' THEN
RETURN 1;END IF;RAISE EXCEPTION 'Unknown cluster';
END;$$ LANGUAGE plpgsql;
-
8/4/2019 Scaling With SkyTools
63/74
get_cluster_configCREATE OR REPLACE FUNCTION plproxy.get_cluster_config(
in cluster_name text,out key text,out val text)
RETURNS SETOF record AS $$BEGIN
-- lets use same config for all clusterskey := 'connection_lifetime';val := 30*60; -- 30mRETURN NEXT;
RETURN;END;$$ LANGUAGE plpgsql;
-
8/4/2019 Scaling With SkyTools
64/74
get_cluster_config
values connection_lifetime
query_timeout
disable_binary
keepalive_idle
keepalive_interval
keepalive_count
-
8/4/2019 Scaling With SkyTools
65/74
SQL/MED
-
8/4/2019 Scaling With SkyTools
66/74
SQL/Med Cluster
DefinitionCREATE SERVER a_cluster FOREIGN DATA WRAPPER plproxy
OPTIONS (
connection_lifetime '1800',disable_binary '1',p0 'dbname=part00 hostname=127.0.0.1',p1 'dbname=part01 hostname=127.0.0.1',p2 'dbname=part02 hostname=127.0.0.1',p3 'dbname=part03 hostname=127.0.0.1'
);
-
8/4/2019 Scaling With SkyTools
67/74
PLProxy + SQL/Med
Behavior
PL/Proxy will prefer SQL/Med clusterdefinitions over the plproxy.get_* functions
PL/Proxy will fallback to plproxy.get_*
functions if there are no SQL/Med clusters
-
8/4/2019 Scaling With SkyTools
68/74
SQL/MED User Mapping
CREATE USER MAPPING FOR bob
SERVER a_clusterOPTIONS (user 'bob', password 'secret');
CREATE USER MAPPING FOR publicSERVER a_clusterOPTIONS (user 'plproxy', password 'foo');
-
8/4/2019 Scaling With SkyTools
69/74
plproxyrc
https://github.com/myYearbook/plproxyrc
plpgsql based api for table basedmanagement of PL/Proxy
Used to manage complicated PL/Proxyinfrastructure @myYearbook
BSD Licensed
https://github.com/myYearbook/plproxyrchttps://github.com/myYearbook/plproxyrchttps://github.com/myYearbook/plproxyrchttps://github.com/myYearbook/plproxyrc -
8/4/2019 Scaling With SkyTools
70/74
PostgresServer #1
PostgresServer #2
PostgresServer #3
pgBouncer
Server-to-Server
-
8/4/2019 Scaling With SkyTools
71/74
-
8/4/2019 Scaling With SkyTools
72/74
Complex PL/Proxy and pgBouncerEnvironment
Local pgBouncer
Local pgBouncer
Local pgBouncer
PostgresServer #1
pgBouncer
PostgresServer #3
Clients
Clients
Clients pgBouncer
Load Balancer
plProxy Server plProxy Server
Load Balancer
pgBouncer
pgBouncer
PostgresServer #3
-
8/4/2019 Scaling With SkyTools
73/74
Other Tools andMethods?
-
8/4/2019 Scaling With SkyTools
74/74
Questions?