Introduction to PostgreSQL for System Administrators
-
Upload
jignesh-shah -
Category
Technology
-
view
2.182 -
download
4
Transcript of Introduction to PostgreSQL for System Administrators
Introduction to PostgreSQL for System Administrators
Jignesh ShahStaff Engineer, VMware Inc
PGEast March 2011 – NYC
© 2011 VMware Inc 2
About MySelf
Joined VMware in 2010 Database performance on vSphere
Previously at Sun Microsystems from 2000-2010 Database Performance on Solaris/Sun Systems
Work with PostgreSQL Performance Community Scaling, Bottlenecks using various workloads
My Blog: http://jkshah.blogspot.com
© 2011 VMware Inc 3
Content
Why OpenSource Databases? Why PostgreSQL? Quick Start to PostgreSQL PostgreSQL Internals PostgreSQL Monitoring
© 2011 VMware Inc 4
Why Open Source Databases?
License costs Need for “Just Enough” databases Source code access if vendor goes under or above Many available but currently two popular ones
MySQL – Great for simple, web-based workloads PostgreSQL – Great for complex, enterprise
transactional oriented workloads
© 2011 VMware Inc 5
Why PostgreSQL?
PostgreSQL is community driven Consists of Core team, Committers, Contributors,
Developers No one company owns it Great Mailing List Support Yearly releases Great strides in overall performance, replication in last
few years License is more flexible than GPL (OEM, Embedded
customers more welcome)
© 2011 VMware Inc 6
Quick Start Guide for PostgreSQL
Install PostgreSQL binaries or self build/install them
Set $PGDATA variable to a valid directory path $ initdb
$ pg_ctl start -l logfile
$ createdb dbname
$ psql dbname
© 2011 VMware Inc 7
Quick Start Guide for Remote Access
If you want to make the database access from remote servers
Add network/remote host entry in $PGDATA/pg_hba.conf Add listen='*' or specific IP address of NIC in
$PGDATA/postgresql.conf $ pg_ctl reload
$ psql -h hostname -u username -d dbname
© 2011 VMware Inc 8
Quick Start Guide for Performance Tuning
Modify following parameters in $PGDATA/postgresql.conf checkpoint_segments=16 shared_buffers=512MB wal_buffers=1MB
If disks are slow to write and willing to take few transaction loss without corrupting databases synchronous_commit=off
Separate pg_xlog on separate filesystem pg_ctl stop mv $PGDATA/pg_xlog /path/to/pg_xlog ln -s /path/to/pg_xlog $PGDATA/pg_xlog pg_ctl start -l logfile
© 2011 VMware Inc 9
Quick Start Guide for Client Connections
psql - SQL Interpreter client libpq – API access to write C applications to access the
database (used by PHP, Python, etc) JDBC – Standard JDBC for Java Programs to connect
with PostgreSQL databases Npgsql – Database access for .NET/Mono applications UnixODBC – Standard ODBC for applications to connect
with PostgreSQL
© 2011 VMware Inc 10
Quick Start Guide - PostgreSQL Administration
SQL Support: SELECT, INSERT, UPDATE, INSERT Utilities:
DDL Support: CREATE, ALTER, etc Backup: pg_dump, pg_restore Statistics: analyze Cleanup: vacuumdb (or Auto-vacuum) Misc: oid2name
© 2011 VMware Inc 11
PostgreSQL Internals – File System Layout
Files in $PGDATA postgresql.conf – PostgreSQL Configuration File pg_hba.conf - PostgreSQL Host Based Access Configuration pg_ident.conf - Mapping System names to PostgreSQL user
names PG_VERSION – Contains Version Information postmaster.pid – Contains PID of PostMaster, $PGDATA and
shared memory segment ids postmaster.opts – Contains the record for command line options
for the last stared database instance
© 2011 VMware Inc 12
PostgreSQL Internals – File System Layout
Directories in $PGDATA base – Contains per-database subdirectories pg_stat_tmp - Temporary directory for Statistics subsystem global - Database Cluster-wide tables (pg_control) pg_log – Server System logs (text or csv readable) pg_subtrans – Contains sub-transaction status data pg_xlog – WAL or Write Ahead Log directory pg_clog - Commit Status data pg_multixact - Contains multi-transaction status data pg_tblspc – Contains links to tablespace locations pg_notify - LISTEN/NOTIFY status data pg_twophase – Contains state files for prepared transactions
© 2011 VMware Inc 13
PostgreSQL Internals – Processes Layout
> pgrep -lf postgres6688 /usr/local/postgres/bin/postgres (POSTMASTER)6689 postgres: logger process 6691 postgres: writer process 6692 postgres: wal writer process 6693 postgres: autovacuum launcher process 6694 postgres: stats collector process13583 postgres: username dbname [local] idle in transaction
© 2011 VMware Inc 14
PostgreSQL Internals – Memory Layout
PostgreSQL Shared Memory
PostgreSQL Backend Process
PostgreSQL Backend Process
PostgreSQL Backend Process
© 2011 VMware Inc 15
PostgreSQL Internals – Shared Memory Layout
PROCProcArray
Auto VacuumBtree VacuumFree Space MapBackground Writer
LWLocksLock HashesPROCLOCKStatisticsSynchronized Scans
XLOG BuffersCLOG BuffersSubtrans BuffersTwo-phase StructsMulti-xact buffersShared InvalidationBuffer Descriptors Shared Buffers
© 2011 VMware Inc 16
PostgreSQL Internals – WAL Layout
WAL – Write Ahead Log pg_xlog contains 16MB segments All transactions are first logged here before they are
committed checkpoint_segments and checkpoint_timeout drives the
number of WAL/pg_xlog segments Can grow pretty big based on transaction load
© 2011 VMware Inc 17
PostgreSQL Internals – Checkpoint
Checkpoint does the following: Pushes dirty bufferpool pages to storage Syncs all filesystem cache Recycle WAL files Check for server messages indicating too-frequent
checkpoints
Checkpoint causes performance spikes (drops) while checkpoint is in progress
© 2011 VMware Inc 18
PostgreSQL Internals – MVCC
Default isolation level in PostgreSQL is READ-COMMITTED
Each Query only sees transactions completed before it started
On Query Start, PostgreSQL records: The transaction counter All transactions that are in progress
In a multi-statement transaction, a transaction's own previous queries are also visible
© 2011 VMware Inc 19
PostgreSQL Internals – MVCC
Visible Tuples must have xmin or creation xact id that: Is a committed transaction Is less than the transaction counter stored at query
start Was not in-process at query start
Visible tuples must also have an expiration transaction id (xmax) that: Is blank or aborted or Is greater than the transaction counter at query start
or Was in-process at query start
© 2011 VMware Inc 20
PostgreSQL Internals – VACUUM
VACUUM is basically MVCC Cleanup Get rid of deleted rows (or earlier versions of rows) Records free space in .fsm files Often used with Analyze to collect optimizer statistics Auto vacuum helps to do the vacuum task as and when
needed
© 2011 VMware Inc 21
PostgreSQL Internals – Lock Types
Access Share Lock – SELECT Row Share Lock - SELECT FOR UPDATE Row Exclusive Lock – INSERT, UPDATE, DELETE Share Lock – CREATE INDEX Share Row Exclusive Lock – EXCLUSIVE MODE but
allows ROW SHARE LOCK Exclusive Lock – Blocks even Row Share Lock Access Exclusive Lock – ALTER TABLE, DROP TABLE,
VACUUM and LOCK TABLE
© 2011 VMware Inc 22
PostgreSQL Internals – Query Execution
Parse Statement Separate Utilities from actual queries Rewrite Queries in a standard format Generate Paths Select Optimal Path Generate Plan Execute Plan
© 2011 VMware Inc 23
Monitoring PostgreSQL – Monitoring IO
Iostat -xcmt 5 (Helps if pg_xlog is on separate filesystem)
12/02/2010 01:09:56 AM
avg-cpu: %user %nice %system %iowait %steal %idle
8.21 0.00 2.40 0.77 0.00 88.62
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util
sda 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
sdd ($PGDATA) 0.00 46.67 0.00 3.67 0.00 0.20 110.55 0.01 1.45 0.73 0.27
sde (pg_xlog) 0.00 509.00 0.00 719.33 0.00 3.39 9.66 0.11 0.15 0.15 10.53
sdc 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
sdb 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
© 2011 VMware Inc 24
Monitoring PostgreSQL – Monitoring CPU/Mem
top -ctop - 01:06:27 up 20 days, 1:48, 1 user, load average: 1.41, 5.30, 2.95
Tasks: 157 total, 1 running, 156 sleeping, 0 stopped, 0 zombie
Cpu(s): 5.0%us, 0.5%sy, 0.0%ni, 93.2%id, 0.6%wa, 0.0%hi, 0.7%si, 0.0%st
Mem: 16083M total, 5312M used, 10770M free, 213M buffers
Swap: 8189M total, 0M used, 8189M free, 4721M cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
29725 pguser 20 0 2105m 55m 53m S 21 0.3 0:14.66 postgres: pguser sbtest 192.168.0.65(20121) idle in transaction
29726 pguser 20 0 2105m 55m 53m S 21 0.3 0:14.56 postgres: pguser sbtest 192.168.0.65(20122) idle in transaction
29727 pguser 20 0 2105m 55m 53m S 21 0.3 0:14.58 postgres: pguser sbtest 192.168.0.65(20123) idle in transaction
29728 pguser 20 0 2105m 55m 53m S 21 0.3 0:14.62 postgres: pguser sbtest 192.168.0.65(20124) idle in transaction
© 2011 VMware Inc 25
Monitoring PostgreSQL – Monitoring SQL
select * from pg_stat_activity;
datid | datname | procpid | usesysid | usename | application_name | client_addr | client_port | backend_start | xact_start | query_start | waiting | current_query
-------+---------+---------+----------+---------+------------------+--------------+-------------+-------------------------------
16384 | sbtest | 29601 | 10 | pguser | psql | | -1 | 2010-12-02 00:59:26.883118+00 | 2010-12-02 01:00:43.088853+00 | 2010-12-02 01:00:43.088853+00 | f | select * from pg_stat_activity;
16384 | sbtest | 29624 | 10 | pguser | | 192.168.0.65 | 50659 | 2010-12-02 01:00:27.406842+00 | 2010-12-02 01:00:43.087634+00 | 2010-12-02 01:00:43.089749+00 | f | SELECT c from sbtest where id=$1
16384 | sbtest | 29626 | 10 | pguser | | 192.168.0.65 | 50661 | 2010-12-02 01:00:27.439644+00 | 2010-12-02 01:00:43.060565+00 | 2010-12-02 01:00:43.087203+00 | f | <IDLE> in transaction
16384 | sbtest | 29627 | 10 | pguser | | 192.168.0.65 | 50662 | 2010-12-02 01:00:27.458667+00 | 2010-12-02 01:00:43.082811+00 | 2010-12-02 01:00:43.089694+00 | f | SELECT SUM(K) from sbtest where id between $1 and $2
© 2011 VMware Inc 26
Monitoring PostgreSQL – EXPLAIN ANALYZE
Explain Analyze select count(*) from sbtest where k > 1; QUERY PLAN
----------------------------------------------------------
Aggregate (cost=380.48..380.49 rows=1 width=0) (actual time=6.929..6.929 rows=1 loops=1)
-> Index Scan using k on sbtest (cost=0.00..374.58 rows=2356 width=0) (actual time=0.023..6.364 rows=3768 loops=1)
Index Cond: (k > 1)
Total runtime: 6.977 ms
(4 rows)
© 2011 VMware Inc 27
Monitoring PostgreSQL – Monitoring DB Statistics
select * from pg_stat_database;
[ RECORD 4 ]-+-----------
datid | 16384
datname | sbtest
numbackends | 1
xact_commit | 136978505
xact_rollback | 95
blks_read | 99161
blks_hit | 1481038913
tup_returned | 1138850495
tup_fetched | 986059963
tup_inserted | 3150677
tup_updated | 6145651
tup_deleted | 2050612
© 2011 VMware Inc 28
Monitoring PostgreSQL – Monitoring Table Statistics
select * from pg_stat_user_tables; -[ RECORD 1 ]----+------------------------------
relid | 16405
schemaname | public
relname | sbtest
seq_scan | 2
seq_tup_read | 0
idx_scan | 2001731
idx_tup_fetch | 46158444
n_tup_ins | 211207
n_tup_upd | 333527
n_tup_del | 111207
n_tup_hot_upd | 108633
n_live_tup | 100000
n_dead_tup | 0
last_vacuum |
last_autovacuum | 2010-12-02 01:38:44.821662+00
last_analyze |
last_autoanalyze | 2010-12-02 01:38:47.598785+00
© 2011 VMware Inc 29
Monitoring PostgreSQL – Monitoring IO Statistics
select * from pg_statio_user_tables; -[ RECORD 1 ]---+---------
relid | 16405
schemaname | public
relname | sbtest
heap_blks_read | 4588
heap_blks_hit | 29358125
idx_blks_read | 1919
idx_blks_hit | 6551658
toast_blks_read |
toast_blks_hit |
tidx_blks_read |
tidx_blks_hit |
© 2011 VMware Inc 30
Monitoring PostgreSQL – Monitoring IO Statistics
select * from pg_statio_user_indexes; -[ RECORD 1 ]-+------------
relid | 16405
indexrelid | 16412
schemaname | public
relname | sbtest
indexrelname | sbtest_pkey
idx_blks_read | 391
idx_blks_hit | 5210450
-[ RECORD 2 ]-+------------
relid | 16405
indexrelid | 16414
schemaname | public
relname | sbtest
indexrelname | k
idx_blks_read | 1528
idx_blks_hit | 1341208
© 2011 VMware Inc 31
Questions / More Information
Email: [email protected] Learn more about PostgreSQL
http://www.postgresql.org Blog: http://jkshah.blogspot.com