Post on 30-May-2015
description
Copyright(c)2013 NTT Corp. All Rights Reserved.
Introduction of pg_̲statsinfo and pg_̲stats_̲reporter
~∼ Statistics Reporting Tool for DBA ~∼
NTT Open Source Software CenterMitsumasa KONDO
2 Copyright(c)2013 NTT Corp. All Rights Reserved.
• Official Company Name • Nippon Telegraph and Telephone Corporation
• My Belonging • Service innovation Laboratory, Software Innovation Center Researcher
• My work • Middleware development for PostgreSQL
• pg_statsinfo, pg_stats_reporter • High Availability PostgreSQL Cluster using replication with Pacemaker
• PostgreSQL community development • Improvement of disk IO bottle neck
• Past work • Data mining, Natural Language Processing, Machine Learning, Recommendation, Information Retrieval
• I have already been good at them than databaseJ
• Hobby • Photography • Pure Audio
About Me
3 Copyright(c)2013 NTT Corp. All Rights Reserved.
• pg_statsinfo • Monitor and Collect PostgreSQL Statistics and Activities
• pg_stats_reporter • Visualize PostgreSQL Statistics and Activities getting from pg_̲statsinfo
Todayʼ’s Introduction Software
pg_statsinfo
pg_statsinfo
pg_statsinfo Repository Database
Database Statistics
and Activity
DB Server A
DB Server B
DB Server C
Sample report which was created by pg_stats_reporter
Creating report
Store of DBstatistics
pg_stats_reporter
4 Copyright(c)2013 NTT Corp. All Rights Reserved.
• pg_statsinfo ~ Monitor and Collect DB Statistics and Activities ~ • What is pg_̲statsinfo ?• Feature Introduction• Demo
• pg_stats_reporter ~ Visualize DB Statistics and Activities ~ • What is pg_̲stats_̲reporter ?• Feature introduction• Demo
• Visualizing DBT-‐2 Benchmark using pg_statsinfo and pg_stats_reporter
• Introduction of DBT-‐‑‒2• Visualized DBT-‐‑‒2 by pg_̲stats_̲reporter• For more performance
Contents
5 Copyright(c)2013 NTT Corp. All Rights Reserved.
• pg_statsinfo ~ Monitor and Collect DB Statistics and Activities ~ • What is pg_̲statsinfo ?• Feature Introduction• Demo
• pg_stats_reporter ~ Visualize DB Statistics and Activities ~ • What is pg_̲stats_̲reporter ?• Feature introduction• Demo
• Visualizing DBT-‐2 Benchmark using pg_statsinfo and pg_stats_reporter
• Introduction of DBT-‐‑‒2• Visualized DBT-‐‑‒2 by pg_̲stats_̲reporter• For more performance
Contents
6 Copyright(c)2013 NTT Corp. All Rights Reserved.
• Monitoring and Collecting PostgreSQL Statistics and Activities • Collecting statistics and activities• All tables in pg_̲catalog schema• pg_̲log information• OS resources
• Other Features• Create Report by command line• Alert and Monitoring function• Log management function• Auto repositoryDB management
• Other relative information • BSD License• Latest version is 2.5.0• http://pgfoundry.org/frs/?group_̲id=1000422• Working on PostgreSQL 9.3!• Web online manual is here• http://pgstatsinfo.projects.pgfoundry.org/pg_̲statsinfo-‐‑‒ja.html
What is pg_̲statsinfo ?
Collective Database Statistics
7 Copyright(c)2013 NTT Corp. All Rights Reserved.
• Programing Language • C
• Starting and Pre-‐Setting method • Start pg_̲statsinfo via shared_̲preload_̲library• Add postgresql.conf to pg_̲statsinfo configuration, then it can start normally in PostgreSQL.
• System Configuration • Install pg_̲statsinfo in monitoring instance
• Not need to install in repository database instance• Monitoring instance and repository database can set together incetance
Architecture of pg_̲statsinfo
pg_statsinfod
Collect and send database statistics
(Snapshot)
Monitoring instance
Repository database
pg_catalog
OS resources
pg_log Statistics ofdatabase
8 Copyright(c)2013 NTT Corp. All Rights Reserved.
• Collect statistics and activities in PostgreSQL • All information gathering PostgreSQLʼ’s statistics collector (ex. pg_̲catalog)
• Detail of statistics collector, please see PostgreSQL documentJ
• http://www.postgresql.jp /document/9.2/html/monitoring-‐‑‒stats.html• Get statistcs as snapshot at uniformity time
• Default every 10 minute
• Analyze pg_̲log and get activities from logs• Get activities which only output pg_̲log
• Checkpoint activities• VACUUM activities
• Get OS resources information in /proc• Get every 5 seconds in sampling, when get snapshot, insert average values of sampling
• CPU usage information(idle, iowait, system, user, Load Average)• Memory usage information(memfree, buffers, cached, swap, dirty)• Disk usage information(IO size, IO time, usage size of disk)
Features of pg_̲statsinfo 1/5
9 Copyright(c)2013 NTT Corp. All Rights Reserved.
• Create reports on command line • Output text format report on command line
• Example) Database admin or SQL Engineer who wants to see database statistics
• Cover almost all report item created by pg_̲stats_̲reporter
Features of pg_̲statsinfo 2/5
$ pg_statsinfo -U postgres -B 2013-10-01 -r ALL | less
Command example: Create report for all monitor instances on 2013-10-1 to now
10 Copyright(c)2013 NTT Corp. All Rights Reserved.
Features of pg_̲statsinfo 2/5
11 Copyright(c)2013 NTT Corp. All Rights Reserved.
• Auto maintenance repository database feature • Delete statistics that stored in repository database automatically
• Pg_̲statsinfo stored data that are used partitioning method per day. • So it can use TRUNCATE to delete old data• Delete data is faster and lower cost
• Note • When we use in multi monitor instance, giving priority to shortest maintenance period of stored data configuration
Features of pg_̲statsinfo 3/5
pg_statsinfo
pg_statsinfo
Maintenance period of stored data config
1 week
Maintenance period of stored data config
2 weeks
DB server A
DB server B
Store of database statistics
Default maintenance period of stored data is 1 weeks
Get and Send database statistics
Repository database
12 Copyright(c)2013 NTT Corp. All Rights Reserved.
• Log management feature • Easy to manage PostgreSQLʼ’s log• Log filtering feature
• Can set log level in pg_̲statsinfo, it means that we can having two log level• example)PostgreSQLʼ’s log level is lower setting to save detail information, and pg_̲statsinfo log level is higher setting to easy to read in daily
• This feature can fix log file name(ex. postgresql.log) It can use in monitoring log software.
• Multi output log feature• Can output syslog and pg_̲log
• Change log level feature• If you want to change log level in especially log message, we can change it
• ex)change log level INFO to LOG in especially log message
• Log compression and managing feature• Compress old logs and manage automatically
Features of pg_̲statsinfo 4/5
pg_statsinfod
pg_log(csv format)
Log by statsinfo(postgresql.log)
log formulation
Flow of extraction statistics from pg_log
13 Copyright(c)2013 NTT Corp. All Rights Reserved.
• Alert and Monitoring Function (Trigger Function) • Output alert log when over the alert thresholds in database
• usage)monitor alert log by monitoring software• Alert function is executed in every snapshot
• Default setting is under following, set property value on your server
• Setting method is UPDATE SQL for statsrepo.alert table
Features of pg_̲statsinfo 5/5
colum name default explanation
instid - Target instance ID rollback_tps 100 Number of rollback (sec)commit_tps 1000 Number of commit per seconds (sec)
garbage_size 20000 Garbage records size in the table(%) garbage_percent 30 Garbage records percentage in the database(%)
garbage_percent_table 30 Garbage records percentage in the table(%)
response_avg 10 average response time in the query (sec) response_worst 60 Worst response time in the query (sec)
enable_alert true Enable alert function
Alert configuration table
14 Copyright(c)2013 NTT Corp. All Rights Reserved.
How to install pg_̲statsinfo ?
$ su# rpm –ivh pg_statsinfo-2.50-1.pg93.rhel6.x86_64.rpm
1. Install RPM file’s
#minimum configurationshared_preload_libraries = ‘pg_statsinfo’ # pre-load library settinglog_filename = 'postgresql-%Y-%m-%d_%H%M%S.log' # configuration of log file’s (must need)
2. Add configuration to postgresql.conf
$ pg_ctl –D data start
3. Start PostgreSQL in normally
server startingLOG: loaded library "pg_statsinfo"LOG: pg_statsinfo launcher started
LOG: startLOG: installing schema: statsinfo
LOG: installing schema: statsrepo_partition
4. If we see under following log messages, install was succeed !
How to install pg_statsinfo is indicated in Web manual ! Jhttp://pgstatsinfo.projects.pgfoundry.org/pg_statsinfo-ja.html#install
15 Copyright(c)2013 NTT Corp. All Rights Reserved.
1.Install
2.Confirmation of Install
3.Collect Database Statistics and Activities (Snapshot)
4.Create Report
Demo of pg_̲statsinfo
16 Copyright(c)2013 NTT Corp. All Rights Reserved.
• One snapshot size is 300kB ~ 800kB• Be careful disk full by snapshots!
• Software installing degradation is almost nothing • But little bit happen. In DBT-‐‑‒2 benchmark, we confirm 2% degradation.
• If you’d like to separate repository server, set “pg_statsinfo.repository_server” in postgresql.conf .
• Default setting is ʻ‘host=localhost port=5432ʼ’
• If you use password in repository database, set /var/lib/pgsql/.pgpass
• pg_̲statsinfo works on postgres user
TIPS of pg_̲statsinfo
17 Copyright(c)2013 NTT Corp. All Rights Reserved.
• pg_statsinfo ~ Monitor and Collect DB Statistics and Activities ~ • What is pg_̲statsinfo ?• Feature Introduction• Demo
• pg_stats_reporter ~ Visualize DB Statistics and Activities ~ • What is pg_̲stats_̲reporter ?• Feature introduction• Demo
• Visualizing DBT-‐2 Benchmark using pg_statsinfo and pg_stats_reporter
• Introduction of DBT-‐‑‒2• Visualized DBT-‐‑‒2 by pg_̲stats_̲reporter• For more performance
Contents
18 Copyright(c)2013 NTT Corp. All Rights Reserved.
• Visualization PostgreSQL statistics and activities getting from pg_statsinfo
• Report items• Transaction situation• Size of Database• OS resources• Amount of WAL output• Replication state• Deadlock information
• Successor software of pg_̲reporter
• Extra information • BSD License• Latest version is 2.0.0• http://pgfoundry.org/frs/?group_̲id=1000422• Detail online manual is here• http://pgstatsinfo.projects.pgfoundry.org/pg_̲stats_̲reporter-‐‑‒ja.html
What is pg_̲stats_̲reporter ?
Report of pg_stats_reporter
19 Copyright(c)2013 NTT Corp. All Rights Reserved.
• Software • Apache + PHP + PostgreSQL
• Only PHP + PostgreSQL combination is OK• Need PostgreSQL 8.3 later
• Programing Language • PHP + javascript + SQL
• Using Library • PHP framework
• Smarty• User Interface
• jQuery, jQuery UI, tablesorter, Superfish• Creating graph
• dygraphs, jqPlot
Architecture of pg_̲stats_̲reporter
20 Copyright(c)2013 NTT Corp. All Rights Reserved.
• By Wab Browser • Only a few clicks for creating report.
How to Create Report ? 1/2
② Push “create new
report” button
① Select database instance
for reporting
③ Set term and time of report
21 Copyright(c)2013 NTT Corp. All Rights Reserved.
• By command line • It works on phpʼ’s stand alone mode.
• Usage scene• Create report in command line.• Create reports by crond in regular intervals.
• If you use only command line mode, Apache wasnʼ’t needed
• If you have security policy which cannot install Apache
• Need to save reports in long term• Repository database is saved until certain terms• Created reports arenʼ’t erased.
How to Create Report ? 2/2
$ pg_stats_reporter -B 2013-10-01 -E 2013-10-08 -O report_dir [LOG] Report file created: sample_localhost_5432_1_20131008-1419_20131008-1945.html
Command usage: Create report in 10/1 to 10/8 at report_dir�
22 Copyright(c)2013 NTT Corp. All Rights Reserved.
• Index of Report feature • Create report and index of reports in report directory• It is easy to see and sort out reports
How to Create Report ? 2/2
Index.html
Report HTML 1
Libraly of pg_stats_reporter
Report HTML2
␡␡␡
Directory of Report
Reports which were created past
Index of report
23 Copyright(c)2013 NTT Corp. All Rights Reserved.
How to install pg_̲stats_̲reporter ?
$ su# rpm –ivh httpd-2.2.15-15.el6_2.1.x86_64.rpm \\ php-5.3.3-3.el6_2.8.x86_64.rpm \\
php-common-5.3.3-3.el6_2.8.x86_64.rpm \\ php-pgsql-5.3.3-3.el6_2.8.x86_64.rpm \\
php-intl-5.3.3-3.el6_2.8.x86_64.rpm \\ pg_stats_reporter-1.0.0-1.el6.noarch.rpm
1. Install pg_stats_reporter RPM and dependency RPMs
# vim /etc/pg_stats_reporter.ini----- configuration of repository database ----- host = localhost
port = 5432dbname = postgres
username = postgrespassword =
2. Set pg_stats_reporter.ini(configuration file) (default setting is under following)
# service httpd start
3. Start Apache HTTP server
4. Access under following URL
http://localhost/pg_stats_reporter/pg_stats_reporter.php
How to install pg_stats_reporter is indicated in Web manual ! Jhttp://pgstatsinfo.projects.pgfoundry.org/pg_stats_reporter-ja.html#install
Please set SELINUX disable!!
24 Copyright(c)2013 NTT Corp. All Rights Reserved.
1.Install
2.Confirmation of Install
3.Create Report
Demo of pg_̲stats_̲reporter
25 Copyright(c)2013 NTT Corp. All Rights Reserved.
• Android and iPad are ready
• It is based on jQueryUI library, so we can easy to change interface design (mostly color)
• Logo picture can be also changed with file replaced
• It can select report items on reports • If weʼ’d like to, set /etc /pg_̲stats_̲reporter.ini with your needed report item
• For Security • We can use .httpaccess• Apacheʼ’s security technic can use in same
TIPS of pg_̲stats_̲reporter
26 Copyright(c)2013 NTT Corp. All Rights Reserved.
• pg_statsinfo ~ Monitor and Collect DB Statistics and Activities ~ • What is pg_̲statsinfo ?• Feature Introduction• Demo
• pg_stats_reporter ~ Visualize DB Statistics and Activities ~ • What is pg_̲stats_̲reporter ?• Feature introduction• Demo
• Visualizing DBT-‐2 Benchmark using pg_statsinfo and pg_stats_reporter
• Introduction of DBT-‐‑‒2• Visualized DBT-‐‑‒2 by pg_̲stats_̲reporter• For more performance
Contents
27 Copyright(c)2013 NTT Corp. All Rights Reserved.
• TPC-‐C benchmark software that developed by Open Source Development Labs(OSDL)
• Shopping simulation in parts wholesaler• http://www.tpc.org/tpcc /
• Benchmark score is calculated by only response in uniformity time
• Response time is very important!• IO bottle-‐‑‒neck benchmark
• Mainly benchmark parameter • warehouse
• Database size parameter• Increase one hundred thousands record per adding 1 parameter• Mainly used coordination size of database
• TPW• Transaction per warehouse
• Prepared clients corresponding warehouse size, Default 10• If we set lower TPW, it will be CPU bottle-‐‑‒necked benchmark
What is DBT-‐‑‒2?
28 Copyright(c)2013 NTT Corp. All Rights Reserved.
• Mainly bottle-‐neck • Random read/write
• Almost SQL plans are index scan• Random read/write performance and cache or buffer replace performance are important
• Parallel execution performance is also important• PostgreSQL is better than other RDBMSJ
• Other features • Plan of SQLs are very simple
• Most of SQLs are only index scan access. • Exist ideal Benchmark score
• If DB response all transactions in limit time, it is be ideal score
• Limit of performance is memory 2x equals database size.
• Amount of WAL output is less than pgbench, WAL is not bottle-‐‑‒neck.
Transaction Tendency in DBT-‐‑‒2
29 Copyright(c)2013 NTT Corp. All Rights Reserved.
Test Server and Settings of postgresql.confServer HP DL360 G7
CPU Xeon E5640 2.66GHz (1P/4C)
Memory DDR3-10600R-9 18GB
RAID card P410i / 256MB cache
Disk 4 x 146GB(1.5krpm) RAID 1 + 0
max_connections = 300 shared_buffers = 2458MB work_mem = 1MB maintenance_work_mem = 64MB fsync = on wal_sync_method = fdatasync full_page_writes = on wal_buffers = -1 archive_mode = on
checkpoint_segments = 300 checkpoint_timeout = 15min checkpoint_completion_target = 0.7 random_page_cost = 2.0 effective_cache_size = 9GB default_statistics_target = 10 log_destination = 'syslog’ autovacuum = on
postgresql.conf (mainly changed parameter)
Wherehouse size = 320(database size is about 40GB) and TPW = 10
30 Copyright(c)2013 NTT Corp. All Rights Reserved.
Visualizing DBT-‐‑‒2 by pg_̲stats_̲reporter 1/5
• Transaction Situation • It was seen fluctuates transactions. It is because some benchmark
specifications and some implementation dependent in PostgreSQL• Lower performance in executing CHECKPOINT• CHECKPOINT was mainly caused by checkpoint_̲timeout
• postgresql.conf sets checkpoint_̲timeout = 15min and checkpoint_̲segments = 300
31 Copyright(c)2013 NTT Corp. All Rights Reserved.
• Amount of WAL output • Output 4.6GB WAL in data load to benchmark finished• In data load, Maximum WAL speed is 54MB/sec• In executing benchmark test, Maximum WAL speed is 12MB/sec
• When starting CHECKPOINT, WAL Speed is higher, it is because “full page write”.
Visualizing DBT-‐‑‒2 by pg_̲stats_̲reporter 2/5
32 Copyright(c)2013 NTT Corp. All Rights Reserved.
• CPU usage • Iowait is most, next is idle (It indicates IO bottle-‐‑‒neck situation.)• Part of final CHECKPOINT causes high Load Average
• It is because executing ugly consecutive fsync().• PostgreSQL CHECKPOINT logic is not goodL
Visualizing DBT-‐‑‒2 by pg_̲stats_̲reporter 3/5
33 Copyright(c)2013 NTT Corp. All Rights Reserved.
• Update and heavily access Tables • HOT(Heap on Tuple) is good working!• order_̲line table and stock table have many access• Each tableʼ’s Cache hit rate are very high, but… (Is it really?L)
Visualizing DBT-‐‑‒2 by pg_̲stats_̲reporter 4/5
34 Copyright(c)2013 NTT Corp. All Rights Reserved.
• Query executed situation • Queries which have complicated filter phrase is slow• Unexpected, COMMIT assumes long time!
• It is because long transaction COMMIT needs lot of WAL (WAL buffer writing)
• Final CHECKPOINT fsync() phase makes queries slower
Visualizing DBT-‐‑‒2 by pg_̲stats_̲reporter 5/5
35 Copyright(c)2013 NTT Corp. All Rights Reserved.
• Use direct_cp in archive copy command • When we use archive mode in PostgreSQL, cp command consume large amount of waste file cache, and it is caused lower performance
• BSD License Software• http://directcp.projects.pgfoundry.org/index.html
• Use SSD • In general, database bottle-‐‑‒neck is random access. SSD has 10 times faster random access than MD
• If you need large disk or donʼ’t have cost, you may use tablespace in only hot table, it is very efficiency.
• Use large RAID cache card • PostgreSQL CHECKPOINT does not consider fsync() schedule at all. It is caused very heavy disk write and fail overL
• If you use large raid cache card, it may prevent a little.
For More Performance
36 Copyright(c)2013 NTT Corp. All Rights Reserved.
• pg_statsinfo • Monitor and Collect PostgreSQL Statistics and Activities with time series
• BSD License• http://pgstatsinfo.projects.pgfoundry.org/pg_̲statsinfo-‐‑‒ja.html
• Collect whole of statistics an activities for DB admin needed• If youʼ’d like to another new report, Create reporting SQL from collecting information
• pg_stats_reporter • Visualize PostgreSQL Statistics and Activities that are collected by pg_̲statsinfo
• BSD License• http://pgstatsinfo.projects.pgfoundry.org/pg_̲stats_̲reporter-‐‑‒ja.html
• jQuery Based Useful Interface• Report index feature is also useful
• It is easy to improve software, because it is created by PHP + JavaScript
• It is also easy to submit patchJ
Summary