Lessons PostgreSQL learned from commercial databases, and didn’t

21
Lessons PostgreSQL learned from commercial databases, and didn’t Ilya Kosmodemiansky [email protected]

Transcript of Lessons PostgreSQL learned from commercial databases, and didn’t

Page 1: Lessons PostgreSQL learned from commercial databases, and didn’t

Lessons PostgreSQL learned from commercialdatabases, and didn’t

Ilya [email protected]

Page 2: Lessons PostgreSQL learned from commercial databases, and didn’t

Preamble

PostgreSQL is a great database!

• (You always need to say so if you are going to say PostgreSQLlags behind commercial databases or has some limitations)

Page 3: Lessons PostgreSQL learned from commercial databases, and didn’t

Preamble

PostgreSQL is a great database!

• The only open source database technology, massively used asan alternative to commercial RDBMSs

• Moreover, 10 years ago it was seriously disputed (by somepeople), if PostgreSQL can outperform MySQL

• Moreover, 5 years ago any Oracle to Postgres migrationcase-study meant you will be 100% accepted to anyPostgreSQL conference

• Only PostgreSQL did such impressive progress!• Well, Linux did, but Linux is not a database system

Page 4: Lessons PostgreSQL learned from commercial databases, and didn’t

What made that possible?

• Good initial architecture• Well organized community work• SQL close to standard• Procedural languages• Lots of things - you probably know those things if you are here

Page 5: Lessons PostgreSQL learned from commercial databases, and didn’t

Did PostgreSQL learne something?(from commercial databases)

• Well, not directly• At least, this a worst possible way to start discussion on[HACKERS]: ”...we need this feature because Oracle has it...”

• Most likely people came from Oracle, did not find somebeloved instruments and started to implement a substitution

Page 6: Lessons PostgreSQL learned from commercial databases, and didn’t

Anecdotally

• Prominent Soviet aircraft designer Tupolev, being unofficiallyaccused of plagiarizing some of his models, used to say that allbeautiful aircrafts look similar and that is why they can fly

• Tupolev’s ill-wishers believed that he definitely plagiarized thatformula as well - from some another aircraft designer...

• For aviation engineers, it was always obvious, that internallyairplanes were totally different

• Anyway, databases _are_ like aircrafts: common theorybeneath makes them look similar

Page 7: Lessons PostgreSQL learned from commercial databases, and didn’t

That common theory was

Transactions• If your data is important, use a database which supports ACIDtransactions

• In PostgreSQL: MVCC implementation since version 6.5(1999), WAL since 7.1 (2001)

• Adopting MVCC instead of pure-locking scheduler was wise(DB2 and MS SQL Server proved that over the time)

• That allowed to implement reliable backup/recoverymechanism and replication for high availability

• And that was actually a pivotal point, which startedPostgreSQL adoption in enterprise-level solutions

• Ironically, current MVCC implementation itself became somelimitation for Postgres

Page 8: Lessons PostgreSQL learned from commercial databases, and didn’t

OK, hold on

What can actually stop you from choosing Postgres insteadof Oracle or DB2?

• Write performance - Yes, absolutely• Database size - Yes, definitely• Lack of diagnostics tools - Yes• We need to run PostgreSQL in Microsoft environment - Yes• Lack of qualified people - Maybe• Lack of build in analog of RAC/PureScale - Yes and No• We are talking about heavy workloads and comparing with

enterprise licenses

Page 9: Lessons PostgreSQL learned from commercial databases, and didn’t

OK, hold on

What can actually stop you from choosing Postgres insteadof Oracle or DB2?

• Write performance - Yes, absolutely• Database size - Yes, definitely• Lack of diagnostics tools - Yes• We need to run PostgreSQL in Microsoft environment - Yes• Lack of qualified people - Maybe• Lack of build in analog of RAC/PureScale - Yes and No

• We are talking about heavy workloads and comparing withenterprise licenses

Page 10: Lessons PostgreSQL learned from commercial databases, and didn’t

OK, hold on

What can actually stop you from choosing Postgres insteadof Oracle or DB2?

• Write performance - Yes, absolutely• Database size - Yes, definitely• Lack of diagnostics tools - Yes• We need to run PostgreSQL in Microsoft environment - Yes• Lack of qualified people - Maybe• Lack of build in analog of RAC/PureScale - Yes and No• We are talking about heavy workloads and comparing with

enterprise licenses

Page 11: Lessons PostgreSQL learned from commercial databases, and didn’t

Main problem

Write performanceand

database size

Page 12: Lessons PostgreSQL learned from commercial databases, and didn’t

PostgreSQL uses buffered writes

Disks

Kernel buffer

shared_buffers

Disks

Kernel buffer

shared_buffers

Buffered IO Direct IO

Page 13: Lessons PostgreSQL learned from commercial databases, and didn’t

PostgreSQL uses buffered writes

• Effectively, one PostgreSQL process writes pages one by one tokernel buffer, then that buffer will be flushed to disk

• Besides double-caching, this is slow and does not allow to usesome cool features (O_ATOMIC)

• Oracle can bypass kernel buffer using direct IO. Moreover,both Oracle’s database writer and logwriter can swap threadsto write asynchronously

• That is a serious limitation for reaching high TPS figures on asingle instance

Page 14: Lessons PostgreSQL learned from commercial databases, and didn’t

Huge database

• Same problem - double caching• Storage overhead• Backup performance and recovery time• Autovacuum performance becomes an issue

Page 15: Lessons PostgreSQL learned from commercial databases, and didn’t

Backup performance

• No build-in parallelism• Level 0 plus PITR only• Keeping undo information right in datafiles can be a problemfor incremental backups

Page 16: Lessons PostgreSQL learned from commercial databases, and didn’t

Current MVCC implementation is a limitation itself

Nothing new, I only want to mention that it can be largestchallenge for PostgreSQL in the next 20 years

• It solves only one, the ”snapshot to old”, problem (and modernOracle solves it better)

• Undo information, spreaded inside datafiles brings a lot ofproblems

Page 17: Lessons PostgreSQL learned from commercial databases, and didn’t

Luck of diagnostics tools

• OK, there are plenty of them• Tools, which require kernel developer experience, such as perf,are not proper tools for a DBA

• Full time PostgreSQL developers are not DBAs. We need toexplain them, what we need and why

• Adding wait information to pg_stat_activity is a goodexample of such joint effort

• And a good lesson learned from Oracle. Not the last I hope

Page 18: Lessons PostgreSQL learned from commercial databases, and didn’t

PostgreSQL performance on Windows

• Well, there is no such thing. By the way, Oracle performs well• At the same time, a lot of PostgreSQL on Windows• Lack of enthusiasts for proper porting• At the same time, we support various BSD and even Tru64UNIX!

• Welcome to the world of open source!

Page 19: Lessons PostgreSQL learned from commercial databases, and didn’t

Documetation

• Relatively small, but efficient, not over-engendered, covers alltopics well - at a first glance

• No graphic diagrams. It seems much easier to decide aboutgraphical format, than to rework MVCC!

• No guidebooks. Application developer must read a half ofdocumentation, to install Postgres in test environment!

• OK, there is PostgreSQL wiki, but it is not under releasecontrol

Page 20: Lessons PostgreSQL learned from commercial databases, and didn’t

In spite of all this

PostgreSQL is a great database!

• It is still relatively simple to start with and to live with• It is safe. We have no listener, but we have no thick booksabout securing listener from external attack.

• It learns fast• May be it will change databases global market like Linuxchange operating systems global market

Page 21: Lessons PostgreSQL learned from commercial databases, and didn’t

Questions?

[email protected]