PGCon2012 - Slide release #12

PGCon 2012
The PostgreSQL Conference

Greg Smith
Peter Geoghegan
Day Talks - 2 - Friday - 2012-05-18
Room MRT 218
Start time 10:00
Duration 01:00
ID 453
Event type Lecture
Track Scaling Out
Language used for presentation English

A Batch of Commit Batching

A database commit can be the most expensive single operation that its users have to wait for. Recent trends in the database industry have proven some applications are willing to accept durability loss, when it must be sacrificed to reach performance goals. And an inevitable downside of more durable approaches like Synchronous Replication are their impact on server commit speed.

Some of the fundamental limitations here are physical ones: disk rotation, network performance, and the speed of light. Recent performance improvements changes for PostgreSQL 9.2 aim at getting closer to the theoretical best possible behavior here in every situation. It's more important than ever to tell when the limit you're hitting is a physical one, and when it's something you can address with a software change. Controlling commit batch size and the number of concurrent clients is getting even more important as PostgreSQL is deployed onto cloud and other virtual hardware environments.

Four of the fundamental factors going into how expensive a commit is are atomicity, consistency, isolation, durability, collectively referred to as ACID. PostgreSQL has always respected the durability aspects of ACID compliance. Extending that to reach onto multiple servers can significantly expands the suitability of the database for business critical applications. It will cost you though. The question isn't just how much durability you want; it's much durability can you afford?

The innovative design used in PostgreSQL doesn't force you to make this sort of decision at the database level. Every individual commit can specify its durability requirements at any time, even in the middle of a transaction. Being able to classify your need at such a fine level allows PostgreSQL an unprecedented range of options in this area. Mission critical data that needs multi-node synchronous commit can coexist with high volume/best effort data, with each transaction fine-tuned to its position in the reliability vs. speed trade-off spectrum.

There's a second factor to consider too: client count. The Synchronous Replication implementation used for PostgreSQL 9.1 makes it possible to increase total aggregate commit throughput by scaling up the concurrent number of clients. Improvements in progress for PostgreSQL 9.2 take that basic idea and applies it more aggressively to local commits as well. Carefully adjusting per-client commit behavior is becoming an increasingly important bottleneck to understand and design against.

Topics covered will include:

  • Components of commit latency
  • Application batch commits
  • Benchmarking commit speed vs. client count
  • Local commit durability options and performance
  • Improvements in progress for PostgreSQL 9.2 group commit performance
  • Remote server commit latency
  • Synchronous Replication commit options and performance
  • Per-transaction commit durability