PGCon2008 - Final - we hope

PGCon 2008
The PostgreSQL Conference

ITAGAKI Takahiro
Day Talks - first day (2008-05-22)
Room B
Start time 15:00
Duration 01:00
ID 76
Event type lecture
Track Horizontal Scaling
Language en

Synchronous Log Shipping Replication

High availability solution to minimize downtime

NTT has developed a shared-nothing replication system for PostgreSQL implemented with transaction log shipping and Heartbeat. The goal is minimizing the system downtime and the impact for update performance. Failover can be done within 15 seconds and the overhead is at worst 7% on heavily-updated workloads in the current implementation.

The replication solution realizes 99.999% availability so that it is applied to production systems. We will explain advantages of the solution and future direction of the development.

The goal of the synchronous log shipping replication is minimizing the system downtime and ensuring 99.999% availability. At the same time, we need to avoid performance degradation on heavily-updated workloads and functional restrictions.

We chose synchronous log shipping approach to accomplish the purposes. There are many replicators already, but almost of them are asynchronous or have some restrictions derived from statement-based approach. We think log shipping is the best solution for high availability and extensibility in the future.

We implemented basic functionalities of the replicator on PostgreSQL 8.2 with a WAL sender through network. It works like the standard warm standby configuration for now, but can minimize the delay of transfering logs. The primary server can commit only after the commit log is transferred to a standby server. After a failure of the primary server and switching to standby server, the standby server has already received all of the transaction logs and doesn't need to read the last WAL records from the primary's disks. We can use shared-nothing disks and create a high availability cluster with inexpensive storages. Even if shared disks are available, it makes possible not to have to wait for ummounting the disks from the primary server.

There are two remaining works; more speedy failover and load-balancing. The delay on failover is restricted by recovery speed in the standby server. Recovery performance is improved in 8.3, so we have a plan to port our replicator to 8.3 and to measure the performances. The next step is load balancing. If read-only queries are allowed on standby servers during a PITR, the replicator will have load balancing capabilities with little or no delay.