PGCon2018 - 2.5

PGCon 2018
The PostgreSQL Conference

Speakers
Anastasia Lubennikova
Schedule
Day Talks - Day 2: Friday - 2018-06-01
Room DMS 1110
Start time 15:00
Duration 00:45
Info
ID 1194
Event type Lecture
Track Hacking
Language used for presentation English

Towards scalable ACID PostgreSQL with partitioning, postgres_fdw and logical replication

The idea of using partitioning and postgres_fdw for sharding has been floating around for many years, however we are not aware of existing attempts to build a complete solution. In this talk we will share our experience on this path. Topics will include sharding itself, decent distributed transactions (isolation and atomicity), failures recovery, benchmarks and performance issues.

We are working on pg_shardman [1] -- PG extension which aims to scale-out Postgres without losing ACID semantics. Based on pg_pathman [2] and postgres_fdw modules, it supports hash-sharding of tables into arbitrary number of partitions scattered across the nodes. Any node can execute queries involving distributed tables. For each partition, user-specified number of copies is created with logical replication. We have implemented two-phase commit (2PC) commit protocol and Clock-SI [3] algorithm to provide atomic cross-node transactions with snapshot isolation (called REPEATABLE READ in Postgres) xact isolation level. In addition, pg_shardman features

  • Node addition/removal with data rebalance;
  • Distributed deadlock detection and resolution;
  • Data import with COPY FROM;
  • Limited ALTER TABLE support for distributed tables;
  • Manual (requires DBA directive) failure recovery;

We will describe how we have implemented this and highlight particular points in the endless universe of possible improvements, e.g. automated failover. Then we will discuss benchmarks, performance issues and what we can do about them. Some of the enhancements can be developed and merged into core PostgreSQL or postgres_fdw independently of pg_shardman, like asynchronous (parallel) postgres_fdw execution and optimized bulk inserts into foreign table.