PGCon2018 - 2.5

PGCon 2018
The PostgreSQL Conference

Speakers
Konstantin Evteev
Mikhail Tyurin
Schedule
Day Talks - Day 2: Friday - 2018-06-01
Room DMS 1110
Start time 11:00
Duration 00:45
Info
ID 1141
Event type Lecture
Track DBA
Language used for presentation English

Recovery use cases for Logical Replication in PostgreSQL 10

In this report, we would like to show how our recovery use cases around Londiste (PGQ in general) in distributed data processing could be switched to new Logical Replication subsystem in PostgreSQL 10. In the current implementation of Logical Replication, we see only non-trivial solutions -- we could open a number of issues for the community, that come down to implementing simpler recovery mechanisms -- as simple as configuring the replication in PostgreSQL 10. These mechanisms are connected with re-play events of replication queue, reposition source and a destination of replication, reigniting subscriber from another subscriber, rewind replication queue and perform UNDO recovery on the destination side.

Avito is the biggest classified site of Russia, and the third largest classified site in the world (after Craigslist of USA and 58.com of China). In Avito, ads are stored in PostgreSQL databases. At the same time, for many years already the logical replication is actively used. With its help, the following issues are successfully solved: the growth of data volume and growth of number of requests to it, the scaling and distribution of the load, the delivery of data to the DWH and the search subsystems, inter-base and internetwork data synchronization etc. But nothing happens "for free" - at the output we have a complex distributed system. Hardware failures can happen - it is natural - you need to be always ready for it. There is plenty of samples of logical replication configuration and lots of success stories about using it. But with all this documentation there is nothing about samples of the recovery after crashes and data corruptions, moreover there are no ready-made tools for it. Over the years of constantly using PgQ replication, we have gained extensive experience, rethought a lot, implemented our own add-ins and extensions to restore and synchronize data after crashes in distributed data processing systems.