PGCon2017 - 20180510

PGCon 2017
The PostgreSQL Conference

Speakers
Payal Singh
Schedule
Day Talks - Day 2 - 2017-05-26
Room DMS 1120
Start time 13:00
Duration 00:45
Info
ID 1085
Event type Podium
Track DBA
Language used for presentation English

Provisioning and Automating High Availability Postgres on AWS EC2

Simple automation techniques to make Postgres cluster management easier at any scale.

RDS is great. But sometimes you feel the need for more flexibility and customization that is just not possible with RDS. In this talk I will talk about why one might want to go with setting up and maintaining a HA postgres cluster on EC2 instead of RDS, following which I will go through each of the steps for setting up such a system and automating installation and upgrades, replication, backups and restores with S3, PITR, monitoring with Cloudwatch and failover. This talk is especially useful for people looking to gain a better understanding of how the various AWS tools can be used together for a HA postgres cluster, and as a HowTo guide on migrating from either RDS or one's own hosted cluster on to EC2.

When people hear of automation, they mostly think of systems with hundreds of servers where manual management is next to impossible. But with the advent of easy to learn, quick to setup configuration management tools, a DBAs life can get much easier even if they only have as few as a single primary-secondary database architecture. One might ask what you get in return? You get reliability, reduced possibility of human errors, easier scalability if ever needed, and most of all, ease of applying changes.

Even with the popularity of AWS tools, primarily RDS and EC2, rarely do we ever have a 'single-tool-fits-all-databases' kind of scenario. While there are more than a few companies out there for which RDS works great, there are limits to RDS that cause things to get more complicated if someone wants more control over their database, and the freedom to use more tools around it. EC2 is a great midway where people can still unload they machine maintenance responsibilities on Amazon while getting almost all of the freedom and power that comes with administering those machines. Here is where the essence of my talk lies - in automating both, the mundane and risky alike in a reliable, well-designed manner, so you can focus on what's important - the data.

In this talk, I will start by giving a quick introduction to popular configuration managers out there and why I chose Ansible for this use-case. I will then go about detailing how to design and implement such a setup with minimum effort to achieve maximum efficiency in the long run. I will also go through some of the pitfalls and 'gotchas' of automating a single primary-secondary Postgres architecture that may seem great at first but can later be a pain to work on and eventually add to technical debt and frustration of colleagues.

Lastly, I will demonstrate initializing such an infrastructure with my own Ansible playbook, setting up replication, performing a failover, rebuilding/syncing the new secondary, installing extensions, upgrading Postgres versions and much more!