PGCon2018 - 2.5

PGCon 2018
The PostgreSQL Conference

Takashi Menjo
Day Talks - Day 1: Thursday - 2018-05-31
Room DMS 1120
Start time 14:15
Duration 00:45
ID 1154
Event type Lecture
Track Hacking
Language used for presentation English

Introducing PMDK into PostgreSQL

Challenges and implementations towards PMEM-generation elephant

Persistent Memory (PMEM) is fast, non-volatile and byte-addressable memory which can be accessed by CPU with load/store instructions, and is already available from a few vendors. Database-management systems can run faster on PMEM compared to on HDD or SSD. Moreover, by modifying applications as PMEM-aware with PMDK (Persistent Memory Development Kit), we could make them much faster. This talk presents how we hack PostgreSQL to make it PMEM-aware, and how much it becomes faster than ever. As a first step, we focus on Write-Ahead Logging (WAL) and Relation to improve OLTP performance and checkpoint time.

There are two ways to use PMEM as storage. One is a simple way, direct-access (DAX) filesystem which enables data access bypassing page cache of operating system. It can be applied without modifying PostgreSQL. The other is a hacker way, PMDK that contains PMEM-dedicated libraries building on DAX capabilities such as kernel-bypassing access to PMEM-mapped file and CPU cache-bypassing memory copy. It makes PostgreSQL PMEM-aware and much faster than a simple way.

Using PMDK, we hack PostgreSQL. The targets are WAL and Relation segment files. We replace system calls with PMEM functions provided by PMDK; open, lseek, read, write and fdatasync for segment files now yield to PMEM-mapping, memory copy and memory barrier. Then, we evaluate our approach by comparing to a simple way i.e. running PostgreSQL on DAX filesystem without hack. In evaluation, we use Non-Volatile DIMM (NVDIMM) as PMEM. The result show that, in regard to WAL, we achieve up to 1.8x more TPS in customized INSERT-oriented benchmark. We propose the patches containing approx. 1,200 insertions and deletions in total to the community. Also, as to Relation, we achieve 20% less time in checkpoint.

We also talk about our efforts to the evolving elephant such as controlling NUMA effects, eliminating overhead of SQL parsing, and extending Relation against PMEM-mapped fixed-length file.