PGCon2016 - 20180510

PGCon 2016
The PostgreSQL Conference

Speakers
Takashi Horikawa
Schedule
Day Talks - Day 2 - 2016-05-20
Room DMS 1110
Start time 14:00
Duration 00:45
Info
ID 945
Event type Lecture
Track Performance
Language used for presentation English

Non-volatile Memory Logging

Emerging byte-addressable, non-volatile memories (NVMs) is revolutionary as the data written by a store operation of the CPU is nonvolatile. This talk presents the architecture of a prototype and performance evaluation results that demonstrate the potential of NVMs when they are used for the WAL buffer in PostgreSQL; a transaction becomes durable promptly after its WAL records are written in the WAL buffer. There is no need to wait the WAL records are written in the storage device. In a nutshell we can exploit the performance of asynchronous commit without impairing the transaction durability by using NVM for WAL buffer. Although the idea is simple, its implementation, however, is not. This talk also covers the difficulties to implement NVM WAL buffer and how to address them. Finally, I would like to share some knowledge that I obtained through the implementation and examination of the prototype.

Emerging byte-addressable, non-volatile memories (NVMs) satisfy both properties: non-volatility of and fast access to the data. Unlike the block-addressable storage devices, they are revolutionary as the data written by a store operation of the CPU is nonvolatile. A natural idea to use NVM in DBMS is to store WAL records in it; it eliminate the need to write WAL records in the usual block-addressable storage device such as HD and SSD by expensive synchronous I/O operation, which increase the transaction response time. Although this idea is simple, its implementation is not simple for following two reasons: NVM capacity issue and partial write problem.

At first, an NVM device that can be used immediately is called NVDIMM, which does not have sufficient capacity to store WAL records of the entire system running period. Therefore, WAL records stored in the NVM device have to be ultimately saved in block-addressable storage device in a manner similar to asynchronous commit. Second, WAL records have to be written in NVM in a well formed manner in case system crashes on the way a WAL record is written, i.e. it is necessary for recovery procedure to recognize whether the WAL record is completely written or not.

With these backgrounds, I have developed a prototype based on PostgreSQL and have observed that the prototype delivers almost the same throughput as that of using asynchronous commit and is possible to recover committed transactions in case of system crash. Necessary modifications were only several source files, mainly xlog.c, in that recovery procedure reads the WAL records not only from the block-addressable storage but also from NVM WAL buffer. I would like to present the architecture of the NVM WAL logging and share some knowledge that I obtained through the implementation and examination of the prototype.