PGCon2014 - Final Release

PGCon 2014
The PostgreSQL Conference

Peter Geoghegan
Day Talks - Day 1 - Thu May 22 - 2014-05-22
Room Morisset 221
Start time 13:00
Duration 00:45
ID 661
Event type Lecture
Track 9.4 Features
Language used for presentation English

Why UPSERT is weird

Counterintuitive lessons learned from the implementation effort

Talk that examines implementation process on the INSERT...ON DUPLICATE KEY LOCK FOR UPDATE feature proposed for PostgreSQL.

"UPSERT" can loosely be described as a DBMS feature that allows a DML statement's author to atomically either insert a row, or on the basis of the row already existing, update that existing row instead, while safely giving little to no further thought to concurrency. One of those two outcomes must be guaranteed, regardless of concurrent activity, which is the essential property of UPSERT. Examples include MySQL's INSERT...ON DUPLICATE KEY UPDATE, or the SQL standard's MERGE statement (as implemented in a number of proprietary systems). The absence of this feature from Postgres has been a long-standing complaint from Postgres users, and an effort to implement an extension to the Postgres INSERT statement, that implements something broadly in line with user expectations in this area was undertaken in 2013.

This talk considers the experience of working on that implementation, and examines how the proposed INSERT...ON DUPLICATE KEY LOCK FOR UPDATE feature deals with concurrency issues, visibility issues, the interplay of value locking and row locking, and the general avoidance of unprincipled deadlocking. In order to come up with an implementation that satisfied user expectations, while still preserving and respecting long standing abstractions and conceptual precepts, some interesting and counterintuitive choices were made to resolve the fundamental tension in tying value locking (as always required on a limited scale for unique index enforcement) to row locking (to LOCK FOR UPDATE).

Finally, the talk examines the strategic significance of UPSERT, and further use-cases enabled by the feature in the area of multi-master replication conflict resolution.