Conference Schedule - PGCon 2023

A Pattern for a Searchable, Encrypted JSON Document Service

Date: 2023-05-31
Time: 11:00–11:45
Room: DMS 1160
Level: Intermediate
Feedback: Leave feedback

Welp, I went and did it. I designed a database outside of the database, converting Postgres to a relatively dumb storage engine.

Why? We needed to develop an identity service in which all data in the database is fully-encrypted, such that in the event of a breach the data would be all but useless. That means we can't take advantage of so many of PostgreSQL's amazing semantic tools for data management an indexing.

The New York Times Identity service adopted a pattern of a fairly simply key/value store, with a UUID primary key and an encrypted payload. But, crucially, we needed to provide an interface for looking up records by secondary IDs, like email addresses and phone numbers — without actually storing those values. What did we do instead?

We took advantage of the emerging patterns of JSON Schema, JSON Path, and JSONB indexing to create hashed JSON objects for querying secondary keys. The result is a system that puts the power of entity schema management in the hands of our users while optimizing the search performance without compromising personal data.

This talk will describe the overall design in detail, highlighting the JSON schema definition, data encryption, indexing, and the query performance enabled by PostgreSQL's GIN index and JSONPath search capabilities.


David E. Wheeler