PGCon2008 - Final - we hope

PGCon 2008
The PostgreSQL Conference

Speakers
Jack Orenstein
Schedule
Day Talks - first day (2008-05-22)
Room B
Start time 11:30
Duration 01:00
Info
ID 57
Event type lecture
Track Horizontal Scaling
Language en
Feedback

Horizontal Scalability with PostgreSQL

Archival of digital data

The Hitachi Content Archive Platform (HCAP) is a storage system designed for the archival of digital data. HCAP software runs on a cluster of Linux nodes, implementing a shared nothing architecture. File metadata is kept in a set of Postgres databases, one running on each node of the cluster.

The data managed by Postgres is partitioned into "regions" and copies of each region are kept on multiple nodes: one master, and zero or more slaves. Typically there are 32-256 regions, depending on the cluster size, and two copies of each region.

Region copies are kept synchronized using a homegrown replication scheme. Before an update is reported as successful, the master copy update is committed, and updates to slaves are acknowledged as received (but not necessarily committed).

The paper and talk will focus on how data and access are partitioned, how replication works, how data integrity is maintained across failures, and how the temporary and permanent losses of region copies are dealt with.