PGCon2010 - Final Release

PGCon 2010
The PostgreSQL Conference

Leandro Guimarães Faria Corcete DUTRA
Day Talks - 1 - 2010-05-20
Room DMS 1160
Start time 16:30
Duration 01:00
ID 228
Event type Lecture
Track DBA
Language used for presentation English

The Illustrated Elephant

Literary modeling and automatic documentation in PostgreSQL

Users of proprietary database management systems are usually bound to use graphical modeling tools with an emphasis on drawing diagrams and generating SQL DDL code from them. The process is generally error-prone and cumbersome, being based on mediochre user interfaces and generating bad SQL representing poor data models, as diagrams can hardly represent the full richness of SQL data models — specially in an SQL flavor like PostgreSQL’s. Many databases are reverse-engineered into entity-relationship diagrams, loosing much information coded into SQL features not directly supported by the diagramming tools, or in the SQL DDL source code which originally created the reverse-engineered database.

A well-kept but open secret of many database administrators is reliance on source code and automatic diagramming tools. Breaking free from the misconception that all information should be graphically represented, or even that it should be graphically created, and from the mistaken identification of modeling and drawing, such data modelers are free to use the full power of both SQL and their well-proven, flexible source code tools, all the while generating all the graphics and web pages they could possibly want automatically, using simple, fast programs which can lay out diagrams much better than most drafters.

SQL DDL coding can also be nicely combined to literary programming tools, in what we call ‘literary modeling’: interspersing SQL DDL statements in a full text explanation of the model and the reasoning behind it, we can generate both text files for database schema creation, and nicely formatted documents for reading, browsing and reference, both printed and online. These documents can, and typically will, include graphics generated automatically from either the SQL DDL or the database schema itself.

It is our tested conviction that this process is much more pleasurable and efficient than the tradicional diagram-based one.

We intend to show how, in the course of the normal life cycle of a database, data architects, administrators, programmers and users are stiffled and unnerved by the reliance on graphical, diagram-based modeling tools, the cumbersome processes they require, the problems they engender, the poor models they all but enforce, and the bad code they generate.

We will introduce, or rather remind, a few among several tools that can and do ease the productive work on rich, maintainable database models, helping generate databases that use the full range of possibilities enabled by PostgreSQL and its SQL dialect. Focusing on the most formal and familiar representation of SQL database schemas, namely SQL DDL code itself, we profit of the full range of traditional text-based coding tools, such as text editors and utilities, markup and formatting languages, source code versioning and control, build systems, scripts, hypertext and automatic layout of diagrams.

Using only tried and tested tools such as NoWeb, LaTeX, HTML, Autodoc, SQL::Fairy and SchemaSpy, and the basic ideas legated by people such as Codd (the relational model itself) and Knuth (literary programming), we present a flexible skeleton process that relies not on secrets, methodologies or black box tools and file formats, but on standards and on well-known interfaces. As we present the tools’ capabilities and possible combinations thereof, both novice and expert users will recognize some of their own practices and, hopefully, a few new ones.

We hope to inspire some interest on further implementation of the SQL standard in PostgreSQL, of the relational model aspects in SQL, and on development of the presently existing tools.

Early versions of this talk were met with interest by the Brazilian free software community in PgCon BR 2008, PgDay SP 2009, and FISL 10 (2009).