PGCon 2012The PostgreSQL ConferenceUniversity of OttawaOttawa2012-05-152012-05-195Final Release09:0000:3009:0003:00MRT 212Mastering PostgreSQL Administration workshopIn this two part course you will learn the essential details of PostgreSQL configuration, security, maintenance, monitoring, tuning, backups, recovery. The course is designed for people with experience in database administration but who are new to the Postgres platform.
There will be a 1hr lunch break at noon.Mastering PostgreSQL Administration
In this two part course you will learn the essential details of PostgreSQL configuration, security, maintenance, monitoring, tuning, backups, recovery. The course is designed for people with experience in database administration but who are new to the Postgres platform.
The training will run for approximately two 3 hour sessions with several breaks during the class.
Every student would be required to carry their own laptops with OSX or Linux/Win on their machines. It is recommended to download PostgreSQL 9.x one click installer.
During training participants will cover the following topics:
Introduction
Installation
Configuration
Security
File structure
Maintenance
Backup
Monitoring
Disk space computations
Hot standby and Replication
Disaster Recovery
Bruce MomjianRobert Treat
http://momjian.us/main/presentations/overview.html#admin
13:0003:00MRT 212Mastering PostgreSQL Administrationafternoon sessionworkshopenIn this two part course you will learn the essential details of PostgreSQL configuration, security, maintenance, monitoring, tuning, backups, recovery. The course is designed for people with experience in database administration but who are new to the Postgres platform.
There will be a 1hr lunch break at noon.Mastering PostgreSQL Administration
In this two part course you will learn the essential details of PostgreSQL configuration, security, maintenance, monitoring, tuning, backups, recovery. The course is designed for people with experience in database administration but who are new to the Postgres platform.
The training will run for approximately two 3 hour sessions with several breaks during the class.
Every student would be required to carry their own laptops with OSX or Linux/Win on their machines. It is recommended to download PostgreSQL 9.x one click installer.
During training participants will cover the following topics:
Introduction
Installation
Configuration
Security
File structure
Maintenance
Backup
Monitoring
Disk space computations
Hot standby and Replication
Disaster Recovery
Bruce MomjianRobert Treat
http://momjian.us/main/presentations/overview.html#admin
09:0003:00MRT 212Getting Hot and Streamy with PostgresUsing Postgres' built in replication facilitiesworkshopenAn overview of Postgres' built in replication system, starting from PITR and going to cascading replication with demonstration's along the way. There will be more emphasis on the more recent and interesting technologies such as streaming replication and hot standby.Starting off with background about the WAL and how it enables all the technologies I am going to describe. Moving on to PITR, Warm Standby, Streaming Replication, Hot Standby, Synchronous Streaming Replication and Cascading Replication. I will have demonstrations along the way that will build on each other as we get toward the more advanced methods. Most of the focus will be on the most practical Streaming Replication and Hot Standby. The goal is that someone watching this tutorial would be able to understand enough about how replication works in Postgres to implement and maintain it.Phillip Sorber
OmniPITR
Config's from Demo
13:0003:00MRT 212Tutorial - Configuring write-scalable PostgreSQL clusterPostgres-XC primer and moreotherenPostgres-XC (simply XC) is write-scalable PostgreSQL cluster, which will be generally available by early May, 2012.
So far, XC is only one write-scalable, symetric database cluster solution available as open source.This tutorial covers almost all the topics needed to use Postgres-XC, listed as follows:
1) Postgres-XC, what it is and what it is not
2) Postgres-XC elements -- Global Transaction Manager, Coordinator and Datanode
3) How to design a Postgres-XC cluster --- cluster configuration and table design
4) Build and installation
5) How to configure Postgres-XC
6) How to test Postgres-XC
7) Cluster-wide backup and restore
8) High availability and component failure
9) Postgres-XC as a community, be a developer!Ashutosh BapatKoichi SuzukiMichael Paquier15:0004:00Royal OakregistrationRegistration pickupThe social way to register: at the pubotherenPick up your registration packStop by the Royal Oak Pub on Laurier Street and get your registration pack. You'll help us avoid long line ups on Friday morning and you get to have a beer, and chat with your fellow attendees. We guarantee you'll spot someone famous.Dan Langille18:0005:00L152hacker1Hacker Loungemeet, greet, code, slackotherenA place to gather...This is the place where many people will gather to work on their laptops, converse, code, hack, slack, and generally behave in cooperative ventures.
The times will vary, depending on when people gather. wifi is available, but bring power strips and extension cords to share the wealth.
L152 is located in the [Residence](http://g.co/maps/8scp6), ground floor, just to the left as you pass by the front desk. Ask if you can't find it.Dan Langille09:3001:00MRT 212Using PostgreSQL in modern enterprise web applicationsUsing HTML5, JavaScript, NodeJS with PostgreSQLlectureenPostgreSQL's object relational heritage makes it an outstanding choice for developing web applications that have a rich object model domain. See how PostgreSQL's object relational features allow for building database models to support a NodeJS data service feeding a 100% JavaScript web client.PostgreSQL's object relational heritage makes it an outstanding choice for developing web applications that have a rich object model domain. See how PostgreSQL's object relational features allow for building database models to support a NodeJS data service feeding a 100% JavaScript web client.
Topics include:
* The benefits of a rich domain model in enterprise software
- Comparing different architecture approaches: scripted model, table model and domain model
* The dissonance between relational databases and the object model
- Why NoSQL database's seem attractive
- Why relational is still the best choice for enterprise applications
- What is an "object relational" database?
* Using Postgres' object-relational features to reduce the friction
- Compound Types
- Querying an object hierarchy
- Object relational views
* Exploring an example
- Define a rich domain in a JavaScript MVC web client
- Build a data source using NodeJS using PostgreSQL JavaScript drivers
- Defining models in Postgres
- Extending models in Postgres
* Yes, it's got a Hemi
- Use Google's plV8js language in PostresSQL to run JavaScript directly in the database
- Parse and process JSON payloadsJohn Rogelstad11:0001:00MRT 212MADlibAn open source machine learning library on RDBMS for Big Data agelectureenMADlib is an open-source library for scalable in-database analytics. It
provides data-parallel implementations of mathematical, statistical and
machine learning methods for structured and unstructured data.The MADlib
mission is to foster widespread development of scalable analytic skills,
by harnessing efforts from commercial practice, academic research, and
open-source development.
The library consists of various analytics methods including linear
regression, logistic regression, k-means clustering, decision tree,
support vector machine and more. That's not all; there is also
super-efficient user-defined data type for sparse vector with a number of
arithmetic methods. It can be loaded and run in PostgreSQL 8.4 to 9.1 as well as Greenplum 4.0 to 4.2.
This talk covers its concept overall with some introductions to the
problems we are tackling and the solutions for them. It will also contain
some topics around parallel data processing which is very hot in both of
research and commercial area these days.Hitoshi Harada
MADlib homepage
12:0001:00MRT 212schemaSchemaverseLearn more about the tournament format, available prizes, game mechanics or even simply discuss the ideacontestenCompete against your fellow PostgreSQL users for prizes and the honor of the Schemaverse Champion title.If you would like to learn more about the tournament format, available prizes, game mechanics or even simply discuss the idea further, we welcome you to join the game's creator, Joshua 'Abstrct' McDougall, for some pizza in Room MRT-212 at 12:30 on Thursday. Josh 'Abstrct' McDougall
The Schemaverse
Schemaverse tutorial
13:0001:00MRT 212Database OpsEasy and Effective Operation for production systems with PostgreSQLworkshopenNTT (Nippon Telegraph and Telephone) group has made effort to introduce PostgreSQL to its production systems that are large and mission-critical. Introducing PostgreSQL, we found it may be an obstacle
that operation tools for PostgreSQL are not provided enough.
So we have developed tools for backup, data load, and performance
monitoring. In the talk, we will introduce these tools and how to improve database operation using them. NTT group, which is the largest telecommunication career in Japan providing more than 120 million subscribers, has made effort introducing PostgreSQL to its production systems that support telecommunication.
When we apply PostgreSQL to a production system, we will have to do many 'house-keeping chores'; At the beginning, you will have to load initial data to PostgreSQL, just after starting operation, you will have to periodically take back up files against data loss caused by media crash.
Many proprietary DBMS provide operational and/or management tools to make DBA's work easier and more efficient. Additionally, such a tool provides an easy and standard way of oprerations, it enables not a skilled engineer to manage database systems well.
Concerning PostgreSQL, such tools are not enough provided, it may be an obstacle to introduce PostgreSQL into enterprise systems. So we have developed some operational tools for taking back up, loading data and monitoring performance. And we provide technical know-hows about appropriate operations so that an engineer who is not familiar with PostgreSQL as we expect can manage PostgreSQL well.
The talk will introduce our daily operation activities with assistance of the tools; pg_rman for taking backups, pg_bulkload for data loading with data cleansing, and pg_statsinfo for performance monitoring. Talking things above we hope to share PostgreSQL operation know-hows with many DBAs.
Tetsuo Sakata14:3001:00MRT 212Range Types and Temporal: Past, Present, and FutureenRange Types didn't exist before, why do we need them now? How do they work? Why is "Temporal" important if we already have timestamps? How do we apply these concepts before deploying PostgreSQL 9.2? What's left to be done, and what solutions are in the works?I'll be asking the audience these questions, so -- Err... I mean: I will be answering these questions during the talk.
Extensions, changes to core postgresql, and future ideas will be described in the context of solving a simple use case from 2006. These ideas build up to the larger point that powerful types are important, and database systems should do more to support them.Jeff Davis09:0000:30MRT 218keynoteKeynotelectureenOne thing is certain: databases in the future are not going to look much like today.At Heroku, we are host to hundreds of thousands of databases that back applications which range from Berkeley class projects to SuperBowl ad campaigns. From this unusual position we have developed a unique perspective about how the data landscape is changing and can offer some thoughts on how PostgreSQL can change the way people build applications.
Slides at: http://pgcon-2012-keynote.herokuapp.com
Peter van Hardenberg
slides
09:3001:00MRT 2189.2: Full Throttle DatabaseNew Feature Grand PrixlectureenGentlemen, start your database engines! PostgreSQL 9.2 beta is here, and it's faster and more exciting than ever before. Come down to the track and join us for a high-speed tour of the next version like never before!
Gentlemen, start your database engines!
PostgreSQL 9.2 beta is here, and it's faster and more exciting than ever before. Come down to the track and join us for a high-speed tour of a database which is faster than ever before!
Starting at pole position, we will whip around the features of version 9.2, speeding through one demo after another, including:
* cascading replication
* enhanced vertical scalability
* improved performance
* index-only access
* range types
* JSON support
* better live DDL deployments
* new administrative views
It's the fastest PostgreSQL yet, and you have a shotgun seat!Josh Berkus11:0001:00MRT 218"Cheap, Fast AND Good" .... CHECKA checklist for database replication shoppers.lectureenPostgreSQL's built in replication is available for a while now, yet all the previous solutions enjoy ongoing popularity. For the experts, this is hardly surprising. A) because none of the solutions was ever meant to replace anything else. B) because replication is one single term for several different attempts of solving a subset of many different problems. For IT decision makers this can be rather confusing. Like with so many things, when looking for the right replication solution, people often don't know what they are really looking for. They look over the feature lists of products and try to determine from that which product best fits their needs. But unless they actually know what they need, how is the feature list going to help them?
It works the other way around. We need to look at all the problems, that can be solved with database replication in general, and identify which are or may become relevant in the case at hand. Then find the solution, that solves most of them by priority. Only if we have a prioritized list of problems to solve, the feature list of products starts making sense.
This talk discusses high level features of database replication systems and presents them in the form of use cases. These usage patterns are what will drive your decision when you look for the right replication solution to your problem(s). Jan Wieck13:0001:00MRT 218Large Scale MySQL Migrationto PostgreSQL!enOnce a Top-10 internet audience site. 32 million users. Billions of photos and comments, more than 6TB of them. Migrating away from MySQL to PostgreSQL!
This talk will share hindsights about the why and the how of that migration, what problems couldn't be solved without moving away and how the solution now looks. The tools used for migrating away the data, the methods and will detail the new architecture. And the new home, in the cloud!On the technical side of things, we will be talking about MySQL, mysqltocsv, pgloader, pljava, Google Protocol Buffers, pgbouncer, plproxy, PostgreSQL, pghashlib, walmgr, streaming replication. And Amazon hosting facilities too (EBS for starters).Dimitri Fontaine14:3001:00MRT 218Unlocking the Postgres Lock ManagerlectureenLocking is critical for providing high concurrency for any database — you cannot fully utilize your hardware if locking is throttling its use. This talk explores all aspects of locking in Postgres by showing queries and their locks; covered lock types include row, table, shared, exclusive, and advisory lock types. The high concurrency provided by Multiversion Concurrency Control (MVCC) is also covered.Locking is critical for providing high concurrency for any database — you cannot fully utilize your hardware if locking is throttling its use. This talk explores all aspects of locking in Postgres by showing queries and their locks; covered lock types include row, table, shared, exclusive, and advisory lock types. The high concurrency provided by Multiversion Concurrency Control (MVCC) is also covered.Bruce Momjian
http://momjian.us/main/presentations/internals.html#locking
16:0001:00MRT 218lightningLightning talksShort sharp descriptions of short topicslightningenA regular feature, PGCon will have a Lightning talks session, with presentations on diverse topics.The format remains essentially the same: in a one hour period, audiences are entertained and informed by a rapid fire series of short talks on interesting new or on-going work by individuals or groups. Slides aer permitted, but not obligatory; pictures are highly recommended. Topic areas include new open source software projects, works in progress for future releases of existing projects, student projects, etc. Lightning talks topics this year may make good conference papers next year!
The number of slots is limited, and experience suggests there will be more takers than slots. Sign up well in advance to be assured a spot. The session chair this year is yet to be decided.
Our tentative list of talks is:
* Security, Why it's Awful and How To Fix It - David Fetter
* Finishing Your PostgreSQL Talk On time - Greg Smith
* PostgreSQL China PUG - Galy Lee
* pgBadger - Gilles Darold
* PostGIS 2.0 and CartoDB 1.0 - Javier de la Torre
* Seven Deadly Sins of Deployment - Josh Berkus
* pg_extractor - Keith Fiske
* Improving the PostgreSQL Experience on the Mac, One App at a Time - Mattt Thompson
* pg_stat_statements - Peter Geoghegan
* What to Do with a Cray Supercomputer - Stephen Frost
... however, this is subject to change up until the Lightning Talks actually begin.
Galy LeeJosh BerkusMagnus Hagander
List of talks + slides
09:3001:00MRT 219Writing a foreign data wrapperExperiences with InformixlectureenWriting a foreign data wrapper (FDW) for PostgreSQL seems easy. However, there are many pitfalls.This talk will cover experiences from writing a FDW for Informix and will discuss problems with client libraries, data type mapping, optimizer support and performance related topics. Interested attendees will get a short overview on what they can expect from a FDW and (hopefully) learn something to do it better ;)Bernd Helmle11:0001:00MRT 219Schemaless SQLThe Best of Both WorldslectureenSchemaless database are a joy to use because they make it easy to iterate on your app, especially early on. And to be honest, the relational model isn't always the best fit for real-world evolving and messy data.
On the other hand, relational databases are proven, robust, and powerful. Also, over time as your data model stabilizes, the lack of well-defined schemas becomes painful.
How are we supposed to pick one or the other? Simple: pick both. Fortunately recent advances in Postgres allow for a hybrid approach that we've been using at Heroku. The hstore datatype gives you key/value in a single column, and PLV8 enables JavaScript and JSON in Postgres. These and others in turn make Postgres the best document database in the world.
We will explore the power of hstore and PLV8, explain how to use them in your project today, and examine their role in the future of data.Will Leinweber13:0001:00MRT 219Finding SimilarEffective similarity search in databaselectureenFinding similar objects is an ubiquitous task in day-to-day activity of developers of informational services. We present PostgreSQL extension, which provides an effective way to find similar objects in database, as well as several usage examples. The extension provides several methods to calculate sets similarity and similarity operator with indexing support on the base of GiST and GIN frameworks.Similarity search in large databases is an important issue in nowadays informational services, such as recommender systems. Naive implementation is slow and resource consuming. We developed PostgreSQL extension, called smlar, which provides several methods to calculate sets similarity (all built-in data types supported), similarity operator with indexing support on the base of GiST and GIN frameworks. Sets similarity means, that smlar isn't about content similarity (it doesn't interested in the nature of objects), but it's about similarity of sets. One example is a recommender system, which produces a list of recommendations based on collaborative and/or content filtering (Amazon is one of the most popular electronic commerce company, which provides recommendations, based on item-item similarity). Content filtering utilizes a set of discrete metadata of an object to build recommendation list of additional objects with similar properties, while collaborative filtering uses information about user's past behaviour and similar decisions made by other users, to predict objects that the user may have interest in. Smlar extension was developed in mind with collaborative filtering. It provides several methods to compute similarity between sets: jaccard, cosine and tfidf. Experiments with generated and real data sets show considerable advantage of using smlar extension in compare with brute-force approach.Oleg BartunovTeodor Sigaev14:3001:00MRT 219PL/R TricksServer Monitoring with Predictive AnalyticslectureenWe will present the results of an investigation into the use of PostgreSQL and PL/R in conjunction with a server monitoring application to perform predictive analytics of server performance.Usually server monitoring is reactive in nature. Some threshold is exceeded, and an alert is sent. By the time you receive the alert, something bad has already happened. Wouldn't it be nice to be able to foresee trouble before it rears its ugly head?
We will investigate the feasibility of applying both well established relatively simple, and more advanced forms of dynamic statistical analysis to server monitoring to allow more proactive server management. The tools used will be PostgreSQL, R, and PL/R.Jeff HamannJoe Conway18:0003:00Out and aboutsocialoutingMajor Social Event!sponsored by EnterpriseDBotherenEnterpriseDB invites all PGCon attendees to a big evening with drinks, appetizers, dinner and music on Thursday May 17th at [My Condo](http://mycondoottawa.ca/), just minutes from the conference venue in the Byward Market. We have exclusive use of My Condo between 6:00 - 9:00 pm, meaning that all 4 floors including the 4th floor patio will be ours to move around, mingle and catch up with fellow attendees. Food stations will be available on multiple floors, BBQ burgers, Jambalaya, Fettuccini Primavera and Honey Mustard Chicken, so come hungry!
NOTE: Remember to bring your conference badge with you as it will be your ticket to entry at [My Condo](http://mycondoottawa.ca/).
For directions, please consult the [official conference map](http://g.co/maps/rzxqq).
MyCondo turns into a night club at 9:30pm, and everyone is welcome to stay free of charge and enjoy the evening - [check it out](http://mycondoottawa.ca/nights/)
Dan Langille
Map
21:0003:00L152hacker2Hacker Loungemeet, greet, code, slackotherenA place to gather...This is the place where many people will gather to work on their laptops, converse, code, hack, slack, and generally behave in cooperative ventures.
The times will vary, depending on when people gather. wifi is available, but bring power strips and extension cords to share the wealth.
L152 is located in the [Residence](http://g.co/maps/8scp6), ground floor, just to the left as you pass by the front desk. Ask if you can't find it.Dan Langille09:0001:00MRT 205WAL Internals Of PostgreSQLlectureenDescribes the Write-Ahead-Log Internals of PostgreSQL system. Improvements in WAL system that can be done to improve the performance.PostgreSQL uses WAL files to perform Crash recovery, Point In Time Recovery and Streaming Replication.
This article will cover details of WAL system in PostgreSQL like what kind os WAL record gets generated on DML operations. WAL file name details and the contents it contains.
The details of Async Commit and how it protects Partial Page writes using WAL system are covered.
Finally some Advantages/Disadvantages and improvements w.r.t other RDBMS that can be done in PostgreSQL WAL system to improve its performance.Amit Kapila10:0001:00MRT 205Dear SQL Server, I'm filing for divorce.Falling in love with the free spirit of PostgreslectureenUsing the StackOverFlow datasets, we'll ditch all the drama of a Microsoft stack and convert from SQL Server to Postgres on Windows. Once we do that, we'll migrate our entire DB and Web App from Microsoft to Linux using Postgres and Mono with as few code changes as possible.Having the StackOverFlow dataset loaded into SQL Server and a mock StackOverFlow app in ASP.NET MVC3, we are going to show various ways to ETL into Postgres from SQL Server on Windows. Once that is done, we'll go over some basics of going from Postgres on Windows to Postgres on Linux as we attempt to migrate our app. Once we get our back-end moved, we'll show just how easily you can wire up ASP.NET MVC3 to Postgres and then move our entire stack to Linux using Nginx and Mono. Since I am a SQL Server DBA, I will also be adding lots of opinion on where Postgres really shines compared to SQL Server and where it doesn't. This session will be informative, entertaining and incredibly nerdy.Rob Sullivan11:3001:00MRT 205Monitoring Ozone Levels with PostgresqlDatabase Streaming Replication and MonitoringlectureenPostgres is used to manage data from the Ozone Monitoring Instrument aboard NASA's Aura spacecraft. The database implementation must handle large volumes of complex data transmitted continually from the satellite and generated by processing-intensive analyses performed by a team of atmospheric scientists. This talk will describe the architecture and some of the challenges faced. Focus will be given to our replication efforts, software developed for monitoring, and ongoing work to create a decentralized network of services commnicating through a RESTful interface.NASA and its international partners operate several Earth observing satellites that closely follow one after another along the same orbital track. This coordinated group of satellites, is called the Afternoon Constellation, or "A-Train" (http://atrain.nasa.gov/), for short. Four satellites currently fly in the A-Train: Aqua, CloudSat, CALIPSO, and Aura. Each satellite has one or more observational instruments that are used together in the construction of high-definition three-dimensional images of the Earth's atmosphere and to monitor changes over time. Aura's instruments include the Ozone Monitoring Instrument (OMI). Data management and processing services for data harvested by OMI are provided by the OMI Science Support Team headquartered at Goddard Space Flight Center.
Raw OMI data is received and initially processed at a ground station in Finland, then ingested into the system, where it is analyzed by scientists who submit processing jobs. Earth Science Data Types (ESDTs) are the products of these jobs, and one of the principal types of data managed in the database. Complex and abstract, ESDTs represent the interface between the raw science data and the data management system, and more than 900 are currently defined.
Our current database implementation includes 10 clusters, each running Postgres 9.0.4, and divided into three production levels: development, testing, and operations. The central operations cluster handles on average about 200 commit statements per second, contains tables as large as 160 million rows, and is configured for streaming replication. New data is continually being added to the system, and the total quantitiy is increasing at a rate of about 60% per year. This influx of data, in addition to scientific analyses, can cause the load on the database to vary suddenly, and monitoring software has been developed to provide early warning of potential problems.
The latest implementation of our software architecture uses decentralized services communicating through a RESTful interface. Databases are bundled together with their software component, and schema changes are managed using patch files. A utility has been created to apply the patches, and ensure schema consistency as the databases are amended. Perl's Rose-DB is used as an object-relational mapper, and database queries, via HTTP requests, are supported by encoding the query information into JSON. The new platform uses a different data model, making it necessary to sync between the two representations, and causing some difficulty with data duplication.
Alex Ming LaiMarty Brandon13:3001:00MRT 205OLTP Performance Benchmarks OverviewlectureenLearn about various OLTP Performance benchmark kits , when to use them and what to know when comparing numbers with other databases.pgbench is widely used for micro benchmarking PostgreSQL. However many customers use benchmarks based on the databases prevalent in their environment.
MySQL - sysbench
SQLServer - TPC-C, DVDStore
Oracle - TPC-C
and so on.
In this session we look at these OLTP Benchmarks like sysbench, dbt2 (TPC-C like), BenchmarkSQL (TPC-C Like), DVDStore, etc and see how to optimize PostgreSQL for these benchmarks and points for consideration when comparing them to other databases.
Jignesh K. Shah15:0001:00MRT 205Running libraries on PostgreSQLThe Evergreen library system's (ab)use of PostgreSQLenLaunched by the Georgia Public Library System in September 2006 to manage and circulate materials through a consortia of over 200 public libraries, Evergreen has since been adopted by over 1,000 public and academic libraries across more than 30 states and provinces. From the beginning, Evergreen's distributed architecture has bet heavily on PostgreSQL features, relying on custom functions, triggers and rules, full-text search, XML support, inheritance, and recently HSTORE to provide reliable high-performance support for the day-to-day operations of libraries. One of the core developers of the Evergreen library system describes how the project tries to use PostgreSQL to its fullest, some of the lessons we have learned, and some of the challenges we face:
* Scalability success stories
* Replication then and now
* Normalizing metadata to the exacting/arcane rules of librarians
* TEXT vs. XML vs. MARC(XML|21)
* Full-text search challenges
* Searching across multiple languages
* Relevance, configurability, and performance
* Schema evolution and testingDan Scott
Evergreen project home
16:3001:00MRT 205Simple SQL Change Management with SqitchlectureenSQL change management has always sucked. This talk introduces Sqitch, the VCS-aware SQL change management application that doesn’t suck. Come see how it works, learn the few simple rules you need to get the most out of it, and liberate yourself from the suckitude.
SQL change management is hard. Most “migration”-style implementations require opaque naming conventions, prefer DSLs that cover a fraction of SQL, and require duplication of code for simple changes to existing functions. Such does not have to be. And now it’s not
Introducing [Sqitch](http://sqitch.org/), simple SQL change management that doesn’t suck. Sqitch doesn’t care what programming language your app is written in. It has no opinions as to what database to use or what its schema should look like. And it doesn’t require sequentially-named migration scripts or the use of any DSL other than SQL. Sqitch lets you to write SQL migration scripts thar target *your* database, and provides a simple, unintrusive interface for specifying dependencies, so that it can run things in the proper order.
Best of all, when used with a version control system (initially Git), you can even modify idempotent deployment scripts between releases. Sqitch recognizes such changes, and automatically knows how to revert to earlier versions if required. And finally, Sqitch supports simple acceptance testing, so that you can be sure that your deployments are successful, and, if not, revert them.
So come to this talk to learn all about Sqitch: How it works, where to get it, and how to get the most out of managing database deployments.David E. Wheeler
Sqitch
GitHub
CPAN
Tutorial
09:0001:00MRT 212Making your own mapsAn introduction in using free Geospatial datalectureenPostGIS is an extension to PostgreSQL that turns PostgreSQL into a superb spatial database. Storing spatial data in PostgreSQL is a great way too use up the space on your SSD's however using the data to make maps is much more fun. This talk is aimed at people with limited GIS experience and will talk about how to use OpenStreetMap data for map making.We will tell you how you can get free geo-spatial data from OpenStreetMap and how it can be loaded into a PostGIS database. Common methods of using and accessing your data will be discussed including:
* Open Source desktop GIS software
* Generating custom map tiles for use on your website
* Making pretty paper maps.
This talk will introduce common tools and techniques used to with PostGIS when working with OpenStreetMap data. This is a user focused talk suitable for people who have next to no GIS background.
Steve Singer10:0001:00MRT 212On snakes and elephantsUsing Python with and in PostgreSQLlectureenPython is one of the most popular application programming languages and there's a plethora of PostgreSQL libraries and utilities for Python.
This talk will try to give an overview of the contemporary Python-PostgreSQL landscape in a way that's useful both for Python programmers starting on a PostgreSQL project and DBAs dealing with what those programmers wrote. We'll try cover a slightly opinionated selection of libraries, frameworks and technologies and give some recommendations.The richeness of the environment is sometimes confusing. Python people starting with PostgreSQL often don't know which driver or ORM library should they be using. Sometimes they're not aware of all the things PostgreSQL can offer to a Python programmer and the tools available.
On the other hand, DBAs sometimes need to debug Python programs (mis)using their database and PostgreSQL-savvy people join or consult on projects written in Python and need to have at least a basic understanding of how Python works, particularily on the database connection front.
We'll try to make both of these groups a bit more comfortable when dealing with the other. The talk will cover available drivers, focusing especially on psycopg2 and some of its lesser-known features and ORM libraries, focusing mainly on SQLAlchemy. We'll also discuss PL/PythonU, the possibilities it opens, along with some best practices and caveats.Jan Urbański11:3001:00MRT 212Hooks in PostgreSQLlectureenPostgreSQL's extensibility is well known. Most people have heard of user types, user operators, the new extension capability, and such. But few know about hooks in PostgreSQL. This talk will cover all kinds of hooks available in PostgreSQL, and will show some tools using them already.Since the 8.3 release, the PostgreSQL developers add many hooks in PostgreSQL. Some extensions already make use of such hooks in the planner and in the executor. pg_stat_statements is one of the various examples available. This talk will give a large overview of the hook system, and how to use it. We'll also see some of the extensions making use of them.Guillaume Lelarge13:3001:00MRT 212Improving foreign key concurrencyTo lock and not to blocklectureenRow locking is a mechanism that lets Postgres maintain strict consistency in certain database constraints, such as foreign keys. However, Postgres has historically only provided share and exclusive row locking, which I'll show to have significant drawbacks for concurrency.To solve the concurrency problem, two new row lock types are being introduced in release 9.2: SELECT FOR KEY SHARE and SELECT FOR KEY UPDATE. In this talk I'll explain how this new locking came to be, how it works, and how it helps significantly improve concurrency in applications.Álvaro Herrera15:0001:00MRT 212Index support for regular expression searchlectureenRegular expressions (regex) are powerful tool for text processing. When dealing with large string collections it's important to search fast on that collections (i.e. search using index). Indexing for regex search is a quite hard task. This talk presents novel technique (and WIP patch for PostgreSQL implementing it) for regex search using trigram indexes. Proposed technique provides more comprehensive trigram extraction than analogues, i.e. higher performance.
There are two existed approaches for index-based regex search. The FREE indexing engine is based on extractions continued text fractions from regex and perform substring search. Google Code Search approach present more sophisticated recursive analysis of regex with extraction of various regex attributes. This talk presents novel technique of regex analysis which is based on automata transformation rather than original regex analysis. Superiority of proposed technique will be proved by examples and tests.
The talk would be organized as following:
* Introduction.
* Regular expressions
* Finite automata
* pg_trgm contrib module
* Existing techniques for index-based regular expression search
* FREE indexing engine
* Google Code Search
* Proposed technique
* Description
* Examples
* Comparison with analogues
* Limitations
* Performance resultsAlexander Korotkov
WIP patch
Junghoo Cho, Sridhar Rajagopalan "A Fast Regular Expression Indexing Engine"
How Google Code Search Worked
Video of talk
16:3001:00MRT 212The PostgreSQL replication protocol, tools and opportunitieslectureenThe new binary replication protocols and tools in PostgreSQL 9.0 and 9.1 are a popular new feature - but they can also be used for other things than just replication!The new binary replication protocols and tools in PostgreSQL 9.0 and 9.1 are a popular new feature - but they can also be used for other things than just replication! PostgreSQL 9.1 includes server changes to allow standalone tools to request and respond according to the replication protocol. From these, tools like pg_basebackup allow a number of new possibilities. And the infrastructure put in place in 9.1 opens opportunities for further enhancements - some already on the drawing board and some just wild ideas so far.
Magnus Hagander09:0001:00MRT 218The Horizontal StruggleImproving the Experience of Scale-OutlectureenHorizontal scale-out of applications using Postgres is typically a time-consuming, expensive, error-prone task. In spite of that, horizontal scale-out is achievable, even if not well-supported by the logical constructs Postgres exposes.
This talk is intended to share what we've learned from both our experiences at Heroku and, more importantly, the litany of customers that we are privileged to talk to about their problems. From these, a few choice gaps in functionality are highlighted for improvement.The era of horizontal scale-out has long been upon-us. While some productively continue to outrun the problem by leveraging Moore's Law, others already have made the jump to fully distributed data management systems to achieve better scalability, availability, and latency around the globe. Some enthusiasts have even gone so far as to say that relational models will not gracefully survive and grow in the coming era.
The author of this talk is skeptical of this prediction, but acknowledges there are painful gaps in functionality that exist in all known generally-available production-class relational database systems in the domain of enabling scale-out of applications. He also thinks those gaps are, in all likelihood, solvable, without a huge upheaval to the implementation of Postgres nor applications written against it. Here, he will attempt to draw attention to:
* The relationship between relational models, ACID, and usability
* The sacrifices in usability made by most distributed data management software not intrinsic to their advantages
* The surprisingly few basic use-cases required by most people struggling with horizontal scalability
* The current state of the art in using Postgres as a member in a distributed system
* Choice weaknesses to make progress on, and sketches on mechanism to address them
Daniel Farina10:0001:00MRT 218A Batch of Commit BatchinglectureenA database commit can be the most expensive single operation that
its users have to wait for. Recent trends in the database industry
have proven some applications are willing to accept durability loss,
when it must be sacrificed to reach performance goals. And an inevitable
downside of more durable approaches like Synchronous Replication are
their impact on server commit speed.
Some of the fundamental limitations here are physical ones: disk rotation,
network performance, and the speed of light. Recent performance improvements
changes for PostgreSQL 9.2 aim at getting closer to the theoretical best
possible behavior here in every situation. It's more important than ever
to tell when the limit you're hitting is a physical one, and when it's
something you can address with a software change. Controlling commit
batch size and the number of concurrent clients is getting even more
important as PostgreSQL is deployed onto cloud and other virtual hardware
environments.Four of the fundamental factors going into how expensive a commit is are
atomicity, consistency, isolation, durability, collectively referred to
as ACID. PostgreSQL has always respected the durability aspects of ACID
compliance. Extending that to reach onto multiple servers can significantly
expands the suitability of the database for business critical applications.
It will cost you though. The question isn't just how much durability
you want; it's much durability can you afford?
The innovative design used in PostgreSQL doesn't force you to make this
sort of decision at the database level. Every individual commit can
specify its durability requirements at any time, even in the middle
of a transaction. Being able to classify your need at such a fine level
allows PostgreSQL an unprecedented range of options in this area.
Mission critical data that needs multi-node synchronous commit can
coexist with high volume/best effort data, with each transaction
fine-tuned to its position in the reliability vs. speed trade-off
spectrum.
There's a second factor to consider too: client count. The Synchronous
Replication implementation used for PostgreSQL 9.1 makes it possible
to increase total aggregate commit throughput by scaling up the
concurrent number of clients. Improvements in progress for PostgreSQL
9.2 take that basic idea and applies it more aggressively to local
commits as well. Carefully adjusting per-client commit behavior is
becoming an increasingly important bottleneck to understand and design
against.
Topics covered will include:
* Components of commit latency
* Application batch commits
* Benchmarking commit speed vs. client count
* Local commit durability options and performance
* Improvements in progress for PostgreSQL 9.2 group commit performance
* Remote server commit latency
* Synchronous Replication commit options and performance
* Per-transaction commit durabilityGreg SmithPeter Geoghegan11:3001:00MRT 218Moving Day: Migrating Big Data from A to BlectureenIn 2011 we moved the Mozilla crash reporting system from old creaky hardware in San Jose to a new shiny datacenter in Phoenix. This system contains more than 40TB of data in HBase, the Hadoop database, and PostgreSQL. The data collecting app has a requirement for close to 100% uptime. On top of that we have data processing, an API, and a webapp. After many months of work, the migration went seamlessly.
In this session we’ll talk about:
- The checklist manifesto, reprised, and understanding the critical path
- How to move all that data in a reasonable timeframe
- The importance of devops culture in success
- Automating packaging and configuration and how it will save you
- Understanding the difference between old and new platforms: correctness testing, load testing, and smoke testing
Attendees should walk away with an outline of everything they’ll need to do to achieve a successful data center migration.Laura Thomson13:3001:00MRT 218PostgreSQL on AWSEC2 with somewhat reduced tearslectureenAmazon Web Services (AWS) has become a very popular platform for deploying PostgreSQL-backed applications. But it's not a standard hosting platform. We'll talk about how to get PostgreSQL to run efficiently and safely on AWS.Among the topics covered will be:
-- Selecting an EC2 instance size, and configuring it for PostgreSQL.
-- Dealing with ephemeral instance storage: What is it good for? How much do you need?
-- Elastic Block Store: How much do you need? How do you configure it for best performance?
-- AWS characteristics and quirks.
-- Why replication is not optional on AWS.
-- Backups and disaster recovery.Christophe Pettus15:0001:00MRT 218Performance Improvements in PostgreSQL 9.2Bigger servers, bigger problemslectureenThe upcoming PostgreSQL 9.2 release features a large number of performance enhancements by many different authors, including heavyweight lock manager improvements, reduced lock hold times in key hot spots, better group commit, index-only scans, better write-ahead log parallelism, sorting improvements, and a userspace AVC for sepgsql.In this talk I'll give an overview of what was changed, how it helped, lessons learned, and the challenges that remain.Robert Haas16:3001:00MRT 218Big Bad "Upgraded" PostgreSQLlectureenA few years ago, we started a project to upgrade our multi-terabyte database from 8.3 to 8.4. Along the way we encountered a number of different obstacles and roadblocks which caused us to postpone the project, but this past fall we finally made it through phase 1 of the project, which by now had become an upgrade from 8.3 to 9.1. The course of the talk will cover several tools and tactics we had to use to get pg_upgrade to complete successfully, including all the different ways that things blew up on us.A few years ago, we started a project to upgrade our multi-terabyte database from 8.3 to 8.4. Along the way we encountered a number of different obstacles and roadblocks which caused us to postpone the project, but this past fall we finally made it through phase 1 of the project, which by now had become an upgrade from 8.3 to 9.1. The course of the talk will cover several tools and tactics we had to use to get pg_upgrade to complete successfully, including all the different ways that things blew up on us. We'll also discuss some of the changes we saw after the upgrade, and discuss some of the improvements we've made using new 9.1 features.
Long time Postgres may be familiar with the "Big Bad" series of talks, which discuss different ways we have had to bend Postgres to serve the needs of a high transaction, multi-terabyte decision support system. Our talks have featured both technical highlights and low lights, from innovative techniques to outright server meltdown, and all the good times in between. If you are using Postgres for mission critical applications, you'll enjoy this look inside the operations of a complex system that lives on the edge. Robert Treat
Slide Info / Blog
Slides
17:3001:00MRT 218Closing sessionsprizes, auctions, fun, gamesotherenThe Traditional Closing SessionWatch the video. We raised thousands for charity.Dan Langille19:0002:00Patty Boland's (upstairs)socialouting2Major Social Event!sponsored by HerokuotherenCome and join us for an evening of food and drink at Patty Boland's in the Market.Heroku is sponsoring this event for all PGCon attendees. Dinner and drinks will be provided. See [the map on the website](http://g.co/maps/a4ztx) for directions to the venue.
NOTE: the time is 6:45pm to 8:45pm. :)
NOTE: Bring your PGCon 2012 badge for admission.
Dan Langille
Map
21:0003:00L152hacker3Hacker Loungemeet, greet, code, slackotherenA place to gather...This is the place where many people will gather to work on their laptops, converse, code, hack, slack, and generally behave in cooperative ventures.
The times will vary, depending on when people gather. wifi is available, but bring power strips and extension cords to share the wealth.
L152 is located in the [Residence](http://g.co/maps/8scp6), ground floor, just to the left as you pass by the front desk. Ask if you can't find it.Dan Langille10:0005:00Out and abouttouristTourist stuffSpend some time exploringotherenExplore OttawaOttawa has a large number of great attractions. Spend some time looking around and explore. Spend as much time as you want with us, or leave early. We will walk everywhere we go. Wear sensible shoes. Bring your camera. We'll probably have lunch somewhere along the way. Consider the weather (sun block, rain coat, umbrella, swim suit).
This is one option for Saturday. You are free to come with us or take part in all or some activities. Or make your own way around...
The agenda for the tourist day is completely wrong.
The forecast: http://www.theweathernetwork.com/weather/CAON0512
Cora's - 8 AM - Breakfast
- 179 Rideau Street - http://bit.ly/dngpnV
- if you arrive much past 8:30, you probably won't get
through in time to get to the next meeting point
- we will depart Cora's at 8:55
Rideau Center - 9:13
- south end of the Mall on the Mackenzie King bridge
- we are catching the #95 bus to direction Orleans to BLAIR 1B
- from there, at 9:27, take Bus route 129 (OC Transpo) direction
Aviation
Aviation Museum
- we'll do a guided tour first, then wander around
- departing here at 1:30
Earl of Sussex Pub
- arriving at about 2:15
- 431 Sussex Drive
Ottawa, ON K1N 9M6
(613) 562-5544
earlofsussex.ca
All times after we leave Rideau Center are very subject to change.
Dan Langille
National Memorial
Residence
Forum