PGCon 2015

PGCon 2015 The PostgreSQL Conference University of Ottawa Ottawa 2015-06-16 2015-06-20 5 final 09:00 00:15 13:30 01:00 DMS 1160 Unconference Room #1 Unconference Day lecture en Please read the link below for full details. Please read the link below for full details. Josh Berkus Details on wiki 14:45 01:00 DMS 1160 Unconference Room #1 Unconference Day lecture en Please read the link below for full details. Please read the link below for full details. Josh Berkus Details on wiki 16:00 01:30 DMS 1160 Unconference Room #1 Unconference Day lecture en Please read the link below for full details. Please read the link below for full details. Josh Berkus Details on wiki 13:30 01:00 DMS 1120 Unconference Room #2 Unconference Day lecture en Please read the link below for full details. Please read the link below for full details. Josh Berkus Details on wiki 14:45 01:00 DMS 1120 Unconference Room #2 Unconference Day lecture en Please read the link below for full details. Please read the link below for full details. Josh Berkus Details on wiki 16:00 01:30 DMS 1120 Unconference Room #2 Unconference Day lecture en Please read the link below for full details. Please read the link below for full details. Josh Berkus Details on wiki 13:30 01:00 DMS 1110 Unconference Room #3 Unconference Day lecture en Please read the link below for full details. Please read the link below for full details. Josh Berkus Details on wiki 14:45 01:00 DMS 1110 Unconference Room #3 Unconference Day lecture en Please read the link below for full details. Please read the link below for full details. Josh Berkus Details on wiki 16:00 01:30 DMS 1110 Unconference Room #3 Unconference Day lecture en Please read the link below for full details. Please read the link below for full details. Josh Berkus Details on wiki 09:30 00:30 DMS 1160 Coffee & light snacks pre-unconference refreshments Social lecture en Please read the link below for full details. Please read the link below for full details. Josh Berkus Details on wiki 10:00 01:15 DMS 1160 Unconference Room #1 Unconference Day lecture en Please read the link below for full details. Please read the link below for full details. Josh Berkus Details on wiki 11:30 01:30 DMS 1160 Unconference Room #1 Unconference Day lecture en Please read the link below for full details. Please read the link below for full details. Josh Berkus Details on wiki 13:00 01:00 DMS 1160 Lunch unconference lunch Social lecture en Please read the link below for full details. Please read the link below for full details. Josh Berkus 14:00 01:00 DMS 1160 Unconference Room #1 Unconference Day lecture en Please read the link below for full details. Please read the link below for full details. Josh Berkus Details on wiki 15:00 00:15 DMS 1160 Coffee & light snacks pre-unconference refreshments Social lecture en Please read the link below for full details. Please read the link below for full details. Josh Berkus 15:15 01:00 DMS 1160 Unconference Room #1 Unconference Day lecture en Please read the link below for full details. Please read the link below for full details. Josh Berkus Details on wiki 16:30 01:00 DMS 1160 Unconference Room #1 Unconference Day lecture en Please read the link below for full details. Please read the link below for full details. Josh Berkus Details on wiki 10:00 01:15 DMS 1120 Unconference Room #2 Unconference Day lecture en Please read the link below for full details. Please read the link below for full details. Josh Berkus Details on wiki 11:30 01:30 DMS 1120 Unconference Room #2 Unconference Day lecture en Please read the link below for full details. Please read the link below for full details. Josh Berkus Details on wiki 14:00 01:00 DMS 1120 Unconference Room #2 Unconference Day lecture en Please read the link below for full details. Please read the link below for full details. Josh Berkus Details on wiki 15:15 01:00 DMS 1120 Unconference Room #2 Unconference Day lecture en Please read the link below for full details. Please read the link below for full details. Josh Berkus Details on wiki 16:30 01:00 DMS 1120 Unconference Room #2 Unconference Day lecture en Please read the link below for full details. Please read the link below for full details. Josh Berkus Details on wiki 10:00 01:15 DMS 1110 Unconference Room #3 Unconference Day lecture en Please read the link below for full details. Please read the link below for full details. Josh Berkus Details on wiki 11:30 01:30 DMS 1110 Unconference Room #3 Unconference Day lecture en Please read the link below for full details. Please read the link below for full details. Josh Berkus Details on wiki 14:00 01:00 DMS 1110 Unconference Room #3 Unconference Day lecture en Please read the link below for full details. Please read the link below for full details. Josh Berkus Details on wiki 15:15 01:00 DMS 1110 Unconference Room #3 Unconference Day lecture en Please read the link below for full details. Please read the link below for full details. Josh Berkus Details on wiki 16:30 01:00 DMS 1110 Unconference Room #3 Unconference Day lecture en Please read the link below for full details. Please read the link below for full details. Josh Berkus Details on wiki 09:00 03:00 DMS 1140 Business Continuity, High Availability and Disaster Recovery Blueprints, validated architectures and implementation tips for HA / DR / BC Tutorial workshop en Having run Postgres reliably for years does not automatically exempt anyone from eventually having some data loss incident. Even when PostgreSQL is one of the more robust and resilient databases, disaster can still strike. When ensuring that some incident will not kill the business is your responsibility, preparation is key. This tutorial covers Business Continuity topics, focusing on the techniques needed to achieve adequate availability (HA) as well as prepare for disaster recovery (DR) should something happened to your database. We will describe the evolution from a stand-alone database to a fully replicated, distributed setup as well as several specific techniques to increase recoverability using some features from PostgreSQL 9.4. Backup and recovery techniques will be reviewed and evaluated in terms of RPO/RTO objectives and the associated tradeoffs. The session will close with an open scenario/interactive round of questions where specific use cases will be analyzed and solutions will be suggested. Jose Luis Tallon 12:00 01:00 DMS 1140 Tutorial lunch nom nom nom Social lecture en Please read the link below for full details. Please read the link below for full details. Dan Langille Details on wiki 13:00 03:00 DMS 1140 Out of the Box Replication in Postgres 9.4 Tutorial lecture en Most people use at least one 3rd party script/tool to manage Postgres replication for building reliable solution for mission critical Postgres databases. Postgres 9.4's new features allow users to setup replication using built in features without compromising reliability and remove reliance on 3rd party tools. Most people use at least one 3rd party script/tool to manage Postgres replication for building reliable solution for mission critical Postgres databases. Postgres 9.4's new features allow users to setup replication using built in features without compromising reliability and remove reliance on 3rd party tools. In this tutorial, I will walk you through replication options available in Postgres 9.4, selecting the right solution to achieve your business requirements, and things to consider for monitoring and maintaining your setup in production. This is a hands on tutorial so please bring your laptop with you. I will share VMs in advance that will be used during tutorial to setup replication with Postgres 9.4. Training VM: The Postgres Training VM can be downloaded from the one of the links below. The zip file is about 2GB and it unzips to about 6GB, so people should make sure they have enough space free for it. I'll be bringing it along on a USB stick as well in case everyone doesn't have a chance to grab it. Download VM from any of the following links: (1) See Training VM Download Link under Links on this page (2) Google Drive: https://drive.google.com/open?id=0BxnXwkT5PRBdeVByQVYySkhIemc After downloading above file, it should be uncompressed. To bring up the VM, you should VirtualBox software installed on the machine. An existing VM can be pulled into VirtualBox by going to the menu "Machine -> Add.." and then browsing to wherever the VM was extracted to. I highly encourage to download the VM before the training so we can use time to cover Postgres replication topic in detail. Let me know via denish@omniti.com , if you see any problem downloading it. Denish Patel Training VM Download Link https://omniti.com/is/denish-patel Presentation Slides slides 15:00 04:00 Royal Oak registration Registration pickup The social way to register: at the pub Social other en Pick up your registration pack Stop by the Royal Oak Pub at 161 Laurier Street (near King Edward) and get your registration pack. You'll help us avoid long line ups on Thursday morning and you get to have a beer, and chat with your fellow attendees. We guarantee you'll spot someone famous. Dan Langille 13:30 01:30 Registration assembly Registration Bag Assembly TBA Social other en Please read the link below for full details. Please read the link below for full details. Dan Langille Details on wiki 09:00 00:45 DMS 1160 Opening Session Plenary lecture en TBD TBD Dan Langille 10:00 00:45 DMS 1160 9.5 Coming to You Live New features by demo 9.5 Features lecture en This all-demo, no-slide talk will show off 9.5's new features. With every new Postgres release comes new features and improvements to make your life easier. Come see some of the new 9.5 features in action and learn how this next release will make your life better. Keith Fiske http://slides.keithf4.com/pg95live 11:00 00:45 DMS 1160 Future(s) of PostgreSQL (Multi-Master) Replication BiDirectional Replication Scaling Out lecture en In the course of the BDR (BiDirectional Replication) project we have worked on delivering robust, feature-full and fast asynchronous multi-master replication for postgres. In addition we have started the UDR project, sharing most of the code and infrastructure with BDR, which provides unidirectional logical replication for the many cases where multi-master replication is not required. To implement BDR a lot of features have already been integrated into core PostgreSQL (9.4). Now that 9.4 is released and BDR/UDR is in production in several complex environment there's some important discussions to be had about what can and what cannot be integrated into core PostgreSQL. We will discuss: * Which features are in core postgres * Which features does BDR/UDR provide on top of that * What can be integrated into core PostgreSQL and how * Future features * Problems found during the development Andres Freund 12:00 00:45 DMS 1160 SchemaVersus 1v1 Schemaverse Battles Social contest en A tournament of 1v1 schemaverse battles, each round taking only about 10 minutes. No prepared scripts allowed! The Schemaverse is a space-based strategy game implemented entirely within a PostgreSQL database. Compete against other players using raw SQL commands to command your fleet. Or, if your PL/pgSQL-foo is strong, wield it to write AI and have your fleet command itself! This year, rather than the classic large space battle, the rounds will be 1v1 and only take 10 minutes each. There will also be NO pre-created scripts allowed. Matches will be broadcast live for all to see and a ladder updated after each round. Prizes and other fun details to be announced as we get closer to the date. Joshua McDougall The Schemaverse cheatsheet 13:00 00:45 DMS 1160 Rethinking JSONB Hacking lecture en PostgreSQL 9.4 has introduced JSONB, a structured format for storing JSON, which provides many users with the new opportunity: an effective storing and querying JSON documents inside ACID relational database. While users have notice a great jsonb performance, their feedback also reveals some hidden problems with current jsonb implementation. We want to discuss different approaches to resolve aforementioned problems and present several proof-of-conceps, so we could rethink jsonb for 9.6. The first problem of jsonb is its size overhead (4-5 times) in comparison with storing decomposed json data in multiple plain tables, whereas binary format of jsonb has very low storage overhead with regard to a plain text (<4%). The overhead comes mainly from redundant storing of keys names, which can be quite long, in each document. One possible solution could be a persistent key dictionary cached in shared memory. Such solution has several infrastructure problems and it wouldn't work in general case. For example, one can use keys names as values and dictionary may wouldn't fit shared memory because of high cardinality of keys. The second problem of jsonb is its querying. According to current SQLfacilities in PostgreSQL user can search for array element inside using subselect and jsonb_array_elements function. However, such queries are quite awkward and lack indexing support. Another way for querying jsonb documents is contains (@>) operator which is compact and has indexing support. Also jsquery is very suitable using in check constraints over jsonb, which could validates document schema. For example, constraint "CHECK(jb @@ 'a is numeric and b is string and c = *'::jsquery)" insures that value of key "a" is integer, value of key "b" is string and key "c" exists. However, jsquery is not extendable, while we need an elegant and extendable way to query json documents with indexing support. We propose to solve this problem by introducing special SQL constructions (ANYELEMENT, ANYVALUE, ANYKEY, ALL) to query json documents. Extendability comes from the ability to use any SQL-expression inside proposed constructions. Indexing suport for such queries is a challenge for PostgreSQL infrastructure, but we think it's feasible. Another missing feature of jsonb is lack of suitable way to update it. Users have to implement kluges to do it. It's possible to develop a set of functions to provide flexible way to update jsonb or extend current SQL syntax to implement elegant syntax similar to ones for array update. Our goal is to discuss different approaches to resolve aforementioned problems and present several proof-of-concepts, so we could rethink jsonb for 9.6. Alexander Korotkov Konstantin Knizhnik Oleg Bartunov 14:00 00:45 DMS 1160 PostgreSQL and the Enterprise Cloud Challenges, Solutions, & Assorted Horrors Case Studies lecture en With transactions growing at over 60% annually and customer and data growth rates even higher, Salesforce is at the leading edge of enterprise cloud computing and is always exploring and investing in highly scalable and resilient transactional computing architectures for the future. Join us and learn how the team approached and solved problems ranging from millions of lines of complex procedural SQLs to an ad-hoc customer query generator to thousands of embedded SQLs in tests and tooling infrastructure in order to further this research using a variety of conventional and innovative techniques. Learn about some of the challenges encountered when adapting one of the world’s largest, most sophisticated and most successful cloud based java application to run on a secondary database. From millions of lines of hand coded and generated PL and SQL to a query generator that can process any customer defined query to tens of thousands of ad-hoc SQLs embedded in test and integrity checking tools, this is no small challenge. Our team was able to overcome these challenges using a variety of conventional and innovative tools and frameworks. Join us to hear about the challenges, solutions, and some of the horror stories that were encountered along the way. Gary Baker 15:00 00:45 DMS 1160 Update and Delete operations for jsonb providing some needed functions and operators for jsonb Hacking lecture en Postgres 9.4 introduced the new jsonb type, However, it is missing some functions, particularly for json composition, that are needed by many users. In this talk we present an extension that provides some of these functions, and work to incorporate the functions in 9.5. Operations include replacement and deletion of array elements and object fields, and composition by concatentation of objects and arrays. Arrays can also be concatenated with scalar values, and array elements can be replaced or deleted by counting from either end of the array. Thus we have the ability to use json arrays as queues and stacks, with basic push/pop and shift/unshift capability. A pretty print function for jsonb is also provided. We will also outline what work we think remains, and discuss possible ideas on how to make json composition more naturally expressed. Andrew Dunstan slides 16:15 00:45 DMS 1160 lightning Lightning talks Short sharp descriptions of short topics Plenary lightning en A regular feature, PGCon will have a Lightning talks session, with presentations on diverse topics. The format remains essentially the same: in a one hour period, audiences are entertained and informed by a rapid fire series of short talks on interesting new or on-going work by individuals or groups. Slides aer permitted, but not obligatory; pictures are highly recommended. Topic areas include new open source software projects, works in progress for future releases of existing projects, student projects, etc. Lightning talks topics this year may make good conference papers next year! The number of slots is limited, and experience suggests there will be more takers than slots. Sign up well in advance to be assured a spot. Please subscribe to the PGCon announce mailing list and wait for the call to go out for submissions. Magnus Hagander 10:00 00:45 DMS 1120 Multi-tenancy in PostgreSQL Scaling Out lecture en This talk is about the need of multi-tenancy in PostgreSQL, and the way to achieve multi-tenancy in PostgreSQL. What is a multi-tenant cluster? Why multi-tenant cluster is needed? PostgreSQL Provides multi-tenancy with the following - Shared Database, Shared Namespace - Separate Databases - Shared Database, Separate Namespace However multi-tenancy means more than this. - Issues with a multi-tenant cluster - What can be done and what can we do to make it easier. This talk will propose a multi tenanted architecture for PostgreSQL, to make it the database of choice in a cloud environment. Multi tenanted architecture is one of the key requirements for any software to be efficiently deployed in the cloud. As more and more databases are made available 'as-a-Service' in cloud offerings, it is necessary to take stock of the features in PostgreSQL to analyse how cloud friendly they are, especially for a multi-tenanted infrastructure. This talk will mainly focus on what functionalities are needed in PostgreSQL to make it truly cloud friendly. PostgreSQL needs to have the functionalities that will make it the database of choice for service providers in the cloud. This can be achieved within the current architecture of PostgreSQL by developing new features that will satisfy these requirements. Arul Shaji 11:00 00:45 DMS 1120 Inside PostgreSQL Shared Memory Hacking lecture en This talk is for people who want to understand how PostgreSQL shares information among processes using shared memory. The talk covers the internal data page format, usage of the shared buffers, locking methods, and various other shared memory data structures. Bruce Momjian http://momjian.us/main/presentations/internals.html#shared_memory 13:00 00:45 DMS 1120 ...(Lag) What's wrong with my slave? Scaling Out lecture en Most of the time, a streaming replication slave in the same data center is so close to the master that lag can be measured in milliseconds. However when it's not, that lag can be baffling at best, and catastrophic at worst. We will look at all things lag; strategies of monitoring, configuration options to fit application needs, diagnosing common issues and real cases of 'what went wrong'. If you google from "postgres streaming replication lag" (go ahead, I'll wait...) your result set will include much information on set up and monitoring, but very little on diagnosing and even less on correcting. This talk is an attempt to fill that gap. We will start with the basics of monitoring and trending over time, look at configuration options and 'gotchas' for making your slaves trusted read sources, diagnose hardware and system factors, and finally share the pain of elusive lag patterns that took days, if not weeks to figure out. This talk takes a broad look at system health. Many factors contribute to making a database cluster run perfectly; disk speed, network latency, user query patterns, etc., etc. It can be easy to over look, or take for granted things that may strongly effect how close a slave follows the master. In fall of 2014 iParadigms converted 8 server clusters across two data centers to streaming replication, allowing us to find and document many such issues. Samantha Billington Lag Slides 14:00 00:45 DMS 1120 Modern SQL in PostgreSQL A lot has changed since SQL:92 Applications lecture en SQL has gone out of fashion lately --- partly due to the NoSQL movement, but mostly because SQL is often still used like 20 years ago. As a matter of fact, the SQL standard continued to evolve during the past decades resulting in the current release of 2011. In this session, we will go through the most important additions since the widely known SQL-92, explain how they work and how PostgreSQL extends them. We will cover common table expressions and window functions in detail and have a very short look at the temporal features of SQL:2011 and the related features of PostgreSQL. Markus Winand temporal features of SQL:2011 slides 10:00 00:45 DMS 1140 Go Faster with Native Compilation Schema-binding for 30% better response times Performance lecture en In this presentation I will briefly talk about the Native Compilation technology, its usage, its benefit and how it is important from current business standard. Further to this I will elaborate Native Compilation of relation called "Schema Binding", which utilizes the information available about a TABLE schema during its creation to generate a specialized code to access/ store the tuple of the corresponding TABLE. Schema Binding technology gives around up to 30% performance improvement for TPC-H benchmark. SQL engines are designed in a very generic way to handle all kinds of functionality but at the same time there are many invariant which does not change for every query execution. Native Compilation is a technology to identify such invariant and generate corresponding specific specialized code. One of the such example is table schema definition, once a schema is defined, its attribute length, data-type, size remains same. Hence access/ storage of data for this table is going to be always same irrespective of any data. So for this case instead of using the generalized code, we can generate specialized code for this table, which will have to execute much fewer instruction in order to access/ store data and hence much better performance. Kumar Rajeev Rastogi 11:00 00:45 DMS 1140 Heavy Duty Backup with PgBackRest DBA lecture en PgBackRest is a backup system developed at Resonate and open sourced to address issues around the backup of databases that measure in tens of terabytes. It supports per file checksums, compression, partial/failed backup resume, high-performance parallel transfer, async archiving, tablespaces, expiration, full/differential/incremental, local/remote operation via SSH, hard-linking, and more. PgBackRest is written in Perl and does not depend on rsync or tar but instead performs its own deltas which gives it maximum flexibility. This talk will introduce the features, give sample configurations, and talk about design philosophy. PgBackRest aims to be a simple backup and restore system that can seamlessly scale up to the largest databases and workloads. Instead of relying on traditional backup tools like tar and rsync, PgBackRest implements all backup features internally and features a custom protocol for communicating with remote systems. Removing reliance on tar and rsync allows better solutions to database-specific backup issues. The custom remote protocol limits the types of connections that are required to perform a backup which increases security. Each thread requires only one SSH connection for remote backups. Primary PgBackRest features: * Local or remote backup * Multi-threaded backup/restore for performance * Checksums * Safe backups (checks that logs required for consistency are present before backup completes) * Full, differential, and incremental backups * Backup rotation (and minimum retention rules with optional separate retention for archive) * In-stream compression/decompression * Archiving and retrieval of logs for replicas/restores built in * Async archiving for very busy systems (including space limits) * Backup directories are consistent Postgres clusters (when hardlinks are on and compression is off) * Tablespace support * Restore delta option * Restore using timestamp/size or checksum * Restore remapping base/tablespaces David Steele Project Page PostgresOpen Slides 13:00 00:45 DMS 1140 Managing your schema Using migrations for consistency, repeatability, and sanity DBA lecture en Keeping track of changes in your database schema can be challenging. In this talk I will discuss the advantages of using [Flyway](http://flywaydb.org) to effectively manage this issue. Migrations are an essential tool for both developers and administrators. Developers can quickly recreate a database from scratch and incrementally modify their development database along with their code and tests. Similarly, administrators can determine the current state of any database and easily migrate to a newer one. Most importantly, schema and data changes can be thoroughly reviewed and tested before going to production. In this talk, I will discuss the benefits of using [Flyway](http://flywaydb.org) to manage migrations. Specifically, I will: - Show why migrations are useful - Introduce Flyway and how to use it - Focus on using Flyway from the command line using migrations written in sql - Help you determine which changes should be in your migrations - Discuss how to create a base migration from your existing database - Cover strategies for dealing with global objects - Show how to integrate Flyway with Jenkins No Java experience is required for this talk. Jeremy Smith Flyway database migration tool 14:00 00:45 DMS 1140 PostgreSQL on FPGA Hardware and Software, Reconfigured to Work Together Hacking lecture en FPGA (Field-Programmable Gate Array) could be one of the beneficial technologies to process massive amount of data. In this project, we are trying to build an open source platform to provide basic infrastructure in order to help PostgreSQL developers have their "own" FPGA hacking projects. Have you heard about FPGA? FPGA (Field-Programmable Gate Array) is an integrated circuit which can be reconfigured after manufacturing. FPGA allows programmers to configure their own custom hardware like writing software. In the big data era, FPGA could be one of the beneficial technologies to process massive amount of data, because of the advantages of FPGA, including high memory bandwidth and low power consumption. However, writing code in HDL (Hardware Description Language) and implementing it on FPGA are not so easy for software/database developers. In this project, we are trying to build an open source platform to provide basic infrastructure in order to help PostgreSQL developers have their "own" FPGA hacking projects. In this talk, we would like to share a brief overview of this project with covering following topics. - Background - Project Overview - What Is FPGA? - Dataflow Programming - Combining Software And Hardware - Architectural Design - Development Tools and Process - Current Implementation - Demonstration - Future Works / Road Map Satoshi Nagayasu 15:00 00:45 DMS 1140 Monitor more of PostgreSQL pg_statsinfo comes with new features DBA lecture en There are many information vanishes as the server operates. pg_statsinfo/pg_stats_reporter is a monitoring tool which records such various status and statistics of PostgreSQL server and lets you see them in graphical and interactive way. It is very usable not only for DBAs to check the health of the server daily, but for technical support to find out what happened in the past on the remote site. It is widely adopted among our systems using PostgreSQL and the new pg_statsinfo 3.1 has new features to support them more. The new pg_statsinfo 3.1 has the following new features in comparison to 2.5. This talk will introduce these features with demo and dig inside some of them. - Collecting plan statistics. Plan statistics is based on an original pg_stat_statements-like extension named pg_store_plans, which is a similar tool to 2ndQuadrant's pg_stat_plans but it still differs in some points to fit to pg_stastinfo. - Statistics of autovacuum/analyze including cancellation stats. Cancellation stats would be in some situations. - Storing server logs into the repositiry database. Stored logs are examined using filtering feature of pg_stats_reporter. - Storing alerts previously only emitted into server log. Alerts gets more valuable with pg_stats_reporter. Kyotaro Horiguchi 09:00 00:30 DMS lobby Coffee & light snacks From 8:30 Social lecture en Please read the link below for full details. Please read the link below for full details. Dan Langille 11:45 01:15 DMS lobby Lunch unconference lunch Social lecture en Please read the link below for full details. Please read the link below for full details. Dan Langille 14:45 00:15 DMS lobby Coffee & light snacks in the lobby Social lecture en Please read the link below for full details. Please read the link below for full details. Dan Langille 18:00 03:30 130 George St socialouting Major Social Event! sponsored by EnterpriseDB Social other en EnterpriseDB invites PGCon attendees to a cocktail reception on Thursday, June 18th, just minutes from the conference venue in the Byward Market. We have exclusive use of the venue between 6:00 - 9:30 pm Address: 130 George Street Ottawa ON This is the same venue as 2014, so if you remember where that is, it's there in 2015. The venue name has changed. NOTE: Remember to bring your conference badge with you, as it will be your ticket to entry. For directions, please consult the [official conference map](http://pgcon.org/) Dan Langille 10:00 00:45 DMS 1160 Warm standby done right With 9.5, it's finally possible 9.5 Features lecture en People has been setting up warm standby systems with streaming replication since version 9.0, and even longer with file-based log-shipping. However, there has been a few pitfalls that many people don't know about, while others have simply accepted the risks. PostgreSQL 9.5 brings a bunch of new features and subtle changes that make warm standby setups more robust than ever. In 9.5, the interaction between a WAL archive and failover has been revised. pg_rewind makes it possible to resynchronize an old master server after failover - even an unplanned one. Replication slots, already introduced in 9.4, make the behaviour of a standby falling behind nicer. This presentation explains the changes, and why they were needed. Finally, I'm going to walk through setting up a simple, robust, two server hot standby system, using only built-in tools and simple shell scripts, taking advantage of the new features. Heikki Linnakangas 11:00 00:45 DMS 1160 Scalability and Performance Improvements in PostgreSQL Performance lecture en This paper will main talk about the scalability and performance improvements done in PostgreSQL 9.5 and will discuss about the improvements that can be done to improve the scalability for both Write and Read operations. The paper will focus on pain points of Buffer Management in PostgreSQL and the improvements done in 9.5 to improve the situation along with performance data. It will also describe in brief the performance improvements done in 9.5. It will also discuss the locking bottlenecks due to various locks (lightweight locks and spinlocks) taken during Read operation and what could be done to further scale the Read operation. The other part of the paper focusses on improving the Write-workload in PostgreSQL. In this part we will discuss about the frequency of writes done by backend operations (along with data) due to limitations of current bgwriter algorithm and some ideas to improve the performance by reducing writes done by backend. It will also discuss about the concurrency bottlenecks in write operation and some ideas to mitigate the same. Amit Kapila 13:00 00:45 DMS 1160 Shootout at the PAAS Corral head-to-head for PostgreSQL cloud platforms Performance en Where should you run your PostgreSQL in the cloud? Join us for a comparison of features, pricing and performance between various cloud options, including most or all of EC2, Amazon RDS, Heroku, OpenShift, Google Compute, and the Rackspace Cloud. To determine which cloud is the fastest, cheapest and best, over the next few months Josh Berkus and others will be running a series of performance benchmarks against several of the many cloud hosting options available for PostgreSQL. This will include most or all of EC2, Amazon RDS, Heroku, OpenShift, Google Compute, and the Rackspace Cloud. The results will be presented to you in this talk, including: * Benchmarking methdology * Cost comparison for each configuration * Feature differences * Performance scores Josh Berkus 14:00 00:45 DMS 1160 Transacting with foreign servers - Two's company, three is ... Managing transactions involving multiple foreign servers. Scaling Out lecture en PostgreSQL has Foreign Data Wrappers and they are writable too! Upcoming features like partitioning, foreign table inheritance and join-push down for foreign tables pave the path for sharding. One missing piece in the puzzle is the distributed transaction manager required to maintain the atomicity and consistency of transactions involving foreign servers. The presentation talks about the current status of such transactions and discusses the path forward towards distributed transaction manager. Support for writable foreign tables was added in PostgreSQL 9.3. As of now, atomicity and consistency is guaranteed, when a transaction makes changes to at most a single foreign server. It fails to do so when changes are made to multiple foreign servers. In order to achieve atomicity and consistency of transactions involving multiple foreign servers, PostgreSQL needs to take up the role of a distributed transaction manager. The talk covers the current status of distributed transactions. It further explores protocol to drive distributed transactions and infrastructure necessary to overcome various hardware, software and network failures during a distributed transaction. It also covers the use cases like data federation and sharding. Ashutosh Bapat 15:00 00:45 DMS 1160 Improving PostgreSQL for Greater Enterprise Adoption Case Studies en PostgreSQL Enterprise Consortium (PGECons for short) is an organization that consists of major IT companies in Japan, aiming to promote PostgreSQL to enterprise users in the country. Since 2012 when PGECons was established, we have been doing surveys of PostgreSQL's functions and performance to PGECons members to estimate how PostgreSQL well meets their requirements. In this talk, we will focus on some of major requests from the surveys, including enhancement of table partitioning and error messages handling. From the enterprise users' point of view, we would like to share these obstacles behind the requests that might limit PostgreSQL's acceptance, in order to cope with these issues with the community. [Partitioning] More the size of a database grows, the more difficult the design and operation of the system become. For example, a sales accounting application has such difficulty, which is often solved by using horizontal table partitioning. We have surveyed how PostgreSQL can be applied to such kind of applications to the member of PGECons, and evaluated performance of table partitioning. These surveys result suggested PostgreSQL partitioning issues below. + larger number of partitions slows query response time + definition and usage of partitioned table are intricate From additional surveys about proprietary DBMS usage to the members, we will show what are needed in partitioning and the features of the application areas that are important but difficult to be supported by PostgreSQL. Then, we will point out that enhancing partitioning functions will make PostgreSQL to be spread to the aforementioned areas. [Error messages] When shooting a trouble about PostgreSQL, a user analyzes a trouble to identify the cause based on the error logs and error messages. However, when we analyze a trouble with only SQLSTATE code we sometimes cannot identify the exact cause because an error code is often assigned to multiple different error messages. Also, we have similar difficulty when analyzing a trouble from error messages. To cope with these problems, Fujitsu Limited, a member of PGECons, delivers customers a list of error messages sorted by SQLSTATE codes to search the corrective actions. In the list, an identifier is added to each item so that customers immediately decide corrective actions and communicate smoothly with support staffs. Based on our activities above, we will propose error message systems which will bring following two merits: + accelerate decisions on corrective actions + enhance the self support by users In this presentation, we will show the needs and actions from members of PGECons as enterprise users, to share the way of improvement, which should help promote PostgreSQL among the community. Tetsuo Sakata Yurie Enomoto 16:00 00:45 DMS 1160 celko Data Encoding Schemes Plenary lecture en Joe Celko will talk about issues with current and historical data encoding schemes. A must-see talk. Joe Celko 17:15 00:45 DMS 1160 Closing sessions prizes, auctions, fun, games Plenary other en The Traditional Closing Session We raise money for charity. Dan Langille 10:00 00:45 DMS 1120 The Art of Performance Evaluation Performance lecture en "Contrary to common belief, performance evaluation is an art." (Raj Jain, 1991) Successful performance evaluation may not be achieved with merely executing common benchmarking tools. This talk presents fundamental principles of performance evaluation and how you can put them into practice. Do you understand what exactly "pgbench" does? Is it appropriate workload for your performance evaluation goal? Common benchmarking tools like "pgbench" are handy for just comparing system A and system B, but if you intend to deeply understand the performance of your system, answers to these questions are critical. In order to conduct a meaningful performance evaluation, the methodology should be elegantly designed to meet the goal of the evaluation: choose metrics for the goal, and choose observation techniques for the metrics. Each step requires careful consideration and deep knowledge about the target system. It cannot be done mechanically. This is why performance evaluation is an art. This talk presents principles of designing performance evaluations and shows how you can put them into practice by introducing the speaker's experiences of performance evaluations with PostgreSQL. Yuto Hayamizu 11:00 00:45 DMS 1120 All the Dirt on VACUUM DBA en The use of Multi-Version Concurrency Control (MVCC) is perhaps one of the most powerful features PostgreSQL has to offer, but it can be a source of confusion for new and experienced users alike. In this talk we will provide an in-depth walkthrough of why Postgres needs to vacuum and what vacuum does. Topics: - MVCC details - HOT overview - Identifying tuples to be vacuumed/frozen - VACUUM and indexes - Vacuuming heap pages Jim Nasby Recording of Talk http:// 13:00 00:45 DMS 1120 Parallel Sequential Scan Unleashing a heard of elephants Hacking lecture en Parallel query is close to becoming a reality in PostgreSQL! A year ago, much of the low-level infrastructure needed for parallelism, such as dynamic shared memory and dynamic background workers, had been completed, but no user-visible facilities made use of this infrastructure. Major work on error handling, transaction control, and state sharing has been completed, and further patches, including a patch for parallel sequential scan, are pending. In this talk, we will talk about parallel sequential scan itself, including performance considerations, the work allocation strategy, and the cost model; and we will also discuss the infrastructure that supports parallel sequential scan, including state sharing for GUCs, transaction state, snapshots, and combo CIDs; error handling and transaction management; and the handling of heavyweight locking. Finally, we'll discuss the future of parallelism in PostgreSQL now that the basic infrastructure is (mostly) complete. Amit Kapila Robert Haas 14:00 00:45 DMS 1120 PG-Strom GPGPU meets PostgreSQL to accelerate analytic queries Performance lecture en The upcoming v9.5 supports custom-scan interface that allows extensions to provide alternative logic to scan/join relations. PG-Strom is an extension to off-load a part of CPU intensive workloads to GPU, built on top of the custom-scan interface. At this moment, it supports scan, join, aggregate and sorting to execute on GPU devices, and records x10 time faster response time in some usual queries. Upcoming PostgreSQL v9.5 will support custom-scan interface that enables extensions to implement alternative logic to scan and/or join relations, then it shall be executed if these alternative logic is cheaper than built-in paths. PG-Strom is an extension built on top of the custom-scan interface, to process a part of CPU intensive SQL workloads on GPU processors. This feature works transparently from the application perspective; it internally generates native GPU binary and run the executable on GPU devices in asynchronous manner. GPU has long-standing technology in HPC region, also different characteristics towards usual CPUs. It allocates much larger percentage of semiconductor chip for ALU logic, rather then cache or control units, therefore, it has much higher computing capability around TFLOPS grade on simple/massive numeric calculation, but not good at complicated logic. Recent GPUs mounts multi-hundreds to multi-thousands cores within a chip, thus much higher performance-cost ratio. PG-Strom intermediates a world of SQL and a world of GPU. It makes great advantage for people in SQL world through the parallel query processing in low cost. This session introduce the brief design of PG-Strom, background technology (custom-scan interface and GPU/OpenCL), current functionality and future development; focus on technology perspective. We expect audience are interested in data-analysis, OLAP or big-data. KaiGai Kohei Presentation in PGcon/Japan 2014 (5-Dec) PG-Strom wiki entry 15:00 00:45 DMS 1120 pg_shard: Shard and scale out PostgreSQL PostgreSQL extension to scale out real-time reads and writes Scaling Out lecture en pg_shard is an open source sharding extension for PostgreSQL. It shards PostgreSQL tables for horizontal scale, and replicates them for high availability. The extension also seamlessly distributes SQL statements, without requiring any changes to the application layer. pg_shard addresses many NoSQL use-cases, and becomes more powerful with the new JSONB data type. Further, the extension leverages the rich analytic capabilities in PostgreSQL, and enables real-time analytics for big data sets. In this talk, we first summarize challenges in distributed systems associated with scaling out databases. We then describe "logical sharding", and discuss how it helps overcome these challenges. Next, we show how pg_shard uses hook APIs, such as the planner and executor hooks, to make PostgreSQL a powerful distributed database. We then cover example customer use-cases, and conclude with a futuristic demo: a distributed table with JSONB fields, backed by a dynamically changing row and columnar store. pg_shard is an open source sharding extension for PostgreSQL. It shards PostgreSQL tables for horizontal scale, and replicates them for high availability. The extension also seamlessly distributes SQL statements, without requiring any changes to the application layer. pg_shard addresses many NoSQL use-cases, and becomes more powerful with the new JSONB data type. Further, the extension leverages the rich analytic capabilities in PostgreSQL, and enables real-time analytics for big data sets. In this talk, we first summarize challenges in distributed systems: dynamically scaling a cluster when new machines are added or old ones fail, and distributed consistency semantics in the face of failures. We then describe "logical sharding", and show how it helps overcome these challenges. We also discuss this idea's application to Postgres. Next, we show how pg_shard uses hook APIs, such as the planner and executor hooks, to make PostgreSQL a powerful distributed database. We then cover example customer use-cases, and conclude with a futuristic demo: a distributed table with JSONB fields, backed by a dynamically changing row and columnar store. Ozgun Erdogan pg_shard 16:00 00:45 DMS 1120 Scalable MVCC Solution for Many Core Machines Performance lecture en In Current MVCC solution of PG, ProcArrayLock is the major bottleneck on many core machine(>120) and can scale upto 30 connections in TPCC test. Done experiment with lock free MVCC solution, and it can Scale upto 120 cores. In Current MVCC solution of PG, ProcArrayLock is the major bottleneck on many core machine(>120) and can scale upto 30 connections in TPCC test. Done experiment with lock free MVCC solution, and it can Scale upto 120 cores. We have taken the CSN based solution proposed in PG community, and implemented a lock free version of the same. By considering the High Memory and other resources in many core machines, locks are avoided in all the performance patch and only in some rare paths locks are used. Dilip Kumar 10:00 00:45 DMS 1140 SchemaVersus 1v1 Schemaverse Battles Social contest en A tournament of 1v1 schemaverse battles, each round taking only about 10 minutes. No prepared scripts allowed! The Schemaverse is a space-based strategy game implemented entirely within a PostgreSQL database. Compete against other players using raw SQL commands to command your fleet. Or, if your PL/pgSQL-foo is strong, wield it to write AI and have your fleet command itself! This year, rather than the classic large space battle, the rounds will be 1v1 and only take 10 minutes each. There will also be NO pre-created scripts allowed. Matches will be broadcast live for all to see and a ladder updated after each round. Prizes and other fun details to be announced as we get closer to the date. Joshua McDougall The Schemaverse cheatsheet 11:00 00:45 DMS 1140 GSoC2014 - Sharing Code and Experience Hacking lecture en This presentation is about my experience in the FOSS world and contributing to PostgreSQL as a Google Summer of Code 2014 student. In this presentation I'll talk about all my involvement with the FOSS world and how it change my life and career in many ways. I'll explain how Google Summer of Code works and the importance of this program to the open-source communities. Some points covered: - who can apply - how to apply - how you can help the PostgreSQL community - principal events Also I will explain my experience with Google Summer of Code 2014 implementing the "ALTER TABLE <name> SET {LOGGED | UNLOGGED}" statement that will be release in the next version 9.5. I'll explain the main challenges I have been solved during the coding, the design that we choose and the actual limitations. Future work and improvements in this area will be explained too. Fabrízio de Royes Mello Project Website Slides 13:00 00:45 DMS 1140 If you can't beat 'em, join 'em (… a pun) Why, when and how you can integrate documents and key-value pairs into your relational model Applications podium en There is a pitched battle going on between the relational, document-based, key-value and other data models. PostgreSQL is uniquely capable of leveraging many of the strengths of multiple data models with JSON(b), HSTORE, XML, ltree data types, arrays and related functions. This presentation outlines the use-cases, benefits and limitations of document-based, key-value and hierarchical data models. It then presents practical advice and code snippets for incorporating them into PostgreSQL's relational framework. The presentation ends with SQL examples and code snippets for loading, accessing and modifying (where possible) JSON, HSTORE, XML, ltree and array data types. This presentation begins with a very quick review of the rationale, benefits and implications of the relational data model. It then does the same for document-based models and hierarchical models. The balance of the presentation works with three publicly available data sets, world-wide airports (http://ourairports.com/data/), Wikipedia Inbox key-value pairs (http://wiki.dbpedia.org/Datasets) and Google address JSON objects, showing how they can be be incorporated into a simple relational model. The presentation also includes snippets of code for loading the files and accessing elements. The full SQL, and shell code will be available on the web site. James Hanson 14:00 00:45 DMS 1140 Tracing PostgreSQL performance Performance lecture en “How could I trace a session in PostgreSQL” - this is quite a common question from an Oracle/DB2 DBA, who is new to Postgres. Usually we start to mumble something about systemtap/perf and pg_stat_bgwriter. Although these tools are great and useful, but they do not exactly meet this need, and besides, some of them are really difficult to use. In this talk, I’ll make a brief introduction to tracing and wait event based performance analysis in commercial databases such as Oracle and DB2, and what actually can be done in this respect using the built-in PostgreSQL tools. Then we will consider what cannot be done with those built-in tools, and why and when a DBA needs dtrace/systemtap/perf and debuggers. I will also cover the drawbacks of these external tools, and give concrete use cases and recipes for their usage. Over the years, the community has been considering other possibilities to implement tracing, performance event system, pg_stat_lwlocks, etc. I will give a brief overview of these attempts, and finally give some thoughts on how we can implement the performance tracing less intrusively. Ilya Kosmodemiansky 15:00 00:45 DMS 1140 Shabang Scripting with Postgres Applications lecture en Sometimes bash is just the way to go! This talk will cover tips and techniques for effective bash scripting with PostgreSQL. Sometimes bash is just the way to go! This talk will cover tips and techniques for effective bash scripting with PostgreSQL. It will include guidance about: * Pros/cons of shell scripts * Function library creation and use * Executing SQL * Set/get PostgreSQL data from/into script variables * Keeping PostgreSQL functions in sync with scripts * Locking * Doing work in parallel * Ensuring cleanup This is a source-code heavy talk. Moderate experience with both bash scripting and PostgreSQL is needed to get the most out of it. Joe Conway 16:00 00:45 DMS 1140 Row Level Security 9.5 Features lecture en In this talk we'll review Row-Level Security (RLS), provide examples and use-cases, discuss the work which has been done on adding Row Level Security to PostgreSQL and the current state of that effort. PostgreSQL has long had a complex and interesting set of permissions available through the GRANT system. There is another system which exists in many other RDBMS's known as row-level security (RLS), where the rows returned is filtered based on a policy implemented on the table. Stephen Frost 09:30 00:30 DMS lobby Coffee & light snacks in the lobby Social lecture en Please read the link below for full details. Please read the link below for full details. Dan Langille 11:45 01:15 DMS lobby Lunch unconference lunch Social lecture en Please read the link below for full details. Please read the link below for full details. Dan Langille 14:45 00:15 DMS lobby Coffee & light snacks in the lobby Social lecture en Please read the link below for full details. Please read the link below for full details. Dan Langille 09:00 03:00 DMS 1160 NoSQL on ACID Tutorial workshop en NoSQL on Acid – Maximizing Results with JSONB and PostgreSQL PostgreSQL has kept up the momentum around JSON with version 9.4 featuring JSONB as demand for working with unstructured data continues to grow. PostgreSQL 9.4 introduces the new JSONB "binary JSON" type. This new storage format for unstructured document data is higher-performance than the original JSON type, and comes with indexing, functions and operators for manipulating and integrating JSON data easily with record oriented data in Postgres. This class will include instruction for several scenarios for working with JSON in PostgreSQL and demonstrate performance metrics. This class will also provide instruction on how to use different operations. Maximizing Results with JSONB and PostgreSQL PostgreSQL has kept up the momentum around JSON with version 9.4 featuring JSONB as demand for working with unstructured data continues to grow. PostgreSQL 9.4 introduces the new JSONB "binary JSON" type. This new storage format for unstructured document data is higher-performance than the original JSON type, and comes with indexing, functions and operators for manipulating and integrating JSON data easily with record oriented data in Postgres. This class will include instruction for several scenarios for working with JSON in PostgreSQL and demonstrate performance metrics. This class will also provide instruction on how to use different operations. The course will cover the following topics: Overview of JSON - history, data types and operators Why not HSTORE? Intro to node.js with examples Working with JSON - examples of SELECT, UPDATE, etc. Integrating in applications Performance benchmark Bruce Momjian Álvaro Hernández Tortosa 12:00 01:00 DMS 1160 Tutorial lunch nom nom nom Social lecture en Please read the link below for full details. Please read the link below for full details. Dan Langille Details on wiki 13:00 03:00 DMS 1160 Introduction to Hacking Tutorial workshop en PostgreSQL is well-modularized and contains (for the most part) very clean, well-documented code, making modification relatively easy, but the core distribution includes more than one million lines of code, so it can sometimes be difficult to figure out where and how to get started. In this talk, I'll discuss the developer tools that may be useful while modifying PostgreSQL, the major subsystems within the database server, PostgreSQL coding conventions and commonly-used idioms, and just a little bit about the PostgreSQL patch submission process. Topics will include nodes, datums, memory management, system caches, and locking. This tutorial is intended for those who are at least somewhat familiar with both SQL and with C programming, but want to learn how to apply that knowledge to the PostgreSQL backend. Robert Haas