PGCon2019 - 3.4

PGCon 2019
The PostgreSQL Conference

Speakers
Feng Guo
Schedule
Day Talks - Day 2 - 2019-05-31
Room DMS 1120
Start time 13:00
Duration 00:45
Info
ID 1376
Event type Lecture
Track Hacking
Language used for presentation English
Feedback

Generating distributed plan for PostgreSQL

Currently the query planner in PostgreSQL generates plan that is supposed to be executed on a single-node PostgreSQL. What if we want to run queries in the MPP (massively parallel processing) way? Greenplum Database is an open source MPP database with PostgreSQL kernel inside. In this talk, I will introduce how Greenplum Database generates distributed plan for PostgreSQL.

Greenplum Database is an open source MPP (massively parallel processing) database with PostgreSQL kernel inside. It is essentially several PostgreSQL disk-oriented database instances acting together as one cohesive database management system (DBMS). Particularly, the optimizer in Greenplum Database is based on PostgreSQL’s optimizer, which takes a query tree as input, examines each of possible execution plans and ultimately selects the execution plan that is expected to run the fastest.

In order to meet the MPP environment, the optimizer in Greenplum Database has been modified and enhanced to support the parallel structure of Greenplum Database. It produces such plan that is able to be executed simultaneously across all of the parallel PostgreSQL database instances. As a result, the implementation of optimizer in Greenplum Database differs from that in PostgreSQL in several aspects.

This talk will introduce these differences in optimizer between Greenplum Database and PostgreSQL, and illustrate how Greenplum Database achieves and optimizes a plan for parallel environment.