Date: 2023-06-01
Time: 09:00–09:45
Room: DMS 1160
Level: Intermediate
Feedback: Leave feedback
In this talk, we'll be introducing a new approach for automatically determining which set of indexes to create for a given Postgres query workload, based on objectives chosen by the developer or DBA.
This approach is based on a combination of processing the Postgres query workload statistics derived from pg_stat_statements in a certain way, and a constraint programming optimization model that finds the mathematically optimal solution (set of index choices) based on a given set of constraints and objectives.
Our work focuses on index selection for a given table, and for example allows determining a set of indexes based on objectives such as minimizing the plan cost of all queries, whilst optimizing for the lowest index write overhead. In other cases, one might determine a subset of queries that should be prioritized over others. The approach we're presenting is flexible in allowing developers and DBAs to specify their intent on how a table should be indexed, whilst avoiding manual analysis of individual queries.
After this talk, you will have learned how to guide a constraint programming solver to find the right indexes for a table, and how to extract the relevant data from a Postgres instance. You will also have learned how this compares to other approaches, such as Dexter, HypoPG, and research published on automatic index selection in recent years.