If you missed my previous post on Understanding Postgres Performance its a great starting point. On this particular post I’m going to dig in to some real life examples of optimizing queries and indexes.

It all starts with stats

I wrote about some of the great new features in Postgres 9.2 in the recent announcement on support of Postgres 9.2 on Heroku. One of those awesome features, is pg_stat_statements. Its not commonly known how much information Postgres keeps about your database (beyond the data of course), but in reality it keeps a great deal. Ranging from basic stuff like table size to cardinality of joins to distribution of indexes, and with pg_stat_statments it keeps a normalized record of when queries are run.

First you’ll want to turn on pg_stat_statments:

CREATE extension pg_stat_statements;

What this means it would record both:

SELECT id 
FROM users
WHERE email LIKE '[email protected]';

and

SELECT id 
FROM users
WHERE email LIKE '[email protected]';

To a normalized form which looks like this:

SELECT id 
FROM users
WHERE email LIKE ?;

Understanding them from afar

While Postgres collects a great deal of this information dissecting it to something useful is sometimes more mystery than it should be. This simple query will show a few very key pieces of information that allow you to begin optimizing:

SELECT 
  (total_time / 1000 / 60) as total_minutes, 
  (total_time/calls) as average_time, 
  query 
FROM pg_stat_statements 
ORDER BY 1 DESC 
LIMIT 100;

The above query shows three key things:

  1. The total time a query has occupied against your system in minutes
  2. The average time it takes to run in milliseconds
  3. The query itself

Giving an output something like:

total_time    |     avg_time     |                                                                                                                                                              query
------------------+------------------+------------------------------------------------------------
 295.761165833319 | 10.1374053278061 | SELECT id FROM users WHERE email LIKE ?
 219.138564283326 | 80.24530822355305 | SELECT * FROM address WHERE user_id = ? AND current = True
(2 rows)