If you missed my previous post on Understanding Postgres Performance its a great starting point. On this particular post I’m going to dig in to some real life examples of optimizing queries and indexes.
I wrote about some of the great new features in Postgres 9.2 in the recent announcement on support of Postgres 9.2 on Heroku. One of those awesome features, is pg_stat_statements. Its not commonly known how much information Postgres keeps about your database (beyond the data of course), but in reality it keeps a great deal. Ranging from basic stuff like table size to cardinality of joins to distribution of indexes, and with pg_stat_statments it keeps a normalized record of when queries are run.
First you’ll want to turn on pg_stat_statments:
CREATE extension pg_stat_statements;
What this means it would record both:
SELECT id
FROM users
WHERE email LIKE '[email protected]';
and
SELECT id
FROM users
WHERE email LIKE '[email protected]';
To a normalized form which looks like this:
SELECT id
FROM users
WHERE email LIKE ?;
While Postgres collects a great deal of this information dissecting it to something useful is sometimes more mystery than it should be. This simple query will show a few very key pieces of information that allow you to begin optimizing:
SELECT
(total_time / 1000 / 60) as total_minutes,
(total_time/calls) as average_time,
query
FROM pg_stat_statements
ORDER BY 1 DESC
LIMIT 100;
The above query shows three key things:
Giving an output something like:
total_time | avg_time | query
------------------+------------------+------------------------------------------------------------
295.761165833319 | 10.1374053278061 | SELECT id FROM users WHERE email LIKE ?
219.138564283326 | 80.24530822355305 | SELECT * FROM address WHERE user_id = ? AND current = True
(2 rows)