How to Postgres indexes review

Vasily Petrushin - Feb 25 - - Dev Community

Hey, in addition to Part 2, I recently started to review indexes in one of my prod databases. I would say, it turned out to be very educational.

Work a bit smarter

The database has been developed for a few years, and thousands of new features and migrations have been applied. I was surprised, at how many indexes have been created and never used, or did not used for a long time.

Why index management is so important? The indexes are costly! The costs of indexing:

  1. Indexes are eating storage, you need high-speed, highly-available storage for that.
  2. Indexing takes CPU and some IOPSes on each INSERT/UPDATE/DELETE query, and gives additional latency on these queries.
  3. Indexes take time to rebuild on database restore during incident recovery.

Any index is a trade-off. On one side it accelerates queries and reduces IOPSes, on another it creates an additional workload.

At the last database review, I found unused indexes for 50GB on a 600GB database. How am I found that?
How to find unused indexes:

-- Unused indexes
SELECT 
    relname AS table_name, 
    indexrelname AS index_name, 
    pg_size_pretty(pg_relation_size(indexrelid)) AS index_size
FROM pg_stat_user_indexes
WHERE idx_scan = 0  -- No scans
ORDER BY pg_relation_size(indexrelid) DESC;
-- Total size of unused indexes
SELECT 
    pg_size_pretty(SUM(pg_relation_size(indexrelid))) AS indexes_size
FROM pg_stat_user_indexes
WHERE idx_scan = 0;
Enter fullscreen mode Exit fullscreen mode

The unused indexes are sorted by size and it is a priority to fix first. The total size of unused indexes shows how bad the whole situation is.

How to find indexes, that cover the same columns:

-- Duplicate indexes cover the same columns
SELECT 
    indrelid::regclass AS table_name,
    array_agg(indexrelid::regclass) AS duplicate_indexes
FROM pg_index
GROUP BY indrelid, indkey
HAVING COUNT(*) > 1;
Enter fullscreen mode Exit fullscreen mode

Yes, it is very interesting and a good reason to review the indexes.

A sorted table to see the percentage of idx_scan/(seq_scan + idx_scan) when using the table. A good occasion to review the index coverage of tables.

-- Index usage vs seq scan
SELECT
    relname AS table_name,
    seq_scan, seq_tup_read,
    idx_scan, idx_tup_fetch,
    round(100 * idx_scan::numeric / NULLIF(seq_scan + idx_scan, 0), 2) AS index_usage_percent
FROM pg_stat_user_tables
ORDER BY index_usage_percent DESC NULLS LAST;
Enter fullscreen mode Exit fullscreen mode

The query example for migrations, that drops the unused index:

ALTER TABLE table_name DROP INDEX IF EXISTS index_name;
Enter fullscreen mode Exit fullscreen mode

The reasons, why this index "crisis" has happened:

  • a lot of new features were added and removed in the application, and indexes have been created, but we forgot to review and remove indexes;
  • a lot of indexes have been created because it was obvious - developers plan to filter or join by these columns, but Postgres plan in real life does not want to use it;
  • indexes have never been reviewed.

Conclusions:

  1. Indexing is required to run queries, but it is costly for the CPU on INSERTs/UPDATEs/DELETEs, and takes extra storage space.
  2. Indexes review should be included in the regular database audit and maintenance procedures.
  3. Here are the extremely useful queries to get an overview of the current situation with indexes for your Postgres database.

PS: I use AI a lot as a learning partner or an advisor, but not in production. No AI was used to write this article.

. . .