In the previous posts, I exposed how to build a table, in addition to the array datataype, to act as an index, but maintained by a trigger and queried with an explicit join. A GIN index may be preferable because it is maintained without the need to declare a trigger, and because the query planner can use it even when we query the table. However, querying the table was more efficient because my query, in addition to filtering on tag_ids
or group_ids
, filtered on the created_date
timestamp. I was able to add it in my table custom table, but not in the GIN index:
postgres=> create index posts_by_user_group_ids_created on posts_by_user using gin (group_ids, created_date);
ERROR: data type timestamp with time zone has no default operator class for access method "gin"
HINT: You must specify an operator class for the index or define a default operator class for the data type.
However, if we still want to use a GIN index, there's the possibility to combine GIN indexed column with BTREE indexed column thanks to the btree_gin
extension:
postgres=> create extension btree_gin;
CREATE EXTENSION
It is now possible to index on the group_ids[]
and tag_ids[]
with created_date timestamptz
:
postgres=> create index posts_by_user_group_ids_created on posts_by_user using gin (group_ids, created_date);
CREATE INDEX
Time: 67771.641 ms (01:07.772)
postgres=> create index posts_by_user_tag_ids_created on posts_by_user using gin (tag_ids, created_date);
CREATE INDEX
Time: 68495.108 ms (01:08.495)
First, let's have a look at the executions plans before that.
Execution plans before
I did that on the table created in the previous post, filled with 5 million rows:
postgres=> \d posts_by_user
Table "public.posts_by_user"
Column | Type | Collation | Nullable | Default
--------------+--------------------------+-----------+----------+------------------------------
user_id | bigint | | not null |
post_id | bigint | | not null | generated always as identity
group_ids | bigint[] | | |
tag_ids | bigint[] | | |
content | text | | |
created_date | timestamp with time zone | | not null |
Indexes:
"posts_by_user_pkey" PRIMARY KEY, btree (user_id, created_date, post_id)
"posts_by_user_user_id_post_id_key" UNIQUE CONSTRAINT, btree (user_id, post_id)
postgres=> select count(*) from posts_by_user;
count
---------
5000000
(1 row)
Time: 417.648 ms
Before creating the GIN index, here was my query:
postgres=> explain (analyze, buffers)
select *
from posts_by_user
where created_date > now() - interval '1 month'
and tag_ids @>'{1}'
order by created_date desc limit 100;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=275177.48..275177.73 rows=100 width=405) (actual time=188.100..188.121 rows=100 loops=1)
Buffers: shared hit=69605
-> Sort (cost=275177.48..275178.02 rows=217 width=405) (actual time=188.098..188.111 rows=100 loops=1)
Sort Key: created_date DESC
Sort Method: quicksort Memory: 124kB
Buffers: shared hit=69605
-> Bitmap Heap Scan on posts_by_user (cost=174740.49..275169.19 rows=217 width=405) (actual time=149.292..187.996 rows=193 loops=1)
Recheck Cond: (created_date > (now() - '1 mon'::interval))
Filter: (tag_ids @> '{1}'::bigint[])
Rows Removed by Filter: 37305
Heap Blocks: exact=35233
Buffers: shared hit=69605
-> Bitmap Index Scan on posts_by_user_pkey (cost=0.00..174740.44 rows=35513 width=0) (actual time=140.588..140.589 rows=39145 loops=1)
Index Cond: (created_date > (now() - '1 mon'::interval))
Buffers: shared hit=32824
Planning Time: 0.118 ms
Execution Time: 188.340 ms
(17 rows)
Time: 282.833 ms
This is not efficient: reads 32824 pages from the primary key index to filter on the timestamp, but filtering on the tag later Filter: (tag_ids @> '{1}'::bigint[])
. This is visible from Rows Removed by Filter: 38714
were read to be discarded later.
With the GIN index on (tag_ids)
, it was better, but still not optimal:
postgres=> explain (analyze, buffers)
select *
from posts_by_user
where created_date > now() - interval '1 month'
and tag_ids @>'{1}'
order by created_date desc limit 100;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=89306.17..89306.42 rows=100 width=405) (actual time=9.542..9.562 rows=100 loops=1)
Buffers: shared hit=1530
-> Sort (cost=89306.17..89306.71 rows=217 width=405) (actual time=9.540..9.552 rows=100 loops=1)
Sort Key: created_date DESC
Sort Method: quicksort Memory: 124kB
Buffers: shared hit=1530
-> Bitmap Heap Scan on posts_by_user (cost=288.80..89297.88 rows=217 width=405) (actual time=2.009..9.478 rows=193 loops=1)
Recheck Cond: (tag_ids @> '{1}'::bigint[])
Filter: (created_date > (now() - '1 mon'::interval))
Rows Removed by Filter: 26807
Heap Blocks: exact=1521
Buffers: shared hit=1530
-> Bitmap Index Scan on posts_by_user_tag_ids (cost=0.00..288.75 rows=30500 width=0) (actual time=1.682..1.682 rows=27000 loops=1)
Index Cond: (tag_ids @> '{1}'::bigint[])
Buffers: shared hit=9
Planning Time: 0.129 ms
Execution Time: 9.594 ms
(17 rows)
Time: 103.207 ms
Now, the GIN index is used to access to the specific tags from tag_ids
, but the later filtering on timestamp Filter: (created_date > (now() - '1 mon'::interval))
discards Rows Removed by Filter: 26807
rows.
Execution plan with btree_gin
Here is the plan with the btree_gin
created using gin (tag_ids, created_date)
:
postgres=> explain (analyze, buffers)
select *
from posts_by_user
where created_date > now() - interval '1 month'
and tag_ids @>'{1}'
order by created_date desc limit 100;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=917.04..917.29 rows=100 width=405) (actual time=56.695..56.710 rows=100 loops=1)
Buffers: shared hit=454
-> Sort (cost=917.04..917.58 rows=217 width=405) (actual time=56.694..56.700 rows=100 loops=1)
Sort Key: created_date DESC
Sort Method: quicksort Memory: 124kB
Buffers: shared hit=454
-> Bitmap Heap Scan on posts_by_user (cost=54.23..908.74 rows=217 width=405) (actual time=56.456..56.624 rows=193 loops=1)
Recheck Cond: ((tag_ids @> '{1}'::bigint[]) AND (created_date > (now() - '1 mon'::interval)))
Heap Blocks: exact=182
Buffers: shared hit=454
-> Bitmap Index Scan on posts_by_user_tag_ids_created (cost=0.00..54.17 rows=217 width=0) (actual time=56.427..56.428 rows=193 loops=1)
Index Cond: ((tag_ids @> '{1}'::bigint[]) AND (created_date > (now() - '1 mon'::interval)))
Buffers: shared hit=272
Planning Time: 0.155 ms
Execution Time: 57.426 ms
(15 rows)
Time: 151.462 ms
This time, all the predicates are index conditions: Index Cond: ((tag_ids @> '{1}'::bigint[]) AND (created_date > (now() - '1 mon'::interval)))
and there is no Rows Removed by Filter
to remove later.
This is the best usage of GIN index, reading around 454 buffers in 57 milliseconds (when in cache - can be much longer from disk).
However, the solution presented in the previous post, if you can build and query the additional table, is more efficient:
postgres=> explain (analyze,buffers)
select posts_by_user.*
from posts_by_user
join posts_by_tag
using(user_id, created_date, post_id)
where posts_by_tag.created_date
> now() - interval '1 month'
and tag_id =1
order by created_date desc limit 100
;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=1.00..741.19 rows=59 width=405) (actual time=0.039..0.696 rows=100 loops=1)
Buffers: shared hit=419
-> Nested Loop (cost=1.00..741.19 rows=59 width=405) (actual time=0.038..0.685 rows=100 loops=1)
Buffers: shared hit=419
-> Index Only Scan Backward using posts_by_tag_pkey on posts_by_tag (cost=0.57..241.75 rows=59 width=24) (actual time=0.021..0.143
rows=100 loops=1)
Index Cond: ((tag_id = 1) AND (created_date > (now() - '1 mon'::interval)))
Heap Fetches: 100
Buffers: shared hit=19
-> Index Scan using posts_by_user_user_id_post_id_key on posts_by_user (cost=0.43..8.45 rows=1 width=405) (actual time=0.005..0.005 rows=1 loops=100)
Index Cond: ((user_id = posts_by_tag.user_id) AND (post_id = posts_by_tag.post_id))
Filter: (posts_by_tag.created_date = created_date)
Buffers: shared hit=400
Planning Time: 0.261 ms
Execution Time: 0.731 ms
(14 rows)
Time: 95.173 ms
Even when reading the same number of blocks as the GIN index, because the structure is similar, the simple Index Scan and Index Only Scan are faster than the Bitmap Scans required by GIN on PostgreSQL.
I've written those posts to show that in PostgreSQL, you have many possibilities, not limited to the normalized heap tables and btree indexes. The right choice requires a good understanding of the access patterns, and verification with the execution plan. I'll talk about modern data modeling at next PGDay:
This is not limited to PostgreSQL and should be available in all PostgreSQL-compatible databases. I've run this in RDS Aurora with PostgreSQL 12.7 compatibility. The previous post was on YugabyteDB which has GIN index, without Bitmap Scan, very similar to the table I've build, but fully automated and transparent. The btree_gin
extension is not yet there, but soon - GIN features are tracked in issue 7850