Let's say you have a query with two conditions, on two columns: WHERE bar = 2 AND baz = 4
. It can use an index on the two columns, like (bar asc, baz asc)
or, in some databases, combine two indexes, one on (bar asc)
and the other on (baz asc)
.
If this combination is not possible, the most selective index will be chosen. The two indexes solution may seem better because they will be used not only for WHERE bar = 2 AND baz = 4
but also for WHERE bar = 2
or WHERE baz = 4
alone. However, two indexes to maintain is more expensive for DML and combining with bitmap the two indexes is expensive.
Let's take an example, which comes from the following discussion Why PostgreSQL High Availability Matters and How to Achieve It which compares Spanner, PostgreSQL, YugabyteDB and others.
PostgreSQL with two indexes: Bitmap Scan
I create the following table with two indexes, one for each column:
CREATE TABLE foo (id bigserial, bar int, baz int);
CREATE INDEX foobar ON foo(bar asc);
CREATE INDEX foobaz ON foo(baz asc);
insert into foo (bar, baz) select n%97, n%101
from generate_series(1,1000000) n;
vacuum analyze foo;
A query with a condition on bar
reads the first index:
postgres=# explain (analyze, buffers, costs off)
SELECT * FROM foo WHERE bar = 2 ;
explain (analyze, buffers, costs off)
QUERY PLAN
---------------------------------------------------------------------------------
Bitmap Heap Scan on foo (actual time=1.058..6.768 rows=10310 loops=1)
Recheck Cond: (bar = 2)
Heap Blocks: exact=5406
Buffers: shared hit=5417
-> Bitmap Index Scan on foobar (actual time=0.474..0.474 rows=10310 loops=1)
Index Cond: (bar = 2)
Buffers: shared hit=11
Planning Time: 0.054 ms
Execution Time: 7.238 ms
(9 rows)
And the other index for a condition on baz
:
postgres=# explain (analyze, buffers, costs off)
SELECT * FROM foo WHERE baz = 4;
QUERY PLAN
--------------------------------------------------------------------------------
Bitmap Heap Scan on foo (actual time=1.098..6.980 rows=9901 loops=1)
Recheck Cond: (baz = 4)
Heap Blocks: exact=5405
Buffers: shared hit=5415
-> Bitmap Index Scan on foobaz (actual time=0.516..0.516 rows=9901 loops=1)
Index Cond: (baz = 4)
Buffers: shared hit=10
Planning Time: 0.060 ms
Execution Time: 7.425 ms
(9 rows)
Those two can be combined for a condition on bar
and baz
:
postgres=# explain (analyze, buffers, costs off)
SELECT * FROM foo WHERE bar = 2 AND baz = 4;
QUERY PLAN
---------------------------------------------------------------------------------------
Bitmap Heap Scan on foo (actual time=1.154..1.269 rows=102 loops=1)
Recheck Cond: ((baz = 4) AND (bar = 2))
Heap Blocks: exact=102
Buffers: shared hit=123
-> BitmapAnd (actual time=1.130..1.131 rows=0 loops=1)
Buffers: shared hit=21
-> Bitmap Index Scan on foobaz (actual time=0.483..0.483 rows=9901 loops=1)
Index Cond: (baz = 4)
Buffers: shared hit=10
-> Bitmap Index Scan on foobar (actual time=0.495..0.495 rows=10310 loops=1)
Index Cond: (bar = 2)
Buffers: shared hit=11
Planning Time: 0.069 ms
Execution Time: 1.292 ms
(14 rows)
The two indexes can serve all three queries. With the two conditions, the bitmaps from each indexes are combined (BitmapAnd
) before fetching the rows from the table. It has a little overhead but, anyway, PostgreSQL uses a Bitmap Scan to mitigate the cost of random reads to the Heap Table.
PostgreSQL with one index: used only on prefix
I drop those two indexes to create one on both columns:
DROP INDEX foobar;
DROP INDEX foobaz;
CREATE INDEX foobarbaz ON foo(bar asc, baz asc);
With condition on bar
, the index starting with it is used:
postgres=# explain (analyze, buffers, costs off)
SELECT * FROM foo WHERE bar = 2 ;
explain (analyze, buffers, costs off) QUERY PLAN
------------------------------------------------------------------------------------
Bitmap Heap Scan on foo (actual time=1.144..6.833 rows=10310 loops=1)
Recheck Cond: (bar = 2)
Heap Blocks: exact=5406
Buffers: shared hit=5406 read=12
-> Bitmap Index Scan on foobarbaz (actual time=0.532..0.532 rows=10310 loops=1)
Index Cond: (bar = 2)
Buffers: shared read=12
Planning:
Buffers: shared hit=12
Planning Time: 0.098 ms
Execution Time: 7.303 ms
(11 rows)
With condition on both columns, the index is still used:
postgres=# explain (analyze, buffers, costs off)
SELECT * FROM foo WHERE bar = 2 AND baz = 4;
QUERY PLAN
----------------------------------------------------------------------------------
Bitmap Heap Scan on foo (actual time=0.031..0.148 rows=102 loops=1)
Recheck Cond: ((bar = 2) AND (baz = 4))
Heap Blocks: exact=102
Buffers: shared hit=105
-> Bitmap Index Scan on foobarbaz (actual time=0.019..0.019 rows=102 loops=1)
Index Cond: ((bar = 2) AND (baz = 4))
Buffers: shared hit=3
Planning Time: 0.064 ms
Execution Time: 0.168 ms
(9 rows)
However, with condition on baz
, the index cannot be used:
postgres=# explain (analyze, buffers, costs off)
SELECT * FROM foo WHERE baz = 4;
QUERY PLAN
------------------------------------------------------------------------------
Gather (actual time=0.220..23.140 rows=9901 loops=1)
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=5406
-> Parallel Seq Scan on foo (actual time=0.009..20.276 rows=3300 loops=3)
Filter: (baz = 4)
Rows Removed by Filter: 330033
Buffers: shared hit=5406
Planning Time: 0.058 ms
Execution Time: 23.622 ms
(10 rows)
PostgreSQL has no way to skip the first column of the index and need to full scan the table.
YugabyteDB with two indexes: only one index used
I'll run the same on a YugabyteDB cluster, distributed over multiple nodes with replication for high availability. The goal is not to compare the performance (there's some network latency involved here) but the scalability of the execution plan.
I create the following table with two indexes, on each column:
CREATE TABLE foo (id bigserial, bar int, baz int);
CREATE INDEX foobar ON foo(bar asc);
CREATE INDEX foobaz ON foo(baz asc);
insert into foo (bar, baz) select n%97, n%101
from generate_series(1,1000000) n;
analyze foo;
set yb_enable_optimizer_statistics=on;
With condition on bar
the first index is used:
yugabyte=# explain (analyze, dist, costs off)
SELECT * FROM foo WHERE bar = 2 ;
QUERY PLAN
--------------------------------------------------------------------------------
Index Scan using foobar on foo (actual time=11.612..91.885 rows=10310 loops=1)
Index Cond: (bar = 2)
Storage Index Read Requests: 11
Storage Index Execution Time: 0.000 ms
Storage Table Read Requests: 11
Storage Table Execution Time: 88.000 ms
Planning Time: 4.645 ms
Execution Time: 92.575 ms
Storage Read Requests: 22
Storage Write Requests: 0
Storage Execution Time: 88.000 ms
Peak Memory Usage: 0 kB
(12 rows)
With condition on baz
the second index is used:
yugabyte=# explain (analyze, dist, costs off)
SELECT * FROM foo WHERE baz = 4;
QUERY PLAN
-------------------------------------------------------------------------------
Index Scan using foobaz on foo (actual time=10.489..88.123 rows=9901 loops=1)
Index Cond: (baz = 4)
Storage Index Read Requests: 10
Storage Index Execution Time: 0.000 ms
Storage Table Read Requests: 10
Storage Table Execution Time: 76.000 ms
Planning Time: 0.783 ms
Execution Time: 89.305 ms
Storage Read Requests: 20
Storage Write Requests: 0
Storage Execution Time: 76.000 ms
Peak Memory Usage: 0 kB
(12 rows)
There is no Bitmap Scan in YugabyteDB. PostgreSQL bitmaps are on the table pages, read through shared buffers in the single-server's memory. This cannot scale horizontally. YugabyteDB distributes the rows and do not work with pages but rows, as key-value documents. To avoid random reads, YugabyteDB leverages index-organized tables (rows stored in the primary key structure) and Index Only Scan (with no need to read the visibility from the table).
However, that means that there's no bitmap combination and, with the condition on bar
and baz
, only one index is used:
yugabyte=# explain (analyze, dist, costs off)
SELECT * FROM foo WHERE bar = 2 AND baz = 4;
QUERY PLAN
------------------------------------------------------------------------------
Index Scan using foobaz on foo (actual time=11.121..88.236 rows=102 loops=1)
Index Cond: (baz = 4)
Remote Filter: (bar = 2)
Storage Index Read Requests: 10
Storage Index Execution Time: 0.000 ms
Storage Table Read Requests: 10
Storage Table Execution Time: 88.000 ms
Planning Time: 0.081 ms
Execution Time: 88.299 ms
Storage Read Requests: 20
Storage Write Requests: 0
Storage Execution Time: 88.000 ms
Peak Memory Usage: 19 kB
(13 rows)
The most selective index is chosen and the other condition is the pushdown as Remote Filter
to avoid sending the discarded rows between nodes.
YugabyteDB with one index: used by queries
I drop those indexes to create one on both columns:
DROP INDEX foobar;
DROP INDEX foobaz;
CREATE INDEX foobarbaz ON foo(bar asc, baz asc);
With the condition on bar
, the index, starting with it, is used:
yugabyte=# explain (analyze, dist, costs off)
SELECT * FROM foo WHERE bar = 2 ;
QUERY PLAN
-----------------------------------------------------------------------------------
Index Scan using foobarbaz on foo (actual time=11.059..79.501 rows=10310 loops=1)
Index Cond: (bar = 2)
Storage Index Read Requests: 11
Storage Index Execution Time: 4.000 ms
Storage Table Read Requests: 11
Storage Table Execution Time: 68.000 ms
Planning Time: 0.065 ms
Execution Time: 80.151 ms
Storage Read Requests: 22
Storage Write Requests: 0
Storage Execution Time: 72.000 ms
Peak Memory Usage: 0 kB
(12 rows)
With condition on bar
and baz
the index, which includes both, is used:
yugabyte=# explain (analyze, dist, costs off)
SELECT * FROM foo WHERE bar = 2 AND baz = 4;
QUERY PLAN
-------------------------------------------------------------------------------
Index Scan using foobarbaz on foo (actual time=1.998..2.040 rows=102 loops=1)
Index Cond: ((bar = 2) AND (baz = 4))
Storage Index Read Requests: 1
Storage Index Execution Time: 0.000 ms
Storage Table Read Requests: 1
Storage Table Execution Time: 4.000 ms
Planning Time: 0.074 ms
Execution Time: 2.075 ms
Storage Read Requests: 2
Storage Write Requests: 0
Storage Execution Time: 4.000 ms
Peak Memory Usage: 8 kB
(12 rows)
With condition on baz
, the index is still used, skipping the first column:
yugabyte=# explain (analyze, dist, costs off)
SELECT * FROM foo WHERE baz = 4;
QUERY PLAN
---------------------------------------------------------------------------------
Index Scan using foobarbaz on foo (actual time=7.765..69.886 rows=9901 loops=1)
Index Cond: (baz = 4)
Storage Index Read Requests: 10
Storage Index Execution Time: 0.000 ms
Storage Table Read Requests: 10
Storage Table Execution Time: 60.000 ms
Planning Time: 0.062 ms
Execution Time: 70.533 ms
Storage Read Requests: 20
Storage Write Requests: 0
Storage Execution Time: 60.000 ms
Peak Memory Usage: 0 kB
(12 rows)
This last feature, called Hybrid Scan, has other names in other databases: Skip Scan, Jump Scan, Loose Index Scan. It makes compound indexes the best indexing scheme: less overhead on DML, used by many queries.
We can even make this fat index a bit fatter by including all columns selected:
CREATE INDEX foobar_covering ON foo(bar asc, baz asc) include (id);
DROP INDEX foobarbaz;
The queries will read only the index with an Index Only Scan
yugabyte=# explain (analyze, dist, costs off)
SELECT * FROM foo WHERE baz = 4;
QUERY PLAN
--------------------------------------------------------------------------------------------
Index Only Scan using foobar_covering on foo (actual time=2.866..25.780 rows=9901 loops=1)
Index Cond: (baz = 4)
Heap Fetches: 0
Storage Index Read Requests: 10
Storage Index Execution Time: 20.000 ms
Planning Time: 0.086 ms
Execution Time: 26.319 ms
Storage Read Requests: 10
Storage Write Requests: 0
Storage Execution Time: 20.000 ms
Peak Memory Usage: 8 kB
(11 rows)
With YugabyteDB a single index can serve many queries with the optimal access path. If you see BitmapAnd or BitmapOr in PostgreSQL execution plans, you probably need to create different indexes when migrating to YugabyteDB: including all columns, and with range sharding (required to allow skip scans).