One fat index, or two indexes on each columns? PostgreSQL vs. YugabyteDB

Franck Pachot - Jun 14 '23 - - Dev Community

Let's say you have a query with two conditions, on two columns: WHERE bar = 2 AND baz = 4. It can use an index on the two columns, like (bar asc, baz asc) or, in some databases, combine two indexes, one on (bar asc) and the other on (baz asc).

If this combination is not possible, the most selective index will be chosen. The two indexes solution may seem better because they will be used not only for WHERE bar = 2 AND baz = 4 but also for WHERE bar = 2 or WHERE baz = 4 alone. However, two indexes to maintain is more expensive for DML and combining with bitmap the two indexes is expensive.

Let's take an example, which comes from the following discussion Why PostgreSQL High Availability Matters and How to Achieve It which compares Spanner, PostgreSQL, YugabyteDB and others.

PostgreSQL with two indexes: Bitmap Scan

I create the following table with two indexes, one for each column:

CREATE TABLE foo (id bigserial, bar int, baz int);
CREATE INDEX foobar ON foo(bar asc);
CREATE INDEX foobaz ON foo(baz asc);
insert into foo (bar, baz) select n%97, n%101
     from generate_series(1,1000000) n;
vacuum analyze foo;
Enter fullscreen mode Exit fullscreen mode

A query with a condition on bar reads the first index:

postgres=#     explain (analyze, buffers, costs off)
               SELECT * FROM foo WHERE bar = 2 ;

    explain (analyze, buffers, costs off)
                                   QUERY PLAN
---------------------------------------------------------------------------------
 Bitmap Heap Scan on foo (actual time=1.058..6.768 rows=10310 loops=1)
   Recheck Cond: (bar = 2)
   Heap Blocks: exact=5406
   Buffers: shared hit=5417
   ->  Bitmap Index Scan on foobar (actual time=0.474..0.474 rows=10310 loops=1)
         Index Cond: (bar = 2)
         Buffers: shared hit=11
 Planning Time: 0.054 ms
 Execution Time: 7.238 ms
(9 rows)
Enter fullscreen mode Exit fullscreen mode

And the other index for a condition on baz:

postgres=#     explain (analyze, buffers, costs off)
               SELECT * FROM foo WHERE  baz = 4;

                                   QUERY PLAN
--------------------------------------------------------------------------------
 Bitmap Heap Scan on foo (actual time=1.098..6.980 rows=9901 loops=1)
   Recheck Cond: (baz = 4)
   Heap Blocks: exact=5405
   Buffers: shared hit=5415
   ->  Bitmap Index Scan on foobaz (actual time=0.516..0.516 rows=9901 loops=1)
         Index Cond: (baz = 4)
         Buffers: shared hit=10
 Planning Time: 0.060 ms
 Execution Time: 7.425 ms
(9 rows)
Enter fullscreen mode Exit fullscreen mode

Those two can be combined for a condition on bar and baz:

postgres=#    explain (analyze, buffers, costs off)
              SELECT * FROM foo WHERE bar = 2 AND baz = 4;

                                      QUERY PLAN
---------------------------------------------------------------------------------------
 Bitmap Heap Scan on foo (actual time=1.154..1.269 rows=102 loops=1)
   Recheck Cond: ((baz = 4) AND (bar = 2))
   Heap Blocks: exact=102
   Buffers: shared hit=123
   ->  BitmapAnd (actual time=1.130..1.131 rows=0 loops=1)
         Buffers: shared hit=21
         ->  Bitmap Index Scan on foobaz (actual time=0.483..0.483 rows=9901 loops=1)
               Index Cond: (baz = 4)
               Buffers: shared hit=10
         ->  Bitmap Index Scan on foobar (actual time=0.495..0.495 rows=10310 loops=1)
               Index Cond: (bar = 2)
               Buffers: shared hit=11
 Planning Time: 0.069 ms
 Execution Time: 1.292 ms
(14 rows)
Enter fullscreen mode Exit fullscreen mode

The two indexes can serve all three queries. With the two conditions, the bitmaps from each indexes are combined (BitmapAnd) before fetching the rows from the table. It has a little overhead but, anyway, PostgreSQL uses a Bitmap Scan to mitigate the cost of random reads to the Heap Table.

PostgreSQL with one index: used only on prefix

I drop those two indexes to create one on both columns:

DROP INDEX foobar;
DROP INDEX foobaz;
CREATE INDEX foobarbaz ON foo(bar asc, baz asc);
Enter fullscreen mode Exit fullscreen mode

With condition on bar, the index starting with it is used:

postgres=#     explain (analyze, buffers, costs off)
               SELECT * FROM foo WHERE bar = 2 ;

    explain (analyze, buffers, costs off)                                     QUERY PLAN
------------------------------------------------------------------------------------
 Bitmap Heap Scan on foo (actual time=1.144..6.833 rows=10310 loops=1)
   Recheck Cond: (bar = 2)
   Heap Blocks: exact=5406
   Buffers: shared hit=5406 read=12
   ->  Bitmap Index Scan on foobarbaz (actual time=0.532..0.532 rows=10310 loops=1)
         Index Cond: (bar = 2)
         Buffers: shared read=12
 Planning:
   Buffers: shared hit=12
 Planning Time: 0.098 ms
 Execution Time: 7.303 ms
(11 rows)

Enter fullscreen mode Exit fullscreen mode

With condition on both columns, the index is still used:

postgres=#    explain (analyze, buffers, costs off)
              SELECT * FROM foo WHERE bar = 2 AND baz = 4;

                                    QUERY PLAN
----------------------------------------------------------------------------------
 Bitmap Heap Scan on foo (actual time=0.031..0.148 rows=102 loops=1)
   Recheck Cond: ((bar = 2) AND (baz = 4))
   Heap Blocks: exact=102
   Buffers: shared hit=105
   ->  Bitmap Index Scan on foobarbaz (actual time=0.019..0.019 rows=102 loops=1)
         Index Cond: ((bar = 2) AND (baz = 4))
         Buffers: shared hit=3
 Planning Time: 0.064 ms
 Execution Time: 0.168 ms
(9 rows)
Enter fullscreen mode Exit fullscreen mode

However, with condition on baz, the index cannot be used:

postgres=#     explain (analyze, buffers, costs off)
               SELECT * FROM foo WHERE  baz = 4;

                                  QUERY PLAN
------------------------------------------------------------------------------
 Gather (actual time=0.220..23.140 rows=9901 loops=1)
   Workers Planned: 2
   Workers Launched: 2
   Buffers: shared hit=5406
   ->  Parallel Seq Scan on foo (actual time=0.009..20.276 rows=3300 loops=3)
         Filter: (baz = 4)
         Rows Removed by Filter: 330033
         Buffers: shared hit=5406
 Planning Time: 0.058 ms
 Execution Time: 23.622 ms
(10 rows)
Enter fullscreen mode Exit fullscreen mode

PostgreSQL has no way to skip the first column of the index and need to full scan the table.

YugabyteDB with two indexes: only one index used

I'll run the same on a YugabyteDB cluster, distributed over multiple nodes with replication for high availability. The goal is not to compare the performance (there's some network latency involved here) but the scalability of the execution plan.

I create the following table with two indexes, on each column:

CREATE TABLE foo (id bigserial, bar int, baz int);
CREATE INDEX foobar ON foo(bar asc);
CREATE INDEX foobaz ON foo(baz asc);
insert into foo (bar, baz) select n%97, n%101
     from generate_series(1,1000000) n;
analyze foo;
set yb_enable_optimizer_statistics=on;
Enter fullscreen mode Exit fullscreen mode

With condition on bar the first index is used:

yugabyte=#     explain (analyze, dist, costs off)
               SELECT * FROM foo WHERE bar = 2 ;

                                   QUERY PLAN
--------------------------------------------------------------------------------
 Index Scan using foobar on foo (actual time=11.612..91.885 rows=10310 loops=1)
   Index Cond: (bar = 2)
   Storage Index Read Requests: 11
   Storage Index Execution Time: 0.000 ms
   Storage Table Read Requests: 11
   Storage Table Execution Time: 88.000 ms
 Planning Time: 4.645 ms
 Execution Time: 92.575 ms
 Storage Read Requests: 22
 Storage Write Requests: 0
 Storage Execution Time: 88.000 ms
 Peak Memory Usage: 0 kB
(12 rows)
Enter fullscreen mode Exit fullscreen mode

With condition on baz the second index is used:

yugabyte=#     explain (analyze, dist, costs off)
               SELECT * FROM foo WHERE  baz = 4;

                                   QUERY PLAN
-------------------------------------------------------------------------------
 Index Scan using foobaz on foo (actual time=10.489..88.123 rows=9901 loops=1)
   Index Cond: (baz = 4)
   Storage Index Read Requests: 10
   Storage Index Execution Time: 0.000 ms
   Storage Table Read Requests: 10
   Storage Table Execution Time: 76.000 ms
 Planning Time: 0.783 ms
 Execution Time: 89.305 ms
 Storage Read Requests: 20
 Storage Write Requests: 0
 Storage Execution Time: 76.000 ms
 Peak Memory Usage: 0 kB
(12 rows)
Enter fullscreen mode Exit fullscreen mode

There is no Bitmap Scan in YugabyteDB. PostgreSQL bitmaps are on the table pages, read through shared buffers in the single-server's memory. This cannot scale horizontally. YugabyteDB distributes the rows and do not work with pages but rows, as key-value documents. To avoid random reads, YugabyteDB leverages index-organized tables (rows stored in the primary key structure) and Index Only Scan (with no need to read the visibility from the table).

However, that means that there's no bitmap combination and, with the condition on bar and baz, only one index is used:

yugabyte=#    explain (analyze, dist, costs off)
              SELECT * FROM foo WHERE bar = 2 AND baz = 4;

                                  QUERY PLAN
------------------------------------------------------------------------------
 Index Scan using foobaz on foo (actual time=11.121..88.236 rows=102 loops=1)
   Index Cond: (baz = 4)
   Remote Filter: (bar = 2)
   Storage Index Read Requests: 10
   Storage Index Execution Time: 0.000 ms
   Storage Table Read Requests: 10
   Storage Table Execution Time: 88.000 ms
 Planning Time: 0.081 ms
 Execution Time: 88.299 ms
 Storage Read Requests: 20
 Storage Write Requests: 0
 Storage Execution Time: 88.000 ms
 Peak Memory Usage: 19 kB
(13 rows)
Enter fullscreen mode Exit fullscreen mode

The most selective index is chosen and the other condition is the pushdown as Remote Filter to avoid sending the discarded rows between nodes.

YugabyteDB with one index: used by queries

I drop those indexes to create one on both columns:

DROP INDEX foobar;
DROP INDEX foobaz;
CREATE INDEX foobarbaz ON foo(bar asc, baz asc);
Enter fullscreen mode Exit fullscreen mode

With the condition on bar, the index, starting with it, is used:

yugabyte=#     explain (analyze, dist, costs off)
               SELECT * FROM foo WHERE bar = 2 ;

                                     QUERY PLAN
-----------------------------------------------------------------------------------
 Index Scan using foobarbaz on foo (actual time=11.059..79.501 rows=10310 loops=1)
   Index Cond: (bar = 2)
   Storage Index Read Requests: 11
   Storage Index Execution Time: 4.000 ms
   Storage Table Read Requests: 11
   Storage Table Execution Time: 68.000 ms
 Planning Time: 0.065 ms
 Execution Time: 80.151 ms
 Storage Read Requests: 22
 Storage Write Requests: 0
 Storage Execution Time: 72.000 ms
 Peak Memory Usage: 0 kB
(12 rows)
Enter fullscreen mode Exit fullscreen mode

With condition on bar and baz the index, which includes both, is used:

yugabyte=#    explain (analyze, dist, costs off)
              SELECT * FROM foo WHERE bar = 2 AND baz = 4;

                                  QUERY PLAN
-------------------------------------------------------------------------------
 Index Scan using foobarbaz on foo (actual time=1.998..2.040 rows=102 loops=1)
   Index Cond: ((bar = 2) AND (baz = 4))
   Storage Index Read Requests: 1
   Storage Index Execution Time: 0.000 ms
   Storage Table Read Requests: 1
   Storage Table Execution Time: 4.000 ms
 Planning Time: 0.074 ms
 Execution Time: 2.075 ms
 Storage Read Requests: 2
 Storage Write Requests: 0
 Storage Execution Time: 4.000 ms
 Peak Memory Usage: 8 kB
(12 rows) 
Enter fullscreen mode Exit fullscreen mode

With condition on baz, the index is still used, skipping the first column:

yugabyte=#     explain (analyze, dist, costs off)
               SELECT * FROM foo WHERE  baz = 4;

                                   QUERY PLAN
---------------------------------------------------------------------------------
 Index Scan using foobarbaz on foo (actual time=7.765..69.886 rows=9901 loops=1)
   Index Cond: (baz = 4)
   Storage Index Read Requests: 10
   Storage Index Execution Time: 0.000 ms
   Storage Table Read Requests: 10
   Storage Table Execution Time: 60.000 ms
 Planning Time: 0.062 ms
 Execution Time: 70.533 ms
 Storage Read Requests: 20
 Storage Write Requests: 0
 Storage Execution Time: 60.000 ms
 Peak Memory Usage: 0 kB
(12 rows)
Enter fullscreen mode Exit fullscreen mode

This last feature, called Hybrid Scan, has other names in other databases: Skip Scan, Jump Scan, Loose Index Scan. It makes compound indexes the best indexing scheme: less overhead on DML, used by many queries.

We can even make this fat index a bit fatter by including all columns selected:

CREATE INDEX foobar_covering ON foo(bar asc, baz asc) include (id);
DROP INDEX foobarbaz;
Enter fullscreen mode Exit fullscreen mode

The queries will read only the index with an Index Only Scan

yugabyte=#     explain (analyze, dist, costs off)
               SELECT * FROM foo WHERE  baz = 4;

                                         QUERY PLAN
--------------------------------------------------------------------------------------------
 Index Only Scan using foobar_covering on foo (actual time=2.866..25.780 rows=9901 loops=1)
   Index Cond: (baz = 4)
   Heap Fetches: 0
   Storage Index Read Requests: 10
   Storage Index Execution Time: 20.000 ms
 Planning Time: 0.086 ms
 Execution Time: 26.319 ms
 Storage Read Requests: 10
 Storage Write Requests: 0
 Storage Execution Time: 20.000 ms
 Peak Memory Usage: 8 kB
(11 rows)
Enter fullscreen mode Exit fullscreen mode

With YugabyteDB a single index can serve many queries with the optimal access path. If you see BitmapAnd or BitmapOr in PostgreSQL execution plans, you probably need to create different indexes when migrating to YugabyteDB: including all columns, and with range sharding (required to allow skip scans).

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .