YugabyteDB Skip Scan aka Loose Index Scan on compound index

Franck Pachot - Oct 19 '22 - - Dev Community

In a previous post I tested how an Index Scan can skip on the first index column to access ranges on the second column. This required that the first column was split by range.

Here, I am testing something similar but with hash sharding. I have an equality predicate on the hash column, and skip the second one. Here is what I've run and I'll explain the result later:

\! curl -s https://raw.githubusercontent.com/FranckPachot/ybdemo/main/docker/yb-lab/client/ybwr.sql | grep -v '\watch' > ybwr.sql
\i ybwr.sql
create table tbl ( A int, B int, C int );
insert into  tbl select 0,mod(m,10),m from generate_series(1,1000000) m;
create index i1 on tbl(A hash,B asc) INCLUDE (C);
create index i2 on tbl(A hash,B asc,C asc);
set yb_enable_expression_pushdown=on;
execute snap_table;
/*+ indexonlyscan(tbl i1) */ explain analyze select C from tbl where A =  0 and C = 900042;
execute snap_table;
/*+ indexonlyscan(tbl i2) */ explain analyze select C from tbl where A =  0 and C = 900042;
execute snap_table;
/*+ indexonlyscan(tbl i1) */ explain analyze select C from tbl where A =  0 and C > 900042;
execute snap_table;
/*+ indexonlyscan(tbl i2) */ explain analyze select C from tbl where A =  0 and C > 900042;
execute snap_table;
Enter fullscreen mode Exit fullscreen mode

The idea is to compare two covering indexes where one (i2) has all columns as the index key and the other (i1) includes the last column where it is not part of the key. I query with an equality predicate on the first column (A) which is hashed, and a range predicate on the third column (C), and have no predicate on the second column (B). I have only few distinct values for the second column (B) which is where Skip Scan makes the most sense.

point query

The first test will filter on A = 0 and C = 900042

INCLUDE covering index

yugabyte=# /*+ indexonlyscan(tbl i1) */ explain analyze select C from tbl where A =  0 and C = 900042;
                                                      QUERY PLAN
----------------------------------------------------------------------------------------------------------------------
 Index Only Scan using i1 on tbl  (cost=0.00..15.75 rows=100 width=4) (actual time=3605.845..3605.848 rows=1 loops=1)
   Index Cond: (a = 0)
   Remote Filter: (c = 900042)
   Heap Fetches: 0
 Planning Time: 0.123 ms
 Execution Time: 3605.891 ms
 Peak Memory Usage: 8 kB
(7 rows)

yugabyte=# execute snap_table;
 rocksdb_seek | rocksdb_next | rocksdb_insert | dbname / relname / tserver / tabletid / leader
--------------+--------------+----------------+------------------------------------------------
            1 |              |                | yugabyte i1 10.0.0.61 9c0a2dd1193e... L
            1 |       999999 |                | yugabyte i1 10.0.0.62 359f44ea1136... L
            1 |              |                | yugabyte i1 10.0.0.63 61591ddbdd81... L
(3 rows)
Enter fullscreen mode Exit fullscreen mode

Here, with the INCLUDE(C) index, all rows for (a = 0) have been read because there's no point access to (c = 900042)

Key covering index

yugabyte=# /*+ indexonlyscan(tbl i2) */ explain analyze select C from tbl where A =  0 and C = 900042;
                                                   QUERY PLAN
----------------------------------------------------------------------------------------------------------------
 Index Only Scan using i2 on tbl  (cost=0.00..15.50 rows=100 width=4) (actual time=1.040..1.041 rows=1 loops=1)
   Index Cond: ((a = 0) AND (c = 900042))
   Heap Fetches: 0
 Planning Time: 0.108 ms
 Execution Time: 1.070 ms
 Peak Memory Usage: 8 kB
(6 rows)

yugabyte=# execute snap_table;
 rocksdb_seek | rocksdb_next | rocksdb_insert | dbname / relname / tserver / tabletid / leader
--------------+--------------+----------------+------------------------------------------------
           21 |           41 |                | yugabyte i2 10.0.0.63 135697ffc832... L
(1 row)
Enter fullscreen mode Exit fullscreen mode

When the column is in the index key, we can skip directly to it in the LSM-Tree structure. As there are multiple values for the second column if the index (B), there are multiple skip operations. This is efficient when the skipped column has not too many distinct values. Basically, the index access reads from multiple points in one operation.

range query

The second test will filter on A = 0 and C > 900042

INCLUDE covering index

yugabyte=# /*+ indexonlyscan(tbl i1) */ explain analyze select C from tbl where A =  0 and C > 900042;
execute snap_table;
                                                       QUERY PLAN
------------------------------------------------------------------------------------------------------------------------
 Index Only Scan using i1 on tbl  (cost=0.00..15.75 rows=100 width=4) (actual time=39.337..3799.021 rows=99958 loops=1)
   Index Cond: (a = 0)
   Remote Filter: (c > 900042)
   Heap Fetches: 0
 Planning Time: 0.106 ms
 Execution Time: 3808.669 ms
 Peak Memory Usage: 8 kB
(7 rows)

yugabyte=# execute snap_table;
 rocksdb_seek | rocksdb_next | rocksdb_insert | dbname / relname / tserver / tabletid / leader
--------------+--------------+----------------+------------------------------------------------
            1 |              |                | yugabyte i1 10.0.0.61 9c0a2dd1193e... L
           98 |      1000096 |                | yugabyte i1 10.0.0.62 359f44ea1136... L
            1 |              |                | yugabyte i1 10.0.0.63 61591ddbdd81... L
(3 rows)
Enter fullscreen mode Exit fullscreen mode

As the condition on C cannot be optimized with skip operation, this is the same as above: the INCLUDE index is good to have an Index Only Scan, not going to the table to eliminate more rows, but filtering is done after reading all index entries. Note that Remote Filter is an optimization where YugabyteDB pushes the filtering to the storage, but it still has to read all index entries.

Key covering index

yugabyte=# /*+ indexonlyscan(tbl i2) */ explain analyze select C from tbl where A =  0 and C > 900042;
                                                      QUERY PLAN
----------------------------------------------------------------------------------------------------------------------
 Index Only Scan using i2 on tbl  (cost=0.00..15.50 rows=100 width=4) (actual time=6.563..595.860 rows=99958 loops=1)
   Index Cond: ((a = 0) AND (c > 900042))
   Heap Fetches: 0
 Planning Time: 0.109 ms
 Execution Time: 599.628 ms
 Peak Memory Usage: 8 kB
(6 rows)

yugabyte=# execute snap_table;
 rocksdb_seek | rocksdb_next | rocksdb_insert | dbname / relname / tserver / tabletid / leader
--------------+--------------+----------------+------------------------------------------------
          108 |       100074 |                | yugabyte i2 10.0.0.63 135697ffc832... L
(1 row)
Enter fullscreen mode Exit fullscreen mode

When the covering column used by the predicate is in the index key, this is optimized to read only what is necessary

Conclusion

A covering index holds all required columns in the index entry so that there is no need to go to the table. This is efficient when reading lot of rows by index, especially for the columns where there is some filtering on it, to discard as many as possible before going to the table

Then, the covered columns can be in the index key, or added as INCLUDE.

INCLUDE is mandatory for unique indexes where those columns are not part of the unique key. It may also be preferred where there are updates on those columns as the index entry location in the LSM-Tree will not change.

However, when there is some point or range condition on those columns, they should be in the index keys to benefit from LSM-Tree skip operation to get to the right point. And this, even if there are additional columns in between, ascending on descending, because YugabyteDB can skip on those.

This gives flexibility when defining indexes that can serve multiple queries. Usually, it is considered to put the most selective columns first, but if there are columns with few distinct values, and ASC or DESC, they may be better placed in front of the selective ones so that you have an index that can serve multiple queries. In this example, in addition to serving ((a = 0) AND (c > 900042)) efficiently, my index can also serve queries where A=0 and B=0 which would not be efficient with an index on (A HASH, C ASC) only.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .