Parallel Scan in YugabyteDB (ysql_select_parallelism=-1)

Franck Pachot - May 24 '23 - - Dev Community

When you connect to YugabyteDB YSQL (the PostgreSQL compatible API) your session executes the SQL statement from one process (the PostgreSQL backend parsing the query and executing the plan) which reads and writes, by batch of rows, from/to the tablet servers where the rows are distributed.
In some cases, it makes sense to send multiple batches of operations at the same time, so that they are processed in parallel in multiple tablet servers. In some other cases, when those operations return a lot of rows for example, this may not be efficient because those rows will be finally processed by a single process on the SQL layer. In short, scanning in parallel is efficient when combined with the pushdowns of filters and aggregations. Note that the sharding method also matters: if rows are expected to return sorted. This low-level optimization happens only on hash sharded tables from which no specific ordering is expected.

Parallelism level

The number of operations that are executed in parallel is defined by the cluster parameter --ysql_select_parallelism which defaults to -1. With this setting, the parallelism is calculated from the number of tablet servers with min(max(tservers * 2), 1), 16). With a value of

Here is how it decodes:

ysql_select_parallelism tserver count parallelism level
<0 (default -1) 1 2
<0 (default -1) 2 4
<0 (default -1) 3 6
<0 (default -1) 4 8
<0 (default -1) 5 10
<0 (default -1) 6 12
<0 (default -1) 7 14
<0 (default -1) 8 or more 16
0 Any ERROR: Operation list must not be empty
>0 Any ysql_select_parallelism

In order to test it I've setup a YugabyteDB Managed cluster with 9 tablet servers:

Image description

Example

I'm running this on YugabyteDB 2.18 where this parallelization is done only on hash-sharded tables or indexes (because row ordering may be expected on range sharding), and where there's some pushdown (Partial Aggregate or Remote Filter) for which it makes sense to scan in parallel and return a reduced set to the PostgreSQL backend. This happens in PgGate which is the layer between the PostgreSQL execution plan and DocDB distributed storage. Happening at this level, it is not directly visible from the execution plan. However, the response time gives an idea, and here is an example

with multiple tablets

I'm creating a hash sharded table (the default) with multiple tablets (distributed to multiple tservers) and query with a count() pushdown:

yugabyte=> drop table if exists demo;
DROP TABLE
yugabyte=> create table demo ( id bigint primary key)
           split into 12 tablets;
CREATE TABLE
yugabyte=> insert into demo select generate_series(1,10000000) id;
INSERT 0 10000000

yugabyte=> select yb_table_properties('demo'::regclass) 
           , (select count(*) from yb_servers());

 yb_table_properties | count 
---------------------+-------
 (12,1,f,,)          |     9
(1 row)

yugabyte=> explain analyze select count(id) from demo;
                                                   QUERY PLAN                                                   
----------------------------------------------------------------------------------------------------------------
 Finalize Aggregate  (cost=102.50..102.51 rows=1 width=8) (actual time=1312.134..1312.134 rows=1 loops=1)
   ->  Seq Scan on demo  (cost=0.00..100.00 rows=1000 width=8) (actual time=1312.094..1312.102 rows=12 loops=1)
         Partial Aggregate: true
 Planning Time: 3.059 ms
 Execution Time: 1312.458 ms
 Peak Memory Usage: 64 kB
(6 rows)

Enter fullscreen mode Exit fullscreen mode

The response time is approximately one second to count 10 million rows.

with one tablet

I'm running the same with only one tablet, which leaves no room to parallel scan:

yugabyte=> drop table if exists demo;
DROP TABLE
yugabyte=> create table demo ( id bigint primary key)
           split into 1 tablets;
CREATE TABLE
yugabyte=> insert into demo select generate_series(1,10000000) id;
INSERT 0 10000000

yugabyte=> select yb_table_properties('demo'::regclass) 
           , (select count(*) from yb_servers());

 yb_table_properties | count 
---------------------+-------
 (1,1,f,,)          |     9
(1 row)

yugabyte=> explain analyze select count(id) from demo;     

                                               QUERY PLAN                                                    
------------------------------------------------------------------------------------------------------------------
 Finalize Aggregate  (cost=102.50..102.51 rows=1 width=8) (actual time=8298.744..8298.744 rows=1 loops=1)
   ->  Seq Scan on demo  (cost=0.00..100.00 rows=1000 width=8) (actual time=8298.732..8298.735 rows=1 loops=1)
         Partial Aggregate: true
 Planning Time: 0.039 ms
 Execution Time: 8298.819 ms
 Peak Memory Usage: 14 kB
(6 rows)

Enter fullscreen mode Exit fullscreen mode

This has a higher response time and proves that the parallel scan was efficient to reduce the response time. Note that this is different from PostgreSQL Parallel Query which distributes the query processing to multiple backends. Currently, this is disabled in YugabyteDB.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .