Scalable sequences in PostgreSQL / YugabyteDB

Franck Pachot - Mar 14 '23 - - Dev Community

You may be surprised if you come from Oracle Database but the sequence cache in PostgreSQL is per session. Which means that:

  • if you re-connect, you will see a gap. This is not a problem by the way
  • if you have new connections, their first use of the sequence may compete on updating a single row

I described this, with YugabyteDB enhancements, in:

I was showing a workaround to transparently use 8 sequences when calling nextval() with only one name, in order to distribute the updates for the problem exposed above. Here is the code:

create or replace function nextval(name text) returns bigint as $$
declare
 sequence_to_read text:=name;
begin
 if name like '%\%' escape '\' then
  sequence_to_read:=(name||mod(pg_backend_pid(),8));
 end if;
 raise log 'Reading nextval from %',sequence_to_read;
 return nextval(sequence_to_read::regclass);
end;
$$ language plpgsql;

Enter fullscreen mode Exit fullscreen mode

This overrides the nextval() function when passing a text. The builtin nextval() accepts a reglass argument, which is the oid of the sequences. You usually pass the name of the sequence as text but then it does an implicit casting to regclass.

Here, when you pass a name without the special pattern (name ending with %) it just calls the builtin function. However, with the % pattern, I add a random number from 0 to 7 to call one of those sequences. Well... not really random. I take the number from pg_backend_pid() because the goal is to hit different sequences when from different sessions but one session can use the same one as it holds a cache for it.

Of course, you can choose a different pattern. The only thing is that I want it to be fast.

To create the 8 sequences, I just generate the DDL and run it from psql with \gexec:

select format('create sequence if not exists %I cache 100 start with %s increment by 8','seq%'||n,n+1) from generate_series(0,7) n;
\gexec
Enter fullscreen mode Exit fullscreen mode

This generates the following:

yugabyte=> select format('create sequence if not exists %I cache 100 start with %s increment by 8','seq%'||n,n+1) from generate_series(0,7) n;
                                   format
-----------------------------------------------------------------------------
 create sequence if not exists "seq%0" cache 100 start with 1 increment by 8
 create sequence if not exists "seq%1" cache 100 start with 2 increment by 8
 create sequence if not exists "seq%2" cache 100 start with 3 increment by 8
 create sequence if not exists "seq%3" cache 100 start with 4 increment by 8
 create sequence if not exists "seq%4" cache 100 start with 5 increment by 8
 create sequence if not exists "seq%5" cache 100 start with 6 increment by 8
 create sequence if not exists "seq%6" cache 100 start with 7 increment by 8
 create sequence if not exists "seq%7" cache 100 start with 8 increment by 8
(8 rows)
Enter fullscreen mode Exit fullscreen mode

The all increment by 8 and start with a different number so that their nextval() do not overlap. Remember, the goal is still to get a unique number, but avoiding the single row to update when the cache is cold.

You can use that as a normal sequence, the example is in the video. This workaround should not be necessary in future versions of YugabyteDB as we implement server-side caching and push down the update logic.

Note that in YugabyteDB, you don't have the same scalability problems that you find in PostgreSQL or Oracle because we don't use B-Tree indexes and Heap Tables. There is no need to partition the sequence like Scalable Sequences in Oracle Database because new values are appended to the LSM-Tree (no hot block issue) and you will probably use hash sharding on the generated key, even if range sharding is possible if you goal is to collocate rows inserted together.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .