Postgres and Yugabyte connection latency

Frits Hoogland - Oct 3 '22 - - Dev Community

Connecting to PostgreSQL and to YugabyteDB YSQL will have a latency/delay. The reason is the database client must setup a network connection, and let the database create a server side facility to respond to the requests and keep the state of the connection.

Currently, there is a latency difference between creating a connection in PostgreSQL and creating a YugabyteDB YSQL connection. The reason for this difference lies in the fact that the creation of a connection for both PostgreSQL and YSQL creates a new operating system process by forking the postmaster, and builds a number of memory area's for which some of them require loading catalog data. Every backend holds the catalog data relevant to it in its caches individually.

This means exactly the same is happening between PostgreSQL and YSQL. PostgreSQL loads the needed catalog data from the datafile or the database buffer cache. YSQL loads the same catalog data, but because YSQL is a distributed database, we store the catalog data in DocDB at the master.

For one cache, which is the relation cache or 'relcache', PostgreSQL and therefore YSQL is performing a local optimization, which is that it builds the cache image as a file, and uses the file to quickly load the cache, instead of loading the data from the buffer cache or from file. This is described here.

For the catalog cache, or 'catcache', a minimal set of data is loaded to have just enough data for the backend to be able to work, and load more catalog data when needed.

The required catalog data for YSQL currently requires performing RPC calls to the master leader to obtain the data.

The absolute latency for both PostgreSQL and YSQL connection creation is dependent on multiple factors, such as the latency between the database client and the PostgreSQL server, the availability of the catalog data in the database buffer cache and the availability of the datafile data in the operating system page cache and if necessary the latency of doing disk IO.

For the YSQL server, it's the latency between the database client and the YSQL postmaster, the latency between the backend and the tablet server, and the latency between the tablet server and the master leader.

This means that the only way to say something sensible for your own setup, is to test the connection latency. This can be done with connection-profiler. See the BUILD.mdhttps://github.com/fritshoogland-yugabyte/connection-profiler/blob/main/BUILD.md file for installation/building instructions.

Like has been described in Postgres first database connection latency, do not use the latency of the first connection after (postmaster) startup, because this latency will be higher because of the building of the relcache files.

If we perform a first connection to take the relcache files creation hit, and then perform a test, it looks like this on my local VM PostgreSQL setup:

 % ./target/release/connection-profiler --url "host=192.168.66.100 user=postgres password=postgres" -q '' -n 5
create_connection                              2868 us
------------------------------------------------------------
============================================================
create_connection                              2659 us
------------------------------------------------------------
============================================================
create_connection                              2759 us
------------------------------------------------------------
============================================================
create_connection                              2582 us
------------------------------------------------------------
============================================================
create_connection                              2377 us
------------------------------------------------------------
============================================================
Enter fullscreen mode Exit fullscreen mode

If I then take my local YugabyteDB VM setup, it looks like this:

% ./target/release/connection-profiler --url "host=192.168.66.80 user=yugabyte password=yugabyte port=5433" -q '' -n 5
create_connection                             13131 us
------------------------------------------------------------
============================================================
create_connection                             18530 us
------------------------------------------------------------
============================================================
create_connection                             16197 us
------------------------------------------------------------
============================================================
create_connection                             14285 us
------------------------------------------------------------
============================================================
create_connection                             15840 us
------------------------------------------------------------
============================================================
Enter fullscreen mode Exit fullscreen mode

Again: do not take these latencies as the absolute truth, but test them out in your situation to understand your actual latency.

YSQL has higher latency, no doubt about that. But the point I want to make that in both cases there is latency. The absolute best way to solve the issue of latency in both situations is to make sure you are using a connection pool so that the client doesn't need to create a new connection.

If that is not possible, there is the opportunity to use an external connection pooler, of which pg-bouncer is a common choice. Please note that even though pg-bouncer can lower the latency, it doesn't take away the problem of having to perform a connection, it just makes it a lower latency action.

PostgreSQL:

% ./target/release/connection-profiler --url "host=192.168.66.100 user=postgres password=postgres port=6432" -q '' -n 5
create_connection                              1455 us
------------------------------------------------------------
============================================================
create_connection                               989 us
------------------------------------------------------------
============================================================
create_connection                               725 us
------------------------------------------------------------
============================================================
create_connection                               645 us
------------------------------------------------------------
============================================================
create_connection                               627 us
------------------------------------------------------------
============================================================
Enter fullscreen mode Exit fullscreen mode

YSQL:

% ./target/release/connection-profiler --url "host=192.168.66.80 user=yugabyte password=yugabyte port=6432" -q '' -n 5
create_connection                              1443 us
------------------------------------------------------------
============================================================
create_connection                               629 us
------------------------------------------------------------
============================================================
create_connection                               639 us
------------------------------------------------------------
============================================================
create_connection                               551 us
------------------------------------------------------------
============================================================
create_connection                               484 us
------------------------------------------------------------
============================================================
Enter fullscreen mode Exit fullscreen mode

What this shows, is that with pg-bouncer, the latency can be lowered from having to create new postgres connection to connecting to pg-bouncer, which can link the new network connection to a previously created and still existing connection.

What is also interesting to see, is that by using pg-bouncer, not only the YSQL connection creation latency gets lower than creating a regular PostgreSQL connection, but also that connection creation using pg-bouncer for both PostgreSQL as well as YSQL gets equal. Obviously, the above latencies are strictly for my situation, and should be tested for your actual situation.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .