In this post I'll explain, with examples, the following options of pgbench: --client
and --jobs
used when running concurrent activity. Because their name is misleading: --client
is about the number of servers and --jobs
about the number of clients 🤨
Added after listening to Creston Jamison review:
For a benchmaking tool, I like to think about resources and where they are allocated. A database connection (--client) takes resources on the database server and application threads (--jobs) takes resources on the database client and that's why I find those names misleading.
I'm using the short options in the examples below, here are the equivalents:
pgbench --help | grep -E -- " -[stTcjfn],"
-n, --no-vacuum do not run VACUUM during initialization
-s, --scale=NUM scaling factor
-f, --file=FILENAME[@W] add script FILENAME weighted at W (default: 1)
-c, --client=NUM number of concurrent database clients (default: 1)
-j, --jobs=NUM number of threads (default: 1)
-n, --no-vacuum do not run VACUUM before tests
-s, --scale=NUM report this scale factor in output
-t, --transactions=NUM number of transactions each client runs (default: 10)
-T, --time=NUM duration of benchmark test in seconds
In order to run something simple and predictable, I'll use a custom script which simply waits one second in database: select pg_sleep(1)
and because I like one-liners I pass it though STDIN:
pgbench -T 30 -nf /dev/stdin <<< "select pg_sleep(1)"
transaction type: /dev/stdin
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
number of transactions per client: 10
number of transactions actually processed: 10/10
latency average = 1005.139 ms
tps = 0.994887 (including connections establishing)
tps = 0.995271 (excluding connections establishing)
I've set it to run 30 seconds and, without surprises, it has run with 1 transaction per second given that I have 1 thread running 10 transactions through 1 client connection.
Those are the defaults -j 1 -c 1
. I'll run with different values.
-c
--client
number of concurrent database clients (default: 1)
This is the most important to control the load on the database. Each client is a connection to the DB, which means a backend process, and transactions are executed concurrently. Let's run the same as above, now with 2 clients:
pgbench -j 1 -c 2 -T 30 -nf /dev/stdin <<< "select pg_sleep(1)"
transaction type: /dev/stdin
scaling factor: 1
query mode: simple
number of clients: 2
number of threads: 1
duration: 30 s
number of transactions actually processed: 60
latency average = 1004.676 ms
tps = 1.990692 (including connections establishing)
tps = 1.990951 (excluding connections establishing)
I can achieve 2 transactions per second now, still with the 1 second latency query. Increasing the number of clients will linearly increase the throughput if there are no bottlenecks elsewhere. This is where we say "it scales".
Because a sleep(1) doesn't take lot of resources, I still have the same latency with 100 connections:
pgbench -j 1 -c 100 -T 30 -nf /dev/stdin <<< "select pg_sleep(1)"
transaction type: /dev/stdin
scaling factor: 1
query mode: simple
number of clients: 100
number of threads: 1
duration: 30 s
number of transactions actually processed: 2900
latency average = 1056.748 ms
tps = 94.629933 (including connections establishing)
tps = 94.652406 (excluding connections establishing)
Asynchronous libq
But you can see that I still have 1 thread (--jobs=1
) default here. How can I run through 100 connections (aka server threads aka --client
) concurrently running transactions from 1 client thread (aka --jobs
)?
Here is a trace of interesting system calls about the communication with the database, with 3 clients from 1 thread:
strace -T -s 1000 -e trace=sendto,recvfrom,pselect6 -yy -o /dev/stdout pgbench -j 1 -c 3 -t 1 -nf /dev/stdin <<< "select pg_sleep(1)" | grep 5432
sendto(3<TCPv6:[[::1]:41360->[::1]:5432]>, "Q\0\0\0\30select pg_sleep(1)\n\0", 25, MSG_NOSIGNAL, NULL, 0) = 25 <0.000116>
recvfrom(3<TCPv6:[[::1]:41360->[::1]:5432]>, 0xaaac8fb02dd0, 16384, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable) <0.000008>
sendto(4<TCPv6:[[::1]:41362->[::1]:5432]>, "Q\0\0\0\30select pg_sleep(1)\n\0", 25, MSG_NOSIGNAL, NULL, 0) = 25 <0.000037>
recvfrom(4<TCPv6:[[::1]:41362->[::1]:5432]>, 0xaaac8fb11750, 16384, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable) <0.000007>
sendto(5<TCPv6:[[::1]:41364->[::1]:5432]>, "Q\0\0\0\30select pg_sleep(1)\n\0", 25, MSG_NOSIGNAL, NULL, 0) = 25 <0.000026>
recvfrom(5<TCPv6:[[::1]:41364->[::1]:5432]>, 0xaaac8fb1c180, 16384, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable) <0.000005>
pselect6(6, [3<TCPv6:[[::1]:41360->[::1]:5432]> 4<TCPv6:[[::1]:41362->[::1]:5432]> 5<TCPv6:[[::1]:41364->[::1]:5432]>], NULL, NULL, NULL, NULL) = 2 (in [4 5]) <1.001285>
recvfrom(4<TCPv6:[[::1]:41362->[::1]:5432]>, "T\0\0\0!\0\1pg_sleep\0\0\0\0\0\0\0\0\0\10\346\0\4\377\377\377\377\0\0D\0\0\0\n\0\1\0\0\0\0C\0\0\0\rSELECT 1\0Z\0\0\0\5I", 16384, 0, NULL, NULL) = 65 <0.000009>
sendto(4<TCPv6:[[::1]:41362->[::1]:5432]>, "X\0\0\0\4", 5, MSG_NOSIGNAL, NULL, 0) = 5 <0.000327>
recvfrom(5<TCPv6:[[::1]:41364->[::1]:5432]>, "T\0\0\0!\0\1pg_sleep\0\0\0\0\0\0\0\0\0\10\346\0\4\377\377\377\377\0\0D\0\0\0\n\0\1\0\0\0\0C\0\0\0\rSELECT 1\0Z\0\0\0\5I", 16384, 0, NULL, NULL) = 65 <0.000009>
sendto(5<TCPv6:[[::1]:41364->[::1]:5432]>, "X\0\0\0\4", 5, MSG_NOSIGNAL, NULL, 0) = 5 <0.000029>
pselect6(4, [3<TCPv6:[[::1]:41360->[::1]:5432]>], NULL, NULL, NULL, NULL) = 1 (in [3]) <0.000009>
recvfrom(3<TCPv6:[[::1]:41360->[::1]:5432]>, "T\0\0\0!\0\1pg_sleep\0\0\0\0\0\0\0\0\0\10\346\0\4\377\377\377\377\0\0D\0\0\0\n\0\1\0\0\0\0C\0\0\0\rSELECT 1\0Z\0\0\0\5I", 16384, 0, NULL, NULL) = 65 <0.000014>
sendto(3<TCPv6:[[::1]:41360->[::1]:5432]>, "X\0\0\0\4", 5, MSG_NOSIGNAL, NULL, 0) = 5 <0.000035>
I can clearly see 3 calls sendto(...->...5432...Q...select pg_sleep(1)
returning immediately. Then pselect6(...:5432...:5432...:5432)...<1.001285>
waiting for the first response from one of them, which takes 1 second. And then receiving the results with recvfrom(...)
from each one.
Those are asynchronous calls and I know many developers expecting that for years in other databases.
If I add -k
to strace I can get the call stack:
sendto(5<TCPv6:[[::1]:42120->[::1]:5432]>, "Q\0\0\0\30select pg_sleep(1)\n\0", 25, MSG_NOSIGNAL, NULL, 0) = 25 <0.000031>
> /usr/lib64/libpthread-2.28.so(__send+0x34) [0x11a2c]
> /usr/lib64/libpq.so.5.13(pqsecure_raw_write+0x6f) [0x1f52f]
> /usr/lib64/libpq.so.5.13(pqSendSome+0x77) [0x19547]
> /usr/lib64/libpq.so.5.13(PQsendQuery+0x7b) [0x14f3b]
> /usr/bin/pgbench(threadRun+0x12e7) [0x84b7]
> /usr/bin/pgbench(main+0x16a7) [0x4867]
> /usr/lib64/libc-2.28.so(__libc_start_main+0xe3) [0x20e63]
> /usr/bin/pgbench(_start+0x33) [0x5653]
> /usr/bin/pgbench(_start+0x33) [0x5653]
> No DWARF information found
recvfrom(5<TCPv6:[[::1]:42120->[::1]:5432]>, 0xaaadd068c180, 16384, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable) <0.000004>
> /usr/lib64/libpthread-2.28.so(recv+0x34) [0x11864]
> /usr/lib64/libpq.so.5.13(pqsecure_raw_read+0x3b) [0x1f233]
> /usr/lib64/libpq.so.5.13(pqReadData+0xab) [0x192fb]
> /usr/lib64/libpq.so.5.13(PQconsumeInput+0x23) [0x1536b]
> /usr/bin/pgbench(threadRun+0x853) [0x7a23]
> /usr/bin/pgbench(main+0x16a7) [0x4867]
> /usr/lib64/libc-2.28.so(__libc_start_main+0xe3) [0x20e63]
> /usr/bin/pgbench(_start+0x33) [0x5653]
> /usr/bin/pgbench(_start+0x33) [0x5653]
> No DWARF information found
pselect6(6, [3<TCPv6:[[::1]:42116->[::1]:5432]> 4<TCPv6:[[::1]:42118->[::1]:5432]> 5<TCPv6:[[::1]:42120->[::1]:5432]>], NULL, NULL, NULL, NULL
) = 1 (in [3]) <1.000339>
> /usr/lib64/libc-2.28.so(__select+0x74) [0xcaa5c]
> /usr/bin/pgbench(threadRun+0x14c3) [0x8693]
> /usr/bin/pgbench(main+0x16a7) [0x4867]
> /usr/lib64/libc-2.28.so(__libc_start_main+0xe3) [0x20e63]
> /usr/bin/pgbench(_start+0x33) [0x5653]
> /usr/bin/pgbench(_start+0x33) [0x5653]
> No DWARF information found
recvfrom(3<TCPv6:[[::1]:42116->[::1]:5432]>, "T\0\0\0!\0\1pg_sleep\0\0\0\0\0\0\0\0\0\10\346\0\4\377\377\377\377\0\0D\0\0\0\n\0\1\0\0\0\0C\0\0\0\rSELECT 1\0Z\0
\0\0\5I", 16384, 0, NULL, NULL) = 65 <0.000042>
> /usr/lib64/libpthread-2.28.so(recv+0x34) [0x11864]
> /usr/lib64/libpq.so.5.13(pqsecure_raw_read+0x3b) [0x1f233]
> /usr/lib64/libpq.so.5.13(pqReadData+0xab) [0x192fb]
> /usr/lib64/libpq.so.5.13(PQconsumeInput+0x23) [0x1536b]
> /usr/bin/pgbench(threadRun+0x853) [0x7a23]
> /usr/bin/pgbench(main+0x16a7) [0x4867]
> /usr/lib64/libc-2.28.so(__libc_start_main+0xe3) [0x20e63]
> /usr/bin/pgbench(_start+0x33) [0x5653]
> /usr/bin/pgbench(_start+0x33) [0x5653]
> No DWARF information found
The libpq C library used by pgbench has an asynchronous API with PQsendQuery/pqReadData
-j
--jobs
number of threads (default: 1)
They why running multiple threads on the client? You probably don't need to as one thread can handle hundred of asynchronous calls.
First, the threads cannot share the connections so you cannot have more client threads than server connections:
pgbench -j 2 -c 1 -T 30 -nf /dev/stdin <<< "select pg_sleep(1)"
transaction type: /dev/stdin
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
duration: 30 s
number of transactions actually processed: 30
latency average = 1005.422 ms
tps = 0.994607 (including connections establishing)
tps = 0.994731 (excluding connections establishing)
This has just ignored the --jobs
to set it to the same as --client
(you see that in "number of threads: 1"). Actually, the connections defined by --client
are distributed among the threads defined by --jobs
and it makes no sense to have threads with no connections. However you can have many connections per threads as we have seen below. This will still stress the database with concurrent executions thanks to asynchronous calls.
So what's the point with --jobs
? My example was running a script that takes long in the database (1 second) when compared to the client work and that's why one client thread --jobs=1
can serve many connections --client=100
without being the bottleneck. However, if you run really short queries to many connections, the work on client side can be significant. And as the goal of pgbench is to stress the database, you may need more threads. Don't forget that if you want to stress the CPU you will probably not run pgbench on the database server. But then you need more connection because there's a network component in the latency.
I'm taking an extreme example here where my custom script doesn't even call the database but takes 1 second of client time:
pgbench -j 1 -c 100 -T 30 -nf /dev/stdin <<< "\shell sleep 1"
transaction type: /dev/stdin
scaling factor: 1
query mode: simple
number of clients: 100
number of threads: 1
duration: 30 s
number of transactions actually processed: 29
latency average = 446722.970 ms
tps = 0.223852 (including connections establishing)
tps = 0.223857 (excluding connections establishing)
My unique thread throttles the throughput: During 30 seconds, only 30 transactions are possible on one thread when the client-side processing takes 1 second.
key points:
-
-c
--client
is what drives the number of sessions on the server -
-j
--jobs
can be used if the coordination from pgbench is a bottleneck
I'll share more about pgbench. Because benchmarks means nothing if we don't understand exactly what is run and how. And pgbench, using libpq, with custom scripts, is great to show the different ways to run SQL efficiently. So I'm flagging this the first post of a series.