In YugabyteDB, distributed SQL database using PostgreSQL, the pg_stat_statements extension is installed by default. However, statements are gathered locally in each node, which mean that you need to query pg_stat_statements
or call pg_stat_statements_reset()
on each node. YugabyteDB is cloud native, you can add and remove nodes. With YugabyteDB Managed or YugabyteDB Anywhere this is done automatically. If you are using YugabyteDB for free, you have to automate it.
I'm showing an easy way to do it here.
First, let's create a table that will hold the statements gathered from all nodes, with the host
it comes from and ts
the timestamp when it was gathered:
drop table if exists ybwr_statements ;
create table ybwr_statements as
select now() as ts,'' as host, *
from pg_stat_statements where null is not null;
Using a Create Table As Select makes it independent on the table structure which may evolve though versions.
Now, the plpgsql code to gather from all nodes, as an example.
- I query the list of servers from
yb_servers()
- I use COPY TO PROGRAM to execute
ysqlsh
, thepsql
equivalent that is present on each node - I find it from the
postmaster.opts
file. You can alsofind
it or get it from theyb-tserver
working directory - I run an
insert into ybwr_statements select '%s','%s',* from pg_stat_statements
which I pass as STDIN toysqlsh
though the COPY TO as I need an input query input anyway - I also reset with
pg_stat_statements_reset()
is the gathering is successful (callingysqlsh
with-v ON_ERROR_STOP=1
) - to debug it, I add
> /tmp/log.txt 2>&1
at the end of the command line. You may prefer to use COPY FROM PROGRAM, passing the queries with-c
, and store the output in a table.
Here it is:
do $DO$
declare i record;
begin
for i in (select now() as ts, host from yb_servers()) loop
execute format(
$COPY$
copy (select $SQL$ insert into ybwr_statements select '%s','%s',* from pg_stat_statements; select pg_stat_statements_reset(); $SQL$ ) to program
$BASH$ $(awk '{sub("/postgres/bin/postgres .*","");print}' ./postmaster.opts)/bin/ysqlsh -h $(hostname) -v ON_ERROR_STOP=1 $BASH$
$COPY$
,i.ts,i.host);
end loop;
end;
$DO$
;
Then, you can query ybwr_statements
:
select ts, host, calls, total_time, query
from ybwr_statements
order by total_time;
Here is an example on an idle database where I captured only my capturing queries:
This is an example to show the technique. You can schedule it, like every hour, and you will have hourly snapshots of queries in the ybwr_statements
table. You may choose not to call pg_stat_statements_reset()
and will diff the snapshots on querying.