A kind of flashback query in PostgreSQL

Franck Pachot - Aug 22 '21 - - Dev Community

With MVCC databases, a transaction sees a virtual snapshot of committed data, as of the beginning of the statement (in READ COMMITED) or the transaction (in REPEATABLE READ or SERIALIZABLE). In Oracle 9i , this capability has been enhanced so that we can define a snapshot that is even earlier. This was called Flashback Query. Let's see something that, in PostgreSQL, can be equivalent in some cases.

I'll run multiple concurrent sessions for this example

psql -v PROMPT1="$((1+$(jobs|wc -l)))%R%x%# "
Enter fullscreen mode Exit fullscreen mode

This connects psql and sets the prompt to the job number.

1=# drop table if exists demo;
DROP TABLE
1=# create table demo as select 1 n, transaction_timestamp(), statement_timestamp();
SELECT 1
1=# select * from demo;
 n |     transaction_timestamp     |      statement_timestamp
---+-------------------------------+-------------------------------
 1 | 2021-08-22 18:07:19.934717+00 | 2021-08-22 18:07:19.934717+00
(1 row)
Enter fullscreen mode Exit fullscreen mode

I have created a demo table with 1 row, this is committed as I'm in auto-commit.

Now I'll open a transaction that I'll leave open so that the snapshot is still available (there's nothing like Oracle UNDO_RETENTION here). And I get the internal identification for the snapshot with the pg_export_snapshot function (there's nothing like Oracle TIMESTAMP_TO_SCN)

1=# begin transaction;
BEGIN
1=*# set transaction isolation level serializable;
SET
1=*# select  pg_export_snapshot();
 pg_export_snapshot
---------------------
 00000003-0015B343-1
(1 row)
Enter fullscreen mode Exit fullscreen mode

This is the base for using the feature and be careful, especially if you are using streaming replication, keeping transactions opened can be a problem. The goal of this feature is to run multiple sessions seeing the same snapshot. For example to parallelize an export of data without compromising consistency.

I'm now opening another transaction to simulate more activity:

1=*# ^Z
[1]+  Stopped                 /usr/local/pgsql/bin/psql -v PROMPT1="$((1+$(jobs|wc -l)))%R%x%# "
[opc@C ~]$
[opc@C ~]$ psql -v PROMPT1="$((1+$(jobs|wc -l)))%R%x%# "
2=#
Enter fullscreen mode Exit fullscreen mode

I'm inserting a new row in the demo table, and commit this change:

2=# begin transaction;
BEGIN
2=*# insert into demo select 1 n, transaction_timestamp(), statement_timestamp();
INSERT 0 1
2=*# commit;
COMMIT
Enter fullscreen mode Exit fullscreen mode

I could have exited the session, but just put it in background and open a new one here:

2=# ^Z
[2]+  Stopped                 /usr/local/pgsql/bin/psql -v PROMPT1="$((1+$(jobs|wc -l)))%R%x%# "
[opc@C ~]$ psql -v PROMPT1="$((1+$(jobs|wc -l)))%R%x%# "
Enter fullscreen mode Exit fullscreen mode

From there, the default snapshot being the start of the statement, I see the 2 rows as both inserts were committed.

3=# select * from demo;
 n |     transaction_timestamp     |      statement_timestamp
---+-------------------------------+-------------------------------
 1 | 2021-08-22 18:07:19.934717+00 | 2021-08-22 18:07:19.934717+00
 1 | 2021-08-22 18:09:34.615897+00 | 2021-08-22 18:09:35.317214+00
(2 rows)
Enter fullscreen mode Exit fullscreen mode

And this is where we have more control. I'll open a serializable transaction for which the snapshot is not the current one but the one I got a while ago with pg_export_snapshot()

3=# begin transaction;
BEGIN
3=*# set transaction isolation level serializable;
SET
3=*# set transaction snapshot '00000003-0015B343-1';
SET
Enter fullscreen mode Exit fullscreen mode

This would make no sense in READ COMMITED as this snapshot would have been discarded immediately.

And here it is. At that time, only one row was committed:

3=*# select * from demo;
 n |     transaction_timestamp     |      statement_timestamp
---+-------------------------------+-------------------------------
 1 | 2021-08-22 18:07:19.934717+00 | 2021-08-22 18:07:19.934717+00
(1 row)
Enter fullscreen mode Exit fullscreen mode

What is an exported snapshot? Just some metadata from the snapshot structure, with identification of the transaction and process that exported it:

[postgres]$ cat /data/pgdata/pg_snapshots/00000003-0015B343-1

vxid:3/1422147
pid:2663858
dbid:14973
iso:3
ro:0
xmin:2133213514
xmax:2133213514
xcnt:0
sof:0
sxcnt:0
rec:0
Enter fullscreen mode Exit fullscreen mode

The import, with SET TRANSACTION SNAPSHOT, can be done only if the exporting transaction is still there. Once this snapshot adopted, you can continue on this snapshot even if the exporting transaction terminates.

Before you try to use this for crazy pseudo-solutions to exotic problems, please keep in mind the reason why it has been designed: consistent reads across parallel sessions. You have one session has some work to do on a repeatable read snapshot (like exporting data) and you want to distribute the work to multiple session reading from the same point-in-time snapshot. If you export with pg_dump, this is automatically used by with --jobs option.

It may be tempting to PREPARE TRANSACTION in order to quit the session that exported the snapshot, but that's not allowed (ERROR: cannot PREPARE a transaction that has exported snapshots)

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .