Many companies want to move from Oracle Database to PostgreSQL. From an infrastructure point of view, There are many managed services running PostgreSQL or PostgreSQL-compatible database services. And with Distributed SQL databases, Maximum Availability can be provided (here is a comparison or Oracle MAA with YugabyteDB native HA). But one question remains: can you lift-and-shift legacy applications, using the old non-ANSI SQL syntax and proprietary PL/SQL code, to PostgreSQL?
Let me break you dreams to avoid any frustration at the end: there is no magic solution. However, because there is a big market here, many companies try to add an Oracle-compatibility layer to PostgreSQL. Even if it will never be a full solution, it may ease your migration by not having to re-write all use cases immediately.
In this blog post, I'm testing IvorySQL, a fork of PostgreSQL, which provides some compatibility with some Oracle syntax, in a similar way as orafce but on other features.
What I'm testing here is documented and open-source but nothing is better than a quick test to look at the corner cases.
Start IvorySQL in Docker
I start a Centos7 container on my laptop and bash
into it:
docker exec -it $(
docker run -d --rm -p 5432:5432 centos:7 sleep infinity
) bash
I install and start IvorySQL:
ver="2.1-1"
tag=Ivory_REL_2_1
rel=https://github.com/IvorySQL/IvorySQL/releases/download
env=rhel7
yum install -y ${rel}/${tag}/ivorysql2{-libs,,-server,-contrib}-${ver}.${env}.x86_64.rpm
echo "export PATH=\$PATH:"$(dirname $(rpm -ql ivorysql2-server-${ver}.${env} |
grep "/bin/initdb$")) >> /var/lib/ivorysql/.bash_profile
su - ivorysql -c "initdb"
su - ivorysql -c "pg_ctl -l logfile start"
I connect locally with psql
:
su - ivorysql
psql
select version();
show compatible_mode;
This is the IvorySQL fork from PostgreSQL 15.1 and, by default, with Oracle-compatibility disabled:
Before looking at what is documented as Compatibility Features I'm curious about Global Unique Index implementation. Global indexes exists in Oracle. They don't in PostgreSQL, and that limits the possibilities of ensuring unique constraints on other columns than the partition key. OLTP applications rarely have only one unique key. They often have one or multiple natural keys in addition to one immutable surrogate key chose as the primary key.
This doesn't hurt scalability in YugabyteDB because all secondary indexes are global in YugabyteDB at sharding level. However, when using PostgreSQL declarative partitioning on top of it, there is the same limitation for unique constraints. There is always the possibility to create a trigger to check unicity from all local indexes, but this is not scalable. Let's see what IvorySQL proposes
Global Unique Index
I'm creating the following partitioned table and insert two rows into it:
create table demo (id bigint, ts timestamptz) partition by range(ts);
create table demo20 partition of demo for values from ('2020-01-01') TO ('2021-01-01');
create table demo21 partition of demo for values from ('2021-01-01') TO ('2022-01-01');
create table demo22 partition of demo for values from ('2022-01-01') TO ('2023-01-01');
create table demo23 partition of demo for values from ('2023-01-01') TO ('2024-01-01');
create table demo24 partition of demo for values from ('2024-01-01') TO ('2025-01-01');
create table demo25 partition of demo for values from ('2025-01-01') TO ('2026-01-01');
insert into demo(id,ts) values (1,'2022-12-25'),(1,'2023-01-01');
Here is the structure:
Now, let's say I want to enforce unicity of id
, IvorySQL has a CREATE UNIQUE INDEX ... GLOBAL option for that:
ivorysql=# create unique index demoid on demo(id) global;
ERROR: could not create unique index "demo20_id_idx"
DETAIL: Key (id)=(1) is duplicated.
The duplicates, that are from different partitions, are correctly detected which shows that it works as desired.
I delete one of them to create my index:
ivorysql=# delete demo where ts='2023-01-01';
DELETE 1
ivorysql=# create unique index demoid on demo(id) global;
CREATE INDEX
ivorysql=#
Note that I forgot the FROM by running delete demo
instead of delete from demo
. This is allowed in IvorySQL, even without changing the compatible_mode
because Oracle Database allows that. Not really useful as this will not be the more difficult code change you will have to do if you move from Oracle to PostgreSQL, but an easy one to check in the syntax compatibility list.
Back to our global index, I want to see the cost of it with EXPLAIN ANALYZE:
ivorysql=# explain (analyze, costs off, buffers)
select * from demo where id=1;
QUERY PLAN
-------------------------------------------------------------------------------------------------
Append (actual time=0.015..0.021 rows=1 loops=1)
Buffers: shared hit=10
-> Index Scan using demo20_id_idx on demo20 demo_1 (actual time=0.007..0.007 rows=0 loops=1)
Index Cond: (id = 1)
Buffers: shared hit=2
-> Index Scan using demo21_id_idx on demo21 demo_2 (actual time=0.001..0.001 rows=0 loops=1)
Index Cond: (id = 1)
Buffers: shared hit=2
-> Seq Scan on demo22 demo_3 (actual time=0.007..0.007 rows=1 loops=1)
Filter: (id = 1)
Buffers: shared hit=1
-> Seq Scan on demo23 demo_4 (actual time=0.003..0.003 rows=0 loops=1)
Filter: (id = 1)
Buffers: shared hit=1
-> Index Scan using demo24_id_idx on demo24 demo_5 (actual time=0.001..0.001 rows=0 loops=1)
Index Cond: (id = 1)
Buffers: shared hit=2
-> Index Scan using demo25_id_idx on demo25 demo_6 (actual time=0.001..0.001 rows=0 loops=1)
Index Cond: (id = 1)
Buffers: shared hit=2
Planning:
Buffers: shared hit=5
Planning Time: 0.141 ms
Execution Time: 0.044 ms
(24 rows)
This is not what I expect from a Global Index. A point query on one id
value, which is known to be unique, is reading all partitions because the Global Unique Index is actually implemented with Local Indexes. This is a syntactic sugar to replace the creation of local indexes and trigger to look into them, but doesn't solve the scalability issue.
If you want to know more about the implementation, all is detailed in the HighGo blog, which is the company behind IvorySQL.
Non-ANSI recursive queries with CONNECT BY
Even if I've been working a lot with Oracle, I'm a big fan of ANSI syntax. Because it is standard, because I learned it earlier (DB2), and because with Oracle syntax nobody really remember where to put the (+)
and PRIOR
in a query. I used Oracle non-ANSI for the only reason that there are still many bugs with their ANSI syntax support. But With PostgreSQL, I don't really feel the need to CONNECT BY rather than WITH RECURSIVE. If you find recursive CTE hard to understand, I've a blog post to help.
IvorySQL implements some CONNECT BY features for Oracle compatibility. And when I say "legacy", I mean it. What I'll run in this paragraph comes from the Oracle 2.3 User Guide from 1981 🤓
Funny to see OATES as the boss here. Ed Oates was a co-founder of the company that became Oracle later, with Larry Ellison and Bob Miner. In the sample schema that is now famous as SCOTT/TIGER, OATES has not been renamed to ELLISON but to KING 😂
Back to our data, here is how to insert it in PostgreSQL. The COPY command is awesome - I just copy/pasted data from the Scan+OCR of the vintage User Guide:
create table emp(ename text, empno int, supr int, deptno int);
\copy emp from stdin with (header, delimiter ' ', null '<NULL>')
ENAME EMPNO SUPR DEPTNO
SMITH 7369 7902 20
ALLEN 7499 7698 30
WARD 7521 7698 30
JONES 7566 7839 20
MARTIN 7654 7698 30
BLAKE 7698 7839 30
CLARK 7782 7839 10
SCOTT 7788 7566 20
OATES 7839 <NULL> 10
TURNER 7844 7698 30
ADAMS 7876 7788 20
JAMES 7900 7698 30
FORD 7902 7566 20
MILLER 7934 7782 10
WILSON 7955 7566 20
JAKES 7956 7955 20
CARTER 7989 7698 30
\.
I run the first query found in this past-century User Guide, but on Oracle 21c. I'm just adding SYS_CONNECT_BY_PATH which shows the hierarchy for each row:
SELECT UNIQUE ENAME, EMPNO,DEPTNO,SUPR
,sys_connect_by_path(ename,'<')
FROM EMP
START WITH ENAME = 'SMITH'
CONNECT BY EMPNO = PRIOR SUPR
;
In Oracle 21c:
Now let's see how IvorySQL is compatible:
Ok, PostgreSQL doesn't support UNIQUE but DISTINCT
This doesn't work with Connect By.
Anyway, I don't need to deduplicate the result:
This ok, given that we should not expect any specific order when not having an ORDER BY. But are you sure your legacy application doesn't rely on the implicit ordering of CONNECT BY?
You should be curious about the implementation of this compatibility layer. Here is the execution plan:
QUERY PLAN
---------------------------------------------------------------
CTE Scan on cte_emp
CTE cte_emp
-> Recursive Union
-> Seq Scan on emp
Filter: (ename = 'SMITH'::text)
-> Hash Join
Hash Cond: (emp_1.empno = cte_emp_1.supr)
-> Seq Scan on emp emp_1
-> Hash
-> WorkTable Scan on cte_emp cte_emp_1
(10 rows)
IvorySQL has translated the CONNECT BY syntax to a WITH RECURSIVE one. Note that if you write your application with jOOQ you will never need to use this kind of compatible-layer because this translation can be done by jOOQ. Try it here: https://www.jooq.org/translate/
Oracle-compatible Sequences
This one is really nice because I don't like the PostgreSQL syntax to get a value from a sequence, passing the sequence name as a character string. Identifiers in SQL should be used as identifiers, may be double-quoted, but not as single-quote character strings. Here is a demo of the two possibilities:
ivorysql=# create sequence my_sequence;
CREATE SEQUENCE
ivorysql=# select nextval('my_sequence');
nextval
---------
1
ivorysql=# select my_sequence.nextval;
nextval
---------
2
Accepting the Oracle syntax has another advantage. When using Oracle Heterogenous Services with ODBC gateway to PostgreSQL, the sequence calls are not translated and fail. With this they should work as desired. It would be nice to port the same in orafce
extension.
Syntax-compatible functions
There are many functions added by IvorySQL to be Oracle-compatible. However, again, being syntax-compatible is different from being runtime compatible. Here is an example.
Here are two calls to current_timestamp
within an Oracle transaction:
set transaction isolation level read committed;
select current_timestamp from dual;
! sleep 1
select current_timestamp from dual;
rollback;
Here is the same code in IvorySQL:
set compatible_mode=oracle;
start transaction isolation level read committed;
select current_timestamp;
\! sleep 1
select current_timestamp;
rollback;
You see the problem: the timestamp is the same within a PostgreSQL transaction but not the same in an Oracle transaction. This may be a simple bug and IvorySQL is open-source. But to detect it you need strong unit tests for your database calls. Do you have that for the legacy code? If you want to reduce the effort of re-coding your application, by using a compatibility layer, you probably don't want to build lot of additional non-regression tests for this legacy code.
PL/iSQL to run PL/SQL code in PL/pgSQL
IvorySQL provides a PL/SQL-like syntax to declare stored procedures, packages and anonymous blocks. But, again, being syntax-compatible doesn't mean that the runtime behavior is the same.
Here is a simple example. I'll run the same code on Oracle and IvorySQL. Before looking at the screenshots below, please, look at the code and think about which result you expect from it. I try to insert duplicate rows, it raises an error which is catched by the exception blocl to insert a different row:
create table demo(text varchar2(20) primary key);
begin
insert into demo values('Hello');
insert into demo values('Hello');
exception
when others then insert into demo values('World');
end;
/
select * from demo;
This is a very simple example with a structure used in all PL/SQL applications.
Oracle will commit two lines, 'Hello' and 'World' because there are two successful inserts:
IvorySQL commtis only one row because and error was encountered in the main block which was then rolled back.
The reason is that, even if adopting the Oracle syntax, IvorySQL is still running with the PostgreSQL behavior where a procedural block runs in a sub-transaction and is rolled-back before going to the exception block. In both cases, the error raises a rollback to an internal savepoint. However, Oracle takes this savepoint before each statement and PostgreSQL takes it before each PL/pgSQL block.
PL/iSQL is syntax-compatible with PL/SQL but runs with a PL/pgSQL behavior. You are now in a configuration that is compatible with no other existing database.
Syntax-compatibility vs. Runtime-compatibility
In summary, IvorySQL is really interesting as an advanced orafce
which accepts Oracle syntax to avoid re-writing some code when moving from Oracle to PostgreSQL. However, it covers only a subset of what you probably use in Oracle, and the runtime behavior will always have some differences.
This means that, in my opinion, such approach cannot be used to migrate from Oracle to PostgreSQL without any effort. The time you save on not re-coding is lost in additional extensive regression tests.
I see two possible usages for it:
- keep old code for non-critical use-cases to accelerate the migration, until they are re-written for the new database. By non-critical, I mean that wrong results are not problematic, easily detected and fixed when they happen, and that concurrent access, as well as exception scenarios, have a low probability to happen
- use it for specific software in mind where you put the effort on the compatible layer rather than changing the original code, because you use a small subset of Oracle features and have extensive integration tests. It is the same idea as Babelfish for SQL Server. It will never have a 100% compatibility but can help on well-known software.
If you are moving from Oracle Database to YugabyteDB to run on open-source, cloud-native, postgres-compatible, distributed database, you are probably in a database infrastructure modernization project which is accompanied by changes in the application design and code. You may split a monolith into (micro-)services, modern languages and frameworks. In this case, you probably want to put the effort on the new application rather than troubleshooting the old legacy code though a new emulation layer. However, if you think that some IvorySQL features can be useful in YugabyteDB, do not hesitate to open a git issue for it so that we can look at it. We already have orafce
installed by default, easy to enable with create extension
but more can be added if it has a real value for users.