Inner Inverted Join in YugabyteDB & PostgreSQL

Franck Pachot - Oct 25 '23 - - Dev Community

In YugabyteDB, we have the same Join Methods as PostgreSQL, that are found in all SQL databases:

  • Nested Loop when the inner table has a fast access path with the join key, typically an index
  • Merge Join when the outer and inner tables are both sorted on the join key
  • Hash Join, which accesses the inner table by reading it into a hash table

The query planner chooses those methods. SQL is a declarative language where you mention the tables to join and the join condition, but it doesn't specify which index to use or which join method. You can control it with hints declared in a /*+ */ comment because it cannot be part of the SQL language.

Some others do it differently, and you have to mention the index to use and even the join method. I was asked if we support "Inverted Join" in YugabyteDB, and this came apparently from reading the CockroachDB documentation:
CockroachDB

CREATE INVERTED INDEX , table@indexname, and INNER INVERTED JOIN is not standard SQL syntax.

Of course, YugabyteDB supports inverted indexes with the PostgreSQL GIN index syntax, and the query planner will decide the join method without having to change the SQL query: if there's a join to a GIN index, a Nested Loop will probably be chosen. But let's test it.

The question I got was related to an array of UUID stored in an events table. In YugabyteDB I create it as in PostgreSQL:

yugabyte=> create table events 
           ( id bigserial primary key, identifiers uuid[] );
CREATE TABLE
Enter fullscreen mode Exit fullscreen mode

To have an index entry for each element of the array, I create a GIN index, as in PostgreSQL:

yugabyte=> create index events_identifiers on events
           using gin ( identifiers );
CREATE INDEX
Enter fullscreen mode Exit fullscreen mode

I'll insert five million rows with one, two or three items in the array, building random (but funny) UUID:

yugabyte=> with
     h4(hex) as ( values('C0DE'),('DEAD'),('FACE') )
   , h8(hex) as ( values('D1AB011C'),('CA5CADED') )
   , h12(hex)as ( values('C1A551F1AB1E'),('DEC1A551F1ED') )
   , u as (
 select format('%s-%s-%s-%s-%s',a.hex,b.hex,c.hex,d.hex,e.hex)::uuid u
 from h8 a,h4 b,h4 c,h4 d, h12 e
 where b.hex!=c.hex and c.hex!=d.hex
)
insert into events (identifiers)
select ARRAY[a.u] from u a
union all select ARRAY[a.u, b.u] from u a , u b
union all select ARRAY[a.u, b.u, c.u] from u a , u b , u c
union all select ARRAY[a.u, b.u, c.u, d.u] from u a , u b , u c, u d
;

INSERT 0 5421360
Enter fullscreen mode Exit fullscreen mode

Contains (@>) One Value

Here is the PostgreSQL-compatible query to count the rows that contain one UUID in their array:

yugabyte=> select count(id) from events
           where identifiers @> array['d1ab011c-dead-face-c0de-dec1a551f1ed'::uuid]
;

 count
--------
 435600
(1 row)

yugabyte=> explain (analyze, dist, verbose, costs off, summary off)
           select count(id) from events 
           where identifiers @> array['d1ab011c-dead-face-c0de-dec1a551f1ed'::uuid]
yugabyte-> ;

                                                           QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------
 Aggregate (actual time=6090.907..6090.907 rows=1 loops=1)
   Output: count(id)
   ->  Index Scan using events_identifiers on public.events (actual time=13.114..6062.775 rows=435600 loops=1)
         Output: id, identifiers
         Index Cond: (events.identifiers @> '{d1ab011c-dead-face-c0de-dec1a551f1ed}'::uuid[])
         Storage Table Read Requests: 426
         Storage Table Read Execution Time: 5751.870 ms
         Storage Index Read Requests: 426
         Storage Index Read Execution Time: 3.791 ms
(9 rows)
Enter fullscreen mode Exit fullscreen mode

I checked the execution plan. The GIN index was used. Most of the time is spent reading the half-million rows from the table to count them.

Contains (@>) From a Join

Now, using the simple PostgreSQL INNER JOIN I do the same where the values come from another table. I generate it in a WITH clause but this can be any table.

yugabyte=> with sample as (
 select 'd1ab011c-dead-face-c0de-dec1a551f1ed'::uuid as identifier
  union all
 select 'ca5caded-face-dead-c0de-c1a551f1ab1e'::uuid as identifier
)
select sample.identifier, count(e.id)
  from sample
  inner join events e
  on e.identifiers @> array[sample.identifier]
group by sample.identifier;

              identifier              | count
--------------------------------------+--------
 ca5caded-face-dead-c0de-c1a551f1ab1e | 435600
 d1ab011c-dead-face-c0de-dec1a551f1ed | 435600
(2 rows)

yugabyte=> explain (analyze, dist, verbose, costs off, summary off)
with sample as (
 select 'd1ab011c-dead-face-c0de-dec1a551f1ed'::uuid as identifier
  union all
 select 'ca5caded-face-dead-c0de-c1a551f1ab1e'::uuid as identifier
)
select sample.identifier, count(e.id)
  from sample
  inner join events e
  on e.identifiers @> array[sample.identifier]
group by sample.identifier;
                                                               QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------
 HashAggregate (actual time=12415.671..12415.671 rows=2 loops=1)
   Output: sample.identifier, count(e.id)
   Group Key: sample.identifier
   CTE sample
     ->  Append (actual time=0.001..0.005 rows=2 loops=1)
           ->  Result (actual time=0.001..0.001 rows=1 loops=1)
                 Output: 'd1ab011c-dead-face-c0de-dec1a551f1ed'::uuid
           ->  Result (actual time=0.001..0.002 rows=1 loops=1)
                 Output: 'ca5caded-face-dead-c0de-c1a551f1ab1e'::uuid
   ->  Nested Loop (actual time=13.597..12292.211 rows=871200 loops=1)
         Output: sample.identifier, e.id
         ->  CTE Scan on sample (actual time=0.003..0.010 rows=2 loops=1)
               Output: sample.identifier
         ->  Index Scan using events_identifiers on public.events e (actual time=13.243..6104.858 rows=435600 loops=2)
               Output: e.id, e.identifiers
               Index Cond: (e.identifiers @> ARRAY[sample.identifier])
               Storage Table Read Requests: 426
               Storage Table Read Execution Time: 5764.332 ms
               Storage Index Read Requests: 426
               Storage Index Read Execution Time: 3.557 ms
(20 rows)
Enter fullscreen mode Exit fullscreen mode

The execution shows that, without the need to mention it in the SQL statement, it has joined with a Nested Loop to the inner table that is accessed with the GIN index.

This is the equivalent if the INNER INVERTED JOIN, but with a PostgreSQL compatible syntax.

Performance

In the previous examples, I tested the worst case 10% of the rows had a match with the search value. To test with lower cardinality, I add one more row with two unique UUIDs:

yugabyte=> insert into events (identifiers) values (ARRAY[
  'FEA51B1E-ACED-C0DE-FEED-C1A551F1AB1E'::uuid
 ,'C0D1F1ED-DEAD-FACE-BEEF-DEC1A551F1ED'::uuid
]);

INSERT 0 1
Enter fullscreen mode Exit fullscreen mode

I use the same query as above to find those rows:

yugabyte=> explain (analyze, dist, verbose, costs off, summary off)
with sample as (
    select 'fea51b1e-aced-c0de-feed-c1a551f1ab1e'::uuid as identifier
    union all
    select 'c0d1f1ed-dead-face-beef-dec1a551f1ed'::uuid as identifier
)
select sample.identifier, count(e.id)
from sample
inner join events e
on e.identifiers @> array[sample.identifier]
group by sample.identifier;
                                                           QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------
 HashAggregate (actual time=4.129..4.129 rows=2 loops=1)
   Output: sample.identifier, count(e.id)
   Group Key: sample.identifier
   CTE sample
     ->  Append (actual time=0.001..0.004 rows=2 loops=1)
           ->  Result (actual time=0.001..0.001 rows=1 loops=1)
                 Output: 'fea51b1e-aced-c0de-feed-c1a551f1ab1e'::uuid
           ->  Result (actual time=0.000..0.000 rows=1 loops=1)
                 Output: 'c0d1f1ed-dead-face-beef-dec1a551f1ed'::uuid
   ->  Nested Loop (actual time=2.320..4.121 rows=2 loops=1)
         Output: sample.identifier, e.id
         ->  CTE Scan on sample (actual time=0.004..0.007 rows=2 loops=1)
               Output: sample.identifier
         ->  Index Scan using events_identifiers on public.events e (actual time=2.047..2.049 rows=1 loops=2)
               Output: e.id, e.identifiers
               Index Cond: (e.identifiers @> ARRAY[sample.identifier])
               Storage Table Read Requests: 1
               Storage Table Read Execution Time: 0.916 ms
               Storage Index Read Requests: 1
               Storage Index Read Execution Time: 1.028 ms
(20 rows)
Enter fullscreen mode Exit fullscreen mode

The result comes with 2 millisecond per value (actual time=2.047..2.049 rows=1 loops=2). As with regular indexes, GIN indexes are efficient with highly selective predicates.

Outer Join

Similarly, in SQL there's no LEFT INVERTED JOIN but you can use outer join with inverted indexes:

yugabyte=> with sample as (
    select 'fea51b1e-aced-c0de-feed-c1a551f1ab1e'::uuid as identifier
    union all
    select 'c0d1f1ed-dead-face-beef-dec1a551f1ed'::uuid as identifier
    union all
    select 'ED1F1CE5-5432-7000-9000-000000000000'::uuid as identifier
)
select *
from sample
left outer join events e
on e.identifiers @> array[sample.identifier]
;
              identifier              |   id    |                                 identifiers
--------------------------------------+---------+-----------------------------------------------------------------------------
 fea51b1e-aced-c0de-feed-c1a551f1ab1e | 5421361 | {fea51b1e-aced-c0de-feed-c1a551f1ab1e,c0d1f1ed-dead-face-beef-dec1a551f1ed}
 c0d1f1ed-dead-face-beef-dec1a551f1ed | 5421361 | {fea51b1e-aced-c0de-feed-c1a551f1ab1e,c0d1f1ed-dead-face-beef-dec1a551f1ed}
 ed1f1ce5-5432-7000-9000-000000000000 |         |
(3 rows)
Enter fullscreen mode Exit fullscreen mode

This was a LEFT OUTER JOIN, showing all rows from the outer table even when there's no match in the innert table. Of course you can also declare a RIGHT OUTER JOIN but with my testcase it will return 5421362 rows.

To summarize

Inner Inverted Join is not a join method but a specific implementation of a Nested Loop to an Inverted Index, with a proprietary syntax to hint it. SQL databases do the same without the need for a specific syntax and YugabyteDB behaves like PostgreSQL: you declare a GIN indexes and the INNER JOIN will use it when it is relevant.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .