This post is a continuation of the previous one where I explained how to get to the local partition without having the partitioning key. I was connected on Mars and the customer 85338353-162a-4a62-bcb9-78fd58a6b500
was there but 1c888089-c4f2-4da4-ba61-59957b965bf8
is from Earth:
yugabyte=# select planet,id from customers
where id in ('1c888089-c4f2-4da4-ba61-59957b965bf8','85338353-162a-4a62-bcb9-78fd58a6b500');
planet | id
--------+--------------------------------------
earth | 1c888089-c4f2-4da4-ba61-59957b965bf8
mars | 85338353-162a-4a62-bcb9-78fd58a6b500
(2 rows)
Thanks to my duplicate covering indexes, this query was acheived with local reads only:
yugabyte=# explain analyze select planet,id from customers
where id in ('1c888089-c4f2-4da4-ba61-59957b965bf8','85338353-162a-4a62-bcb9-78fd58a6b500');
I'll drop those indexes to show another method:
drop index customers_moon_id_planet_earth;
drop index customers_mars_id_planet_earth;
drop index customers_earth_id_planet_moon;
drop index customers_mars_id_planet_moon;
drop index customers_earth_id_planet_mars;
drop index customers_moon_id_planet_mars;
Now, If I run the same query, it will have to read the partitions from all regions:
yugabyte=# explain analyze select planet,id from customers
where id in ('1c888089-c4f2-4da4-ba61-59957b965bf8','85338353-162a-4a62-bcb9-78fd58a6b500');
yugabyte=# explain analyze select planet,id from customers
yugabyte-# where id in ('1c888089-c4f2-4da4-ba61-59957b965bf8','85338353-162a-4a62-bcb9-78fd58a6b500');
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------
Append (cost=0.00..46.65 rows=300 width=48) (actual time=1.798..5.347 rows=2 loops=1)
-> Index Scan using customers_earth_pkey on customers_earth (cost=0.00..14.65 rows=100 width=48) (actual time=1.798..1.800 rows=1 loops=1)
Index Cond: (id = ANY ('{1c888089-c4f2-4da4-ba61-59957b965bf8,85338353-162a-4a62-bcb9-78fd58a6b500}'::uuid[]))
-> Index Scan using customers_mars_pkey on customers_mars (cost=0.00..15.25 rows=100 width=48) (actual time=1.073..1.075 rows=1 loops=1)
Index Cond: (id = ANY ('{1c888089-c4f2-4da4-ba61-59957b965bf8,85338353-162a-4a62-bcb9-78fd58a6b500}'::uuid[]))
-> Index Scan using customers_moon_pkey on customers_moon (cost=0.00..15.25 rows=100 width=48) (actual time=2.469..2.469 rows=0 loops=1)
Index Cond: (id = ANY ('{1c888089-c4f2-4da4-ba61-59957b965bf8,85338353-162a-4a62-bcb9-78fd58a6b500}'::uuid[]))
Planning Time: 19.236 ms
Execution Time: 5.876 ms
Peak Memory Usage: 64 kB
(10 rows)
)
Now, in the previous post I mentioned that once you know the region, it is better to connect to it so that all SQL processing is local. My alternative here is to check if the customer is in the local partition. If it is not, I'll not do any remote read, but, maybe connect to another region and try there.
yb_is_local_table
We have a special function in YugabyteDB that can filter to the local partitions only:
yugabyte=# select * from customers
where id in ('1c888089-c4f2-4da4-ba61-59957b965bf8','85338353-162a-4a62-bcb9-78fd58a6b500')
and yb_is_local_table(tableoid);
id | planet | info
--------------------------------------+--------+------
85338353-162a-4a62-bcb9-78fd58a6b500 | mars | 42
(1 row)
This returned only the local customer
yugabyte=# explain analyze select * from customers
where id in ('1c888089-c4f2-4da4-ba61-59957b965bf8','85338353-162a-4a62-bcb9-78fd58a6b500')
and yb_is_local_table(tableoid);
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------
Append (cost=0.00..15.65 rows=100 width=80) (actual time=1.715..1.719 rows=1 loops=1)
-> Index Scan using customers_mars_pkey on customers_mars (cost=0.00..15.15 rows=100 width=80) (actual time=1.715..1.718 rows=1 loops=1)
Index Cond: (id = ANY ('{1c888089-c4f2-4da4-ba61-59957b965bf8,85338353-162a-4a62-bcb9-78fd58a6b500}'::uuid[]))
Filter: yb_is_local_table(tableoid)
Planning Time: 0.338 ms
Execution Time: 1.792 ms
Peak Memory Usage: 32 kB
(7 rows)
I can process this customer, and then connect elsewhere to process the other, while running exactly the same query.
Here is an example:
yugabyte=# \c yugabyte yugabyte yb-tserver-2
You are now connected to database "yugabyte" as user "yugabyte".
yugabyte=# select * from customers
where id in ('1c888089-c4f2-4da4-ba61-59957b965bf8','85338353-162a-4a62-bcb9-78fd58a6b500')
and yb_is_local_table(tableoid);
id | planet | info
--------------------------------------+--------+------
85338353-162a-4a62-bcb9-78fd58a6b500 | mars | 42
(1 row)
yugabyte=# \c yugabyte yugabyte yb-tserver-1
You are now connected to database "yugabyte" as user "yugabyte" on host "yb-tserver-1" at port "5433".
yugabyte=# select * from customers
where id in ('1c888089-c4f2-4da4-ba61-59957b965bf8','85338353-162a-4a62-bcb9-78fd58a6b500')
and yb_is_local_table(tableoid);
id | planet | info
----+--------+------
(0 rows)
yugabyte=# \c yugabyte yugabyte yb-tserver-0
You are now connected to database "yugabyte" as user "yugabyte" on host "yb-tserver-0" at port "5433".
yugabyte=# select * from customers
where id in ('1c888089-c4f2-4da4-ba61-59957b965bf8','85338353-162a-4a62-bcb9-78fd58a6b500')
and yb_is_local_table(tableoid);
id | planet | info
--------------------------------------+--------+------
1c888089-c4f2-4da4-ba61-59957b965bf8 | earth | 42
(1 row)
So, this is another way. Usually, on a geo-partitioned databases, you connect to one region and work there. If you don't know the region, the previous post explained how to maintain a map of them with indexes. But if you have some cross-region jobs or reporting, then better go to each region and process / report what is local. This doesn't need any index but uses the yb_is_local_table(tableoid)
function.