In the first post in this series I have created a RF=3 cluster with 5 nodes and defined a RF=5 placement for the tablets. Here, I'll start the same server but define the placement with tablespaces, to have a finer level control. Indexes, tables and partitions can be created in a tablespace to declare their placement requirements.
Setup the YugabyteDB cluster
I'm starting 5 nodes:
docker network create -d bridge yb
docker run -d --network yb --name yb-eu-1 -p5001:5433 -p7001:7000 -p9001:9000 \
yugabytedb/yugabyte:2.15.0.0-b11 \
yugabyted start --daemon=false --listen yb-eu-1 \
--master_flags="placement_zone=1,placement_region=eu,placement_cloud=cloud" \
--tserver_flags="placement_zone=1,placement_region=eu,placement_cloud=cloud"
docker run -d --network yb --name yb-eu-2 -p5002:5433 -p7002:7000 -p9002:9000 \
yugabytedb/yugabyte:2.15.0.0-b11 \
yugabyted start --daemon=false --listen yb-eu-2 --join yb-eu-1 \
--master_flags="placement_zone=2,placement_region=eu,placement_cloud=cloud" \
--tserver_flags="placement_zone=2,placement_region=eu,placement_cloud=cloud"
docker run -d --network yb --name yb-us-1 -p5003:5433 -p7003:7000 -p9003:9000 \
yugabytedb/yugabyte:2.15.0.0-b11 \
yugabyted start --daemon=false --listen yb-us-1 --join yb-eu-1 \
--master_flags="placement_zone=1,placement_region=us,placement_cloud=cloud" \
--tserver_flags="placement_zone=1,placement_region=us,placement_cloud=cloud"
docker run -d --network yb --name yb-ap-1 -p5004:5433 -p7004:7000 -p9004:9000 \
yugabytedb/yugabyte:2.15.0.0-b11 \
yugabyted start --daemon=false --listen yb-ap-1 --join yb-eu-1 \
--master_flags="placement_zone=1,placement_region=ap,placement_cloud=cloud" \
--tserver_flags="placement_zone=1,placement_region=ap,placement_cloud=cloud"
docker run -d --network yb --name yb-au-1 -p5005:5433 -p7005:7000 -p9005:9000 \
yugabytedb/yugabyte:2.15.0.0-b11 \
yugabyted start --daemon=false --listen yb-au-1 --join yb-eu-1 \
--master_flags="placement_zone=1,placement_region=au,placement_cloud=cloud" \
--tserver_flags="placement_zone=1,placement_region=au,placement_cloud=cloud"
As yugabyted
defines cloud1.datacenter1.rack1
but I use other names, I define the default placement keeping RF=3 but with at least one tablet in both eu
:
docker exec -i yb-eu-1 yb-admin -master_addresses yb-eu-1:7100,yb-eu-2:7100,yb-us-1:7100 \
modify_placement_info \
cloud.eu.1:1,cloud.eu.2:1,cloud.us.1:0,cloud.ap.1:0,cloud.au.1:0 \
3
docker exec -i yb-eu-1 yb-admin -master_addresses yb-eu-1:7100,yb-eu-2:7100,yb-us-1:7100 \
set_preferred_zones cloud.eu.1 cloud.eu.2
Tablespace
In the first post of this series, I defined the number of replicas and placement blocks with: yb-admin modify_placement_info cloud.eu.1:1,cloud.eu.2:1,cloud.us.1:1,cloud.ap.1:1,cloud.au.1:1 \
and the leader preference with
5yb-admin set_preferred_zones cloud.eu.1 cloud.eu.2
. This was at cluster level. Here I'll leave the default (RF=3 where leaders are distributed in all the cluster) but create a eu_preferred
with the same placement:
psql -p 5001 -c '
create tablespace eu_preferred with (
replica_placement=$placement$
{
"num_replicas": 5,
"placement_blocks": [{
"cloud": "cloud",
"region": "eu",
"zone": "1",
"min_num_replicas": 1,
"leader_preference": 1
},
{
"cloud": "cloud",
"region": "eu",
"zone": "2",
"min_num_replicas": 1,
"leader_preference": 1
},
{
"cloud": "cloud",
"region": "us",
"zone": "1",
"min_num_replicas": 1
},
{
"cloud": "cloud",
"region": "ap",
"zone": "1",
"min_num_replicas": 1
},
{
"cloud": "cloud",
"region": "au",
"zone": "1",
"min_num_replicas": 1
}
]
}$placement$)
'
Leader and Follower placement
I run the same as in the previous posts but with the table creation in the new tablespace:
psql -p 5005 -e <<SQL
drop table if exists demo;
create table demo tablespace eu_preferred
as select generate_series(1,1000) n;
update demo set n=n+1;
\watch 0.01
SQL
The leaders are in eu
and reads and writes happen there:
I have followers in all regions:
The tables in this tablespace have the same distributions as in the first blog post of this series.
Default distribution
Tablespaces gives control over specific tables. But the cluster configuration is de default for others. When you don't mention the tablespace, the distribution follows the default tablet placement for the cluster:
psql -p 5005 -e <<SQL
drop table if exists simple ;
create table simple
as select generate_series(1,1000) n;
update simple set n=n+1;
\watch 0.01
SQL
This uses the default for my cluster with two tablets in eu
, preferred leader there, and one follower in any other region:
The updates go to the leaders:
Now, connecting to au
and enabling follower reads:
psql -p 5005 -e <<SQL
set yb_read_from_followers=on;
set default_transaction_read_only = on;
explain analyze select * from simple;
\watch 0.01
SQL
As with this simple table using the default cluster distribution (5 nodes with RF=3), there's not a copy of all data in all nodes, the reads are distributed to all nodes:
This is another advantage of follower reads: the read are distributed even when the leader are not.
Thanks to default placement, and tablespaces, I can decide which tables are replicated on all regions, and which ones are distributed, with specific consideration for tablet leaders in both cases. By tables, I also mean indexes and partitions. Here I considered eu
as the main region to place the leaders, but with declarative partitioning this can also depend on a business value in a table column, like the user country.