Setup Cross Cluster Replication for Data migration in Elasticsearch

Anshul Kichara - Nov 3 - - Dev Community

Cross Cluster Replication (CCR) is a powerful feature in Elasticsearch that facilitates seamless data migration and disaster recovery by allowing real-time replication of data from a remote source cluster to a target cluster. In this guide, we’ll cover setting up CCR to migrate data from a Remote Cluster in Singapore to a Local Cluster in Mumbai. This step-by-step process involves configuring both clusters, creating follower indices, and monitoring the replication status. By the end, you’ll have a reliable setup for cross-regional data replication, enhancing both data redundancy and accessibility across geographical locations.

Pre-requisites

Should have “manage” cluster privilege on the local cluster
A license on both the cluster that includes cross-cluster replication.
In local cluster, all master nodes should also have a remote-cluster-client role. It must also have one data role and remote-cluster-client role.

Steps:
Establish trust between your clusters. This can be done in two ways: API key or TLS certificate authentication.
To setup API key authentication you may want to Enable the remote cluster server on every node of the remote cluster. In elasticsearch.yml specify the domain of your remote cluster (singapore)Cross Cluster Replication
Generate a certificate authority (CA) and a wildcard server certificate/key pair that would work on all nodes of your cluster. Mention the .p12 certification in your elasticsearch.yml.Cross Cluster Replication
Make sure to add the SSL keystore password to the Elasticsearch keystore to keep your password secure.
Restart the cluster.
For additional instructions, please refer to this document: API key authentication

You can check your cluster connection from Stack Management > Data > Remote ClustersCross Cluster Replication
Configure privileges by creating separate roles on the local and remote clusters, and then create a user with the required roles.

Finally create a follower index in your local cluster to replicate data from a selected index in the remote cluster.Cross Cluster Replication
Elasticsearch initializes the follower by transferring Lucene segment files from the leader index to follower, setting the index status to Paused, then switches to Active once replication is done.Cross Cluster Replication

Even if you add more data to the leader index, the follower will continue to replicate these updates as well.

Additional Considerations for CCR

Operational Continuity:It is not required to bring down connections or stop reads and writes in the Singapore cluster to set up Cross-Cluster Replication (CCR). You can perform CCR while the Singapore cluster continues to handle multiple reads and writes.
Replication Time: Replicating 1 GB of data typically takes around 1 minute from remote to local cluster.
Ongoing Replication: CCR continuously replicates data from the remote to the local cluster. As long as both clusters are operational, there should be no downtime. However, if there’s a network disruption or if either cluster experiences an outage, replication might temporarily pause until connectivity is restored.
Downtime during failover: If you have pre-configured your Mumbai cluster to handle writes as soon as a failover is triggered, then the downtime can be minimized or even eliminated entirely.

You can check more info about: Cross Cluster Replication for Data migration in Elasticsearch.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .