"etcd is a strongly consistent, distributed key-value store that provides a reliable way to store data that needs to be accessed by a distributed system or cluster of machines. It gracefully handles leader elections during network partitions and can tolerate machine failure, even in the leader node."
In K8s cluster, etcd is part of the master, it stores all its data – its configuration data, its state, and its metadata. Since K8s is a distributed system, it needs a distributed data store like etcd. etcd lets any of the nodes in the Kubernetes cluster read and write data.
It gets more complicated than that since in real-world etcd is deployed as a cluster of its own. The cluster will contain at least 3 nodes for the sake of good replication, durability and high availability.
The nodes communication are handled by the Raft algorithm, watch animation here.
ProTip: use at least 5 nodes for production etcd cluster.
Security is everyone job !!!
Access to etcd is equivalent to root permission in the cluster. Due to the sensitivity of the data, it is recommended to grant permission to only those nodes that require access to etcd clusters.
Make sure to set up firewall rules or use security features by etcd.
Plan for:
- Securing communication with certificates
- Limiting access of etcd clusters with authentication
Scale
You will probably won't need more than 5 nodes. Scaling etcd increases availability but reduces performance pretty drastic due to it being strongly consistence. Don't overdo it.
That was "what is etcd in 2 minutes or less".
Enjoyed it? check out the Kubernetes bitesize series.