etcd backup & restore in kubernetes
etcd is a consistent and highly-available key value store used as Kubernetes’ store for all cluster data. In case of any disruptions, data should be recoverable via backups. etcd is a leader-based distributed system. It is preferable to run it as a cluster of odd members. A five-member cluster is recommended in production. etcd cluster achieves high availability by tolerating minor member failures.
Access to etcd is equivalent to root permission in the cluster so ideally only the API server should have access to it.
If any API servers are running in your cluster, you should not attempt to restore instances of etcd. Instead, follow these steps to restore etcd:
stop all API server instances
restore state in all etcd instances
restart all API server instances
# check where etcd stores data etcd.service [...] --data-dir=/var/lib/etcd #force to use API v3 export ETCDCTL_API=3 etcdctl snapshot save snapshot.db etcdctl snapshot status snapshot.db # to restore, stop apiserver: service kube-apiserver stop # set new data-dir etcdctl snapshot restore snapshot.db --data-dir=/var/lib/etcd-from-backup # change etcd service configuration to use new data-dir etcd.service [...] --data-dir=/var/lib/etcd-from-backup # restart services systemctl daemon-reload service etcd restart service kube-apiserver start # remember to specify certificate files and enpodint - to run with all above etcdctl commands! # exception is when etcd is on the controlplane node and from there we are invoking etcdctl --endpoints=https://127.0.0.1:2379 \ --cacert=/etc/etcd/ca.crt \ --cert=/etc/etcd/etcd-sever.crt \ --key=/etc/etcd/etcd-server.key # trusted-ca-file, cert-file and key-file can be obtained from the description of the etcd Pod or service definition file etcdctl etcd.service --endpoints --listen-client-urls --cacert --trusted-ca-file --cert --cert-file --key --key-file