etcd is the distributed key-value store that Kubernetes, and plenty of other systems, trust with their most critical state. A single etcd node works, but it is a single point of failure: lose that machine and you lose the source of truth. High availability comes from running etcd as a cluster, where the data is replicated across nodes and the cluster keeps serving as long as a majority survives.
In this tutorial you will set up a 3-node etcd cluster on Ubuntu: install the current etcd release, configure each member, run it under systemd, verify cluster health with etcdctl, and, most importantly, kill a node and watch the cluster keep working.
This setup uses plain HTTP between nodes, which is fine for a trusted private network or a lab. For production, or as the external datastore for Kubernetes, add TLS: How to Setup etcd Cluster with TLS Encryption.
Why 3 Nodes? Quorum Explained
etcd uses the Raft consensus algorithm. Every write must be acknowledged by a quorum, a majority of members, before it is committed. For a cluster with n members, quorum is (n/2)+1:
| Cluster size | Quorum | Failures tolerated |
|---|---|---|
| 1 | 1 | 0 |
| 2 | 2 | 0 |
| 3 | 2 | 1 |
| 5 | 3 | 2 |
Two lessons from that table:
- Never build a 2-node etcd cluster. Quorum for 2 nodes is 2, so losing either node halts the cluster. You get double the hardware and zero extra availability. Three is the minimum useful cluster.
- Bigger is not automatically better. A 5-node cluster tolerates 2 failures, which covers most realistic scenarios. Beyond that, every write has to replicate to more machines, so write latency suffers. Use odd numbers only. A 4th node adds replication cost without raising failure tolerance.
Prerequisites
- Three servers running Ubuntu 22.04 or 24.04 LTS, on a network with low latency between them
- Ports
2379(client) and2380(peer) open between the nodes. If you run UFW, see Setup Firewall Using UFW on Ubuntu - A user with
sudoprivileges on all three - Fast disks help. etcd fsyncs every write, and slow storage is the top cause of unhealthy clusters
The Use Case
____________ ____________
| | | |
| etcd 1 |------------| etcd 2 |
|____________| | |____________|
|
_____|______
| |
| etcd 3 |
|____________|
- etcd1:
192.168.5.100· etcd2:192.168.5.101· etcd3:192.168.5.102
Step 1: Install etcd on All Three Nodes
Download the official binary release from GitHub. Set the version once. Check the etcd releases page for the current stable version:
ETCD_VER=v3.6.1
wget -q --show-progress \
"https://github.com/etcd-io/etcd/releases/download/${ETCD_VER}/etcd-${ETCD_VER}-linux-amd64.tar.gz"
Extract and install the binaries:
tar zxf etcd-${ETCD_VER}-linux-amd64.tar.gz
sudo mv etcd-${ETCD_VER}-linux-amd64/etcd* /usr/local/bin/
You get three binaries: etcd (the server), etcdctl (the client), and etcdutl (offline maintenance tools). Confirm the install:
etcd --version
etcdctl version
Repeat on all three nodes.
Step 2: Create the Member Configuration
Each member gets an environment file at /etc/etcd. The files differ only in the member’s own name and IP. The ETCD_INITIAL_CLUSTER line listing all members is identical everywhere.
On etcd1:
sudo nano /etc/etcd
ETCD_NAME=etcd1
ETCD_DATA_DIR=/var/lib/etcd
ETCD_LISTEN_CLIENT_URLS=http://192.168.5.100:2379,http://127.0.0.1:2379
ETCD_LISTEN_PEER_URLS=http://192.168.5.100:2380
ETCD_ADVERTISE_CLIENT_URLS=http://192.168.5.100:2379
ETCD_INITIAL_ADVERTISE_PEER_URLS=http://192.168.5.100:2380
ETCD_INITIAL_CLUSTER=etcd1=http://192.168.5.100:2380,etcd2=http://192.168.5.101:2380,etcd3=http://192.168.5.102:2380
ETCD_INITIAL_CLUSTER_STATE=new
ETCD_INITIAL_CLUSTER_TOKEN=etcd-cluster
On etcd2, same file with the name and IPs changed:
ETCD_NAME=etcd2
ETCD_DATA_DIR=/var/lib/etcd
ETCD_LISTEN_CLIENT_URLS=http://192.168.5.101:2379,http://127.0.0.1:2379
ETCD_LISTEN_PEER_URLS=http://192.168.5.101:2380
ETCD_ADVERTISE_CLIENT_URLS=http://192.168.5.101:2379
ETCD_INITIAL_ADVERTISE_PEER_URLS=http://192.168.5.101:2380
ETCD_INITIAL_CLUSTER=etcd1=http://192.168.5.100:2380,etcd2=http://192.168.5.101:2380,etcd3=http://192.168.5.102:2380
ETCD_INITIAL_CLUSTER_STATE=new
ETCD_INITIAL_CLUSTER_TOKEN=etcd-cluster
On etcd3:
ETCD_NAME=etcd3
ETCD_DATA_DIR=/var/lib/etcd
ETCD_LISTEN_CLIENT_URLS=http://192.168.5.102:2379,http://127.0.0.1:2379
ETCD_LISTEN_PEER_URLS=http://192.168.5.102:2380
ETCD_ADVERTISE_CLIENT_URLS=http://192.168.5.102:2379
ETCD_INITIAL_ADVERTISE_PEER_URLS=http://192.168.5.102:2380
ETCD_INITIAL_CLUSTER=etcd1=http://192.168.5.100:2380,etcd2=http://192.168.5.101:2380,etcd3=http://192.168.5.102:2380
ETCD_INITIAL_CLUSTER_STATE=new
ETCD_INITIAL_CLUSTER_TOKEN=etcd-cluster
What the key settings mean:
ETCD_LISTEN_CLIENT_URLS(port 2379) is where clients (etcdctl, Kubernetes, your apps) connect. Including127.0.0.1lets you runetcdctllocally on the node without flags.ETCD_LISTEN_PEER_URLS(port 2380) is where other cluster members connect for Raft replication.ETCD_INITIAL_CLUSTERis the full member roster used to bootstrap the cluster. It must be identical on all nodes, and the names must match each node’sETCD_NAME.ETCD_INITIAL_CLUSTER_STATE=newsays this is a fresh cluster, not a member joining an existing one.ETCD_INITIAL_CLUSTER_TOKENis a shared token that prevents members of different clusters on the same network from accidentally joining each other.
Step 3: Create the systemd Service
On all three nodes, create the unit file:
sudo nano /etc/systemd/system/etcd.service
[Unit]
Description=etcd key-value store
Documentation=https://etcd.io/docs/
After=network-online.target
Wants=network-online.target
[Service]
Type=notify
EnvironmentFile=/etc/etcd
ExecStart=/usr/local/bin/etcd
Restart=on-failure
RestartSec=5
LimitNOFILE=40000
[Install]
WantedBy=multi-user.target
Enable and start the service on all three nodes:
sudo systemctl daemon-reload
sudo systemctl enable --now etcd
Timing note: the first node will sit waiting for peers. The cluster cannot elect a leader until a quorum (2 of 3) is up, so start at least two nodes within about a minute of each other. If a node gives up with an election timeout error, just start it again once the others are running.
Check the service on each node:
systemctl status etcd
Step 4: Verify the Cluster
List the members:
etcdctl member list -w table
+------------------+---------+-------+---------------------------+---------------------------+
| ID | STATUS | NAME | PEER ADDRS | CLIENT ADDRS |
+------------------+---------+-------+---------------------------+---------------------------+
| 685732e85e851bdd | started | etcd1 | http://192.168.5.100:2380 | http://192.168.5.100:2379 |
| 8940390e3669b48e | started | etcd2 | http://192.168.5.101:2380 | http://192.168.5.101:2379 |
| 345721e85e456bzw | started | etcd3 | http://192.168.5.102:2380 | http://192.168.5.102:2379 |
+------------------+---------+-------+---------------------------+---------------------------+
Then check health across all endpoints, not just the local one:
ENDPOINTS=http://192.168.5.100:2379,http://192.168.5.101:2379,http://192.168.5.102:2379
etcdctl --endpoints=$ENDPOINTS endpoint health
http://192.168.5.100:2379 is healthy: successfully committed proposal: took = 8.4ms
http://192.168.5.101:2379 is healthy: successfully committed proposal: took = 9.1ms
http://192.168.5.102:2379 is healthy: successfully committed proposal: took = 9.8ms
endpoint status shows which member is currently the Raft leader:
etcdctl --endpoints=$ENDPOINTS endpoint status -w table
Finally, prove replication works. Write on one node, read on another:
# on etcd1
etcdctl put greeting "hello from etcd1"
# on etcd3
etcdctl get greeting
greeting
hello from etcd1
Step 5: Simulate a Node Failure
A high availability setup you have never failure-tested is a hope, not a setup. Stop etcd on one node:
# on etcd2
sudo systemctl stop etcd
From another node, check health again:
etcdctl --endpoints=$ENDPOINTS endpoint health
The dead endpoint reports unhealthy, and the other two keep answering. Writes still work, because 2 of 3 is still a quorum:
etcdctl put survived "yes"
etcdctl get survived
If etcd2 held the leader role, the remaining members elect a new leader within seconds (watch it happen in endpoint status -w table). Now bring the node back:
sudo systemctl start etcd
It rejoins, catches up on the writes it missed, and endpoint health goes green across the board. That round trip (kill, verify, restore, verify) is the whole point of running a cluster.
Common Problems and Troubleshooting
A node logs election timeout or publish error: etcdserver: request timed out at first start.
The node came up alone and could not find a quorum. Make sure etcd is started on at least two nodes, then restart the failed one. This is a bootstrap-timing issue, not a configuration error.
member ... has already been bootstrapped.
The node has an old data directory from a previous attempt. If you are re-bootstrapping a fresh cluster (and the data is disposable), stop etcd, sudo rm -rf /var/lib/etcd, and start again on all nodes. Never do this on a cluster with real data.
Cluster forms but etcdctl from another machine cannot connect.
Check ETCD_LISTEN_CLIENT_URLS includes the node’s real IP (not just 127.0.0.1) and that port 2379 is open in the firewall between client and node.
One member keeps falling behind or flapping.
Look at disk latency first: etcd logs warnings like slow fsync when storage cannot keep up. etcd wants low-latency disks (SSD/NVMe); a member on slow storage degrades the whole cluster’s write path.
Names and URLs mismatch: member count is unequal or unmatched member while checking PeerURLs.
The ETCD_INITIAL_CLUSTER roster must be byte-identical on every node, and each ETCD_NAME must appear in it with exactly the peer URL that node advertises. Diff the three /etc/etcd files.
Best Practices
Run an odd number of members. 3 for most setups, 5 when you need to tolerate two simultaneous failures. Even numbers add cost without adding tolerance.
Take regular snapshots. etcdctl snapshot save /backup/etcd-$(date +%F).db gives you a point-in-time backup you can restore a cluster from with etcdutl snapshot restore. Ship the snapshots off the etcd nodes. Automated Encrypted Backups with Restic on Ubuntu covers a clean way to do that on a schedule.
Add TLS before trusting the cluster with anything real. Plain HTTP means any machine on the network can read and write your cluster state. The TLS setup is a modest extension of what you built here: How to Setup etcd Cluster with TLS Encryption.
Monitor it. etcd exposes Prometheus metrics on the client port at /metrics out of the box. Leader changes, fsync latency, and quorum health are the ones to alert on. See Setup Prometheus and Grafana on Ubuntu.
Conclusion
You built a 3-node etcd cluster: identical member rosters bootstrap the cluster, systemd keeps each member alive, etcdctl endpoint health and member list confirm the cluster state, and you proved the availability claim by stopping a member and writing anyway.
The most common reason to build this is as the external datastore for a highly available Kubernetes control plane. How to Install Kubernetes Single Master is the place to start on that path, and the TLS variant of this cluster is the version Kubernetes will actually accept.