Why Zookeeper needs an odd number of nodes?

Bikas Katwal
3 min readMar 28, 2019

--

Through this blog, I will explain why Zookeeper or any distributed systems need an odd number of nodes and try to keep it as simple as I can :)

I have taken Zookeeper as an example in this blog, but this is applicable to any distributed systems.

Before getting into the “why” part, let's understand a few basics:

  • Zookeeper always runs in a standalone or quorum mode.
    what is a quorum? A minimum number of nodes in a ZK cluster needs to be up and running, for a ZK cluster to work.
  • Any update request from the client is considered safe if it has written to the minimum number of nodes equal to the quorum size.
  • Zookeeper cluster will still be up and running, even if there are node failures as long as it can form a quorum (has a number of nodes equal to or greater than a quorum size).

Example:

  • For a 9 node ZK cluster, let's say we have chosen 5 nodes as the quorum size. So any client update needs to be stored in at least 5 nodes in the ZK cluster for it to be considered safe.
  • Even if 4 nodes go down, ZK cluster can still form a quorum of 5 nodes and will be up and running to serve the client request.

Now the question that comes to mind is, how do we decide what is the safest and optimal size of a quorum? The answer is: (n/2 +1) where “n” is the total number of nodes. So, for a 5 node cluster, it needs (5/2+1) = 3 nodes to form a quorum.

Why (n/2 +1) ?

Let’s consider that we have a 5 node cluster, in which 3 nodes in data centre DC1 and 2 nodes in a different data centre DC2.

Zookeeper Cluster with 5 nodes spread across 2 data centres

Assume the quorum size of the above cluster is 2.

Now, if there is a network failure between the two data centres then both the clusters will be able to form a quorum of size 2 nodes. Hence, both the quorums in the 2 different data centres start accepting write requests from clients. As a result, there will be data inconsistencies between the servers in the two data centres, as the servers in one datacenter can’t communicate updates to other servers in different data centre. This leads to a common problem called “split-brain” problem, where two or more subsets of the cluster function independently.

The quorum size of (n/2 + 1) ensures that we do not have the split-brain problem and we can always achieve a majority consensus. Hence, for the above cluster to form a quorum the minimum nodes required must be 5/2 +1=3. So, which means in the above scenario, DC2 nodes cannot form a quorum and hence can not accept any write requests.

Now let’s answer our question with an example, why is it a good practice to create a ZK cluster with an odd number of nodes?

Let’s say we have a ZK cluster of 5 nodes. This means we need a minimum of 3 nodes:
(i) form a quorum
OR
(ii) for zookeeper to keep serving the client request.
Thus, for a 5 nodes cluster, we can tolerate up to a failure of 2 (5 – 3) nodes.

Similarly, for a 6 node cluster, we need 4 nodes (6/2 + 1):
(i) to form a quorum
OR
(ii) for zookeeper cluster to keep running.
Which in turn means, that this cluster can tolerate a failure of up to 2 (6 – 4) nodes.

So, even with an extra node in the 6 node cluster, we still get a failure tolerance of 2 nodes as in the 5 nodes cluster. It just adds an overhead of managing an extra node. In conclusion, adding an even number of nodes doesn’t give any advantages here.

--

--

Bikas Katwal

Coder | Distributed Systems | Search | Software Engineer @Walmartlabs