HeartBeat
Show a server is available by periodically sending a message to all the other servers.
Problem
When multiple servers form a cluster, each server is responsible for storing some portion of the data, based on the partitioning and replication schemes used. Timely detection of server failures is important for taking corrective actions by making some other server responsible for handling requests for the data on a failed server.
Solution
Periodically send a request to all the other servers indicating liveness of the sending server. Select the request interval to be more than the network round trip time between the servers. All the listening servers wait for the timeout interval, which is a multiple of the request interval. In general,
for more details go to Chapter 07 of the online ebook at oreilly.com
This pattern is part of Patterns of Distributed Systems
23 November 2023