Clock-Bound Wait

Wait to cover the uncertainty in time across cluster nodes before reading and writing values so that values can be correctly ordered across cluster nodes.

Problem

Consider a key-value store where values are stored with a timestamp to designate each version. Any cluster node that handles a client request will be able to read the latest version using the current timestamp at the request processing node.

In the diagram below, the value Before Dawn is updated to value After Dawn at time 2 as per Green's clock. Both Alice and Bob are trying to read the latest value for title. While Alice's request is processed by cluster node Amber, Bob's request is processed by cluster node Blue. Amber has its clock lagging at 1, which means that when Alice reads the latest value, it delivers the value Before Dawn. Blue has its clock at 2, so when Bob reads the latest value, it returns the value as After Dawn.

This violates what is known as external consistency. If Alice and Bob now make a phone call, Alice will be confused: Bob will tell her that the latest value is After Dawn, while her cluster node is showing Before Dawn.

The same is true if Green's clock is fast and the writes happen in the future as per Amber's clock.

This is a problem if system's timestamp is used as a version for storing values, because system timestamps are not monotonic. Clock values from two different servers cannot and should not be compared. When Hybrid Clock is used as a version in Versioned Value, it allows values to be ordered on a single server as well as on different servers which are causally related. However, hybrid clocks (or any other kind of Lamport Clock) can only give 'partial order'. This means that any values which are not causally related and are stored by two different clients across different nodes cannot be ordered. This creates a problem when using a timestamp to read values across cluster nodes. If the read request originates on cluster nodes with lagging clocks, it probably won't be able to read the most up-to-date versions of given values.

Solution

While reading or writing, cluster nodes wait until the clock values on every node in the cluster are guaranteed to be above the timestamp assigned to the value.

If the difference between clocks is very small, write requests can wait without adding a great deal of overhead. As an example, assume the maximum clock offset across cluster nodes is 10 ms. (This means that, at any given point in time, the slowest clock in the cluster is lagging behind the fastest one by at most 10 ms.) To guarantee that every other cluster node has its clock past time t, the cluster node that handles any write operation will have to wait until t + 10 ms before storing the value.

for more details go to Chapter 24 of the online ebook at oreilly.com

This pattern is part of Patterns of Distributed Systems

23 November 2023