Patterns of Distributed Systems

by Unmesh Joshi


On the opening day of November, 2019, my inbox contained a message from one of our senior developers in India. He had observed developers struggling with core distributed systems concepts that they needed to understand, in order to work effectively with modern tools like Kafka, Cassandra, and Zookeeper. He had tried teaching the theory behind key concepts in distributed systems, but found that his colleagues struggled to fully grasp the consequences. So he tried a more code-centric approach. He explored the code driving these core open-source systems, and built simplified implementations, designed to highlight and teach the theoretical concepts. This was more successful and his email was about how to take this training further.

We decided that developing a series of patterns would be a good direction to go and set out on what turned out to be a four year journey. More than most aspects of software development, distributed system design often requires the kind of mathematical analysis provided by tools such as formal methods. But as challenging as it is to understand how the theory works, there's still a considerable jump between what appears in a paper and what can be implemented in a practical system. By studying the code of systems that run our online systems every day (often requiring learning new languages and frameworks) Unmesh was able to formulate the common solutions embedded in this code into more general patterns. Building skeletal implementations of these patterns ensured he he properly understood the oft-subtle behaviors and trade-offs.

To communicate what he'd learned, he then drafted patterns, sent them to me and other interested Thoughtworkers, reflected on reviews, and published developed drafts here for wider consumption. As the pattern collection took shape, he contacted Pearson to turn this into a book, which I am proud to add to my signature series.

The final book contains thirty patterns, each illustrated with explanatory text, many with sequence diagrams to explain the complex interactions, and all with code samples that clarify the all-important details. Understanding these patterns provides a solid foundation for understanding how distributed systems work. In particular they illuminate the most gnarly problem that these systems face: how to ensure that data can be distributed in order to increase availability and resilience, without running into paradoxes when multiple writers try to update at the same time.

Further Reading

Catalog of Patterns

A list of all of the patterns in book, together with deep links into the relevant chapter of the online ebook.