Recent Changes

Here is a list of recent updates the site. You can also get this information as an RSS feed and I announce new articles on Twitter and Mastodon.

I use this page to list both new articles and additions to existing articles. Since I often publish articles in installments, many entries on this page will be new installments to recently published articles, such announcements are indented and don’t show up in the recent changes sections of my home page.


Etsy's observability and ML infrastructure teams moving to the cloud

Tue 22 Nov 2022 09:46 EST

Tim Cochran and Keyur Govande continue their account of how Etsy used the cloud to scale up by describing the journey of two teams: observability and ML infrastructure

more…


Using the cloud to scale Etsy

Thu 17 Nov 2022 09:01 EST

Etsy is a well-known marketplace for craft items. The pandemic led to a huge spike in growth, growing from 46 million buyers to 90 million buyers in two years. Etsy coped with this, with no bottlenecks in the business. One aspect of how they did this was a shift to Google cloud. Tim Cochran and Keyur Govande begin this story by describing the strategic principles that guided this effort and the incremental federated approach that they took.

more…


Using CWs

Wed 16 Nov 2022 10:50 EST

One of the new features on Mastodon for a recovering twitterer is the CW field for new posts. CW stands for Content Warning. When I’m composing a post, if I press the CW button, I have the option of putting a short phrase into a dialog. Readers will initially only see that short phrase, and need to click a button to see more. But there's a lot of strong opinions on when and how to use it - how do I navigate that?

more…


Multiple Mastodon Accounts

Tue 08 Nov 2022 19:44 EST

The usual Twitter convention is to follow the whole person, meaning one Twitter account for a person would tweet on many different subjects. In the Fediverse, however, that's not encouraged, so we'll see many people having multiple Mastodon (and other) accounts.s

more…


Your organization should run its own Mastodon server

Mon 07 Nov 2022 12:53 EST

The latest crisis at Twitter has led to a big surge of interest in Mastodon and the broader Fediverse of open social media platforms. My colleague Julien Deswaef has long been an advocate of the Fediverse. Here he explains why organizations should take control of their own social media platform by running their own Mastodon server.

more…


An appeal to Americans who aren't inclined to vote in the midterm elections

Wed 02 Nov 2022 14:10 EDT

In the United States, we have midterm elections coming up. Many people aren't interested in politics, or feel there is nobody worthwhile to vote for. If you're an American inclined to skip voting in these midterms, I'd appreciate it if you read my appeal.

more…


Twitter feed now cross-posts to Mastodon

Wed 02 Nov 2022 09:50 EDT

One of the main things I wanted to do with Mastodon was to replicate my twitter feed there, so that folks who would rather follow me on Mastodon could get everything. To do this, I used moa.party. You have to give it credentials to access both your Twitter and Mastodon feeds, which is a little worrisome, but my Mastodon-aware colleagues have used it without problems. It allows cross-posting in either or both directions, but I've set it up to just go from Twitter to Mastodon. It's pretty simple and seems to be working. So if you'd like to follow my twitter feed from Mastodon, you can now do so.

I'll be monitoring the follower count for the Mastodon account. If lots of people follow me on Mastodon, I'll probably do more with it. So following my Mastodon feed is vote for me to put more effort into it. But for the moment, I expect it to be a simple copy of what I post on Twitter.


Exploring Mastodon

Tue 01 Nov 2022 15:00 EDT

I've been a heavy user of Twitter over the last decade, and while Musk's purchase of Twitter hasn't got me running for the exit, it has prompted me to take a look at possible alternatives should Twitter change into something no longer worthwhile for me. The obvious alternative is for me to explore the fediverse with a Mastodon account. As I explore using Mastodon, I'll make some notes here so that others can learn from my explorations.

more…


Bliki: ConwaysLaw

Thu 20 Oct 2022 10:02 EDT

Pretty much all the practitioners I favor in Software Architecture are deeply suspicious of any kind of general law in the field. Good software architecture is very context-specific, analyzing trade-offs that resolve differently across a wide range of environments. But if there is one thing they all agree on, it's the importance and power of Conway's Law. Important enough to affect every system I've come across, and powerful enough that you're doomed to defeat if you try to fight it.

The law is probably best stated, by its author, as: [1]

Any organization that designs a system (defined broadly) will produce a design whose structure is a copy of the organization's communication structure.

-- Melvin Conway

Conway's Law is essentially the observation that the architectures of software systems look remarkably similar to the organization of the development team that built it. It was originally described to me by saying that if a single team writes a compiler, it will be a one-pass compiler, but if the team is divided into two, then it will be a two-pass compiler. Although we usually discuss it with respect to software, the observation applies broadly to systems in general. [2]

As my colleague Chris Ford said to me: "Conway understood that software coupling is enabled and encouraged by human communication." If I can talk easily to the author of some code, then it is easier for me to build up a rich understanding of that code. This makes it easier for my code to interact, and thus be coupled, to that code. Not just in terms of explicit function calls, but also in the implicit shared assumptions and way of thinking about the problem domain.

We often see how inattention to the law can twist system architectures. If an architecture is designed at odds with the development organization's structure, then tensions appear in the software structure. Module interactions that were designed to be straightforward become complicated, because the teams responsible for them don't work together well. Beneficial design alternatives aren't even considered because the necessary development groups aren't talking to each other.

A dozen or two people can have deep and informal communications, so Conways Law indicates they will create a monolith. That's fine - so Conway's Law doesn't impact our thinking for smaller teams. It's when the humans need organizing that Conway's Law should affect decision making.

The first step in dealing with Conway's Law is know not to fight it. I still remember one sharp technical leader, who was just made the architect of a large new project that consisted of six teams in different cities all over the world. “I made my first architectural decision” he told me. “There are going to be six major subsystems. I have no idea what they are going to be, but there are going to be six of them.”

This example recognized the big impact location has on human communication. Putting teams on separate floors of the same building is enough to significantly reduce communication. Putting teams in separate cities, and time zones, further gets in the way of regular conversation. The architect recognized this, and realized that he needed take this into account in his technical design from the beginning. Components developed in different time-zones needed to have a well-defined and limited interaction because their creators would not be able to talk easily.[3]

A common mismatch with Conways Law is where an ActivityOriented team organization works at cross-purposes to feature development. Teams organized by software layer (eg front-end, back-end, and database) lead to dominant PresentationDomainDataLayering structures, which is problematic because each feature needs close collaboration between the layers. Similarly dividing people along the lines of life-cycle activity (analysis, design, coding, testing) means lots of hand-offs to get a feature from idea to production.

Accepting Conway's Law is superior to ignoring it, and in the last decade, we've seen a third way to respond to this law. Here we deliberately alter the development team's organization structure to encourage the desired software architecture, an approach referred to as the Inverse Conway Maneuver [4]. This approach is often talked about in the world of microservices, where advocates advise building small, long-lived BusinessCapabilityCentric teams that contain all the skills needed to deliver customer value. By organizing autonomous teams this way, we employ Conway's Law to encourage similarly autonomous services that can be enhanced and deployed independently of each other. This, indeed, is why I describe microservices as primarily a tool to structure a development organization.

Responses to Conway's Law
Ignore Don't take Conway's Law into account, because you've never heard of it, or you don't think it applies (narrator: it does)
Accept Recognize the impact of Conway's Law, and ensure your architecture doesn't clash with designers' communication patterns.
Inverse Conway Maneuver Change the communication patterns of the designers to encourage the desired software architecture.

While the inverse Conway maneuver is a useful tool, it isn't all-powerful. If you have an existing system with a rigid architecture that you want to change, changing the development organization isn't going to be an instant fix. Instead it's more likely to result in a mismatch between developers and code that adds friction to further enhancement. With an existing system like this, the point of Conway's Law is that we need to take into account its presence while changing both organization and code base. And as usual, I'd recommend taking small steps while being vigilant for feedback.

Domain-Driven Design plays a role with Conway's Law to help define organization structures, since a key part of DDD is to identify BoundedContexts. A key characteristic of a Bounded Context is that it has its own UbiquitousLanguage, defined and understood by the group of people working in that context. Such contexts form ways to group people around a subject matter that can then align with the flow of value.

The key thing to remember about Conways Law is that the modular decomposition of a system and the decomposition of the development organization must be done together. This isn't just at the beginning, evolution of the architecture and reorganizing the human organization must go hand-in-hand throughout the life of an enterprise.

Further Reading

Recognizing the importance of Conway's Law means that budding software architects need to think about IT organization design. Two worthwhile books on this topic are Agile IT Organization Design by Narayan and Team Topologies by Skelton and Pais.

Birgitta Böckeler, Mike Mason, James Lewis and I discuss our experiences with Conway's Law on the ThoughtWorks Technology Podcast

Acknowledgements

Bill Codding, Birgitta Boeckeler, Camilla Crispim, Chris Ford, Gabriel Sadaka, Matteo Vaccari, Michael Chaffee, and Unmesh Joshi reviewed drafts of this article and suggested improvements

Notes

1: The source for Conway's law is an article written by Melvin Conway in 1968. It was published by Datamation, one of the most important journals for the software industry at that time. It was later dubbed “Conway’s Law” by Fred Brooks in his hugely influential book The Mythical Man-Month. I ran into it there at the beginning of my career in the 1980s, and it has been a thought-provoking companion ever since.

2: As Conway mentions, consider how the social problems around poverty, health care, housing, and education are influenced by the structures of government.

3: While location makes a big contribution to in-person communication patterns, one of the features of remote-first working, is that it reduces the role of distance, as everyone is communicating online. Conway's Law still applies, but it's based on the online communication patterns. Time zones still have a big effect, even online.

4: The term “inverse Conway maneuver” was coined by Jonny LeRoy and Matt Simons in an article published in the December 2010 issue of the Cutter IT journal.

Revisions

2022-10-24: I added the paragraph about the inverse Conway maneuver and rigid architectures. I also added the footnote about remote-first working.


Negotiate a balanced product investment mix

Wed 19 Oct 2022 08:02 EDT

Rick and Kennedy conclude their article on the bottleneck caused by tension between product and engineering. This final section addresses balancing between under and over-engineering in the product's technical infrastructure.

more…


Creating multidisciplinary stream-aligned teams to escape the product-vs-engineering bottleneck

Tue 18 Oct 2022 09:50 EDT

Rick and Kennedy continue explaining how to deal with the lack of collaboration between product and engineering. This installment advises creating multidisciplinary stream-aligned teams and establishing team working agreements.

more…


Getting out of the product-v-engineering bottleneck by identifying your "first team"

Wed 12 Oct 2022 10:38 EDT

Rick and Kennedy start their discussion of how to break through the product-V-engineering bottleneck by getting people to identify and focus on their "first team", and to develop a shared understanding of how a business creates value.

more…


Bottleneck #03: Product v Engineering

Mon 10 Oct 2022 11:16 EDT

In the third article on the Bottlenecks of Scaleups, Rick Kick and Kennedy Collins talk about the bottleneck that occurs when friction develops between product and engineering. In this first installment they discuss the signs that show this friction is occurring: with finger pointing and engineering lacking a sense of product context, as the teams communicate but don't collaborate.

more…


Request Waiting List

Fri 09 Sep 2022 12:39 EDT

Nodes often have to contact several other nodes to form a quorum to handle a client request. Unmesh Joshi how a waiting list keeps track of the outstanding requests and sorting out what to do when it receives enough responses.

more…


Request Batch

Tue 06 Sep 2022 11:21 EDT

When distributing data leads to lots of small messages around a cluster then network latency and the request processing time (including serialization, deserialization of the request on the server side) can add significant overhead. Unmesh Joshi shows how requests can be combined into a Request Batch to improve throughput.

more…


Key-Range Partitions

Thu 25 Aug 2022 09:15 EDT

A Fixed Partition provides a good way to distribute data over many nodes when clients are accessing a single key at a time. If, however, a client wants a range of values, such as all names from "a" to "f", then they'll need to access every node. Unmesh Joshi Explains how a Key-Range Partition provides a better alternative for this kind of data access.

more…


Fixed Partitions

Tue 23 Aug 2022 11:55 EDT

When partitioning data across a set of cluster nodes we need a uniform distribution and to be able add and remove nodes to the cluster without causing a lot of data to be moved around. Unmesh Joshi explains how to do this by allocating data to a large number of virtual fixed partitions which are then allocated to the nodes.

more…


Emergent Leader

Thu 18 Aug 2022 10:56 EDT

Peer-to-peer systems treat each cluster node as equal; there is no strict leader. This means there is no explicit leader election process as happens in the Leader and Followers pattern. However, there still needs to be one cluster node acting as cluster coordinator for tasks such as assigning data partitions to other cluster nodes and tracking when new cluster nodes join or fail and take corrective actions. Unmesh Joshi explains how this is resolved with an Emergent Leader

more…


Clock-Bound Wait

Wed 17 Aug 2022 10:22 EDT

Although he's been quiet for a while on here, Unmesh Joshi has been working hard on more of his Patterns of Distributed Systems. In this first of a new batch, he looks at the difficulty of getting consistent reads from servers in the presence of the inevitable drifts between system logs. A Clock-Bound Wait adds a small wait time on a request for a value at a recent time. This way the server can be sure it's providing the correct value should it have changed during within the window of the clock lag.

more…


Advocate, educator, and authorial stance

Tue 19 Jul 2022 13:20 EDT

When I'm writing, or mentoring others in writing, about a particular technique I prefer to take the role of an educator rather than that of an advocate. When doing that, I see two main stances an author can take. One is to focus on the trade-offs between this technique and its alternatives, the other is to focus on the merits of the particular technique and not discuss the alternatives.

more…


Legacy Displacement: Revert to Source

Thu 07 Jul 2022 10:07 EDT

Legacy systems often act as integration hubs, ingesting source data to pass on to downstream systems. A new downstream system can decouple itself from legacy by finding the source of data to the legacy and integrating directly to that instead. Ian Cartwright, Rob Horn, and James Lewis describe this Revert to Source pattern, explaining that this part of legacy displacement often also allows a new system to take advantage of upgrades to source data that the legacy had neglected.

more…


Product Backlog Building Canvas

Tue 14 Jun 2022 10:15 EDT

Many software teams describe desired product capabilities as a product backlog: a list of user stories. These stories capture who needs the work, what the work is, and why it's needed. Too often teams expect a product owner to be the sole source of the backlog, but anyone could (and should) write user stories. Paulo Caroli teaches teams to use a Product Backlog Building Canvas, which provides a simple process to develop user stories, starting with describing personas for product users and the activities they do. These activities yield features: their interactions with the product. Features are broken down into backlog items, which can then be formulated into user stories from the background of personas and activities.

more…


Agile Book Club interview on Refactoring

Thu 28 Apr 2022 11:33 EDT

James Shore's Art of Agile Development is my favorite single-volume book on agile software development. A reason for that is its serious emphasis on the technical practices that are essential to making it work effectively. James and I discuss the role of refactoring for software development, the nature of design changes we see, and how to break down big changes into small pieces.

more…


How I use Twitter

Tue 26 Apr 2022 08:50 EDT

A couple of recent conversations about Twitter were nudging me into writing about how I use Twitter even before The Muskover developed. Twitter has become an important part of my online life, and my online life is a big part of what I do. But like any tool, Twitter can be used in many different ways, and how you use it affects how useful it can be.

more…


photostream 128

Tue 19 Apr 2022 17:58 EDT

Heian-jingu Shrine

Kyoto, Japan (2004)


Transitional Architecture

Mon 28 Mar 2022 14:49 EDT

The core to a successful legacy displacement is the gradual replacement of legacy with new software, as this allows benefits to delivered early and circumvents the risks of a Big Bang. During displacement the legacy and new system will have to operate simultaneously allowing behavior to be split between old and new.

Ian Cartwright, Rob Horn, and James Lewis explain how to build and evolve a Transitional Architecture that supports this collaboration as it changes over time. For this to work, intermediate configurations may require integrations that have no place in the target architecture of the new system.

Or to put this more directly - you will have to invest in work that will be thrown away.

more…


Investing in the hiring process

Wed 16 Mar 2022 10:06 EDT

Tim Cochran and Roni Smith complete their article by looking at how scaleups need to invest in the hiring process to overcome the talent bottleneck. They add a case study from our experiences with talent acquisition at Thoughtworks.

more…


How to get out of the talent bottleneck

Tue 15 Mar 2022 10:59 EDT

Tim Cochran and Roni Smith explore how scaleups can get out of the hiring bottleneck by using technology and innovation as a hiring differentiator, hiring T-shaped and non-senior developers, and embracing remote working.

more…


How scaleups get constrained by talent

Thu 10 Mar 2022 10:31 EST

The second bottleneck in the series looks at talent, and how scaleups struggle to hire enough good people. Tim Cochran and Roni Smith explain how the small network and informal processes that allow early stage startups to grow begin to fail during the scaleup phase, and what signs indicate a new approach is needed.

more…


How to get out of the tech debt bottleneck

Wed 09 Mar 2022 11:03 EST

Tim Cochran and Carl Nygard finish their examination of the tech debt bottleneck by looking at how to get out of it. This includes close collaboration betwen product and engineering, a strategy for the four phases of a startup's journey, and empowering teams to fix the tech debt problems.

more…