Recent Changes

Here is a list of recent updates the site. You can also get this information as an RSS feed and I announce new articles on Twitter and Mastodon.

I use this page to list both new articles and additions to existing articles. Since I often publish articles in installments, many entries on this page will be new installments to recently published articles, such announcements are indented and don’t show up in the recent changes sections of my home page.


Bliki: DiffDebugging

Mon 04 Dec 2023 00:00 EST

Regression bugs are newly appeared bugs in features of the software that have been around for a while. When hunting them, it usually valuable to figure out which change in the software caused them to appear. Looking at that change can give invaluable clues about where the bug is and how to squash it. There isn't a well-known term for this form of investigation, but I call it Diff Debugging.

Diff debugging only works if we have our code in version control, but fortunately these days that's the norm. But there are some more things that are needed to make it work effectively. We need Reproducible Builds, so that we can run old versions of the software easily. It helps greatly to have small commits, due to high-frequency integration. That way when we find the guilty commit, we can more easily narrow down what happened.

To find the commit that bred the bug, we begin by finding any past version without the bug. Mark this as a last-good version and the current version as the earliest-bad. Then find the commit half-way between the two and see if the bug is there. If so then this commit becomes the earliest-bad, otherwise it becomes the last-good. Repeat the process (which is a “half-interval” or “binary” search) until we've got the guilty commit.

If we use git, then the git bisect command will automate much of this for us. If we can write a test that will show the presence of the bug, then git bisect can use that too, automating the whole process of finding the guilty commit.

I often find diff debugging to be useful within a programming session. If I have slow tests that take a few minutes to run, I might program for half-an-hour running only a subset of the most relevant tests. As long as I commit after every green test run, I can use diff debugging should one of those slower tests fail. Such is the value of committing extremely frequently, even if they are so small that I feel its best to squash them for the long-term history. Some IDEs make this easier by keeping a local history automatically that is finer-grained than the commits to version control.

Revisions

I originally posted this page on 2004-06-01. In its original form it was more of a casual experience report. I rewrote it on 2023-12-04 to make it more like a definition of the term. Diff debugging isn't a term that's caught on much in the industry, but I haven't seen a another term generally used to describe it.


How to tackle unreliability of coding assistants

Tue 28 Nov 2023 10:21 EST

Over the last year, lots of developers have incorporated LLM coding assistants into their work, finding them a useful tool. But one of the problems of these tools is that they are unreliable, often coming up with poor or outright wrong-headed suggestions. Birgitta Böckeler continues her exploration of GenAI for developers by passing on what she's learned about how think about this unreliability, and why it may be good to call your LLM tool “Dusty”.

more…


Patterns of Distributed Systems is published by Pearson

Fri 24 Nov 2023 14:11 EST

During the last four years, my colleague Unmesh Joshi been developing a collection of patterns to help us all better understand how modern distributed systems work. We've been publishing drafts of these patterns on this site. Now these have turned into a book, published by Addison-Wesley in my signature series. As such, we've now removed the work-in-progress drafts from this site, and have replaced them with a catalog of pattern summaries. For those with a subscription to oreilly.com, we have deep links from the summaries to the relevant chapter of the online book.

more…


Three reasons a liberal arts degree helped me succeed in tech

Thu 09 Nov 2023 10:12 EST

My colleague Sannie Lee has met many students who are looking into getting into technology, taking narrow professionally-oriented majors. Sannie, however, has found that a traditional liberal-arts degree has given her skills that are highly relevant to her work as a product manager.

more…


Enhancing the Headless Component

Tue 07 Nov 2023 10:00 EST

In the second (and final) part of his explanation of React Headless Components Juntao Qiu explores how a headless component allows us to create a visually different component that does the same base behavior, and how it encourages better factoring as we extend base behavior further.

more…


Current thoughts on social media

Thu 02 Nov 2023 12:33 EDT

It's now been a year since The Muskover, what does my use of social media look like now, both as a reader and a writer?

more…


Headless Component: a pattern for composing React UIs

Wed 01 Nov 2023 10:34 EDT

As React UI controls become more sophisticated, complex logic can get intertwined with the visual representation. This makes it hard to reason about the behavior of the component, hard to test it, and necessary to build similar components that need a different look. Juntao Qiu tackles this by using a Headless Component, which extracts all non-visual logic and state management, separating the brain of a component from its looks.

more…


How is GenAI different from other code generators?

Tue 19 Sep 2023 08:57

How is code generation with GenAI different from more "traditional" code generators? The newest memo in Birgitta Böckeler's explorations of GenAI talks about abstraction levels in software engineering, and on which levels GenAI sits in the translation of our thoughts into zeros and ones.

more…


Technology Strategy for Emerging Technologies and Markets

Thu 24 Aug 2023 09:44 EDT

Sarah Taraporewalla completes her study of building a technology strategy that's integrated with strategic business interests. This final strategic direction considers the ever-changing future, suggesting lines of inquiry to consider the impact of new technologies, market trends, and broader social-political changes.

more…


Demo Front-End: A front-end application to test and explore an API

Wed 23 Aug 2023 10:37 EDT

Many software teams create services exposed as APIs, designed to be consumed by other software and thus without any user-interface. Such services are hard to demonstrate, as they effectively just dump pages of JSON. A demo front-end is a simple user-interface just used to manipulate such an API. Matteo Vaccari describes how and why to build one - showing its usefulness both in explaining the API's capabilities to stakeholders and to help client developers explore how to interact with the API.

more…


Strategic Directions supporting the people

Tue 22 Aug 2023 11:22 EDT

Having a robust digital talent strategy is a competitive advantage in today’s fiercely competitive market. This enables businesses to have the right talent and have the right competencies to meet current and future demand to meet business goals or to stay on track for digital transformation aspirations. Sarah Taraporewalla continues her article on how to create an integrated business and technology strategy by looking at questions raised by two strategic directions that support people: culture and internal systems.

more…


Bottlenecks of Scaleups #05: Resilience and Observability

Tue 22 Aug 2023 10:21 EDT

Here is a new article in the bottlenecks of scaleups series, looking at resilience and observability. Startups tend to only address resilience when their systems are already down, often taking a very reactive approach. For a scaleup, excessive system downtime represents a significant bottleneck to the organization, both from the effort expended on restoring function and also from the impact of customer dissatisfaction. Punit Lad and Carl Nygard explain that to move past this, resilience needs to be built into the business objectives, which will influence the architecture, design, product management, and even governance of business systems.

more…


TDD with GitHub Copilot

Thu 17 Aug 2023 14:44 EDT

At Thoughtworks, we are strong practitioners of Test Driven Development (TDD). Naturally this leads to the question of how generative AI can help with this technique. Paul Sobocinski writes a brief memo explaining how some of our teams have used TDD with GitHub Copilot. As ever, co-pilot can't be relied on to fly the plane, but can suggest some useful ideas for the red and green steps. It isn't very helpful for the all-important refactoring step.

more…


Final parts of the cost bottleneck of scaleups

Thu 17 Aug 2023 00:00

Sofia Tania and Stefania Stefansdottir complete their examination of how to overcome the bottleneck that costs impose on a scaleup. In this final installment, they look at how to review and govern the technology portfolio, optimize rates, and provide a general view of cost efficiency initiatives during the life-cycle of a scaleup.

more…


Strategic directions: minimizing risk and being data driven

Wed 16 Aug 2023 15:49 EDT

Time for two more strategic directions from Sarah Taraporewalla. These look at the questions that need to be investigated when reducing cost, minimizing operation risk, and enabling data-driven decision making.

more…


Bottlenecks of Scaleups Webinar: Sep 7 2023

Tue 15 Aug 2023 14:35 EDT

Join Tim Cochran and Rickey Zachary on Thursday, September 7th 1-2pm EST for a webinar on Bottlenecks of Scaleups: Technology mistakes every growing startup makes. They researched Thoughtwork's portfolio to analyze why companies struggle to scale, spanning across Technology, People, and Product. Covering areas such as experimentation culture, overcomplicated architectures, onboarding, developer productivity, product and engineering collaboration, and cost efficiency.

more…


Strategic directions to build a strong foundation

Tue 15 Aug 2023 10:16 EDT

Any business that wants to grow needs to be built on strong and stable foundations. While these strategic directions are often familiar to technical folks it's important that the improvements to engineering align with the themes that resonate with the rest of the organization. Sarah Taraporewalla illustrates how this appears with two strategic directions that consider improvements in efficiency and quality.

more…


Actions to sustain cost control for scaleups

Tue 15 Aug 2023 10:15 EDT

When scaling up, getting costs under control is vital to stabilizing financial health. But as a weight-loss expert might say, the key to long-term health is to do things that promote a sustainable lifestyle. Sofia Tania and Stefania Stefansdottir now move on to begin to describe these initiatives: federated accountability, visibility, and nudges towards better financial discipline.

more…


We will miss Stefan Tilkov

Mon 14 Aug 2023 14:58 EDT

We are saddened today to learn of the passing of Stefan Tilkov. We’ve met Stefan several times at workshops and conferences and always enjoyed his good company and insightful views. He advocated many of the things that we and our colleagues support - and as well as skillfully explaining these ideas, he also backed them up with concrete experience from his work. We shall really miss his wise contributions online, and regret that we won’t get the chance to chat with him again. We offer our condolences to his family, friends, and colleagues at INNOQ.


First stage of reducing costs for a scaleup

Thu 10 Aug 2023 11:17 EDT

When scaleups need to start working on cost efficiency, our experience is that they need to form a cost optimization team to work on the immediate steps that are needed. In this installment Sofia and Stefania describe how to understand the primary cost drivers, together with the levers to get them under control.

more…


Tech strategy for new customer segments and inorganic growth

Thu 10 Aug 2023 10:12 EDT

Sarah Taraporewalla moves on to the remaining two strategic directions that are part of growing the business. Expanding into new customer segments can introduce new operational processes or channels. Inorganic growth (eg acquisitions and mergers) needs an understanding of the drivers of increased value and the long term expectation (eg merge or keep separate and divest) for the business unit.

more…


Coding assistants do not replace pair programming

Thu 10 Aug 2023 09:31 EDT

In her 5th memo about exploring GenAI for software development, Birgitta Böckeler answers the frequently asked question if coding assistants are making the practice of Pair Programming obsolete. Spoiler alert: They don't.

more…


Creating an integrated business and technology strategy

Tue 08 Aug 2023 16:05

My colleague Sarah Taraporewalla describes an approach to building technology strategy that challenges the convention. It starts by identifying the overall strategic directions that the organization is considering, and using common aspects of these directions to indicate the investigations needed for the organization to understand the technology implications raised by that strategic change. This first installment looks at two of these directions: expanding the business by creating complementary products, and expanding the business into new markets and regions.

more…


Exploring Gen AI: how can in-line assistance get in the way?

Thu 03 Aug 2023 11:37 EDT

While coding assistants like Copilot can improve speed and flow, they can also disrupt it. Birgitta Böckeler looks at two ways in which they can get in the way: amplifying bad or outdated practices, and review fatigue.

more…


A case study of getting out of the costs bottleneck

Tue 01 Aug 2023 10:19 EDT

Sofia and Stefania begin their examination of how to get out of the bottleneck of surging cloud costs by relating a case study from a recent client - illustrating how the cost control can be separated into reduce and sustain phases.

more…


Exploring Gen AI: When is in-line assistance useful?

Tue 01 Aug 2023 09:51 EDT

The most widely used form of coding assistance in Thoughtworks at the moment is in-line code generation in the IDE, where an IDE extension generates suggestions for the developer as they are typing. Birgitta Böckeler looks at the factors that impact the usefulness of these suggestions, indicating where they lead to safe waters, and the rocks that we need to look out for.

more…


Bottlenecks of Scaleups #04: Cost Efficiency

Mon 31 Jul 2023 09:30 EDT

As startups begin to grow rapidly, they often find that early decisions that helped them find a product/market fit lead to excessive costs once traffic increases. These costs can threaten a scaleup's ability to grow. Sofia Tania and Stefania Stefansdottir have worked with many scaleups in this predicament and share their approach to understanding and reducing these costs.

more…


Exploring Gen AI - Three versions of a median

Thu 27 Jul 2023 10:34 EDT

Birgitta Böckeler continues her explorations in using LLMs, this time by asking GitHub Copilot to write a median function. It gave her three suggestions to choose from. The experience shows you still have to know what you're doing when asking LLMs to write code, since the LLM's programming skills are often rather flawed.

more…


Exploring Gen AI - The toolchain

Wed 26 Jul 2023 10:52 EDT

My colleague Birgitta Böckeler has long been one of our senior technology leaders in Germany. She's now moved into a new role coordinating our work with Generative AI and its effect of software delivery practices. She's decided to publish her exploration in a series of memos. The first memo looks at the current toolchain for LLMs, categorizing them by what tasks they help with, how we interact with the LLM, and where they come from.

more…


Bliki: TeamTopologies

Tue 25 Jul 2023 09:25 EDT

Any large software effort, such as the software estate for a large company, requires a lot of people - and whenever you have a lot of people you have to figure out how to divide them into effective teams. Forming Business Capability Centric teams helps software efforts to be responsive to customers’ needs, but the range of skills required often overwhelms such teams. Team Topologies is a model for describing the organization of software development teams, developed by Matthew Skelton and Manuel Pais. It defines four forms of teams and three modes of team interactions. The model encourages healthy interactions that allow business-capability centric teams to flourish in their task of providing a steady flow of valuable software.

The primary kind of team in this framework is the stream-aligned team, a Business Capability Centric team that is responsible for software for a single business capability. These are long-running teams, thinking of their efforts as providing a software product to enhance the business capability.

Each stream-aligned team is full-stack and full-lifecycle: responsible for front-end, back-end, database, business analysis, feature prioritization, UX, testing, deployment, monitoring - the whole enchilada of software development. They are Outcome Oriented, focused on business outcomes rather than Activity Oriented teams focused on a function such as business analysis, testing, or databases. But they also shouldn't be too large, ideally each one is a Two Pizza Team. A large organization will have many such teams, and while they have different business capabilities to support, they have common needs such as data storage, network communications, and observability.

A small team like this calls for ways to reduce their cognitive load, so they can concentrate on supporting the business needs, not on (for example) data storage issues. An important part of doing this is to build on a platform that takes care of these non-focal concerns. For many teams a platform can be a widely available third party platform, such as Ruby on Rails for a database-backed web application. But for many products there is no single off-the-shelf platform to use, a team is going to have to find and integrate several platforms. In a larger organization they will have to access a range of internal services and follow corporate standards.

This problem can be addressed by building an internal platform for the organization. Such a platform can do that integration of third-party services, near-complete platforms, and internal services. Team Topologies classifies the team that builds this (unimaginatively-but-wisely) as a platform team.

Smaller organizations can work with a single platform team, which produces a thin layer over an externally provided set of products. Larger platforms, however, require more people than can be fed with two-pizzas. The authors are thus moving to describe a platform grouping of many platform teams.

An important characteristic of a platform is that it's designed to be used in a mostly self-service fashion. The stream-aligned teams are still responsible for the operation of their product, and direct their use of the platform without expecting an elaborate collaboration with the platform team. In the Team Topologies framework, this interaction mode is referred to as X-as-a-Service mode, with the platform acting as a service to the stream-aligned teams.

Platform teams, however, need to build their services as products themselves, with a deep understanding of their customer's needs. This often requires that they use a different interaction mode, one of collaboration mode, while they build that service. Collaboration mode is a more intensive partnership form of interaction, and should be seen as a temporary approach until the platform is mature enough to move to x-as-a service mode.

So far, the model doesn't represent anything particularly inventive. Breaking organizations down between business-aligned and technology support teams is an approach as old as enterprise software. In recent years, plenty of writers have expressed the importance of making these business capability teams be responsible for the full-stack and the full-lifecycle. For me, the bright insight of Team Topologies is focusing on the problem that having business-aligned teams that are full-stack and full-lifecycle means that they are often faced with an excessive cognitive load, which works against the desire for small, responsive teams. The key benefit of a platform is that it reduces this cognitive load.

A crucial insight of Team Topologies is that the primary benefit of a platform is to reduce the cognitive load on stream-aligned teams

This insight has profound implications. For a start it alters how platform teams should think about the platform. Reducing client teams' cognitive load leads to different design decisions and product roadmap to platforms intended primarily for standardization or cost-reduction. Beyond the platform this insight leads Team Topologies to develop their model further by identifying two more kinds of team.

Some capabilities require specialists who can put considerable time and energy into mastering a topic important to many stream-aligned teams. A security specialist may spend more time studying security issues and interacting with the broader security community than would be possible as a member of a stream-aligned team. Such people congregate in enabling teams, whose role is to grow relevant skills inside other teams so that those teams can remain independent and better own and evolve their services. To achieve this enabling teams primarily use the third and final interaction mode in Team Topologies. Facilitating mode involves a coaching role, where the enabling team isn't there to write and ensure conformance to standards, but instead to educate and coach their colleagues so that the stream-aligned teams become more autonomous.

Stream-aligned teams are responsible for the whole stream of value for their customers, but occasionally we find aspects of a stream-aligned team's work that is sufficiently demanding that it needs a dedicated group to focus on it, leading to the fourth and final type of team: complicated-subsystem team. The goal of a complicated-subsystem team is to reduce the cognitive load of the stream-aligned teams that use that complicated subsystem. That's a worthwhile division even if there's only one client team for that subsystem. Mostly complicated-subsystem teams strive to interact with their clients using x-as-a service mode, but will need to use collaboration mode for short periods.

Team Topologies includes a set of graphical symbols to illustrate teams and their relationships. These shown here are from the current standards, which differ from those used in the book. A recent article elaborates on how to use these diagrams.

Team Topologies is designed explicitly recognizing the influence of Conways Law. The team organization that it encourages takes into account the interplay between human and software organization. Advocates of Team Topologies intend its team structure to shape the future development of the software architecture into responsive and decoupled components aligned to business needs.

George Box neatly quipped: "all models are wrong, some are useful". Thus Team Topologies is wrong: complex organizations cannot be simply broken down into just four kinds of teams and three kinds of interactions. But constraints like this are what makes a model useful. Team Topologies is a tool that impels people to evolve their organization into a more effective way of operating, one that allows stream-aligned teams to maximize their flow by lightening their cognitive load.

Acknowledgements

Andrew Thal, Andy Birds, Chris Ford, Deepak Paramasivam, Heiko Gerin, Kief Morris, Matteo Vaccari, Matthew Foster, Pavlo Kerestey, Peter Gillard-Moss, Prashanth Ramakrishnan, and Sandeep Jagtap discussed drafts of this post on our internal mailing list, providing valuable feedback.

Matthew Skelton and Manuel Pais kindly provided detailed comments on this post, including sharing some of their recent thinking since the book.

Further Reading

The best treatment of the Team Topologies framework is the book of the same name, published in 2019. The authors also maintain the Team Topologies website and provide education and training services. Their recent article on team interaction modeling is a good intro to how the Team Topologies (meta-)model can be used to build and evolve a model of an organization. [1]

Much of Team Topologies is based on the notion of Cognitive Load. The authors explored cognitive load in Tech Beacon. Jo Pearce expanded on how cognitive load may apply to software development.

The model in Team Topologies resonates well with much of the thinking on software team organization that I've published on this site. You can find this collected together at the team organization tag.

Notes

1: To be more strict in my modeling lingo, I would say that Team Topologies usually acts as a meta-model. If I use Team Topologies to build a model of an airline's software development organization, then that model shows the teams in the airline classified according to Team Topologies's terminology. I would then say that that the Team Topologies model is a meta-model to my airline model.