Recent Changes
Here is a list of recent updates the site. You can also get this information as an RSS feed and I announce new articles on Twitter and Mastodon.
I use this page to list both new articles and additions to existing articles. Since I often publish articles in installments, many entries on this page will be new installments to recently published articles, such announcements are indented and don’t show up in the recent changes sections of my home page.
Wed 10 Apr 2024 11:00 EDT
Alessio Ferri and Tom Coggrave
complete their article about introducing seams into mainframe systems by
looking how we can use data replication. Done well, it can provide a
rapid start to a displacement effort, but we must be wary of it
coupling new systems to the legacy schema
more…
Thu 04 Apr 2024 09:39 EDT
Mainframe processing is often organized into pipelines of batch
processes consuming and creating data files. Alessio
Ferri and Tom Coggrave show a couple of ways
they found to introduce seams into these pipelines, so that processing
could be replaced by steps in a replacement platform.
more…
Tue 02 Apr 2024 09:46 EDT
Alessio Ferri and Tom Coggrave move
on to analyzing the internal seams of the application by treating
the database as a coarse-grained seam. They describe how they introduced
seams into the database readers and writers.
more…
Thu 28 Mar 2024 12:26 EDT
As the enmuskification of Twitter continues, I’ve increasingly heard that more people are using LinkedIn to keep up with new professional material. So, a couple of weeks ago, I set up my LinkedIn account, so people can follow me on that platform.
I’ve always avoided LinkedIn - I’ve found the whole vibe of connections rather off-putting. I get too much spam from people wanting to connect as it is. But LinkedIn has added a “creator mode”, which encourages people to follow someone for posts rather than the bi-directional connection. It seems to be working reasonably well so far, so I’ve decided that I shall post all updates here to that account too. However I’m still avoiding connections, so please don’t send me a connection request unless we’ve done some substantial work together.
I’m still posting to X (Twitter), but as it steadily deteriorates, it gets less of my attention. If you follow me there, I recommend switching to one of my other feeds if you can.
I don’t particularly like LinkedIn for its feed, as I have little control over it. I never like the site suggestions, preferring as simple feeds of people. Sadly, LinkedIn doesn’t have lists, and pushes everything someone likes into the single connection feed. This leads to Too Many Posts which means I don’t pay attention to much.
Wed 27 Mar 2024 12:48 EDT
John Kordyback, a treasured colleague and friend, died last week, aged 64.
more…
Wed 27 Mar 2024 10:36 EDT
Alessio Ferri and Tom Coggrave start
detailing the seams they explored with two areas of external interfaces.
Batch input of files are copied to new implementations while comparing the output of the
processing pipelines. API access points can be covered with a proxy and
traffic gradually directed to the new implementation.
more…
Tue 26 Mar 2024 09:32 EDT
Mainframe systems continue to run much of the world's computing workload,
but it's often difficult to add new features to support growing business
needs. Furthermore the architectural challenges that make them slow to
enhance also make them hard to replace. To reduce the risk involved, we
use an incremental approach to legacy displacement, gradually replacing
legacy capabilities with implementations in modern technology. This
strategy requires us to introduce seams into the mainframe system: points
in which we could divert logic flow into newer services. In a recent
project Alessio Ferri and Tom Coggrave
investigated several approaches to introduce these seams into a
long-lived mainframe system.
more…
Tue 19 Mar 2024 09:33 EDT
Abi Noda and Tim Cochran complete
their article on qualitative metrics by outlining how to capture them
effectively. They discuss the mental steps that people go through as
they respond to a survey and provide a template to get started when assessing
developer experience. A final section looks at how qualitative and
quantitative work together: often by starting with qualitative metrics to
establish baselines and determine where to focus, followed with
quantitative metrics to drill deeper into specific areas.
more…
Fri 15 Mar 2024 11:04 EDT
From time to time people ask me for a copy of the code I used in the
opening chapter of Refactoring, so
they can follow along themselves. I had Reasons for not providing this code,
specifically laziness. Fortunately Emily Bache is more dedicated, and she
has set up a github repository - the Theatrical
Players Refactoring Kata - with the code, and enough tests to make it
reasonable to do the refactoring.
The repository goes further than this, however, in that it includes similar
sample code in a dozen languages, including C, Java, Rust, and Python.
She has recently posted a video to her YouTube
channel, which outlines why she encourages folks to use this code while
they are reading that chapter. Her channel includes a lot of videos on good
code technique, and she has a Patreon for readers to support
her work.
Wed 13 Mar 2024 10:36 EDT
Abi Noda and Tim Cochran continue their
discussion on using qualitative metrics to assess the productivity of
development teams. In this installment they classify qualitative metrics into
attitudinal and behavioral metrics. We also see that qualitative metrics
allow you to measure things that are otherwise unmeasurable, provide missing
visibility, and supply necessary context for quantitative data.
more…
Tue 12 Mar 2024 09:36 EDT
Measuring developer productivity is a difficult challenge.
Conventional metrics focused on development cycle time and throughput are
limited, and there aren't obvious answers for where else to turn.
Qualitative metrics offer a powerful way to measure and understand
developer productivity using data derived from developers themselves.
Abi Noda and Tim Cochran begin their
discussion by explaining what a qualitative metric is and
why we shouldn't reject them for being subjective or unreliable.
more…
Wed 06 Mar 2024 10:33 EST
When pair programming, it's important to rotate the pairs frequently,
but many organizations that do pair programming are reluctant to do that.
Gabriel Robaina and Kieran Murphy ask
the question: “What if we rotate pairs every day?” and worked with three
teams through an exercise of daily pair rotation. They developed a
lightweight methodology to help teams reflect on the benefits and
challenges of pairing and how to solve them. Initial fears were overcome
and teams discovered the benefits of frequently rotating pairs. They
learned that pair swapping frequently greatly enhances the benefits of
pairing. Their article shares the methodology they
developed, their observations, and some common fears and insights shared
by the participating team members.
more…
Tue 05 Mar 2024 10:03 EST
When we gradually replace a legacy system, we have plenty of cases
where the legacy system and its replacement need to interact. Since these
legacy systems are often difficult, and costly, to change, we need a
mechanism that can integrate elements of the replacement while minimizing
the impact to the legacy system. Ian Cartwright, Rob Horn, and
James Lewis explain how we can use Event Interception on
state-changing events, allowing us to forward them to the
replacement.
more…
Tue 27 Feb 2024 09:17 EST
Improvements in communications technology have led an increasing number of
teams that work in a Remote-First
style, a trend that was boosted by the forced isolation of Covid-19 pandemic.
But a team that operates remotely still benefits from face-to-face gatherings,
and should do them every few months.
Remote-first teams have everyone in a separate location, communicating
entirely by email, chat, video and other communication tools. It has definite
benefits: people can be recruited to the team from all over the world, and we can
involve people with care-giving responsibilities. Wasteful hours of
frustrating commutes can be turned into productive or recuperative time.
But however capable folks may be at remote working, and however nifty modern
collaboration tools become, there is still nothing like being in the same
place with the other members of a team. Human interactions are always richer
when they are face-to-face. Video calls too easily become transactional, with
little time for the chitchat that builds a proper human relationship. Without
those deeper bonds, misunderstandings fester into serious relationship
difficulties, and teams can get tangled in situations that would be
effectively resolved if everyone were able to talk in person.
A regular pattern I see from those who are effective in remote-first work
is that they ensure regular face-to-face meetings. During these they schedule
those elements of work that are done better together. Remote work
is more effective for tasks that require solo concentration, and modern tools
can make remote pairing workable. But tasks that require lots of input from
many people with rapid feedback are much easier to do when everyone is in the
same room. No video-conference system can create the that depth of
interaction, staring at a computer screen to see what other people are doing
is draining, with no opportunity to pop out for a coffee together to break up the
work. Debates about product strategy, explorations of systems architecture,
explorations of new ground - these are common tasks for when the team is
assembled.
For people to work effectively together they need to trust each other,
aware of how much they can rely on each other. Trust is hard to develop
online, where there isn't the social cues that can happen when we are in the
same room. Thus the most valuable part of a face-to-face gathering isn't
the scheduled work, it's chitchat while getting a coffee, and conviviality
over lunch. Informal conversations, mostly not about work, forge the human
contact that makes the work interactions be more effective.
Those guidelines suggest what the content for a face-to-face should be.
Working together is both valuable in its own right, and an important part of
team bonding. So we should set a full day of work, focusing on those tasks
that benefit from the low-latency communication that comes from being
together. We should then include what feels like too much time for
breaks, informal chatter, and opportunities to step outside the office. I
would avoid any artificial “team building” exercises, if only because of how
much I hate them. Those who do gatherings like this stress the value from
everyone energized afterwards, and thus able to be more effective in the
following weeks.
Remote teams can be formed at large distances, and it's common to see
members separated by hours of travel. For such teams, the rule of thumb I would
use is to get together for a week every two or three months. After the team
has become seasoned they may then decide to reduce the frequency, but I would
worry if a team isn't having at least two face-to-face meetings a year. If a
team is all in the same city, but using a remote-first style to reduce
commuting, then they can organize shorter gatherings, and do them more
frequently.
This kind of gathering may lead to rethinking of how to configure office
space. Much has been made of how offices are far less used since the pandemic.
Offices could well become less of a day-to-day workspace, and more a location
for these kinds of irregular team gatherings. This leads to a need for
flexible and comfortable team gathering spaces.
Some organizations may balk at the costs of travel and accommodation for a
team assembly like this, but they should think of it as an investment in the
team's effectiveness. Neglecting these face-to-faces leads to teams getting
stuck, heading off in the wrong direction, plagued with conflict, and people
losing motivation. Compared to this, saving on airplanes and hotels is a false
economy.
Further Reading
Remote-first is one form of remote work, I explore the different styles
of remote working and their trade-offs in Remote versus Co-located
Work.
At Thoughtworks, we learned the importance of regular face-to-face
gatherings for remote teams when we first started our offshore development
centers nearly two decades ago. These generated the practices I describe in Using an Agile Software Process with
Offshore Development.
Remote work, particularly when crossing time zones, puts a greater
premium on asynchronous patterns of collaboration. My colleague Sumeet
Moghe, a product manager, goes into depth on how to do this in his book
The Async-First Playbook
Atlassian, a software product company, has recently entirely shifted to
remote working, and published a report on its
experiences. They have learned that it's wise for teams to have a
face-to-face gathering roughly three times per year. Claire Lew surveyed remote-first teams in 2018, noting that a quarter
of their respondents did retreats “several times a year”. 37Signals has
operated as a remote-first company for nearly two decades and schedules meetups twice a year.
Acknowledgements
Alejandro Batanero, Andrew Thal, Chris Ford, Heiko Gerin, Kief Morris, Kuldeep Singh, Matt Newman, Michael Chaffee, Naval
Prabhakar, Rafael Detoni, and Ramki Sitaraman discussed drafts of
this post on our internal mailing list.
Tue 13 Feb 2024 12:22 EST
LLM engineering involves much more than just prompt design or prompt
engineering. Here David Tan and Jessie
Wang reflect on how regular engineering practices such as
testing and refactoring helped them deliver a prototype LLM
application rapidly and reliably.
more…
Wed 31 Jan 2024 11:57 EST
Tim and Prem finish their article on effective
onboarding. They discuss the value of pair programming, setting up
personal environments, and removing friction from the process.
more…
Tue 30 Jan 2024 09:33 EST
Tim and Prem continue outlining the steps for an
effective onboarding process. They talk about including new hires in the
company culture, nailing the post-offer and first-day experience, and
investing in self-service knowledge management
more…
Thu 25 Jan 2024 13:20 EST
I’ve been using Emacs for many years, using it for any writing for my website, writing my books, and most of my programming. (Exceptions have been IntellJ IDEA for Java and RStudio for R.) As such I’ve been happy to see a lot of activity in the last few years to improve Emacs’s capabilities, making it feel rather less than a evolutionary dead end. One of the biggest improvements to my Emacs experience is using regexs for completion lists.
Many Emacs commands generate lists of things to pick from. I want to visit (open) a file I type the key combination to find a file, and Emacs pops up a list of candidate files in the minibuffer (a special area for to interact with commands). These file lists can be quite long, particularly should I ask for a list of all files in my current project.
To specify the file I want, I can type some text to filter the list, so if I want to open the file articles/simple/2024-emacs-completion.md
I might type emacs
. I don’t have to get only that one file, just filtering to a small enough list is often enough.
There’s a particular style of regex builder that I find the most helpful, one that separates regexs by spaces. This would allow me to type articles emacs
to get a list of any file paths that contain “articles” and “emacs” in their file path. It essentially turns the string “articles emacs” into the regex \\(articles\\).*\\(emacs\\)
. Better yet, such a matcher allows me to type the regexs in any order, so that “emacs articles” would also match. This way once the first regex pops up a filtered list, I can use a second regex to pick the one I want, even if the distinguishing regex is earlier than my initial search.
Installing such a completion matcher has had a remarkable effect on my use of Emacs, since it makes it a breeze to filter large lists when interacting with commands. One of the most significant of these is how it changes my use of M-x
, the key combo that brings up a list of all interactive Emacs functions. With a regex matcher to filter the list, it allows me to invoke an Emacs command using its name, with just a few keystrokes. That way I don’t have to remember the keyboard shortcut. With this, I invoke commands that I use less frequently through M-x
. I don’t list all open buffers very often, so rather than try to remember the key combination for it, I just type M-x ib
and ibuffer
quickly pops up. This is helped that the command I use for M-x
(counsel-M-x
) inserts a “^
” as the first character in the regex, which anchors the first regex to the beginning of the line. Since I prefix all my self-written functions with mf-
, I can easily find my own functions, even if they have a long name. I wrote a command to remove the domain from a URL, I call it mf-url-remove-domain
and can invoke it with M-x mf url
.
There are quite a few packages in Emacs that do this kind of matching, enough to be rather confusing. The one I’m using these days is Ivy. By default it uses a space-separated regex matcher, but one that doesn’t support any order. To configure it the way I like it I use
(setq ivy-re-builders-alist '((t . ivy--regex-ignore-order)))
Ivy is part of a package called counsel
that includes various commands that enhance these kind of selections.
Ivy isn’t the only tool that does this kind of thing. Indeed the world of completion tools in Emacs is one I find very confusing: lots of tools with overlaps and interactions that I don’t really understand. The tools in this territory include Helm, company, Vertico, and Consult. Mastering Emacs has an article on Understanding Minibuffer Completion, but it doesn’t explain how the mechanisms it talks about fit in with what Ivy does, and I haven’t spent the time to figure it all out.
And as a general note, I strongly recommend the book Mastering Emacs to learn how to use this incredible tool. Emacs has so many capabilities, that even a decades-old user like me found that book led to “I didn’t know it could do that” moments.
For those that are curious, here’s the relevant bits of my Emacs config
(use-package ivy
:demand t
:diminish ivy-mode
:config
(ivy-mode 1)
(counsel-mode 1)
(setq ivy-use-virtual-buffers t)
(setq ivy-use-selectable-prompt t)
(setq ivy-ignore-buffers '(\\` " "\\`\\*magit"))
(setq ivy-re-builders-alist '(
(t . ivy--regex-ignore-order)
))
(setq ivy-height 10)
(setq counsel-find-file-at-point t)
(setq ivy-count-format "(%d/%d) "))
(use-package counsel
:bind (
("C-x C-b" . ivy-switch-buffer)
("C-x b" . ivy-switch-buffer)
("M-r" . counsel-ag)
("C-x C-d" . counsel-dired)
("C-x d" . counsel-dired)
)
:diminish
:config
(global-set-key [remap org-set-tags-command] #'counsel-org-tag))
(use-package swiper
:bind(("M-C-s" . swiper)))
(use-package ivy-hydra)
Wed 24 Jan 2024 10:45 EST
Tim and Prem begin their discussion of how to get out of the
difficulties of onboarding by explaining how to create a path to
effectiveness for new hires. Such a path outlines the needs of employee
and how the onboarding process should fulfill them.
more…
Tue 23 Jan 2024 09:13 EST
The last year has been a hard one for the technology industry, which
faced its greatest waves of cutbacks and layoffs since the dotcom crash at
the beginning of the century. As 2024 begins, we're seeing the early signs of a
turn-around, which hopefully means technology organizations will soon be thinking of
hiring again. Should such happy days return, firms will again run into
the common problem of
taking too long for new hires to become effective. Tim
Cochran and Premanand Chandrasekaran address this in the sixth part of our
series on the bottlenecks of scaleups. In this first installment, Tim and Prem
look the signs that a growing organization is running into this bottleneck.
more…
Thu 18 Jan 2024 10:01 EST
At the turn of the century, I was lucky to involved in several
projects that developed the practice of Continuous Integration. I wrote
up our lessons from this work in article on my website, where it
continues to be a oft-referenced resource for this important practice.
Late last year a colleague contacted me to say that the article, now
nearly twenty years old, was still useful, but was showing its age. He
sent me some suggested revisions, and I used this as a trigger to do a
thorough revision of the article, considering every section in the
original, and adding new ones to deal with issues that have appeared in the last
two decades.
During those decades Feature Branching has been widely adopted in the
industry. Many colleagues of mine feel that Continuous Integration is a
better fit for many teams, I hope this article will help readers assess
if this is the case, and if so, how to implement Continuous Integration
effectively.
more…
Thu 04 Jan 2024 09:13 EST
When working with a legacy system it is valuable to identify and create seams:
places where we can alter the behavior of the system without editing
source code. Once we've found a seam, we can use it to break dependencies to
simplify testing, insert probes to gain observability, and redirect program
flow to new modules as part of legacy displacement.
Michael Feathers coined the term “seam” in the context of legacy systems in
his book Working Effectively with Legacy Code. His definition: “a seam is a place
where you can alter behavior in your program without editing in that place”.
Here's an example of where a seam would be handy. Imagine some code to
calculate the price of an order.
// TypeScript
export async function calculatePrice(order:Order) {
const itemPrices = order.items.map(i => calculateItemPrice(i))
const basePrice = itemPrices.reduce((acc, i) => acc + i.price, 0)
const discount = calculateDiscount(order)
const shipping = await calculateShipping(order)
const adjustedShipping = applyShippingDiscounts(order, shipping)
return basePrice + discount + adjustedShipping
}
The function calculateShipping
hits an external service, which is slow (and
expensive), so we don't want to hit it when testing. Instead we want to
introduce a stub, so we can provide a canned and deterministic response for
each of the testing scenarios. Different tests may need different responses
from the function, but we can't edit the code of
calculatePrice
inside the test. Thus we need to introduce a seam around the call
to calculateShipping
, something that will allow our test to
redirect the call to the stub.
One way to do this is to pass the function for
calculateShipping
as a parameter
export async function calculatePrice(order:Order, shippingFn: (o:Order) => Promise<number>) {
const itemPrices = order.items.map(i => calculateItemPrice(i))
const basePrice = itemPrices.reduce((acc, i) => acc + i.price, 0)
const discount = calculateDiscount(order)
const shipping = await shippingFn(order)
const adjustedShipping = applyShippingDiscounts(order, shipping)
return basePrice + discount + adjustedShipping
}
A unit test for this function can then substitute a simple stub.
const shippingFn = async (o:Order) => 113
expect(await calculatePrice(sampleOrder, shippingFn)).toStrictEqual(153)
Each seam comes with an enabling point: “a place where you can
make the decision to use one behavior or another” [WELC]. Passing the function as
parameter opens up an enabling point in the caller of
calculateShipping
.
This now makes testing a lot easier, we can put in different values of
shipping costs, and check that applyShippingDiscounts
responds
correctly. Although we had to change the original source code to introduce the
seam, any further changes to that function don't require us to alter that
code, the changes all occur in the enabling point, which lies in the test code.
Passing a function as a parameter isn't the only way we can introduce a
seam. After all, changing the signature of calculateShipping
may
be fraught, and we may not want to thread the shipping function parameter
through the legacy call stack in the production code. In this case a lookup
may be a better approach, such as using a service locator.
export async function calculatePrice(order:Order) {
const itemPrices = order.items.map(i => calculateItemPrice(i))
const basePrice = itemPrices.reduce((acc, i) => acc + i.price, 0)
const discount = calculateDiscount(order)
const shipping = await ShippingServices.calculateShipping(order)
const adjustedShipping = applyShippingDiscounts(order, shipping)
return basePrice + discount + adjustedShipping
}
class ShippingServices {
static #soleInstance: ShippingServices
static init(arg?:ShippingServices) {
this.#soleInstance = arg || new ShippingServices()
}
static async calculateShipping(o:Order) {return this.#soleInstance.calculateShipping(o)}
async calculateShipping(o:Order) {return legacy_calcuateShipping(o)}
// ... more services
The locator allows us to override the behavior by defining a subclass.
class ShippingServicesStub extends ShippingServices {
calculateShippingFn: typeof ShippingServices.calculateShipping =
(o) => {throw new Error("no stub provided")}
async calculateShipping(o:Order) {return this.calculateShippingFn(o)}
// more services
We can then use an enabling point in our test
const stub = new ShippingServicesStub()
stub.calculateShippingFn = async (o:Order) => 113
ShippingServices.init(stub)
expect(await calculatePrice(sampleOrder)).toStrictEqual(153)
This kind of service locator is a classical object-oriented way to set up a
seam via function lookup, which I'm showing here to indicate the kind of
approach I might use in other languages, but I wouldn't use this approach in
TypeScript or JavaScript. Instead I'd put something like this into a module.
export let calculateShipping = legacy_calculateShipping
export function reset_calculateShipping(fn?: typeof legacy_calculateShipping) {
calculateShipping = fn || legacy_calculateShipping
}
We can then use the code in a test like this
const shippingFn = async (o:Order) => 113
reset_calculateShipping(shippingFn)
expect(await calculatePrice(sampleOrder)).toStrictEqual(153)
As the final example suggests, the best mechanism to use for a seam depends
very much on the language, available frameworks, and indeed the style of the
legacy system. Getting a legacy system under control means learning how to
introduce various seams into the code to provide the right kind of enabling
points while minimizing the disturbance to the legacy software. While a
function call is a simple example of introducing such seams, they can be much
more intricate in practice. A team can spend several months figuring out how
to introduce seams into a well-worn legacy system. The best mechanism for
adding seams to a legacy system may be different to what we'd do for similar
flexibility in a green field.
Feathers's book focuses primarily on getting a legacy system under test, as
that is often the key to being able to work with it in a sane way. But seams
have more uses than that. Once we have a seam, we are in the position to place
probes into the legacy system, allowing us to increase the observability of
the system. We might want to monitor calls to calculateShipping
,
figuring out how often we use it, and capturing its results for separate analysis.
But probably the most valuable use of seams is that they
allow us to migrate behavior away from the legacy.
A seam might redirect high-value customers to a different shipping calculator.
Effective legacy displacement is founded on introducing seams into the legacy
system, and using them to gradually move behavior into a more modern environment.
Seams are also something to think about as we write new software, after all
every new system will become legacy sooner or later. Much of my design advice
is about building software with appropriately placed seams, so we can easily test,
observe, and enhance it. If we write our software with testing in mind, we
tend to get a good set of seams, which is a reason why Test Driven Development is such a useful technique.
Tue 02 Jan 2024 18:42 EST
Another year, another time to pick six favorite musical
discoveries. 2023 includes ambient bluegrass, Afro-Andean funk, Northumbrian
smallpipes, dancing kora, and Ukrainian folk jazz.
more…
Wed 13 Dec 2023 00:00 EST
Throughout my career, people have compared software development to
“traditional” engineering, usually in a way to scold software developers for not
doing a proper job. As someone who got his degree in Electronic Engineering,
this resonated with me early in my career. But this way of thinking is flawed
because most people have the wrong impression of how engineering works in
practice.
Glenn Vanderburg has spent a lot of time digging
into these misconceptions, and I strongly urge anyone who wants to compare
software development to engineering to watch his talk Real Software Engineering. It's also well worth
listening to his interview on the podcast Oddly
Influenced. Sadly I've not been able to persuade him to write this
material down - it would make a great article.
Another good thinker on this relationship is Hillel Wayne. He interviewed a
bunch of “crossovers” - people who had worked both in traditional engineering
and in software. He wrote up what he learned in a series of essays, starting
with Are We Really Engineers?
Mon 11 Dec 2023 14:40 EST
Test-Driven Development (TDD) is a technique for building
software that guides software development by writing tests. It was
developed by Kent
Beck in the late 1990's as part of
Extreme Programming. In essence we follow three simple
steps repeatedly:
- Write a test for the next bit of functionality you want to add.
- Write the functional code until the test passes.
- Refactor both new and old code to make it well structured.
Although these three steps, often summarized as Red - Green -
Refactor, are the heart of the process, there's also a vital initial
step where we write out a list of test cases first. We then pick one of these
tests, apply red-green-refactor to it, and once we're done pick the next.
Sequencing the tests properly is a skill, we want to pick tests that drive us
quickly to the salient points in the design. During the process we should add
more tests to our lists as they occur to us.
Writing the test first, what XPE2 calls
Test-First Programming, provides two main benefits. Most obviously it's a way
to get SelfTestingCode, since we can only write some functional
code in response to making a test pass. The second benefit is that thinking
about the test first forces us to think about the interface to the code first.
This focus on interface and how you use a class helps us separate interface
from implementation, a key element of good design that many programmers
struggle with.
The most common way that I hear to screw up TDD is neglecting
the third step. Refactoring the code to keep it clean is a key part
of the process, otherwise we just end up with a messy aggregation of
code fragments. (At least these will have tests, so it's a less
painful result than most failures of design.)
Revisions
My original post of this page was 2005-03-05. Inspired by Kent's
canonical post, I updated it on 2023-12-11
Mon 04 Dec 2023 00:00 EST
Regression bugs are newly appeared bugs in features of the software that have been around
for a while. When hunting them, it usually valuable to figure out which change
in the software caused them to appear. Looking at that change can give
invaluable clues about where the bug is and how to squash it. There isn't a
well-known term for this form of investigation, but I call it Diff Debugging.
Diff debugging only works if we have our code in version control, but
fortunately these days that's the norm. But there are some more things that
are needed to make it work effectively. We need Reproducible Builds, so that we can run old versions of
the software easily. It helps greatly to have small commits, due to high-frequency
integration. That way when we find the guilty commit, we can more easily
narrow down what happened.
To find the commit that bred the bug, we begin by finding any past version
without the bug. Mark this as a last-good version and the current
version as the earliest-bad. Then find the commit half-way between the
two and see if the bug is there. If so then this commit becomes the earliest-bad,
otherwise it becomes the last-good. Repeat the process (which is a
“half-interval” or “binary” search) until we've got the guilty commit.
If we use git, then the git
bisect command will automate much of this for us. If we can write a test
that will show the presence of the bug, then git bisect can use that too,
automating the whole process of finding the guilty commit.
I often find diff debugging to be useful within a programming session. If I
have slow tests that take a few minutes to run, I might program for
half-an-hour running only a subset of the most relevant tests. As long as I
commit after every green test run, I can use diff debugging should one of
those slower tests fail. Such is the value of committing extremely frequently,
even if they are so small that I feel its best to squash them for the long-term
history. Some IDEs make this easier by keeping a local history automatically
that is finer-grained than the commits to version control.
Revisions
I originally posted this page on 2004-06-01. In its original form it was
more of a casual experience report. I rewrote it on 2023-12-04 to make it
more like a definition of the term. Diff debugging isn't a term that's
caught on much in the industry, but I haven't seen a another term generally
used to describe it.
Tue 28 Nov 2023 10:21 EST
Over the last year, lots of developers have incorporated LLM coding
assistants into their work, finding them a useful tool. But one of the
problems of these tools is that they are unreliable, often coming up with
poor or outright wrong-headed suggestions. Birgitta
Böckeler continues her exploration of GenAI for developers by
passing on what she's learned about how think about this unreliability, and
why it may be good to call your LLM tool “Dusty”.
more…
Fri 24 Nov 2023 14:11 EST
During the last four years, my colleague Unmesh Joshi
been developing a collection of patterns to help us all better understand
how modern distributed systems work. We've been publishing drafts of these
patterns on this site. Now these have turned into a book, published
by Addison-Wesley in my signature series. As such, we've now removed
the work-in-progress drafts from this site, and have replaced them with a
catalog of pattern
summaries. For those with a subscription to oreilly.com, we have deep
links from the summaries to the relevant chapter of the online book.
more…
Thu 09 Nov 2023 10:12 EST
My colleague Sannie Lee has met many students who are looking into
getting into technology, taking narrow professionally-oriented majors.
Sannie, however, has found that a traditional liberal-arts degree has given
her skills that are highly relevant to her work as a product manager.
more…
Tue 07 Nov 2023 10:00 EST
In the second (and final) part of his explanation of React Headless Components
Juntao Qiu explores how a headless component allows us to
create a visually different component that does the same base behavior, and
how it encourages better factoring as we extend base behavior further.
more…