Design bliki

CraftmanshipAndTheCrevasse design 19 January 2011 Reactions

Dan North's recent blog post on software craftsmanship has unleashed a lot of blog discussions (which I summarize below, if you're interested). There's a lot in there, but one of his themes particularly resonated with me, hence this post.

Dan North (aka tastapod)

Before I get to that, however, I just want to push one element off to the side. I've long felt that debates about metaphors for software development are tedious. While MetaphoricQuestioning has its place I'm fundamentally uninterested in whether software development is a craft, an art, a trade, or a dessert topping.

The point that matters to me isn't about the craftsmanship metaphor, but more a characteristic of the movement that seems to have sprung up in the last couple of years. From my outsider perspective, the primary force that's energized the software craftsmanship community is a reaction to the change in the agile movement. In the early days of the agile virus, the dominant strain was Extreme Programming, which has a lot to say about technical practices. Now the dominant agile strains are Scrum and Lean, which don't care very much about programming - and thus those people who primarily self-identify as programmers feel a large part of their life is no longer important in the agile world.

The software craftsmanship world, therefore, is place where programming can become front-and-central again. People can talk about testing, how to learn and use functional languages, principles of good design, etc. The management and analysis issues can then be left to the debilitated agile community. There's much I sympathize with here. I accept the DesignStaminaHypothesis which suggests that you need to pay attention to good technical practices if you want to develop software effectively. So a movement that gives these issues attention is important. But there also lies a danger, that by focusing too much on technical issues the craftsmanship movement will underplay the equally vital role of customer communication.

One thing I like so much about Kent's work is that there's a real balance between the relationship with the customer and the skills required to execute properly on our half the bargain. I remember him saying at the AgileManifestoMeeting that his primary aim with Extreme Programming was to heal the divide between software developers and their customers. This divide, which Dan and I characterized as the "yawning crevasse of doom", is one the most important problems in software development.

A large part of the blame for the crevasse lies in organizational habits that are founded on the notion programmers and customers are such different creatures that they can't communicate (and shouldn't try). But many programmers gladly seem to go along with widening the crevasse too. Several years ago I was struck by Eric Evans's observation that as developers get more senior they tend to focus more on technical issues and don't tend to engage in understanding the domain they are working in. Domain-Driven Design is very much about trying to change that - although often it seems to get bogged down in discussions of topics such as how to use dependency injection on your repositories. (My own career has also suffered from this. As I've developed as a general software development pundit, I've no longer been able to work in domain modeling - even though that's always been my favorite part of software development.)

So just as Scrum and Lean exacerbate this problem by neglecting the technical skills, I fear craftsmanship in turn may make the crevasse worse by neglecting the relationship skills. My ideal of a programmer is someone who is not just skilled in the craft of programming, but is also energized by learning about the domain by communicating with domain experts so that she can participate in finding the best ways to get software to help make customers rock at what they do. Paraphrasing Dan, the software shouldn't be at the center of a programmer's world, instead a programmer should focus on the benefit that the software is supposed to deliver.

The blog chatter

(includes items posted in response to this entry.)

  • Dan North's original post
  • Liz Keogh explains her discomfort with the software craftsmanship manifesto.
  • Gil Zilberfeld draws a comparison between software craftsmanship and
  • Jason Gorman wants us to avoid getting hung up on labels
  • Michael Feathers looks for more deliberate practice in our work.
  • George Dinwiddie provides a physical example of why quality work is important to a customer, and how certification and licensing doesn't help.
  • Dan North expands and clarifies some his earlier points.
  • Bob Martin says software craftsmanship is only about programmers tired of writing crap.
  • Bob Martin thinks my fears are groundless (I hope he's right).

IntegrationContractTest design 12 January 2011 Reactions

One of the most common cases of using a TestDouble is when you are communicating with an external service. Typically such services are being maintained by a different team, they may be subject to slow, and unreliable networks, and maybe unreliable themselves. That's why a test double is handy, it stops your own tests from being slow and unreliable. But testing against a double always raises the question of whether the double is indeed an accurate representation of the external service, and what happens if the external service changes its contract?

A good way to deal with this is to run your own tests against the double, but to periodically run a separate set of integration contract tests that checks all the calls against your test doubles return the same results as a call to the external service would. A failure in any of these integration contract tests implies you need to update your test doubles, and probably your code to take into account the service contract change.

These tests need not be run as part of your regular deployment pipeline. Your regular pipeline is based on the rhythm of changes to your code, but these tests need to be based on the rhythm of changes to the external service. Often running just once a day is plenty.

A failure in an integration contract test shouldn't necessarily break the build in the same way that a normal test failure would. It should, however, trigger a task to get things consistent again. This may involve updating the tests and code to bring them back into consistency with the external service. Just as likely it will trigger a conversation with the keepers of the external service to talk about the change and alert them to how their changes are affecting other applications.

This communication with the external service supplier is even more important if this service is being used as part of a production application. In these cases an integration contract change may may break a production application, triggering an emergency fix and an urgent conversation with the supplier team.

To reduce the chances of unexpected breaks in integration contracts, it's useful to move to a Consumer Driven Contracts approach. You can facilitate this by letting the supplier team have copies of your integration contract tests so they can run them as part of their build pipeline.

When testing an external service like this, it's usually best to do so against a test instance of the external service. Occasionally you'll have no choice but to hit the production instance, at that point you'll need to talk to the suppliers to let them know what's happening and be extra careful with what the tests do.

Integration contract tests check the contract of external service calls, but not necessarily the exact data. Often a stub will snapshot a response as at a particular date, since the format of the data matters rather than the actual data. In this case the integration contract test needs to check that the format is the same, even if the actual data has changed.

One of the best way to build these test doubles is to use a SelfInitializingFake.

ReproducibleBuild design 30 November 2010 Reactions

One of the prevailing assumptions that fans of Continuous Integration have is that builds should be reproducible. By this we mean that at any point you should be able to take some older version of the system that you are working on and build it from source in exactly the same way as you did then.

This isn't called out as a key practices in the sources I usually refer to on the build process. I think that's because it's an underlying assumption - one that's considered so obvious there's no need to explain it.

One of the driving reasons to have reproducible builds is to ensure we can deal with problems in past releases that are still used. If we release software to a customer a year ago, and they now report a serious bug with it, it's important to be able to recreate that software so that we can deliver a fix.

But let's assume a case where you're releasing software every week to a hosted environment. Let's also assume you have a solid Continuous Delivery process and are thus confident that you promulgate bug fix by either waiting until the next release or(if really critical) doing an early release. Do you then still need reproducible builds?

In a scenario where you: receive a bug report, reproduce the bug on head, fix it on head and either wait or immediately release - then you don't. But there are cases when it's still very handy to have reproducible builds.

What happens when you get a bug report and you can't reproduce it. Do you just declare it fixed and move in? I wouldn't be happy with that response. Firstly I'd want to be sure I really understood the bug - so I'd want to check out the released version of the software, build it, and ensure I could then reproduce it. To be confident in reproducing the bug, I'd need to reproduce the build. Furthermore even if I'm confident that the bug got fixed en passant during recent development, I'd still argue that there's at least one test missing. I'd want to write that test and verify that it passes now and fails against the released build.

Another case is a regression. A customer contacts you and says there's a bug now that wasn't there before. Such bugs can hide a long time before they wake up and wave their feelers at you. Maybe it only occurs when the first of the month falls on a Monday. Either way you now have software that you think worked two months ago but now has a bug.

Here having reproducible builds gives you the ability to use DiffDebugging. Your customer is pretty sure that you didn't have this problem two months ago, that was build 20000, you're now on build 28000. So you check out build 20000 and look to see if the bug is there. It isn't so you try build 24000, not there either, so next is 26000. Before long you know the bug first appeared with revision 26543 (modern version control systems have features to help you do this). Now you look at the diffs between revision 26543 and its parent - often this approach makes it much easier to find a bug.

FeatureToggle design 29 October 2010 Reactions

One of the most common arguments in favor of FeatureBranch is that it provides a mechanism for pending features that take longer than a single release cycle. Imagine you are releasing into production every two weeks, but need to build a feature that's going to take three months to complete. How do you use Continuous Integration to keep everyone working on the mainline without revealing a half-implemented feature on your releases? We run into this issue quite a lot and feature toggles are a handy tool to deal with it.

(I've seen a lot of names for this concept tossed around: feature bits, flags, flippers, switches and the like. There seems no generally accepted name yet.)

The basic idea is to have a configuration file that defines a bunch of toggles for various features you have pending. The running application then uses these toggles in order to decide whether or not to show the new feature.

Most of these decisions occur in the user-interface of the application. So if you are building a web application using jsp, you may use a set of jsp tags to surround any user-interface parts of a pending feature.

    <toggle name="petSurvey">
      <p>Take our new <a href = 'petSurvey'>pet survey</a></p>

The implementation of the toggle tag then just passes through the content if the toggle is set to on, and skips it otherwise. Other UI technologies will use different details, but the basic notion of wrapping the pending elements is the same.

This technique works best when the number of entry points to the new feature are small. There could be many screens in the pet survey feature, but if there's only one link on the home page that gets you there, then that's the only element that needs to be protected with the toggle tag. Therefore it's best to limit the entry points as much as you can, but for some features that will be difficult and you'll need lots of toggle tags.

Some features may be like introducing a new pricing algorithm, where there might be no user-interface elements. Here the test of the toggle would be in the application code, it could be as crude as a conditional test, or something more sophisticated like a strategy wired through dependency injection.

Most feature toggles I've heard about are set at run-time, but I've also seen cases set at build time. The advantage of a build time toggle is that none of the new feature's code gets compiled into the released executable, although that's rarely much of an advantage. Run-time toggles make it easier to set up your build pipelines and to run tests with various configurations of features. It also facilitates canary releasing, A/B testing, and makes it easier to roll-back should a new feature misbehave in production.

One danger with feature toggles is an accidental exposure, when someone forgets to wrap the UI feature in a toggle tag. This is awkward to test, since it's difficult to form a test that nothing that should be hidden is visible without calling out the individual elements - which are likely to be forgotten at the same time.

A common question we hear about feature toggles concerns testing - does using feature toggles mean a combinatorial explosion of tests? I think this concern is a red herring. Using feature toggles doesn't mean you have to do any more testing that you need to do with feature branches, it just makes it easier to run the alternatives. The key is to focus testing on the combination of pending features that you expect to deploy together.

It's very important to retire the toggles once the pending features have bedded down in production. This involves removing the definitions on the configuration file and all the code that uses them. Otherwise you will get a pile of toggles that nobody can remember how to use. In one memorable example I heard of, it required making a special recompilation of the linux kernel to handle enough command line switches.

In some cases you can use the basic thinking behind feature toggles, without the actual toggles, by implementing the entry UI elements of a feature last. A danger of this is lack of end-to-end testing, which you can deal with by using subcutaneous testing, or a backdoor into a UI.

Feature toggles can be used for permanent variable configuration too, such as different versions of a software for different contexts. This is a different usage to handling pending features but most of the implementation is the same. If you use feature toggles for other scenarios too, it's wise to clearly separate the pending feature case from the permanent cases.

While feature toggles are a valuable tool in the box, they are a second-best option. The best thing to do with such features is to find a way to gradually release them into production as you are building them. This gradual release get you a return on investment earlier as well as superior feedback. I would always look for a way to do that, and only use feature toggles when we can't find such an option. Most people don't try hard enough to gradually release new capabilities.

To probe further...

(Thanks to Kent Beck and Christian Gruber for tweets that reminded me of points I forgot to include.)

BlueGreenDeployment design 1 March 2010 Reactions

One of the goals that my colleagues and I urge on our clients is that of a completely automated deployment process. Automating your deployment helps reduce the frictions and delays that crop up in between getting the software "done" and getting it to realize its value. Dave Farley and Jez Humble are finishing up a book on this topic - Continuous Delivery. It builds upon many of the ideas that are commonly associated with Continuous Integration, driving more towards this ability to rapidly put software into production and get it doing something. Their section on blue-green deployment caught my eye as one of those techniques that's underused, so I thought I'd give a brief overview of it here.

One of the challenges with automating deployment is the cut-over itself, taking software from the final stage of testing to live production. You usually need to do this quickly in order to minimize downtime. The blue-green deployment approach does this by ensuring you have two production environments, as identical as possible. At any time one of them, let's say blue for the example, is live. As you prepare a new release of your software you do your final stage of testing in the green environment. Once the software is working in the green environment, you switch the router so that all incoming requests go to the green environment - the blue one is now idle.

Blue-green deployment also gives you a rapid way to rollback - if anything goes wrong you switch the router back to your blue environment. There's still the issue of dealing with missed transactions while the green environment was live, but depending on your design you may be able to feed transactions to both environments in such a way as to keep the blue environment as a backup when the green is live. Or you may be able to put the application in read-only mode before cut-over, run it for a while in read-only mode, and then switch it to read-write mode. That may be enough to flush out many outstanding issues.

The two environments need to be different but as identical as possible. In some situations they can be different pieces of hardware, or they can be different virtual machines running on the same (or different) hardware. They can also be a single operating environment partitioned into separate zones with separate IP addresses for the two slices.

An advantage of this approach is that it's the same basic mechanism as you need to get a hot-standby working. Hence this allows you to test your disaster-recovery procedure on every release. (I hope that you release more frequently than you have a disaster.)

The fundamental idea is to have two easily switchable environments to switch between, there are plenty of ways to vary the details. One project did the switch by bouncing the web server rather than working on the router. Another variation would be to use the same database, making the blue-green switches for web and domain layers.

This technique has been "out there" for ages, but I don't see it used as often as it should be. Some foggy combination of Dan North and Jez Humble came up with the name.

TechnicalDebtQuadrant design 14 October 2009 Reactions

There's been a few posts over the last couple of months about TechnicalDebt that's raised the question of what kinds of design flaws should or shouldn't be classified as Technical Debt.

A good example of this is Uncle Bob's post saying a mess is not a debt. His argument is that messy code, produced by people who are ignorant of good design practices, shouldn't be a debt. Technical Debt should be reserved for cases when people have made a considered decision to adopt a design strategy that isn't sustainable in the longer term, but yields a short term benefit, such as making a release. The point is that the debt yields value sooner, but needs to be paid off as soon as possible.

To my mind, the question of whether a design flaw is or isn't debt is the wrong question. Technical Debt is a metaphor, so the real question is whether or not the debt metaphor is helpful about thinking about how to deal with design problems, and how to communicate that thinking. A particular benefit of the debt metaphor is that it's very handy for communicating to non-technical people.

I think that the debt metaphor works well in both cases - the difference is in nature of the debt. A mess is a reckless debt which results in crippling interest payments or a long period of paying down the principal. We have a few projects where we've taken over a code base with a high debt and found the metaphor very useful in discussing with client management how to deal with it.

The debt metaphor reminds us about the choices we can make with design flaws. The prudent debt to reach a release may not be worth paying down if the interest payments are sufficiently small - such as if it were in a rarely touched part of the code-base.

So the useful distinction isn't between debt or non-debt, but between prudent and reckless debt.

There's another interesting distinction in the example I just outlined. Not just is there a difference between prudent and reckless debt, there's also a difference between deliberate and inadvertent debt. The prudent debt example is deliberate because the team knows they are taking on a debt, and thus puts some thought as to whether the payoff for an earlier release is greater than the costs of paying it off. A team ignorant of design practices is taking on its reckless debt without even realizing how much hock it's getting into.

Reckless debt may not be inadvertent. A team may know about good design practices, even be capable of practicing them, but decide to go "quick and dirty" because they think they can't afford the time required to write clean code. I agree with Uncle Bob that this is usually a reckless debt, because people underestimate where the DesignPayoffLine is. The whole point of good design and clean code is to make you go faster - if it didn't people like Uncle Bob, Kent Beck, and Ward Cunningham wouldn't be spending time talking about it.

Dividing debt into reckless/prudent and deliberate/inadvertent implies a quadrant, and I've only discussed three cells. So is there such a thing as prudent-inadvertent debt? Although such a thing sounds odd, I believe that it is - and it's not just common but inevitable for teams that are excellent designers.

I was chatting with a colleague recently about a project he'd just rolled off from. The project that delivered valuable software, the client was happy, and the code was clean. But he wasn't happy with the code. He felt the team had done a good job, but now they realize what the design ought to have been.

I hear this all the time from the best developers. The point is that while you're programming, you are learning. It's often the case that it can take a year of programming on a project before you understand what the best design approach should have been. Perhaps one should plan projects to spend a year building a system that you throw away and rebuild, as Fred Brooks suggested, but that's a tricky plan to sell. Instead what you find is that the moment you realize what the design should have been, you also realize that you have an inadvertent debt. This is the kind of debt that Ward talked about in his video.

The decision of paying the interest versus paying down the principal still applies, so the metaphor is still helpful for this case. However a problem with using the debt metaphor for this is that I can't conceive of a parallel with taking on a prudent-inadvertent financial debt. As a result I would think it would be difficult to explain to managers why this debt appeared. My view is this kind of debt is inevitable and thus should be expected. Even the best teams will have debt to deal with as a project goes on - even more reason not to recklessly overload it with crummy code.