Design bliki
AbundantMutation, AcademicRotation, AccessModifier, Agiledox, AltNetConf, AnemicDomainModel, Annotation, ApplicationBoundary, ApplicationDatabase, AssetCapture, BlueGreenDeployment, BuildLanguage, BuildingArchitect, CallSuper, CatastrophicFailover, CheaperTalentHypothesis, ClassInstanceVariable, ClockWrapper, Closure, Closures, CobolInference, CodeSmell, CollectionClosureMethod, CommandOrientedInterface, CommandQuerySeparation, ComposedRegex, ConstructorInitialization, ContextualValidation, ContradictoryObservations, CourtesyImplementation, CurrencyAsValue, CustomerLoyaltySoftware, DataClump, DataModels, DatabaseStyles, DatabaseThaw, DecoratedCommand, DesignPayoffLine, DesignStaminaHypothesis, DesignedInheritance, Detestable, DiffDebugging, DirectingAttitude, DuckInterface, DynamicTypeCheck, DynamicTyping, EagerReadDerivation, EnablingAttitude, EncapsulatedCollection, EnterpriseArchitecture, ErraticTestFailure, EvansClassification, EventInterception, EventPoster, FeatureBranch, FirstLaw, FixedLengthString, FoundationFramework, GangOfFour, GetterEradicator, HarvestedFramework, HeaderInterface, HierarchicDataModel, HistoryIsNotBunk, HollywoodPrinciple, HumaneInterface, HumaneRegistry, IllustrativeProgramming, ImplicitInterfaceImplementation, InMemoryTestDatabase, IntegrationDatabase, InterfaceImplementationPair, InversionOfControl, JAOO2005, JunitNewInstance, LanguageForLearningObjects, LayProgrammer, LayeringPrinciples, LazyInitialization, LocalDTO, MakingStubs, MinimalInterface, ModelDrivenSoftwareDevelopment, MultipleCanonicalModels, NashvilleProject, NetworkDataModel, OOPSLA2004, OOPSLA2005, ObjectMother, ObservableState, OneLanguage, OpenInheritance, OutputBuildTarget, POJO, PatternShare, PatternsAreNothingNew, PostModernProgramming, PresentationDomainSeparation, ProjectionalEditing, ProtectedData, ProvideServiceStub, PublicCsharpFields, PublishedInterface, RelationalDataModel, ReportingDatabase, RepositoryBasedCode, RequestStreamMap, RoleInterface, RulesEngine, Seal, SecurityAndDesign, Seedwork, SegmentationByFreshness, SelfEncapsulation, SelfInitializingFake, SelfTestingCode, SemanticDiff, ServiceCustodian, ServiceOrientedAmbiguity, SetterInitialization, SmalltalkBooks, SoftwareDevelopmentAttitude, SourceBasedCode, SourceEditing, StaticSubstitution, StranglerApplication, SunkCostDrivenArchitecture, TechnicalDebt, TechnicalDebtQuadrant, TestCancer, TestDouble, TestDrivenDevelopment, TestInvariant, TestingResourcePools, TimeZoneUncertainty, TouchFile, Transactionless, TwoHardThings, TypeInstanceHomonym, TypedCollection, UbiquitousLanguage, UiPatternsReadings, UseOfXml, ValueObject, VotingMachines, Wardish, Web2.0, Xunit
| BlueGreenDeployment |
design |
1 March 2010 |
Reactions |
|
One of the goals that my colleagues and I urge on our clients is
that of a completely automated deployment process. Automating your
deployment helps reduce the frictions and delays that crop up in
between getting the software "done" and getting it to realize its
value. Dave Farley and Jez Humble are finishing up a book on this topic
- Continuous
Delivery. It builds upon many of the ideas that are commonly
associated with Continuous
Integration, driving more towards this ability to rapidly put
software into production and get it doing something. Their section
on blue-green deployment caught my eye as one of those techniques
that's underused, so I thought I'd give a brief overview of it here.  One of the challenges with automating deployment is the cut-over
itself, taking software from the final stage of testing to live
production. You usually need to do this quickly in order to minimize
downtime. The blue-green deployment approach does this by ensuring
you have two production environments, as identical as possible. At
any time one of them, let's say blue for the example, is live. As you
prepare a new release of your software you do your final stage of
testing in the green environment. Once the software is working in the
green environment, you switch the router so that all incoming
requests go to the green environment - the blue one is now idle. Blue-green deployment also gives you a rapid way to rollback - if
anything goes wrong you switch the router back to your blue
environment. There's still the issue of dealing with missed
transactions while the green environment was live, but depending on
your design you may be able to feed transactions to both
environments in such a way as to keep the blue environment as a
backup when the green is live. Or you may be able to put the application
in read-only mode before cut-over, run it for a while in read-only
mode, and then switch it to read-write mode. That may be enough to
flush out many outstanding issues. The two environments need to be different but as identical as
possible. In some situations they can be different pieces of
hardware, or they can be different virtual machines running on the
same (or different) hardware. They can also be a single operating
environment partitioned into separate zones with separate IP
addresses for the two slices. An advantage of this approach is that it's the same basic
mechanism as you need to get a hot-standby working. Hence this
allows you to test your disaster-recovery procedure on every
release. (I hope that you release more frequently than you have a
disaster.) The fundamental idea is to have two easily switchable
environments to switch between, there are plenty of ways to vary the
details. One project did the switch by bouncing the web server
rather than working on the router. Another variation would be to use
the same database, making the blue-green switches for web and domain
layers. This technique has been "out there" for ages, but I don't see it
used as often as it should be. Some foggy combination of Dan North and Jez Humble came up with the
name.
|
| TechnicalDebtQuadrant |
design |
14 October 2009 |
Reactions |
|
There's been a few posts over the last couple of months about
TechnicalDebt that's raised the question of what kinds of design
flaws should or shouldn't be classified as Technical Debt. A good example of this is Uncle Bob's post saying a
mess is not a debt. His argument is that messy code, produced by
people who are ignorant of good design practices, shouldn't be a
debt. Technical Debt should be reserved for cases when people have
made a considered decision to adopt a design strategy that isn't
sustainable in the longer term, but yields a short term benefit,
such as making a release. The point is that the debt yields value
sooner, but needs to be paid off as soon as possible. To my mind, the question of whether a design flaw is or isn't
debt is the wrong question. Technical Debt is a metaphor, so the
real question is whether or not the debt metaphor is helpful about
thinking about how to deal with design problems, and how to
communicate that thinking. A particular benefit of the debt metaphor
is that it's very handy for communicating to non-technical people. I think that the debt metaphor works well in both cases - the
difference is in nature of the debt. A mess is a reckless debt which
results in crippling interest payments or a long period of paying
down the principal. We have a few projects where we've taken over a
code base with a high debt and found the metaphor very useful in
discussing with client management how to deal with it. The debt metaphor reminds us about the choices we can make with
design flaws. The prudent debt to reach a release may not be
worth paying down if the interest payments are sufficiently small -
such as if it were in a rarely touched part of the code-base. So the useful distinction isn't between debt or non-debt, but
between prudent and reckless debt. There's another interesting distinction in the example I just
outlined. Not just is there a difference between prudent and
reckless debt, there's also a difference between deliberate and
inadvertent debt. The prudent debt example is deliberate because the
team knows they are taking on a debt, and thus puts some thought as
to whether the payoff for an earlier release is greater than the
costs of paying it off. A team ignorant of design practices is
taking on its reckless debt without even realizing how much hock
it's getting into. Reckless debt may not be inadvertent. A team may know about good
design practices, even be capable of practicing them, but decide to
go "quick and dirty" because they think they can't afford the time
required to write clean code. I agree with Uncle Bob that this is
usually a reckless debt, because people underestimate where the
DesignPayoffLine is. The whole point of good design and
clean code is to make you go faster - if it didn't people like Uncle
Bob, Kent Beck, and Ward Cunningham wouldn't be spending time
talking about it. Dividing debt into reckless/prudent and deliberate/inadvertent
implies a quadrant, and I've only discussed three cells. So is there
such a thing as prudent-inadvertent debt? Although such a thing
sounds odd, I believe that it is - and it's not just common but
inevitable for teams that are excellent designers. I was chatting with a colleague recently about a project he'd
just rolled off from. The project that delivered valuable software,
the client was happy, and the code was clean. But he wasn't happy
with the code. He felt the team had done a good job, but now they
realize what the design ought to have been. I hear this all the time from the best developers. The point is
that while you're programming, you are learning. It's often the case
that it can take a year of programming on a project before you
understand what the best design approach should have been. Perhaps
one should plan projects to spend a year building a system that you
throw away and rebuild, as Fred Brooks suggested, but that's a
tricky plan to sell. Instead what you find is that the moment you
realize what the design should have been, you also realize that you
have an inadvertent debt. This is the kind of debt that Ward talked
about in his
video. The decision of paying the interest versus paying down the
principal still applies, so the metaphor is still helpful for this
case. However a problem with using the debt metaphor for this is
that I can't conceive of a parallel with taking on a
prudent-inadvertent financial debt. As a result I would think it
would be difficult to explain to managers why this debt appeared. My
view is this kind of debt is inevitable and thus should be
expected. Even the best teams will have debt to deal with as a
project goes on - even more reason not to recklessly overload it
with crummy code. 
|
| FeatureBranch |
design |
3 September 2009 |
Reactions |
|
With the rise of Distributed Version Control Systems (DVCS) such
as git and Mercurial, I've seen more conversations about strategies
for branching and merging and how they fit in with Continuous
Integration (CI). There's a bit of confusion here, particularly
on the practice of feature branching and how it fits in with CI.
Simple (isolated) Feature Branch
The basic idea of a feature branch is that when you start work on
a feature (or story if you prefer that term) you take a branch of
the repository to work on that feature. In a DVCS, you'll do this
in your personal repository, but the same kind of thing works in a
centralized VCS too. I'm going to illustrate this with a series of diagrams. I have a
shared project mainline, colored blue, and two developers, colored
purple and green (since the developers names are Reverend Green and
Professor Plum).  I'm using labeled colored boxes (eg P1 and P2) to represent
local commits on the branch. Arrows between branches represent
merges between branches, the boxes are colored orange to make them stand
out. In this case there are updates, say a couple of bug-fixes,
applied to the mainline (presumably by Mrs Peacock). When these
happen our developers merge them into their work. To give this a
sense of time, I'll assume we're looking at a few days work here,
with each developer committing to their local branch roughly once a day. In order to ensure things are working properly, they can run
builds and tests on their branch. Indeed for this article I'll
assume that each commit and merge comes with an automated build and
test on the branch it's on. The advantage of feature branching is that each developer can
work on their own feature and be isolated from changes going on
elsewhere. They can pull in changes from the mainline at their own
pace, ensuring they don't break the flow of their
feature. Furthermore it allows the team to choose its features for
release. If Reverend Green takes too long, we can release with just
Professor Plum's changes. Or we may want to delay Professor Plum's
feature, perhaps because we are uncertain that the feature works the
way we want to release it. In this case we just tell the professor
to not merge his changes into mainline until we are ready for the
feature. This is called cherry-picking, the team decides
which features to merge in before release. Attractive though that picture looks, there can be trouble
ahead.  Although our developers can develop their features in isolation,
at some point their work does have to be integrated. In this case
Professor Plum easily updates the mainline with his own
changes. There's no merge here because he's already incorporated the
mainline changes into his own branch (there will be a build). Things
are however not so simple for Reverend Green, he needs to merge all
of his changes (G1-6) with all of Professor Plum's (P1-5). (At this point many users of DVCSs may feel I'm missing
something as this is a simple, perhaps simplistic view of feature
branching. I'll get to a more involved scheme later.) I've made this a big merge box as it's a scary merge. It may be
just fine, the developers may have been working on completely
separate parts of the code base with no interaction, in which case
the merge will go smoothly. But they may be working on bits that do
interact, in which case here lye dragons. The dragons can come in many forms, and tooling can help slay
some of them. The most of obvious dragon is the complexity of
merging the source code and dealing with conflicts as developers
edit the same files. Modern DVCSs actually handle this rather well,
indeed somewhat magically. Git has quite the reputation for dealing
with complicated merges. So much so that the textual issues of
merging are much better than they used to be - indeed I'll go so far
as to discount textual conflicts for the purposes of this
article. The problem I worry more about is a semantic conflict. A simple
example of this is that if Professor Plum changes the name of a method
that Reverend Green's code calls. Refactoring tools allow you to
rename a method safely, but only on your code base. So if G1-6
contain new code that calls foo, Professor Plum can't tell in his
code base as he doesn't have it. You only find out on the big merge. A function rename is a relatively obvious case of a semantic
conflict. In practice they can be much more subtle. Tests are the
key to discovering them, but the more code there is to merge the
more likely you'll have conflicts and the harder it is to fix
them. It's the risk of conflicts, particularly semantic conflicts,
that make big merges scary. This fear of big merges also acts as a deterrent to
refactoring. Keeping code clean is constant effort, to do it well it
requires everyone to keep an eye out for cruft and fix it wherever
they see it. However this kind of refactoring on a feature branch is
awkward because it makes the Big Scary Merge worse. The result we
see is that teams using feature branches shy away from refactoring
which leads to uglier code bases.
Continuous Integration
It's these problems that Continuous Integration was designed to
solve. With Continuous Integration my diagram looks like this.  There's a lot more merging going on here, but merging is one of
those things that's much easier to do frequently and small rather
than rarely and large. As a result if Professor Plum is changing
some code that Reverend Green relies on, the Reverend will find it
early, such as when he merges in P1-2. At that point he's only got
to modify G1-2 to work with the changes, rather than G1-6. CI is effective at removing the problem of big merges, but it's
also a vital communication mechanism. In this scenario the potential
conflict will actually appear when Professor Plum merges G1 and
realizes that Reverend Green is actively building on Plum's
libraries. At this point Professor Plum can go and find Reverend
Green and they can discuss how their two features interact. It may
be that Professor Plum's feature requires some changes that don't
mesh well with Reverend Green's changes. By looking at both their
features they can come up with a better design that affects both
their work-streams. With the isolated feature branches our
developers don't discover this till late, probably too late to do
much about it. Communication is one of the key factors in software
development and one of CI's most important features is that it
facilitates human communication. It's important to note that, most of the time, feature branching
like this is a different approach to CI. One of the principles of CI
is that everyone commits to the mainline every day. So unless
feature branches only last less than a day, running a feature branch
is a different animal to CI. I've heard people say they are doing CI
because they are running builds, perhaps using a CI server, on every
branch with every commit. That's continuous building, and a Good
Thing, but there's no integration, so it's not CI.
Promiscuous Integration
Earlier I said parenthetically that there are other ways of doing
feature branching. Say Professor Plum and Reverend Green take tea
together early in the cycle. While chatting they discover they are
working on features that interact. At this point they may choose to
integrate with each other directly, like this.  With this approach they only push to the mainline at the end, as
before. But they merge frequently with each other, so this avoids
the Big Scary Merge. The point here is that the primary issue with
the isolated feature branching scheme is its isolation. When you
isolate the feature branches, there is a risk of a nasty conflict
growing without you realizing it. Then the isolation is an illusion,
and will be shattered painfully sooner or later. So is this more ad-hoc integration a form of CI or a different
animal entirely? I think it is a different animal, again a key point
of CI is everyone integrates to the mainline every
day. Integrating across feature branches, which I shall call
promiscuous integration (PI), doesn't involve or even need a
mainline. I think this difference is important.
I see CI as primarily giving birth to
a release candidate at each commit. The job of the CI system and
deployment process is to disprove the production-readiness of a
release candidate. This model relies on the need to have some
mainline that represents the current shared, most up to date
picture of complete.
--Dave Farley
Promiscuous Integration vs Continuous Integration
So if it's different is PI better than CI, or more
realistically under what circumstances is PI better than CI? With CI, you lose the ability to use the VCS to do cherry
picking. Every developer is touching mainline, so all features grow
in the mainline. With CI, the mainline must always be healthy, so in
theory (and often in practice) you can safely release after any
commit. Having a half built feature or a feature you'd rather not
release yet won't damage the other functionality of the software,
but may require some masking if you don't want it to be visible in
the user-interface. This can be as simple as not including a menu
item in the UI to trigger the feature. PI can provide some middle ground here. It allows Reverend Green
the choice of when to incorporate Professor Plum's changes. If
Professor Plum makes some core API changes in P2, then Reverend
Green can import P1-2 but leave the others until Professor Plum's
feature is put onto the release. One worry with all this picking and choosing is that PI makes it
really hard to keep track of who has what in their branch. In
practice, it seems tooling pretty much solves this problem. DVCSs
keep a clear track of changes and their origins and can figure out
that when Professor Plum pulls G3 he already has G2 but doesn't have
B2. I may have made mistakes drawing the diagram by hand, but tools
do keep track of these things well. On the whole, however, I don't think cherry-picking with the VCS
is a good idea.
Feature Branching is a poor man's
modular architecture, instead of building systems with the ability
to easy swap in and out features at runtime/deploytime they couple
themselves to the source control providing this mechanism through
manual merging.
--Dan Bodart
I much prefer designing the software in such a way that makes it
easy to enable or disable features through configuration changes. My
colleague Paul Hammant calls this Branch by
Abstraction. This requires you to put some thought into what
needs to be modularized and how to control that variation, but we've
found the result to be far less messy that relying on the VCS. The main thing that makes me nervous about PI is the influence on
human communication. With CI the mainline acts as a communication
point. Even if Professor Plum and Reverend Green never talk, they
will discover the nascent conflict - within a day of it
forming. With PI they have to notice they are working on interacting
code. An up-to-date mainline also makes it easy for someone to be
sure they are integrating with everyone, they don't have to poke
around to find out who is doing what - so less chance of some
changes being hidden until a late integration. PI arose out
of open-source work, and it could be that the less intensive tempo
of open-source could be a factor here. In a full time job, you work
several hours a day on a project. This makes it easier for features
to be worked in priority. With an open source project people often
put in a hour here, and the next hour a few days later. A feature
may take one developer quite a while to complete while other
developers with more time are able to get features into a releasable
state earlier. In this situation cherry picking can be more
important. It's important to realize that the tools you use are largely
independent of the integration strategy you use. Although many
people associate DVCSs with feature branching, they can be used with
CI. All you need to do is mark one branch on one repository as the
mainline. If everyone pulls and pushes to that every day, then you
have a CI mainline. Indeed with a disciplined team, I would usually
prefer to use a DVCS on a CI project than a centralized one. With a
less disciplined team I would worry that a DVCS would nudge people
towards long lived branches, while a centralized VCS and a
reluctance to branch nudges them towards frequent mainline
commits. Paul Hammant may be right: "I wonder though, if a team
should not be adept with trunk-based development before they move to
distributed."
|
| SelfInitializingFake |
design |
4 August 2009 |
Reactions |
|
One of the classic cases for using a TestDouble is
when you call a remote service. Remote services are usually slow and
often unreliable, so using a double is a good way to make your tests
faster and more stable. When you're querying a remote service, you need to find a way to
load the expected data into your double. One way to do this is to
use what I'm dubbing a self-initializing fake. The basic plan is
simple. The first time you call the fake it passes the call onto
the actual remote service, and as it returns the data it takes and
saves a copy. Further calls just return the copy. In a sense this is
like a cache, but with the important difference that there is no
attempt to handle cache invalidation, which is handy as that's one
of the TwoHardThings.  I've called this a fake, as that seems the closest fit from the
various varieties of test doubles. The other reasonable alternative
is a stub, but the distinction here is that a stub needs setting up
when you build the fixture, while fakes are autonomous. The interesting thing about a self-initializing fake is how you deal
with situations where the remote service changes it's response. One time I saw this approach was with a database controlled by
another application. In this case the data did change,
frequently. This is unhelpful for tests, because automated tests
rely on getting the same answers to the same questions. But usually
tests don't care whether the data is up to date or not, so saving an
old value worked just fine. I ran into this again recently while chatting with my colleague
Josh Price. In his case the remote data was supposedly static, but
occasionally there were changes, which would imply that the system
he was developing needed to change - usually to handle formatting
issues. In this case he had a special test suite that would get all
self-initializing fakes to call the remote service and check that they
returned the same value that was saved. In this case early stages of their build pipeline ran against the
fake, and the last (slowest) stage ran against the service
itself. One interesting problem was that the remote service required
some unimportant parameters which changed from call to call but
didn't alter the results. These were stripped out of the URL when
the fake looked the values up from the store. (Thanks to Josh Price, Darren Cotterill, and Gerard Meszaros for
their help with this piece.)
|
| ComposedRegex |
design |
24 July 2009 |
Reactions |
|
One of the most powerful tools in writing maintainable code is
break large methods into well-named smaller methods - a technique
Kent Beck refers to as the Composed Method pattern.
People can read your programs much more quickly and accurately
if they can understand them in detail, then chunk those details
into higher level structures.
--Kent Beck
What works for methods often works for other things as well. One
area that I've run into a couple of times where people fail to do
this is with regular expressions. Let's say you have a file full of rules for scoring frequent
sleeper points for a hotel chain. The rules all look rather like:
score 400 for 2 nights at Minas Tirith Airport
We need to pull out the points (400) the number of nights (2) and
the hotel name (Minas Tirith Airport) for each of these rows. This is an obvious task for a regex, and I'm sure right now
you're thinking - oh yes we need: const string pattern = @"^score\s+(\d+)\s+for\s+(\d+)\s+nights?\s+at\s+(.*)";
Then our three values just pop out of the groups. I don't know whether or not you're comfortable in understanding
how that regex works and whether it's correct. If you're like me you
have to look at a regex like this and carefully figure out what it's
saying. I often find myself counting parentheses so I can see where
the groups line up (not actually that hard in this case, but I've
seen plenty of others where it's tougher). You may have read advice to take a pattern like this and to
comment it. (Often needs a switch when you turn it into a regex.)
That way you can write it like this.
protected override string GetPattern() {
const string pattern =
@"^score
\s+
(\d+) # points
\s+
for
\s+
(\d+) # number of nights
\s+
night
s? #optional plural
\s+
at
\s+
(.*) # hotel name
";
return pattern;
}
}
This is easier to follow, but comments never quite satisfy
me. Occasionally I've been accused of saying comments are bad, and
that you shouldn't use them. This is wrong, in both senses.
Comments are not bad - but there are often better options. I always
try to write code that doesn't need comments, usually by good
naming and structure. (I can't always succeed, but I feel I do more
often than not.) People often don't try to structure regexs, but I find it
useful. Here's one way of doing this one.
const string scoreKeyword = @"^score\s+";
const string numberOfPoints = @"(\d+)";
const string forKeyword = @"\s+for\s+";
const string numberOfNights = @"(\d+)";
const string nightsAtKeyword = @"\s+nights?\s+at\s+";
const string hotelName = @"(.*)";
const string pattern = scoreKeyword + numberOfPoints +
forKeyword + numberOfNights + nightsAtKeyword + hotelName;
I've broken down the pattern into logical chunks and put them
together again at the end. I can now look at that final expression
and understand the basic chunks of the expression, diving into the
regex for each one to see the details. Here another alternative that seeks to separate the whitespace to
make the actual regexs look more like tokens.
const string space = @"\s+";
const string start = "^";
const string numberOfPoints = @"(\d+)";
const string numberOfNights = @"(\d+)";
const string nightsAtKeyword = @"nights?\s+at";
const string hotelName = @"(.*)";
const string pattern = start + "score" + space + numberOfPoints + space +
"for" + space + numberOfNights + space + nightsAtKeyword +
space + hotelName;
I find this makes the individual tokens a bit clearer, but all
those space variables makes the overall structure harder to
follow. So I prefer the previous one. But this does raise a question. All of the elements are separated
by space, and putting in lots of space variables or \s+
in the patterns feels wet. The nice thing about breaking out the
regexs into sub-strings is that I can now use the programming logic
to come up with abstractions that suit my particular purpose
better. I can write a method that will take sub strings and join
them up with whitespace.
private String composePattern(params String[] arg) {
return "^" + String.Join(@"\s+", arg);
}
Using this method, I then have.
const string numberOfPoints = @"(\d+)";
const string numberOfNights = @"(\d+)";
const string hotelName = @"(.*)";
const string pattern = composePattern("score", numberOfPoints,
"for", numberOfNights, "nights?", "at", hotelName);
You may not use exactly any of these alternative yourself, but I
do urge you to think about how to make regular expressions
clearer. Code should not need to be figured out, it should be just
read.
Updates
In this discussion I've made the elements for the composed
regexs be local variables. An variation is to take commonly used
regex elements and use them more widely. This can be handy to use
common regexs that are needed in lots of places. My colleague
Carlos Villela comments that one thing to watch out for is if
these fragments are not well-formed, ie having an opening
parenthesis that's closed in another fragment. This can be tricky
to debug. I've not felt the need to do it, so haven't run into
this problem. A few people mentioned using fluent interfaces (internal DSLs)
as an more readable alternative
to regexs. I see this as a separate thing. Regexs don't bother
me if they are small, indeed I prefer a small regex to an
equivalent fluent interface. It's the composition that counts,
which you can do with either technique. Some others mentioned named capture groups. Like comments, I
find these are better than the raw regex, but still find a
composed structure more readable. The point of composition is that
it breaks the overall regex into small pieces that are easier to
understand.
|
| RequestStreamMap |
design |
1 July 2009 |
Reactions |
|
Hang around my colleagues at ThoughtWorks and you soon get the
impression that the only good Enterprise Service Bus (ESB) is a dead
ESB. Jim Webber refers to them as Egregious Spaghetti Boxes. So it's
not uncommon to hear tales of attempts to get them out of systems
that don't need them. Battle was joined at one client and it brought to mind my younger
days playing D&D. Webber swings but misses as the ESB is AC 2,
Evan gets a hit and rolls 2d8 for 6 damage. Erik finally kills it
by casting "Summon Request Stream Map". So what was Erik Dörnenburg's
decisive spell? Essentially the idea was to take a simple request
and show how the data for the request and response made their way
through the layers of the application. Erik printed out all the code
that you needed to read to understand how this would work - which
ran to several pages. He also produced this diagram.  It's currently fashionable in agile circles to do Value Stream
Mapping as a way to uncover waste in a software development
process. I think of this as a request stream map because it similarly
takes a request and shows how it moves through the layers allowing
us to visualize what's going on and think about the cost and value of
the layers. Layering is an essential tool for building software
applications. But like most essential things in life, excess can
be almost as much of a problem as too little. A visualization like
this (or the multiple pages of code) can help you find where "just
enough" is. One hazard, however. If you do need to transform data from one
form to another, it's usually better to a few little
transformations than one big transformation. You want to avoid
unnecessary transformations not compress the ones you need.
|
|
|