encapsulation · language feature · object collaboration design


In programming, the fundamental notion of an object is the bundling of data and behavior. This provides a common data context when writing a set of related functions. It also provides an interface to manipulating the data that allows the object to control access to that data, making it easy to support derived data and prevent invalid modifications of data. Many languages provide explicit syntax to define classes, which act as definitions for objects. But if you have a language with first-class functions and closures, you can use these constructs to create objects using the Function As Object pattern (originally described by Eugene Wallingford).

Here is an example of a simplistic person object, done using the function-as-object style in JavaScript. [1]

function createPerson(name) {
  let birthday;
  return {
    name: () => name,
    setName: (aString) => name = aString,
    birthday: () => birthday,
    setBirthday: (aLocalDate) => birthday = aLocalDate,
    age: age,
    canTrust: canTrust,
  function age() {
    return birthday.until(, ChronoUnit.YEARS);
  function canTrust() {
    return age() <= 30;

The outer form of a function-as-object is a function, which is called as a constructor function. The result of the call is, in essence, a hashmap of functions [2] which acts as a method selector. This map captures the state of any variables in the function in a closure, allowing the data to persist beyond a single function invocation. This result hashmap can be treated like a classical object.

const kent = createPerson("kent");
const youngEnoughToTrust = kent.canTrust();

Looking at the function-as-object from a classical OO point of view:

A common alternative implementation of this pattern is to return a function as the method selector rather than the hashmap which is the natural method selector in JavaScript. To use a function as the method selector, I'd return a function whose first argument is the name of the method to invoke. The function body then switches on that value (see Wallingford for more on this).

The function-as-object approach has been around for a long time, I've seen it described in lisp many times, and it's been widely used in JavaScript (until ES6, JavaScript had a very limited notion of classes). It's often used as an argument that a specific syntax for classes isn't necessary, which is the equivalent of object-aficionados arguing that you don't need first class functions when you can write a class with a single "call" method. As a consequence many people in the JavaScript world argue against using the ES6 class syntax. Personally, I like having both first class functions and first class classes, and prefer ES6's class syntax.

Further Reading

Eugene Wallingford coined the name "Function as Object" in his 1999 pattern language "Envoy". His paper is worth reading for more details on this, including using a function as the method selector and delegation to support some notion of inheritance. The examples in the paper use Scheme.


Chris Ford, Fred George, James Shore, Kevin Yeung, Lucas Lego, Matteo Vaccari, Rob Miles, and Eugene Wallingford commented on drafts of this post


1: For date handling I'm using js-joda, a port of the Joda-Time library that cleaned up the appalling mess that was Java's date and time handling. I'm glad joda-js is repeating the service of bringing sanity to date and time handling.

2: In JavaScript terminology it's called an object, although it is a JavaScript object, not the classical object that we're trying to create. I'll thus refer to it as a hashmap, to try and reduce the confusion.

3: In ES6 I can use shorthand property names to remove the duplication by replacing "age: age," with "age,".

if you found this article useful, please share it. I appreciate the feedback and encouragement


25 January 2017

delivery · testing


Synthetic monitoring (also called semantic monitoring [1]) runs a subset of an application's automated tests against the live production system on a regular basis. The results are pushed into the monitoring service, which triggers alerts in case of failures. This technique combines automated testing with monitoring in order to detect failing business requirements in production.

In the age of small independent services and frequent deployments it's very difficult to test pre-production with the exact same combination of versions as they will later exist in production. One way to mitigate this problem is to extend testability from pre-production into production environments - the idea behind QA in production. Doing this shifts the mindset from a focus on Mean-Time-Between-Failures (MTBF) towards a focus on Mean-Time-To-Recovery (MTTR).

A technique for this is synthetic monitoring, which we used at a client who is a digital marketplace for cars with millions of classifieds across a dozen countries. They have close to a hundred services in production, each deployed multiple times a day. Tests are run in a ContinuousDelivery pipeline before the service is deployed to production. The dependencies for the integration tests do not use TestDoubles, instead the tests run against components in production.

Here is an example of these tests that's well suited for synthetic monitoring. It impersonates a user adding a classified to her list of favourites. The steps she takes are as follows:

In order to exclude test requests from analytics we add a parameter (such as excluderequests=true) to the URL. The parameter is handed over transitively to all downstream services, each of which suppresses analytics and third party scripts when it is set to true.

We could use the excluderequests parameter to mark the data as synthetic in the backend datastores. In our case this isn't relevant since we re-use the same user account and clean out its state at the beginning of the test. The downside is that we cannot run this test concurrently. Alternatively, we could create a new user account for each test run. To make the test users easily identifiable these accounts would have a specific pre or postfix in the email address. Another option would be to have a custom HTTP header that would be sent in every request to identify it as a test, though this is more common for APIs.

Our tests run with the Selenium webdriver and are executed with PhantomJS every 5 minutes against the service in production. The test results are fed into the monitoring system and displayed on the team's dashboard. Depending on the importance of the tested feature, failures can also trigger alerts for on-call duties.

A selection of Broad Stack Tests at the top of the Test Pyramid are well suited to use for synthetic monitoring. These would be UI tests, User Journey Tests, User Acceptance tests or End-to-End tests for web applications; or Consumer-Driven Contract tests (CDCs) for APIs. An alternative to running a suite of UI tests — for example in the context of batch processing jobs — would be to feed a synthetic transaction into the system and assert on its desired final state such as a database entry, a message on a queue or a file in a directory.


Thanks to Henry Lawson for his feedback.

And a special thanks to Martin Fowler for his support, suggestions and time spent helping us improve this Bliki.


1: Ryan Murray coined the term "semantic monitoring" and it appeared on the ThoughtWorks Technology Radar in late 2012. However "synthetic monitoring" seems to be the more widely used term, and usefully builds on the notion of synthetic transactions.

if you found this article useful, please share it. I appreciate the feedback and encouragement


certification · delivery · continuous integration


Continuous Integration is a popular technique in software development. At conferences many developers talk about how they use it, and Continuous Integration tools are common in most development organizations. But we all know that any decent technique needs a certification program — and fortunately one does exist. Developed by one of the foremost experts in continuous delivery and devops, it’s known for being remarkably rapid to administer, yet very insightful for its results. Although it’s quite mature, it isn’t as well known as it should be, so as a fan of the technique I think it’s important for me to share this certification program with my readers. Are you ready to be certified for Continuous Integration? And how will you deal with the shocking truth that taking the test will reveal?

By now my regular readers are wondering if they’ve come across a parody post [1], and yes I am having a little fun with my opening teaser. But like any good joke there’s an important kernel of truth buried in it. There is a remarkably good test for proper Continuous Integration that was created by Jez Humble - and he certainly is a leading expert in ContinuousDelivery. It’s also a rapid test, he often administers it to his audience during his talks. The only problem is that I’ve never heard him refer to it as a certification test - which just shows his lack of vision for money-making schemes.

He usually begins the certification process by asking his audience to raise their hands if they do Continuous Integration. Usually most of the audience raise their hands.

He then asks them to keep their hands up if everyone on their team commits and pushes to a shared mainline (usually shared master in git) at least daily.

Over half the hands go down.

He then asks them to keep their hands up if each such commit causes an automated build and test. Half the remaining hands are lowered.

Finally he asks if, when the build fails, it’s usually back to green within ten minutes. [2]

With that last question only a few hands remain. Those are the people who pass his certification test.

It’s a simple set of questions, but it gets to the core of what Continuous Integration is about. The whole idea is that nobody is working on a code base that deviates significantly from anyone else’s. Continuous Integration means the team knows what the current state of the code truly is, we avoid big risky merges, and people can refactor as much as they need to.

The reason so many people raise their hands at the beginning is the common view that Continuous Integration means running some “Continuous Integration Server” against their feature branches. But Continuous Integration — as it was originally described and named by Kent Beck as part of ExtremeProgramming — has nothing to do with tools. At the beginning it was a human workflow and Jim Shore made an excellent argument that it should be that. The idea of running a daemon process against a source code repository came later, and while it is helpful, it’s only Continuous Integration if it’s run on a shared mainline that people commit to every day. Running such a daemon otherwise, such as on every FeatureBranch, is Daemonic Continuous Integration that debases the name [3], yielding a workflow that doesn't give you the benefits that make the whole thing worth the effort.

Further Reading

For more details on Continuous Integration, see my main article, while written in 2006 it's still a solid summary and definition of the technique. Jez explains why Continuous Integration is a foundation for Continuous Delivery. He states the three questions in the FAQ on that page. Paul Duvall wrote the definitive book on Continuous Integration. Watch Jez administer the certification test at GOTO Chicago in 2014 (sadly there was no camera on the audience).


All credit for the three questions go to Jez, whose talks I've always enjoyed. A conversation with Paul Hammant triggered me to come up with the term Daemonic Continuous Integration, which I hope will catch on for this particularly annoying piece of cargo culting.


1: In general, I'm not a fan of software certification schemes, as they usually fail the CertificationCompetenceCorrelation

2: For this step, "green" counts as passing the commit build, typically compilation and unit tests. While we usually expect a full DeploymentPipeline to be run for release to production, a repository should be fine for developers to work on after the commit build is green. You should have a commit build that takes no more than ten minutes, so quickly fixing it and re-running the commit build works if the fix is easy. If you can't fix and get a green commit build within ten minutes, then you should revert to the last green build.

3: The problem of Daemonic Continuous Integration leads some people to use the name Trunk-Based Development, arguing that SemanticDiffusion has rendered the term “Continuous Integration” useless. While I understand their view, I believe that we shouldn’t give in to semantic diffusion, instead we need to keep working at re-explaining the proper meaning of Continuous Integration, just as we should with other terms under this kind of semantic assault (such as “agile” and “refactoring”).

if you found this article useful, please share it. I appreciate the feedback and encouragement


metrics · clean code


During my career, I've heard many arguments about how long a function should be. This is a proxy for the more important question - when should we enclose code in its own function? Some of these guidelines were based on length, such as functions should be no larger than fit on a screen [1]. Some were based on reuse - any code used more than once should be put in its own function, but code only used once should be left inline. The argument that makes most sense to me, however, is the separation between intention and implementation. If you have to spend effort into looking at a fragment of code to figure out what it's doing, then you should extract it into a function and name the function after that “what”. That way when you read it again, the purpose of the function leaps right out at you, and most of the time you won't need to care about how the function fulfills its purpose - which is the body of the function.

Once I accepted this principle, I developed a habit of writing very small functions - typically only a few lines long [2]. Any function more than half-a-dozen lines of code starts to smell to me, and it's not unusual for me to have functions that are a single line of code [3]. The fact that size isn't important was brought home to me by an example that Kent Beck showed me from the original Smalltalk system. Smalltalk in those days ran on black-and-white systems. If you wanted to highlight some text or graphics, you would reverse the video. Smalltalk's graphics class had a method for this called 'highlight', whose implementation was just a call to the method 'reverse' [4]. The name of the method was longer than its implementation - but that didn't matter because there was a big distance between the intention of the code and its implementation.

Some people are concerned about short functions because they are worried about the performance cost of a function call. When I was young, that was occasionally a factor, but that's very rare now. Optimizing compilers often work better with shorter functions which can be cached more easily. As ever, the general guidelines on performance optimization are what counts. Sometimes inlining the function later is what you'll need to do, but often smaller functions suggest other ways to speed things up. I remember people objecting to having an isEmpty method for a list when the common idiom is to use aList.length == 0. But here using the intention-revealing name on a function may also support better performance if it's faster to figure out if a collection is empty than to determine its length.

Small functions like this only work if the names are good, so you need to pay good attention to naming. This takes practice, but once you get good at it, this approach can make code remarkably self-documenting. Larger scale functions can read like a story, and the reader can choose which functions to dive into for more detail as she needs it.


Brandon Byars, Karthik Krishnan, Kevin Yeung, Luciano Ramalho, Pat Kua, Rebecca Parsons, Serge Gebhardt, Srikanth Venugopalan, and Steven Lowe discussed drafts of this post on our internal mailing list.

Christian Pekeler reminded me that nested functions don't fit my sizing observations.


1: Or in my first programming job: two pages of line printer paper - around 130 lines of Fortran IV

2: Many languages allow you to use functions to contain other functions. This is often used as a scope reduction mechanism, such as using the Function as Object pattern to implement a class. Such functions are naturally much larger.

3: Length of my functions

Recently I got curious about function length in the toolchain that builds this website. It's mostly Ruby and runs to about 15 KLOC. Here's a cumulative frequency plot for the method body lengths

As you see there's lots of small methods there - half of the methods in my codebase are two lines or less. (lines here are non-comment, non-blank, and excluding the def and end lines.)

Here's the data in a crude tabular form (I'm feeling too lazy to turn it into proper HTML tables).

              lines.freq lines.cumfreq lines.cumrelfreq
[1,2)          875           875        0.4498715
[2,3)          264          1139        0.5856041
[3,4)          195          1334        0.6858612
[4,5)          120          1454        0.7475578
[5,6)          116          1570        0.8071979
[6,7)           69          1639        0.8426735
[7,8)           75          1714        0.8812339
[8,9)           46          1760        0.9048843
[9,10)          50          1810        0.9305913
[10,15)         98          1908        0.9809769
[15,20)         24          1932        0.9933162
[20,50)         12          1944        0.9994859

4: The example is in Kent's excellent Smalltalk Best Practice Patterns in Intention Revealing Message

Translations: Chinese
if you found this article useful, please share it. I appreciate the feedback and encouragement


bad things


Sometimes when I work with some data, that data is more precise than I expect. One might think that would be a good thing, after all precision is good, so more is better. But hidden precision can lead to some subtle bugs.

const validityStart = new Date("2016-10-01");   // JavaScript
const validityEnd = new Date("2016-11-08");
const isWithinValidity = aDate => (aDate >= validityStart && aDate <= validityEnd);
const applicationTime = new Date("2016-11-08 08:00");

assert.notOk(isWithinValidity(applicationTime));  // NOT what I want

What happened in the above code is that I intended to create an inclusive date range by specifying the start and end dates. However I didn't actually specify dates, but instants in time, so I'm not marking the end date as November 8th, I'm marking the end as the time 00:00 on November 8th. As a consequence any time (other than midnight) within November 8th falls outside the date range that's intended to include it.

Hidden precision is a common problem with dates, because it's sadly common to have a date creation function that actually provides an instant like this. It's an example of poor naming, and indeed general poor modeling of dates and times.

Dates are a good example of the problems of hidden precision, but another culprit is floating point numbers.

const tenCharges = [
  0.10, 0.10, 0.10, 0.10, 0.10,
  0.10, 0.10, 0.10, 0.10, 0.10,
const discountThreshold = 1.00;
const totalCharge = tenCharges.reduce((acc, each) => acc += each);
assert.ok(totalCharge < discountThreshold);   // NOT what I want

When I just ran it, a log statement showed totalCharge was 0.9999999999999999. This is because floating point doesn't exactly represent many values, leading to a little invisible precision that can show up at awkward times.

One conclusion from this is that you should be extremely wary of representing money with a floating point number. (If you have a fractional currency part like cents, then usually it's best to use integers on the fractional value, representing €5.00 with 500, preferably within a money type) The more general conclusion is that floating point is tricksy when it comes to comparisons (which is why test framework asserts always have a precision for comparisons).


Arun Murali, James Birnie, Ken McCormack, and Matteo Vaccari discussed a draft of this post on our internal mailing list.
if you found this article useful, please share it. I appreciate the feedback and encouragement


domain driven design · API design


When programming, I often find it's useful to represent things as a compound. A 2D coordinate consists of an x value and y value. An amount of money consists of a number and a currency. A date range consists of start and end dates, which themselves can be compounds of year, month, and day.

As I do this, I run into the question of whether two compound objects are the same. If I have two point objects that both represent the Cartesian coordinates of (2,3), it makes sense to treat them as equal. Objects that are equal due to the value of their properties, in this case their x and y coordinates, are called value objects.

But unless I'm careful when programming, I may not get that behavior in my programs

Say I want to represent a point in JavaScript.

const p1 = {x: 2, y: 3};
const p2 = {x: 2, y: 3};
assert(p1 !== p2);  // NOT what I want

Sadly that test passes. It does so because JavaScript tests equality for js objects by looking at their references, ignoring the values they contain.

In many situations using references rather than values makes sense. If I'm loading and manipulating a bunch of sales orders, it makes sense to load each order into a single place. If I then need to see if the Alice's latest order is in the next delivery, I can take the memory reference, or identity, of Alice's order and see if that reference is in the list of orders in the delivery. For this test, I don't have to worry about what's in the order. Similarly I might rely on a unique order number, testing to see if Alice's order number is on the delivery list.

Therefore I find it useful to think of two classes of object: value objects and reference objects, depending on how I tell them apart [1]. I need to ensure that I know how I expect each object to handle equality and to program them so they behave according to my expectations. How I do that depends on the programming language I'm working in.

Some languages treat all compound data as values. If I make a simple compound in Clojure, it looks like this.

> (= {:x 2, :y 3} {:x 2, :y 3})

That's the functional style - treating everything as immutable values.

But if I'm not in a functional language, I can still often create value objects. In Java for example, the default point class behaves how I'd like.

assertEquals(new Point(2, 3), new Point(2, 3)); // Java

The way this works is that the point class overrides the default equals method with the tests for the values. [2] [3]

I can do something similar in JavaScript.

class Point {
  constructor(x, y) {
    this.x = x;
    this.y = y;
  equals (other) {
    return this.x === other.x && this.y === other.y;
const p1 = new Point(2,3);
const p2 = new Point(2,3);

The problem with JavaScript here is that this equals method I defined is a mystery to any other JavaScript library.

const somePoints = [new Point(2,3)];
const p = new Point(2,3);
assert.isFalse(somePoints.includes(p)); // not what I want

//so I have to do this
assert(somePoints.some(i => i.equals(p)));

This isn't an issue in Java because Object.equals is defined in the core library and all other libraries use it for comparisons (== is usually used only for primitives).

One of the nice consequences of value objects is that I don't need to care about whether I have a reference to the same object in memory or a different reference with an equal value. However if I'm not careful that happy ignorance can lead to a problem, which I'll illustrate with a bit of Java.

Date retirementDate = new Date(Date.parse("Tue 1 Nov 2016"));

// this means we need a retirement party
Date partyDate = retirementDate;

// but that date is a Tuesday, let's party on the weekend

assertEquals(new Date(Date.parse("Sat 5 Nov 2016")), retirementDate);
// oops, now I have to work three more days :-(

This is an example of an Aliasing Bug, I change a date in one place and it has consequences beyond what I expected [4]. To avoid aliasing bugs I follow a simple but important rule: value objects should be immutable. If I want to change my party date, I create a new object instead.

Date retirementDate = new Date(Date.parse("Tue 1 Nov 2016"));
Date partyDate = retirementDate;

// treat date as immutable
partyDate = new Date(Date.parse("Sat 5 Nov 2016"));

// and I still retire on Tuesday
assertEquals(new Date(Date.parse("Tue 1 Nov 2016")), retirementDate);

Of course, it makes it much easier to treat value objects as immutable if they really are immutable. With objects I can usually do this by simply not providing any setting methods. So my earlier JavaScript class would look like this: [5]

class Point {
  constructor(x, y) {
    this._data = {x: x, y: y};
  get x() {return this._data.x;}
  get y() {return this._data.y;}
  equals (other) {
    return this.x === other.x && this.y === other.y;

While immutability is my favorite technique to avoid aliasing bugs, it's also possible to avoid them by ensuring assignments always make a copy. Some languages provide this ability, such as structs in C#.

Whether to treat a concept as a reference object or value object depends on your context. In many situations it's worth treating a postal address as a simple structure of text with value equality. But a more sophisticated mapping system might link postal addresses into a sophisticated hierarchic model where references make more sense. As with most modeling problems, different contexts lead to different solutions. [6]

It's often a good idea to replace common primitives, such as strings, with appropriate value objects. While I can represent a telephone number as a string, turning into a telephone number object makes variables and parameters more explicit (with type checking when the language supports it), a natural focus for validation, and avoiding inapplicable behaviors (such as doing arithmetic on integer id numbers).

Small objects, such as points, monies, or ranges, are good examples of value objects. But larger structures can often be programmed as value objects if they don't have any conceptual identity or don't need share references around a program. This is a more natural fit with functional languages that default to immutability. [7]

I find that value objects, particularly small ones, are often overlooked - seen as too trivial to be worth thinking about. But once I've spotted a good set of value objects, I find I can create a rich behavior over them. For taste of this try using a Range class and see how it prevents all sorts of duplicate fiddling with start and end attributes by using richer behaviors. I often run into code bases where domain-specific value objects like this can act as a focus for refactoring, leading to a drastic simplification of a system. Such a simplification often surprises people, until they've seen it a few times - by then it is a good friend.


James Shore, Beth Andres-Beck, and Pete Hodgson shared their experiences of using value objects in JavaScript.

Graham Brooks, James Birnie, Jeroen Soeters, Mariano Giuffrida, Matteo Vaccari, Ricardo Cavalcanti, and Steven Lowe provided valuable comments on our internal mailing lists.

Further Reading

Vaughn Vernon's description is probably the best in-depth discussion of value objects from a DDD perspective. He covers how to decide between values and entities, implementation tips, and the techniques for persisting value objects.

The term started gaining traction in the early noughties. Two books that talk about them from that time are are PoEAA and DDD. There was also some interesting discussion on Ward's Wiki.

One source of terminological confusion is that around the turn of the century some J2EE literature used "value object" for Data Transfer Object. That usage has mostly disappeared by now, but you might run into it.


1: In Domain-Driven Design the Evans Classification contrasts value objects with entities. I consider entities to be a common form of reference object, but use the term "entity" only within domain models while the reference/value object dichotomy is useful for all code.

2: Strictly this is done in awt.geom.Point2D, which is a superclass of awt.Point

3: Most object comparisons in Java are done with equals - which is itself a bit awkward since I have to remember to use that rather than the equals operator ==. This is annoying, but Java programmers soon get used to it since String behaves the same way. Other OO languages can avoid this - Ruby uses the == operator, but allows it to be overridden.

4: There is robust competition for the worst feature of the pre-Java-8 date and time system - but my vote would be this one. Thankfully we can avoid most of this now with Java 8's java.time package

5: This isn't strictly immutable since a client can manipulate the _data property. But a suitably disciplined team can make it immutable in practice. If I was concerned that a team wouldn't be disciplined enough I might use use freeze. Indeed I could just use freeze on a simple JavaScript object, but I prefer the explicitness of a class with declared accessors.

6: There is more discussion of this in Evans's DDD book.

7: Immutability is valuable for reference objects too - if a sales order doesn't change during a get request, then making it immutable is valuable; and that would make it safe to copy it, if that were useful. But that wouldn't make the sales order be a value object if I'm determining equality based on a unique order number.

if you found this article useful, please share it. I appreciate the feedback and encouragement


bad things


Aliasing occurs when the same memory location is accessed through more than one reference. Often this is a good thing, but frequently it occurs in an unexpected way, which leads to confusing bugs.

Here's a simple example of the bug.

Date retirementDate = new Date(Date.parse("Tue 1 Nov 2016"));

// this means we need a retirement party
Date partyDate = retirementDate;

// but that date is a Tuesday, let's party on the weekend

assertEquals(new Date(Date.parse("Sat 5 Nov 2016")), retirementDate);
// oops, now I have to work three more days :-(

What's happening here is that when we do the assignment, the partyDate variable is assigned a reference to the same object that the retirement data refers to. If I then alter the internals of that object (with setDate) then both variables are updated, since they refer to the same thing.

Although aliasing is a problem in that example, in other contexts it's what I expect.

Person me = new Person("Martin");
Person articleAuthor = me;
assertEquals("999", articleAuthor.getPhoneNumber());

It's common to want to share records like this, and then if it changes, it changes for all references. This is why it's useful to think of reference objects, which we deliberately share [1], and Value Objects that we don't want this kind of shared update behavior. A good way to avoid shared updates of value objects is to make value objects immutable.

Functional languages, of course, prefer everything to be immutable. So if we want changes to be shared, we need to handle that as the exception rather than the rule. Immutability is a handy property, one that makes it harder to create several kinds of bugs. But when things do need to change, immutability can introduce complexity, so it's by no means a free breakfast.


Graham Brooks and James Birnie's comments on our internal mailing list led me to write this post.

Further Reading

The term aliasing bug has been around for a while. It appears in Eric Raymond's Jargon file in the context of the C language where the raw memory accesses make it even more unpleasant.


1: The Evans Classification has the notion of Entity, which I see as a common form of reference object.

if you found this article useful, please share it. I appreciate the feedback and encouragement