Musings on Ebooks

It's only just over a year since I got my first ebook reader. Now I only buy paper books when I really have to. I wrote my last book thinking of it primarily as a paper book, but that will be the last time, in the future electronic forms will be at the front of my mind. These changes will completely alter the landscape of books, but other than that, the next steps aren't that clear.

05 May 2011

For a full-time technologist, I can be remarkably ludditly sometimes. Even as 2009 drew to a close, I was resisting getting hold of an ebook reader. While the weight of the traditional paper reader was a drag, I appreciated its lack of DRM and the battery life too much to switch to an electronic form. Now a year later, I prefer to buy books in electronic form. I only get paper for a book I really want and can't get virtually.

As an avid reader, and somewhat successful writer, I can't but be moved by this moment in history. I think we are at a major shift for the way people interact with what we currently think of as books. I have no certainties about where we are going, but feel compelled to describe the landscape as I see it now.

As a reader

My initial revelation was the iPad. Although I had expected to like the device, I was surprised by how much I liked it. After a few months of using it I decided to get a Kindle as well. I did this in the full knowledge that it was really a waste, since the kindle software on the iPad did everything I needed, but as an author I wanted to get a feel of what many readers would be using. What I discovered was that the kindle was so light (and small) that I find I carry it with me on trips in addition to the iPad. It contributes negligible extra weight, and its lightness makes it more comfortable to read prose books.

The fact that I travel so much is a big reason why I like tablets. In the past trading off reading material for weight was always a fraught decision, particularly with long trips to non-English speaking places. Now I can easily load up with a couple more books than I need, and even then all I need is an internet connection if I need any more. I found this very handy on my recent trip to India, particularly when my favorite newspaper launched its iPad app.

The shift to digital changes many of our assumptions about books, and not always in a good way. The ownership rules about a physical book are well-understood: I have a clearly delineated thing that I can sell, lend, or give away. Virtual books limit these options - but they give us new ones. As with other digital products, we have to develop and get used to a whole new model for ownership and access. As with most changes we'll regret what we lose twice as much as appreciate what we'll gain until we get to the new status quo, whatever that becomes.

As a writer

For most of the time I was writing my DSL book, I imagined it as a physical book, much like my other books. That won't be the case again. If I summon up the enthusiasm to write another book, I reckon I'll be imagining most people reading it on various tablet devices, with only a minority using paper, if paper is an option at all.

As I got close to publication for the DSL book, I began to recognize this new reality. As well as the paper form, I have readers on iPad, Kindle, the safari books online website[1], and even the iphone. Each of these has different characteristics, which requires me to rethink elements of how I construct the book.

Take one simple example - the diagrams. My books have always been monochromatic, so when I draw diagrams I do them in black and white. This is still true for the paper and kindle forms, but not for the web and epub versions. But this makes things tricky. Color is an opportunity to provide another channel of semantic information, but I have to be wary of using it if a large segment of my audience can't see the colors. Do I make the most of the opportunity of the richer format, or do I stick with the lesser, but more available capability?

One definite change I made with this book was to number my sections. In the past I've not done this, as people can refer to something within the book by a page number. But since ebooks don't have pages, I now need a regular way to point into the book. Mostly this does not affect the design of the book, since section numbers and cross-references can easily be automatically generated in the production software, but it does make a subtle difference. In P of EAA, my chapters were groups of related patterns. However if I did that I'd end up with subsections with 4 part numbers ( which I prefer to avoid. So for the DSL book I gave each pattern its own chapter. This led to lots of chapters, but allowed me to retain 3 part section numbers throughout the book. My suspicion is that we'll find more of these subtle interactions between the logical structure of a book and the way that structure plays out in its various representations. While it's nice to focus on logical structure in a way that's independent of its presentation, there are always cases like this that disturb the comfortable abstraction. [2]

On the form

With all this change, it seems easy for people to predict the end of books. I don't see this. For me the book isn't so much the physical form, or the delivery mechanism. Instead it's the packaging of the content. What makes a book hard to write, at least for me, is pulling together and organizing a large body of material. This is why a good book is more than a couple of dozen articles. Tablets don't change the need to take material and organize it like this, although they do alter the way you can present that material.

Some people argue that videos will kill off books. I don't think so, essentially because video is really such a poor medium for communicating things in depth. I can read text far faster than I can watch a video of someone talking. I can easily skim text and jump backwards and forwards. I can easily follow cross-references. All of this makes me prefer text to audio/visual when I'm trying to understand something. (It's also why I'm not a big fan of conference talks.)

It is interesting to me to see how video can be incorporated into books. Certainly for mechanical tasks, it's really nice to have a snippet of video to explain how to do something. I find it interesting to experiment to see how motion graphics can help explain concepts. But I still prefer text as the primary medium, with video playing a supporting role.


One of the open questions for ebooks what format to use. There are four formats that are in reasonably common use: PDF, ebook, kindle, and HTML.

  • PDF has been a common print + electronic format for many years. As a result it's a mature format that you can read on lots of devices. Its biggest problem (and arguably its strength) is that it's delivered with a fixed display (page) size in mind. This means it doesn't reflow well for different sized tablets. On the other hand it does allow good control for publications that take design very seriously.
  • epub is the open format for books. Unlike PDF it supports differently sized devices, which is why many people (including me) see this as a more serious choice for books. I've read epub books on my iPad but I haven't yet experimented with producing them.
  • kindle (mobi) is Amazon's proprietary format. Its disadvantage is that it's tied to Amazon, which is less of an issue at the moment since Amazon have such a big hold on distribution. I'm told the format is more limited than epub (but again I haven't tried it, other than as a reader). Amazon believe that controlling the format and tying it to their own hardware and software readers will allow them to do more with the format in the future. It certainly will be interesting to see the contrast between open epub and closed kindle over the next few years.
  • HTML isn't generally thought of as an ebook format, since it's designed so much for online usage. But it is important to several publishers if they can make books available through a web delivery mechanism. Further developments with HTML 5 and offline storage may lead to more people exposing books as HTML with the offline storage used to allow people to read the books without an internet connection.

At the moment I'm certainly unsure where things will go. On the whole I don't think PDF is a good route for electronic publication since the fixed page size is much more of a problem than a help. Kindle's format offers less options, but has a big reach of readers - especially given the kindle hardware's nice form factor. The fact that it's proprietary is also a negative, but again Amazon's reach currently makes up for it. Essentially for book material you need to do both epub and kindle, but then have to figure out how to live with the limitations of the kindle compared to epub. Do you limit yourself to what will work in both, or two versions that can play to the strengths of both platforms?

The other thought is go beyond the ebook formats themselves and publish material as a tablet app - which allows full use of the capabilities of the platform itself. At the moment I'm not tempted by this. There's too much churn the in app formats - and when I'm working on a book I want something that will last for a decade or so. This does, however, raise some interesting questions of what can be done with relatively open formats, such as HTML 5, and a tablet form factor. Or perhaps some specialized variant of book format for particular kinds of book.[3]

The relationship between publishers and readers

I remember an early editor of mine at Pearson[4] talking about how much he was looking forward to disintermediating the big book stores. The distribution of physical books takes a big chunk of revenue - usually just over 50% of a book's cover price goes to the book stores and book distributors (such as Ingram). Since an author's royalties are usually based on the what the publisher gets, reducing this would improve matters for authors too. As I write this, Borders is bankrupt, so the disintermediation could be said to be going full swing. However what seems to be happening is a change of distributor. Amazon has become the big player in book distribution over the last decade. As well as doing a good job with physical books, they have also got an early lead in electronic books too.

The alternative is for publishers to get a direct relationship with readers. A great example of a company doing this is the Pragmatic Programmers (usually referred to as the "prags"). The best way to buy an ebook from the prags is to go directly to their website. Once you get the book you can download it in multiple formats: PDF, epub, and kindle. You can download it as often as you like and in as many formats as you like. You can get a paper copy too at a reduced price. I really like that I can read their books in multiple formats easily. And I'm sure their authors like the fact that they get more money by disintermediating the distributors.

Large publishers, such as Pearson, have different challenges to small ones. If they try to disintermediate, they can run into nasty fights with their distributors. They also have existing legal agreements to honor. On the other hand they can carry enough weight to influence even such a large distributor as Amazon.

One of the open questions around all this is what does the reader buy? The traditional approach is that the reader pays for the representation of a book, each physical copy of a book is a separate thing to pay for. The Prags' model is different, when I buy an ebook from them I am buying access to the content of the book - and I can take as many representations (epub, kindle, etc) as I like. The only representation I pay extra for is the paper one, should I wish to have it.

The Economist's model is like this too: I've been a subscriber to The Economist for a long time and it's always given me a copy of the paper representation plus access to the web site. Over time they've added audio downloads and iPhone/iPad apps. All of these representations are included in the subscription price.

In contrast I was subscriber to Zagat's. When they came out with an iPhone app, I had to pay for it again to read the same content as one their website. There the payment was for the representation (website or app) rather than for content.[5]

The Production Process

The demands of ebooks places a lot of pressure on the production process. If you want to be able to produce output for multiple different formats easily, you need a highly automated process. Furthermore the source files for that process need to be based on the semantic structure of the book, as opposed to its physical representation on paper.

Here again the prags have led the way (in no small part because they are programmers themselves). They have built up a complete tool-chain that goes from an XML source file that captures the semantic structure of the book and can produce camera-ready output for print and multiple ebook formats. On top of that they make use of source control on the book text itself to enhance collaboration during the book production process.

When I decided to stick with Pearson for my DSL book, I had to do without the prags' expertise. For my previous book I had used a similar approach during the drafting of the book, but cut over to a more old fashioned approach once I reached the final draft. For my DSL book I was determined to keep an automated, version-controlled process throughout the entire system. This is somewhat of a struggle with Pearson, since it's a big company that has to mostly deal with authors who are used to a more traditional process. Fortunately the people I worked with are keen to support this style of working and were able to work with me, although it did require a fair bit of programming on my part to make it all flow.

In the future I firmly believe this kind of automation will become more of an imperative, particularly if we want to push the boundaries of what a book should look like in a world of multiple tablets.

The Value of Publishers for Authors

When I've been asked about how publishers are valuable to an author, my emphasis was on getting books into bookstores. My regular counter-example was Dorset House, a publisher who did some fine books that you could never get hold of. But the move to internet sales changes the equation. At the moment one could argue that the only bookstore that matters is Amazon. If you can get your books there, does a publisher help?

Self-publishing has always had a reputation of being for cranks, yet there are some well regarded exceptions - Edward Tufte is a good example. With ebooks there is much more of an argument for self-publishing. There's a growing segment of authors like Amanda Hocking and Joe Konrath who are doing well selling lots of copies of very cheap books (under $5). By self-publishing there may be a route to lower the prices of books to the reader and still getting more than the traditional route offers. (An author usually gets about 5-10% of the cover price of a book.)

I've known other tech-authors who have given self-publishing a try and found it more trouble that it was worth. And there is the Prags who went into self-publishing and turned into a full-time business. (Ed Yourdon followed that same path a couple of decades earlier.) Both are examples of how common it is for people to underestimate the amount of work it takes to publish - for people seem to either give up to turn into their full-time job. But with such disruption to the book business I cannot exclude the possibility that there will be a shift that makes it worthwhile for independent authors.


1: Safari Books Online

A web business owned jointly by Pearson and O'Reilly. Membership in safari books online isn't cheap, but it does allow you access to all of their technical books, which is a very handy resource.

2: Physical Structure and Logical Structure

And that's without even considering cases where the physical representation is part of the book's design. In Refactoring, the opening chapter uses the left and right pages to show before and after code segments. I like how that worked really well, but it's impossible to do with ebooks.

3: Specialized Book Formats

A good example of these are travel guides. Their topic is such that a specialized information schema, based on point-of-interest rather than chapters and paragraphs, make sense. Lonely Planet is shifting to thinking of their content as a populated database that can be represented in various formats, which include tablet apps as well as epub documents and their website.

4: Pearson and Addison-Wesley

Publishers, like many enterprises, have undergone lots of consolidation over the years. As a result there are separate notions of publisher and imprint. A publisher is a company that publishes books. Thus far, all my books have been published by Pearson. However you don't see Pearson's name that prominently on my books, instead you see Addison-Wesley. Addison-Wesley is the imprint, essentially the brand-name used for books like mine. Pearson owns many imprints including Prentice Hall, Peachpit, and Addison-Wesley. In some cases the imprint and the publisher is the same, such as O'Reilly.

Pearson is in fact much larger even than that. Pearson also includes Longman books, The Economist, and the Financial Times.

5: I ended up dropping the website subscription in favor of the iPhone app since it was far more convenient.

Significant Revisions

05 May 2011: First Published

22 February 2011: Started Drafting