BusinessReadableDSL, DomainSpecificLanguage, DslBookRoadmap, DslBoundary, DslExceptionalism, DslMigration, DslQandA, DslReadings, EmbedmentHelper, ExpressionBuilder, FlexibleAntlrGeneration, FluentInterface, HelloRacc, InstallingOpenArchitectureWare, IntentionalSoftware, InternalDslStyle, LanguageWorkbench, LanguageWorkbenchReadings, MDSDandDSL, MetaProgrammingSystem, Oslo, ParserFear, RubyAnnotations, SyntacticNoise
||20 April 2009
Update: video of the Intentional talk at DSL Devon is now
Several years ago, my then colleague Matt Foemmel, dissatisfied
with the tools with which we were building software, managed to get
in touch with Charles Simonyi to find out more about the shadowy Intentional Software. What
he saw impressed him, and he persuaded me and other ThoughtWorkers
to get involved too. What we saw was a tool with startling
potential, but we remained frustrated by the secrecy and lack of
urgency to release. That frustration ended last week.
Last week I set off for Chris Sells's DSL Devcon, and
Magnus Christerson - Intentional's product manager - suggested
I pop in to see how they were going on. After several years of "Real
Soon Now", I was unsure, but Rebecca Parsons, my colleague who has
been keeping regular contact with Intentional, said that now would
be a good time.
I spent a fascinating and exciting day at their office in
Bellevue. It's not that I saw anything particularly new - these were
all ideas and capabilities that had been around for a while - but
there was a realness and maturity that I hadn't seen before. Indeed
Intentional had released a version 1.0 of their product a few weeks
earlier. The usual approach is to trumpet a version 1.0 release of a
ground-breaking product from the mountaintops. Only Intentional
would make such a release and not bother to tell anyone. Indeed as I
write this there's no mention of their product on their website - if
you want more information you have to talk to them.
This isn't a comprehensive discussion of their tool
(called the Intentional Domain Workbench), I haven't had time to
put something like that together. But I hope my scattered thoughts
and observations will be interesting. The Intentional Domain
Workbench is a LanguageWorkbench, indeed it's one of
the systems that made me coin that term. A Language Workbench is a
tool that allows people to design
DomainSpecificLanguages: not simply to parse them, but build
a comprehensive environment that includes rich editing. In
Intentional's case this is a ProjectionalEditing
One of the examples they have is the state machine example I use
for my book
introduction. The workbench allows you to define the schema of
the semantic model state machine in its schema definition
language. In order to manipulate state machines you define
projections of the semantic model. One of the striking features of
the Intentional Domain Workbench is its ability to support multiple
projections of the same semantic model. For the state machine
example they've defined projections in several of the DSLs I've used
in discussing the example: XML, custom syntax, and Ruby. All three
of these projections are reversible, meaning that you can edit
through them, updating the semantic model and other
projections. Switching between the projections is just a matter of
selecting a menu item.
They also had read-only projections in fluent C#, command-query
C, and a state machine diagram. Although they hadn't set up the
diagram to be editable, the workbench can handle editable
diagrammatic representations. In another example they
show an electronic circuit which is editable in both a
tree structured property sheet projection and in a circuit diagram
The circuit diagram also showed another really powerful feature
of the workbench - the ability to fluidly integrate example
executions with the program definition. In the electronic circuit
case, this means that you can give the various elements of the
circuit properties and the model will calculate the impedance of
various parts of the circuit and display them as you are editing the
circuit. Of course you can build a custom program to do this kind of
thing - but the point is that this behavior comes easily as part of
a DSL definition in the workbench.
Combining example execution with program definition is one of the
features of spreadsheets - and may be a reason why spreadsheets have
become so successful as an environment for
LayProgrammers. It's also a notion that's been propelling
much of Jonathon Edwards's interesting and wild ideas. My
sense is that interesting DSLs in language workbenches will have
this characteristic, particularly if they are aimed at being used by
Another way that you can combine execution with specification is
with test cases. They have an example of a pension workbench, build
with Capgemini, that allows actuaries to enter formulas using full
mathematical notation, together with FIT-like tables to show test
cases. These test cases are live with the appropriate red/green
behavior as you edit the formulas.
The pension workbench also illustrates the combination of
multiple languages. When you look at a pension document on the
screen, you're looking at three independent languages:
word processed text for the prose, mathematical notation for the
formulae, and test case tables. These languages are developed
independently but integrated in the workbench's core data structure
(called the Intentional Tree). This integration
extends to the execution too - you can step into a test case and
delve into the intermediate values in the mathematical formulae.
In order to make these things run, you have to include behavior
with the semantic model. Intentional have developed their own
general purpose language, whose working name is CL1, to do this. CL1 can
look like superset of C#, but such a view is again a projection of
the core semantic model. I found it interesting that this is a
similar feature to JetBrains MPS who have their "base language"
which projects into a Java-like general purpose
language. Increasingly much of these tools are programmed using this
in-workbench general purpose language.
The intended way of working is that developers use the
Intentional Domain Workbench to build a domain-specific
workbench. They provide a runtime (the Intentional Domain Runtime)
for them to run without language editing capabilities. So Capgemini
used the Intentional Domain Workbench to build the Pension Workbench
as their own product. The Intentional Domain Workbench allows you to
define new model schemas and projections, while the Pension
Workbench allows you to build pension plans using these languages.
The Intentional system is primarily arranged in the .NET
ecosystem. Both the workbench and runtime run on the CLR and core
parts of them are written in C#. The workbench makes it really easy
to generate .NET assemblies that can be automatically loaded into
the workbench for testing or run with the runtime. Custom
workbenches can generate code for any environment, and Intentional
have done some work with another partner that involves generating
Java code so that people can specify behavior in the custom
workbench and then deploy the resulting system in a Java environment.
An interesting aspect of the implementation is that they handle
representational transformations by using lots of little
transformations rather than one large one. As an example, code
generating C# from a semantic model involves about a dozen small
transforms lined up in a pipeline similar to a multi-stage compiler,
the last step being a transformation from a C# AST to text. Much of
their internal design goes into making this approach efficient so
you can happily string together a lot of small transforms without
worrying about any efficiency cost. A further consequence is that
the pipeline of transforms for code-generation is very similar to
that used for editing projections.
A common problem with tools that use projectional editing is how
they deal with version control. Often the answer is to just let
multiple people edit the same store simultaneously, which makes many
serious developers quake. The Intentional Domain Workbench has a built in
version control mechanism that records all the changes made to the
Intentional Tree and can do commits and merges at the tree
level. You then see diffs in languages by doing another projection.
An interesting feature of this version control approach
is that you can commit with conflicts and the conflicts are
committed into the repository as conflicts. Unlike with text files
they don't mess up your text - you have a real data structure
present, so you can find the conflicts and fix them. The developers
use this feature to commit a conflict they can't sort out to a
branch so that developers more familiar with the conflicted area can
update to the branch and fix it.
The fact that editing is done on an intentional tree rather than
text also changes some other things. For example unordered
collections are tagged so that a change in the ordering of the
elements in an editor doesn't trigger a conflict. You can also
include domain-specific conflict detection and resolution
Historically the lack of releasing of Intentional has been one
problem, their secrecy is another. To see anything real about the
Intentional Domain Workbench has required what Neal Ford refers to as an
UnforgivenContract. Intentional have given some public
talks, but they've really boiled down to saying "trust us, we
have some really cool technology". We'd known that indeed they had,
but couldn't explain to people why.
So I awaited the talk at DSL DevCon, given by Magnus and Shane Clifford
(their development manager), with quite some expectation. They said they
were going to finally open the curtain. Would they - and how would
They started worryingly, with the usual unrevealing Powerpoints,
but then they switched to showing the workbench and the curtain finally
opened. To gauge the reaction, take a look at
- @pandemonial Quite impressed! This is sweet!
Multiple domains, multiple langs, no question is going
- @csells OK, watching a live electrical circuit
rendered and working in a C# file is pretty damn cool.
- @jolson Two words to say about the Electronics
demo for Intentional Software: HOLY CRAPOLA. That's it, my brain has
- @gblock This is not about snazzy demos, this is about completely
changing the world we know it.
- @twleung ok, the intellisense for the actuarial formulas
is just awesome
- @lobrien This is like seeing a 100-mpg carburetor : OMG someone is going to buy this and put it in a vault!
Afterwards a couple of people said it was the most important demo
they'd ever seen, comparing it even to the Mother of all
Demos. For many there was a sense that the whole world of
software development had just changed.
(Many thanks to Chris Sells and co for organizing this conference
and inviting me to speak. They also made a video of the
So now what? There's more to all this than a demo can
reveal. Right now we want to get several of our hands on the
workbench and kick its tires - hard. Assuming it passes
that test, we want to use it on commercial projects and see how
works for real. No system designed using the Intentional Domain Workbench
has yet gone live, and as any agilist knows you never really
understand something till you deploy it into production every week.
Shortly the other major similar workbench to this - JetBrains's
System - will have version 1.0 released as open-source. So this year could
well be the year when these Language Workbenches will finally step
out into the light and see their first external pilot projects. (I
should also mention that the MetaEdit workbench has been out for a
while, although it hasn't had much visibility.) I don't know whether
these workbenches will change the face of programming as we know it,
after all I once thought Smalltalk was going to be our future; but these
workbenches do have the potential to be such a profound
change. Certainly I'm excited that we're now on the next, more
public, stage of this journey.
||4 February 2009
One danger that DSL advocates need to guard against is the notion
that first you design a DSL, then people use it. Like any other
deice of software, a successful DSL will evolve. This means that
scripts written in an earlier version of a DSL may fail when run
with a later version.
Like many properties of DSL, good and bad, this is really very
much the same as happens with a library. If you take a library from
a someone and they upgrade the library, you may end up stuck. In
essence DSLs don't really do anything to change that. Your DSL
definition is essentially a PublishedInterface and you
have to deal with the consequences just the same.
This problem can be more prominent with external DSLs. Many
changes to an internal DSL can be handled through refactoring tools
(for those languages that have them). But refactoring tools won't
help with an external DSL. In practice this problem is less of an
issue than it might be. An internal DSL with scripts that are
outside the control of the DSL implementors won't be picked up with
refactoring. So the only difference between internal and external
lies with DSL scripts within the same code base.
One technique for handling evolution of DSLs is to provide tools
that automatically migrate a DSL from one version to another. These
can be run either during an upgrade, or automatically should you try
to run an old version script against a new version.
There are two broad ways to handle migration. The first is an
incremental migration strategy. This is essentially the same notion
that's used by people doing evolutionary database
design. For every change you do to your DSL definition, create a
migration program that automatically migrates DSL scripts from the
old version to the new version.
An important part of incremental
migration is that you keep the changes as small as you can. Imagine
you are upgrading from version 1 to 2, and have ten changes you want
to make to your DSL definition. In this case, don't create just one
migration script to migrate from version 1 to 2, instead create at
least 10 scripts. Change the DSL definition one feature at a time,
and write a migration script for each change. You may find it useful
to break it down even more and add a feature with more than one step
(and thus more than one migration). They way I've described it may
sound like more work than a single script, but the point is that
migrations are much easier to write if they are small, and it's easy
to chain multiple migrations together. As a result you'll be much
faster writing ten scripts than one.
The other approach is model-based migration. This is a
tactic you can use if you are using a Semantic
Model (which is something I almost always recommend). With this
approach you support multiple parsers for your language, one for
each released version. (So you only do this for version 1 and 2, not
for the intermediate steps.) Each parser populates the semantic
model. When you use a semantic model, the parser's behavior is
pretty simple, so it's not too much trouble to have several of them
around. You then run the appropriate parser for the version of
script you are working with. This handles multiple versions, but
doesn't migrate the scripts. To do the migration you write a
generator from the semantic model that generates a DSL script
representation. This way you can run the parser for a version 1
script, populate the semantic model, and then emit a version 2
script from the generator.
One problem with the model-based approach is that it's easy to
lose stuff that doesn't matter to the semantics, but is something
that the script writers want to keep. Comments are the obvious
example. This is exacerbated if there's too much smarts in the
parser, but then the need to migrate this way may help encourage the
parsers to stay dumb - which is Good Thing.
If the change to the DSL is big enough, you may not be able to
transform a version 1 script into a version 2 semantic model. In
which case you may need to keep a version 1 model (or intermediate
model) around and give it the ability to emit a version 2
I don't have a strong preference between these two alternatives.
Migration scripts can be run by script programmers themselves
when needed, or automatically by the DSL system. In order to run
automatically it's very useful to have the script record which
version of the DSL it is so the parser can detect it easily and
trigger the resulting migrations.
||22 December 2008
One of the tricky things about writing about external
DomainSpecificLanguages is that I'm walking through
territory already heavily tracked by the programming languages
community. Programming language research has always been a popular
area of academic activity, and I'm the first to admit that I don't
have anywhere near the depth in this topic as many people who've
been studying in this space for years. So inevitably the question
comes up as to why such a noob as me thinks he can write a book in
this well trodden ground?
The primary reason is that nobody else has written a
practitioner-oriented book on DSLs. I like topics like this that are
well-trodden but not well written about. However as I've spent time
walking these pathways I think there's another factor in the
There's a lot of work on programming languages out there, but
almost all of it has concentrated on general purpose programming
languages. DSLs are seen as a small and simple subset of general
purpose programming thinking. As a result people think that what's
true for general purpose languages is also true for DSLs (with the
implication that DSLs are too small to be worth thinking much about).
I'm increasingly of the opposite conclusion. The rules for DSLs
are different to the rules for general purpose languages - and this
applies on multiple dimensions.
The first is in language design. I was talking with a language
designer who I have a lot of respect for, and he stressed that a key
feature of languages was the ability to define new
abstractions. With DSLs I don't think this is the case. In most DSLs
the DSL chooses the abstraction you work with, if you want different
abstractions you use a different DSL (or maybe extend the DSL you're
using). Sometimes there's a role for new abstractions, but those
cases are the minority and when they do happen the abstractions are
limited. Indeed I think the lack of ability to define new
abstractions is one of the things that distinguishes DSLs from
general purpose languages.
Differences also occur in the approach that you use for
implementing the tools that go with languages. A constant issue for
general purpose languages is dealing with large inputs, since
realistic programs will have thousands or millions of lines of
code. As a result many tools and techniques for using them involve
aspects that make parsing harder to follow but support these large
inputs. DSL scripts tend to be much smaller, so these trade-offs
In my work I've put a lot of emphasis on using a DSL to populate
Model, using that model as the basis for any further
processing: interpretation, visualization, or code generation. Lots
of language writing I've seen tend to emphasize code generation,
often generating code directly from the grammar file. Intermediate
representations are not talked about much, and when they do appear
they more in the form of an Abstract Syntax Tree rather than a
semantic model. Serious compilers do use intermediate
representations, such as program dependence graphs, but these are
seen (rightly) as advanced topics. I think Semantic Models are a
really valuable tool in simplifying the use of a DSL, allowing you
to separate the parsing from the semantics.
Since DSLs are less expressive, you can
design a simpler language for them. Much of the language community's writing
talks about how to handle the difficulties of a complex general
purpose language, while the challenge of DSLS is to write a language
that is readable to the intended audience (which may well include
non-programmers) and also should be easy to parse (to simplify the
maintenance of the parser). Not just does this lead to different
decisions on the design of a language, it also means that you only
really need a subset of the features of parser generators.
A consequence of this is DSLs are written with the expectation
that each individual DSL won't solve the whole problem at hand and
often you need to combine DSLs. Traditional language thinking hasn't
explored the idea of composable languages that much, but I think
this topic is very important as DSLs develop. Thinking about
composable languages should have significant effects on both language
design and language tools.
So I'm increasingly coming around to the thinking that DSLs
inspire some seriously different ways of thinking about programming
languages. It may also lead to developing different kinds of parsing
tools that are more suited for DSL work - usually tools that are
simpler. I hope the increased attention that DSLs are getting these
days will lead to more people treating DSLs as first class subjects
of study rather than a simplistic form of general purpose languages.
||15 December 2008
Will DSLs allow business people to write software rules
without involving programmers?
When people talk about DSLs it's common to raise the question of
business people writing code for themselves. I like to apply the
COBOL inference to this line of thought. That is that one of the
original aims of COBOL was to allow people to write software without
programmers, and we know how that worked out. So when any scheme is
hatched to write code without programmers, I have to ask what's
special this time that would make it succeed where COBOL (and so
many other things) have failed.
I do think that programming involves a particular mind-set, an
ability to both give precise instructions to a machine and the
ability to structure a large amount of such instructions to make a
comprehensible program. That talent, and the time involved to
understand and build a program, is why programming has resisted
being disintermediated for so long. It's also why many
"non-programming" environments end up breeding their own class of
That said, I do think that the greatest potential benefit of DSLs
comes when business people participate directly in the writing of
the DSL code. The sweet spot, however is in making DSLs
business-readable rather than business-writeable. If
business people are able to look at the DSL code and understand it,
then we can build a deep and rich communication channel between
software development and the underlying domain. Since this is the Yawning
Crevasse of Doom in software, DSLs have great value if they can
help address it.
With a business-readable DSL, programmers write the code but they
show that code frequently to business people who can understand what
it means. These customers can then make changes, maybe draft some
code, but it's the programmers who make it solid and do the
debugging and testing.
This isn't to say that there's no benefit in a business-writable
DSL. Indeed a couple of years ago some colleagues of mine built a
system that included just that, and it was much
appreciated by the business. It's just that the effort in creating a
decent editing environment, meaningful error messages, debugging and
testing tools raises the cost significantly.
While I'm quick to use the COBOL inference to diss many tools
that seek to avoid programmers, I also have to acknowledge the big
exception: spreadsheets. All over the world suprisingly big business
functions are run off the back of Excel. Serious programmers tend to
look down their noses at these, but we need to take them more
seriously and try to understand why they have been as successful as
they are. It's certainly part of the reason that drives many
LanguageWorkbench developers to provide a different
vision of software development.
||28 October 2008
Oslo is a project at Microsoft, of which various things have been
heard but with little details until this week's PDC conference. What we
have known is that it has something to do with
ModelDrivenSoftwareDevelopment and DomainSpecificLanguages.
A couple of weeks ago I got an early peek behind the curtain as I,
and my language-geek colleague Rebecca Parsons, went through a preview
of the PDC coming-out talks with Don Box, Gio
Della-Libera and Vijaye Raji.
It was a very interesting presentation, enough to convince me that
Oslo is a technology to watch. It's broadly a Language
Workbench. I'm not going to attempt a comprehensive review of the
tool here, but just my scattered impressions from the walk-through. It
was certainly interesting enough that I thought I'd publish my
impressions here. With the public release at the PDC I'm sure you'll
be hearing a lot more about it in the coming weeks. As I describe my
thoughts I'll use a lot of the language I've been developing for my book, so you may find
the terminology a little dense.
Oslo has three main components:
- a modeling language (currently code-named M) for textual
- a design surface (named Quadrant) for graphical DSLs
- a repository (without a name) that stores semantic
models in a relational database.
(All of these names are current code names. The marketing
department will still use the same smarts that replaced "Avalon and
Indigo" with "WPF and WCF". I'm just hoping they'll rename "Windows"
to "Windows Technology Foundation".)
The textual language environment is bootstrapped and provides three base
- MGrammar: defines grammars for Syntax
- MSchema: defines schemas for a Semantic Model
- MGraph: is a textual language for representing the
population of a Semantic Model. So while MSchema represents types,
MGraph represents instances. Lispers might think of MGraph as
s-expressions with a ugly syntax.
You can represent any model in MGraph, but the syntax is often not
too good. With MGrammar you can define a grammar for your own DSL
which allows you to write scripts in your own DSL and build a parser to
translate them into something more useful.
Using the state machine example from my book introduction, you
could define a state machine semantic model with MSchema. You could
then populate it (in an ugly way) with MGraph. You can build a decent
DSL to populate it using MGrammar to define the syntax and to drive a
There is a grammar compiler (called
mg) that will take
an input file in MGrammar and compile it into what they call an image
file, or .mgx file. This is different to most parser generator
tools. Most parser generators tools take the grammar and generate code
which has to be compiled into a parser. Instead Oslo's tools compile
the grammar into a binary form of the parse rules. There's then a
separate tool (
mgx) that can take an input script and a
compiled grammar and outputs the MGraph representation of the syntax
tree of the input script.
More likely you can take the compiled grammar and add it to your
own code as a resource. With this you can call a general parser
mechanism that Oslo provides as a .NET framework, supply the reference
to the compiled grammar file, and generate an in-memory syntax
tree. You can then walk this syntax tree and use it to do whatever you
will - the parsing strategy I refer to as Tree
The parser gives you a syntax tree, but that's often not the same as
a semantic model. So usually you'll write code to walk the tree and
populate a semantic model defined with MSchema. Once you've done this
you can easily take that model and store it in the repository so that
it can accessed via SQL tools. Their demo showed entering some data
via a DSL and accessing corresponding tables in the repository,
although we didn't go into complicated structures.
You can also manipulate the semantic model instance with
Quadrant. You can define a graphical notation for a schema and then
the system can project the model instance creating a diagram using
that notation. You can also change the diagram which updates the
model. They showed a demo of two graphical projections of a model,
updating one updated the other using Observer
Synchronization. In that way using Quadrant seems like a similar
style of work to a graphical Language Workbench such MetaEdit.
As they've been developing Oslo they have been using it on other
Microsoft projects to gain experience in its use. Main ones so far
have been with ASP, Workflow, and web services.
More on M
We spent most of the time looking at the textual environment. They have a
way of hooking up a compiled grammar to a text editing control to
provide a syntax-aware text editor with various completion and
highlighting goodness. Unlike tools such as MPS, however, it is
still a text editor. As a result you can cut and paste stretches of
text and manipulate text freely. The tool will give you squigglies if
there's a problem parsing what you've done, but it preserves the
editing text experience.
I think I like this. When I first came across it, I rather liked
the MPS notion of: "it looks like text, but really it's a structured
editor". But recently I've begun to think that we lose a lot that
way, so the Oslo way of working is appealing.
Another nice text language tool they have is an editor to help
write MGrammars. This is a window divided into three vertical
panes. The center pane contains MGrammar code, the left pane contains
some input text, and the right pane shows the MGraph representation of
parsing the input text with the MGrammar. It's very example
driven. (However it is transient, unlike tests.) The tool resembles
the capability in Antlr to process sample text right away with a
grammar. In the conversation Rebecca referred to
this style as "anecdotal testing" which is a phrase I must remember to
The parsing algorithm they use is a GLR parser. The grammar syntax
is comparable to EBNF and has notation for Tree Construction expressions. They
use their own varient of regex notation in the lexer to be more
consistent with their other tools, which will probably throw people
like me more used to ISO/Perl regexp notation. It's mostly similar,
but different enough to be annoying.
One of the nice features of their grammar notation is
that they have provided constructs to easily make parameterized rules -
effectively allowing you to write rule subroutines. Rules can also be
given attributes (aka annotations), in a similar way to .NET's
language attributes. So you can make a whole language case insensitive
by marking it with an attribute. (Interestingly they use "@" to mark
an attribute, as in the Java syntax.)
The default way a grammar is run is to do tree construction. As it
turns out the tree construction is the behavior of the default class
that gets called by the grammar while it's processing some input. This
class has an interface and you can write your own class that
implements this. This would allow you to do embedded translation and
embedded interpretation. It's not the same as code actions, as the
action code isn't in the grammar, but in this other class. I reckon
this could well be better since the code inside actions often swamp
They talked a bit about the ability to embed one language in
another and switch the parsers over to handle this gracefully -
heading into territory that's been explored by Converge. We didn't look at this deeply
but that would be interesting.
An interesting tidbit they mentioned was that originally they
intended to only have the tools for graphical languages. However they
found that graphical languages just didn't work well for many problems
- including defining schemas. So they developed the textual tools.
(Here's a thought for the marketing department. If you stick with
the name "M" you could use this excellent film for
marketing inspiration ;-))
Plainly this tool hovers in the same space as tools like
Intentional Software and JetBrains MPS that I dubbed as Language
Workbenches in 2005. Oslo doesn't exactly fit the definition for a
language workbench that I gave back then. In particular the textual
component isn't a projectional editor and you don't have to use a
storage representation based on the abstract representation (semantic
model), instead you can store the textual source in a more
conventional style. This lesser reliance on a persistent abstract
representation is similar to Xtext. At some point I really need to
rethink what I consider the defining elements of a Language Workbench
to be. For the moment let's just say that Xtext and Oslo feel like
Language Workbenches and until I revisit the definition I'll treat
them as such.
One particularly interesting point in this comparison is comparing
Oslo with Microsoft's
DSL tools. They are different tools with a lot of overlap, which
makes you wonder if there's a place for both them. I've heard vague
"they fit together" phrases, but am yet to be convinced. It could be
one of those situations (common in big companies) where multiple
semi-competing projects are developed. Eventually this could lead to
one being shelved. But it's hard to speculate about this as much
depends on corporate politics and it's thus almost impossible to get a
straight answer out of anyone (and even if you do, it's even harder to
tell if it is a straight answer).
The key element that Oslo shares with its cousins is that it
provides a toolkit to define new languages, integrate them together,
and define tooling for those languages. As a result you get the
freedom of syntax of external DomainSpecificLanguages
with decent tooling - something that deals with one of the main
disadvantages of external DSLs.
Oslo supports both textual and graphical DSLs and seems to do so
reasonably evenly (although we spent more time on the textual). In
this regard it seems to provide more variety than MPS and Intentional
(structured textual) and MetaEdit/Microsoft's DSL tools (graphical). It's also
different in its textual support in that it provides real free text
input not the highly structured text input of Intentional/MPS.
Using a compiled grammar that plugs into a text editor strikes me
as a very nice route for supporting entering DSL scripts. Other tools
either require you to have the full language workbench machinery or to
use code generation to build editors. Passing around a representation
of the grammar that I could plug into an editor strikes me as a good
way to do it. Of course if that language workbench is Open Source (as
I'm told MPS will be), then that may make this issue moot.
One of the big issues with storing stuff like this in a repository
is handling version control. The notion that we can all collaborate on
a single shared database (the moral equivalent of a team editing one
copy of its code on a shared drive) strikes me as close to
irresponsible. As a result I tend to look askance at any vendors who
suggest this approach. The Oslo team suggests, wisely, that you treat
the text files as the authoritative source which allows you to use
regular version control tools. Of course the bad news for many
Microsoft shops would be that this tool is TFS (or, god-forbid, VSS),
but the great advantage of using plain text files as your source is
that you can use any of the multitude of version control systems to
A general thing I liked was most of the tools leant towards
run-time interpretation rather than code generation and
compilation. Traditionally parser generators and many language
workbenches assume you are going to generate code from your models
rather than interpreting them. Code generation is all very well, but
it always has this messy feel to it - and tends to lead to all sorts
of ways to trip you up. So I do prefer the run-time emphasis.
It was only a couple of hours, so I can't make any far-reaching
judgements about Oslo. I can, however, say it looks like some very
interesting technology. What I like about it is that it seems to
provide a good pathway to using language workbenches. Having Microsoft
behind it would be a big deal although we do need to
remember that all sorts of things were promised about Longhorn that
never came to pass. But all in all I think this is an interesting
addition to the Language Workbench scene and a tool that could make
DSLs much more prevalent.
||9 September 2008
I was asked to put together a discussion of DSLs for
non-technical types. Maybe I've been reading too much Stephen O'Grady, but I felt an
irresistible urge to do it in a Q and A manner. So here it
What is a Domain Specific Language?
A Domain Specific Language (DSL) is a computer
programming language of limited expressiveness focused on a
particular domain. Most languages you hear of are General
Purpose Languages, which can handle most things you run into
during a software project. Each DSL can only handle one specific
aspect of a system.
So you wouldn't write a whole project in a
No. Most projects will use one general purpose language
and several DSLs
Are they a new idea?
Not at all. DSLs have been used extensively in Unix
circles since the early days of that system. The lisp community
often talks of creating DSLs in lisp and then using the DSLs to
implement the logic. Most IT projects use several DSLs - you
might of heard of things like CSS, SQL, regular expressions and
So why are they getting a lot of noise
Probably because of Ruby and Rails. Ruby as a language has
many features that make it easy to develop DSLs and the people
who got involved in the Ruby community have been familiar with
this approach from elsewhere, so they took advantage of these
features. In particular Rails uses several DSLs which have
played a big role in making it so easy to use. This in turn
encouraged more people to take up these ideas.
Another reason is that many Java and C# systems need to have
some of their behavior defined in a more dynamic way. This led
to complex XML files that are difficult to comprehend, which in
turn led to people exploring DSLs again.
So DSLs can be used with languages other than
Yes, as I indicated DSLs have been around for much
longer than Ruby has. Ruby has an unobtrusive syntax and
meta-programming features that make it easier to create more
elegant internal DSLs than languages like C# and Java. But there
are useful internal DSLs in Java and C#.
What's the distinction between internal and external
An internal DSL is just a particular idiom of
writing code in the host language. So a Ruby internal DSL is
Ruby code, just written in particular style which gives a more
language-like feel. As such they are often called Fluent
Interfaces or Embedded DSLs. An external DSL
is a completely separate language that is parsed into data that
the host language can understand.
Why are people interested in DSLs?
I see DSLs as having two main benefits. The most common
benefit is that they make certain kinds of code easier to
comprehend, which makes it much easier to modify, thus
improving programmer productivity. This is worthwhile all on
its own and is relatively easy to achieve.
The most interesting benefit, however, is that a well designed
DSL can be understandable by business people, allowing them to
directly comprehend the code that implements their business
So is this the hook - business people write the rules
In general I don't think so. It's a lot of work to make
an environment that allows business people to write their own
rules. You have to make a comfortable editing tool, debugging
tools, testing tools, and so on. You get most of the benefit of
business facing DSLs by doing enough to allow business people to
be able to read the rules. They can then review them for
accuracy, talk about them with the developers and draft changes
for developers to implement properly. Getting DSLs to be
business readable is far less effort than business writable, but
yields most of the benefits. There are times where it's worth
making the effort to make the DSLs business-writable, but it's a
more advanced goal.
Do you need special (ie expensive) tools?
In general, no. Internal DSLs just use the regular
facilities of the programming language that you are using
anyway. External DSLs do require you to use some special tools
- but these are open source and are very mature. The biggest
problem with these tools is that most developers aren't
familiar with them and believe they are harder to use than
they really are (a problem exacerbated by poor
There are exceptions on the horizon, however. These are a
class of tools that I call a
LanguageWorkbench. These tools allow you to define
DSLs more easily, and also provide sophisticated editors for
them. Tools like this make it more feasible to make
So is this a repeat of the dream of developing
software without programming (or programmers)?
That was the intent of COBOL, and I don't think there's
any reason to think that DSLs will succeed where COBOL (and so
many others failed). What I think is important is that DSLs
allow business people and developers to collaborate more
effectively because they can talk about a common set of precise
rules that are the executable code.
When should I consider making a DSL?
When you are looking at an aspect of system with rich
business rules or work-flow. A well-written DSL should allow
customers to understand the rules by which the system works.
Isn't this going to lead to a cacophony of languages
that people will find hard to learn?
We already have a cacophony of frameworks that
programmers have to learn. That's the inevitable consequence of
reusable software, which is the only way we can get a handle on
all the things software has to do these days. In essence a DSL
is nothing more than a fancy facade over a framework. As a
result they contribute little complexity over what is already
there. Indeed a good DSL should make things better by making
these frameworks easier to use.
But won't people create lots of bad DSLs?
Of course, just like people create bad frameworks. But
again I'd argue that bad DSLs don't do much additional harm
compared to the cost of bad frameworks.