Dsl bliki



IntentionalSoftware dsl 20 April 2009 Reactions

Several years ago, my then colleague Matt Foemmel, dissatisfied with the tools with which we were building software, managed to get in touch with Charles Simonyi to find out more about the shadowy Intentional Software. What he saw impressed him, and he persuaded me and other ThoughtWorkers to get involved too. What we saw was a tool with startling potential, but we remained frustrated by the secrecy and lack of urgency to release. That frustration ended last week.

Last week I set off for Chris Sells's DSL Devcon, and Magnus Christerson - Intentional's product manager - suggested I pop in to see how they were going on. After several years of "Real Soon Now", I was unsure, but Rebecca Parsons, my colleague who has been keeping regular contact with Intentional, said that now would be a good time.

I spent a fascinating and exciting day at their office in Bellevue. It's not that I saw anything particularly new - these were all ideas and capabilities that had been around for a while - but there was a realness and maturity that I hadn't seen before. Indeed Intentional had released a version 1.0 of their product a few weeks earlier. The usual approach is to trumpet a version 1.0 release of a ground-breaking product from the mountaintops. Only Intentional would make such a release and not bother to tell anyone. Indeed as I write this there's no mention of their product on their website - if you want more information you have to talk to them.

What's There

This isn't a comprehensive discussion of their tool (called the Intentional Domain Workbench), I haven't had time to put something like that together. But I hope my scattered thoughts and observations will be interesting. The Intentional Domain Workbench is a LanguageWorkbench, indeed it's one of the systems that made me coin that term. A Language Workbench is a tool that allows people to design DomainSpecificLanguages: not simply to parse them, but build a comprehensive environment that includes rich editing. In Intentional's case this is a ProjectionalEditing environment.

One of the examples they have is the state machine example I use for my book introduction. The workbench allows you to define the schema of the semantic model state machine in its schema definition language. In order to manipulate state machines you define projections of the semantic model. One of the striking features of the Intentional Domain Workbench is its ability to support multiple projections of the same semantic model. For the state machine example they've defined projections in several of the DSLs I've used in discussing the example: XML, custom syntax, and Ruby. All three of these projections are reversible, meaning that you can edit through them, updating the semantic model and other projections. Switching between the projections is just a matter of selecting a menu item.

They also had read-only projections in fluent C#, command-query C, and a state machine diagram. Although they hadn't set up the diagram to be editable, the workbench can handle editable diagrammatic representations. In another example they show an electronic circuit which is editable in both a tree structured property sheet projection and in a circuit diagram projection.

The circuit diagram also showed another really powerful feature of the workbench - the ability to fluidly integrate example executions with the program definition. In the electronic circuit case, this means that you can give the various elements of the circuit properties and the model will calculate the impedance of various parts of the circuit and display them as you are editing the circuit. Of course you can build a custom program to do this kind of thing - but the point is that this behavior comes easily as part of a DSL definition in the workbench.

Combining example execution with program definition is one of the features of spreadsheets - and may be a reason why spreadsheets have become so successful as an environment for LayProgrammers. It's also a notion that's been propelling much of Jonathon Edwards's interesting and wild ideas. My sense is that interesting DSLs in language workbenches will have this characteristic, particularly if they are aimed at being used by lay-programmers.

Another way that you can combine execution with specification is with test cases. They have an example of a pension workbench, build with Capgemini, that allows actuaries to enter formulas using full mathematical notation, together with FIT-like tables to show test cases. These test cases are live with the appropriate red/green behavior as you edit the formulas.

The pension workbench also illustrates the combination of multiple languages. When you look at a pension document on the screen, you're looking at three independent languages: word processed text for the prose, mathematical notation for the formulae, and test case tables. These languages are developed independently but integrated in the workbench's core data structure (called the Intentional Tree). This integration extends to the execution too - you can step into a test case and delve into the intermediate values in the mathematical formulae.

In order to make these things run, you have to include behavior with the semantic model. Intentional have developed their own general purpose language, whose working name is CL1, to do this. CL1 can look like superset of C#, but such a view is again a projection of the core semantic model. I found it interesting that this is a similar feature to JetBrains MPS who have their "base language" which projects into a Java-like general purpose language. Increasingly much of these tools are programmed using this in-workbench general purpose language.

The intended way of working is that developers use the Intentional Domain Workbench to build a domain-specific workbench. They provide a runtime (the Intentional Domain Runtime) for them to run without language editing capabilities. So Capgemini used the Intentional Domain Workbench to build the Pension Workbench as their own product. The Intentional Domain Workbench allows you to define new model schemas and projections, while the Pension Workbench allows you to build pension plans using these languages.

The Intentional system is primarily arranged in the .NET ecosystem. Both the workbench and runtime run on the CLR and core parts of them are written in C#. The workbench makes it really easy to generate .NET assemblies that can be automatically loaded into the workbench for testing or run with the runtime. Custom workbenches can generate code for any environment, and Intentional have done some work with another partner that involves generating Java code so that people can specify behavior in the custom workbench and then deploy the resulting system in a Java environment.

An interesting aspect of the implementation is that they handle representational transformations by using lots of little transformations rather than one large one. As an example, code generating C# from a semantic model involves about a dozen small transforms lined up in a pipeline similar to a multi-stage compiler, the last step being a transformation from a C# AST to text. Much of their internal design goes into making this approach efficient so you can happily string together a lot of small transforms without worrying about any efficiency cost. A further consequence is that the pipeline of transforms for code-generation is very similar to that used for editing projections.

A common problem with tools that use projectional editing is how they deal with version control. Often the answer is to just let multiple people edit the same store simultaneously, which makes many serious developers quake. The Intentional Domain Workbench has a built in version control mechanism that records all the changes made to the Intentional Tree and can do commits and merges at the tree level. You then see diffs in languages by doing another projection.

An interesting feature of this version control approach is that you can commit with conflicts and the conflicts are committed into the repository as conflicts. Unlike with text files they don't mess up your text - you have a real data structure present, so you can find the conflicts and fix them. The developers use this feature to commit a conflict they can't sort out to a branch so that developers more familiar with the conflicted area can update to the branch and fix it.

The fact that editing is done on an intentional tree rather than text also changes some other things. For example unordered collections are tagged so that a change in the ordering of the elements in an editor doesn't trigger a conflict. You can also include domain-specific conflict detection and resolution behavior.

Going Public

Historically the lack of releasing of Intentional has been one problem, their secrecy is another. To see anything real about the Intentional Domain Workbench has required what Neal Ford refers to as an UnforgivenContract. Intentional have given some public talks, but they've really boiled down to saying "trust us, we have some really cool technology". We'd known that indeed they had, but couldn't explain to people why.

So I awaited the talk at DSL DevCon, given by Magnus and Shane Clifford (their development manager), with quite some expectation. They said they were going to finally open the curtain. Would they - and how would people react?

They started worryingly, with the usual unrevealing Powerpoints, but then they switched to showing the workbench and the curtain finally opened. To gauge the reaction, take a look at Twitter.

  • @pandemonial Quite impressed! This is sweet! Multiple domains, multiple langs, no question is going unanswered
  • @csells OK, watching a live electrical circuit rendered and working in a C# file is pretty damn cool.
  • @jolson Two words to say about the Electronics demo for Intentional Software: HOLY CRAPOLA. That's it, my brain has finally exploded.
  • @gblock This is not about snazzy demos, this is about completely changing the world we know it.
  • @twleung ok, the intellisense for the actuarial formulas is just awesome
  • @lobrien This is like seeing a 100-mpg carburetor : OMG someone is going to buy this and put it in a vault!

Afterwards a couple of people said it was the most important demo they'd ever seen, comparing it even to the Mother of all Demos. For many there was a sense that the whole world of software development had just changed.

(Many thanks to Chris Sells and co for organizing this conference and inviting me to speak. They also made a video of the talk available.)

So now what? There's more to all this than a demo can reveal. Right now we want to get several of our hands on the workbench and kick its tires - hard. Assuming it passes that test, we want to use it on commercial projects and see how works for real. No system designed using the Intentional Domain Workbench has yet gone live, and as any agilist knows you never really understand something till you deploy it into production every week.

Shortly the other major similar workbench to this - JetBrains's Meta Programming System - will have version 1.0 released as open-source. So this year could well be the year when these Language Workbenches will finally step out into the light and see their first external pilot projects. (I should also mention that the MetaEdit workbench has been out for a while, although it hasn't had much visibility.) I don't know whether these workbenches will change the face of programming as we know it, after all I once thought Smalltalk was going to be our future; but these workbenches do have the potential to be such a profound change. Certainly I'm excited that we're now on the next, more public, stage of this journey.


DslMigration dsl 4 February 2009 Reactions

One danger that DSL advocates need to guard against is the notion that first you design a DSL, then people use it. Like any other deice of software, a successful DSL will evolve. This means that scripts written in an earlier version of a DSL may fail when run with a later version.

Like many properties of DSL, good and bad, this is really very much the same as happens with a library. If you take a library from a someone and they upgrade the library, you may end up stuck. In essence DSLs don't really do anything to change that. Your DSL definition is essentially a PublishedInterface and you have to deal with the consequences just the same.

This problem can be more prominent with external DSLs. Many changes to an internal DSL can be handled through refactoring tools (for those languages that have them). But refactoring tools won't help with an external DSL. In practice this problem is less of an issue than it might be. An internal DSL with scripts that are outside the control of the DSL implementors won't be picked up with refactoring. So the only difference between internal and external lies with DSL scripts within the same code base.

One technique for handling evolution of DSLs is to provide tools that automatically migrate a DSL from one version to another. These can be run either during an upgrade, or automatically should you try to run an old version script against a new version.

There are two broad ways to handle migration. The first is an incremental migration strategy. This is essentially the same notion that's used by people doing evolutionary database design. For every change you do to your DSL definition, create a migration program that automatically migrates DSL scripts from the old version to the new version.

An important part of incremental migration is that you keep the changes as small as you can. Imagine you are upgrading from version 1 to 2, and have ten changes you want to make to your DSL definition. In this case, don't create just one migration script to migrate from version 1 to 2, instead create at least 10 scripts. Change the DSL definition one feature at a time, and write a migration script for each change. You may find it useful to break it down even more and add a feature with more than one step (and thus more than one migration). They way I've described it may sound like more work than a single script, but the point is that migrations are much easier to write if they are small, and it's easy to chain multiple migrations together. As a result you'll be much faster writing ten scripts than one.

The other approach is model-based migration. This is a tactic you can use if you are using a Semantic Model (which is something I almost always recommend). With this approach you support multiple parsers for your language, one for each released version. (So you only do this for version 1 and 2, not for the intermediate steps.) Each parser populates the semantic model. When you use a semantic model, the parser's behavior is pretty simple, so it's not too much trouble to have several of them around. You then run the appropriate parser for the version of script you are working with. This handles multiple versions, but doesn't migrate the scripts. To do the migration you write a generator from the semantic model that generates a DSL script representation. This way you can run the parser for a version 1 script, populate the semantic model, and then emit a version 2 script from the generator.

One problem with the model-based approach is that it's easy to lose stuff that doesn't matter to the semantics, but is something that the script writers want to keep. Comments are the obvious example. This is exacerbated if there's too much smarts in the parser, but then the need to migrate this way may help encourage the parsers to stay dumb - which is Good Thing.

If the change to the DSL is big enough, you may not be able to transform a version 1 script into a version 2 semantic model. In which case you may need to keep a version 1 model (or intermediate model) around and give it the ability to emit a version 2 script.

I don't have a strong preference between these two alternatives.

Migration scripts can be run by script programmers themselves when needed, or automatically by the DSL system. In order to run automatically it's very useful to have the script record which version of the DSL it is so the parser can detect it easily and trigger the resulting migrations.


DslExceptionalism dsl 22 December 2008 Reactions

One of the tricky things about writing about external DomainSpecificLanguages is that I'm walking through territory already heavily tracked by the programming languages community. Programming language research has always been a popular area of academic activity, and I'm the first to admit that I don't have anywhere near the depth in this topic as many people who've been studying in this space for years. So inevitably the question comes up as to why such a noob as me thinks he can write a book in this well trodden ground?

The primary reason is that nobody else has written a practitioner-oriented book on DSLs. I like topics like this that are well-trodden but not well written about. However as I've spent time walking these pathways I think there's another factor in the works.

There's a lot of work on programming languages out there, but almost all of it has concentrated on general purpose programming languages. DSLs are seen as a small and simple subset of general purpose programming thinking. As a result people think that what's true for general purpose languages is also true for DSLs (with the implication that DSLs are too small to be worth thinking much about).

I'm increasingly of the opposite conclusion. The rules for DSLs are different to the rules for general purpose languages - and this applies on multiple dimensions.

The first is in language design. I was talking with a language designer who I have a lot of respect for, and he stressed that a key feature of languages was the ability to define new abstractions. With DSLs I don't think this is the case. In most DSLs the DSL chooses the abstraction you work with, if you want different abstractions you use a different DSL (or maybe extend the DSL you're using). Sometimes there's a role for new abstractions, but those cases are the minority and when they do happen the abstractions are limited. Indeed I think the lack of ability to define new abstractions is one of the things that distinguishes DSLs from general purpose languages.

Differences also occur in the approach that you use for implementing the tools that go with languages. A constant issue for general purpose languages is dealing with large inputs, since realistic programs will have thousands or millions of lines of code. As a result many tools and techniques for using them involve aspects that make parsing harder to follow but support these large inputs. DSL scripts tend to be much smaller, so these trade-offs work differently.

In my work I've put a lot of emphasis on using a DSL to populate a Semantic Model, using that model as the basis for any further processing: interpretation, visualization, or code generation. Lots of language writing I've seen tend to emphasize code generation, often generating code directly from the grammar file. Intermediate representations are not talked about much, and when they do appear they more in the form of an Abstract Syntax Tree rather than a semantic model. Serious compilers do use intermediate representations, such as program dependence graphs, but these are seen (rightly) as advanced topics. I think Semantic Models are a really valuable tool in simplifying the use of a DSL, allowing you to separate the parsing from the semantics.

Since DSLs are less expressive, you can design a simpler language for them. Much of the language community's writing talks about how to handle the difficulties of a complex general purpose language, while the challenge of DSLS is to write a language that is readable to the intended audience (which may well include non-programmers) and also should be easy to parse (to simplify the maintenance of the parser). Not just does this lead to different decisions on the design of a language, it also means that you only really need a subset of the features of parser generators.

A consequence of this is DSLs are written with the expectation that each individual DSL won't solve the whole problem at hand and often you need to combine DSLs. Traditional language thinking hasn't explored the idea of composable languages that much, but I think this topic is very important as DSLs develop. Thinking about composable languages should have significant effects on both language design and language tools.

So I'm increasingly coming around to the thinking that DSLs inspire some seriously different ways of thinking about programming languages. It may also lead to developing different kinds of parsing tools that are more suited for DSL work - usually tools that are simpler. I hope the increased attention that DSLs are getting these days will lead to more people treating DSLs as first class subjects of study rather than a simplistic form of general purpose languages.


BusinessReadableDSL dsl 15 December 2008 Reactions

Will DSLs allow business people to write software rules without involving programmers?

When people talk about DSLs it's common to raise the question of business people writing code for themselves. I like to apply the COBOL inference to this line of thought. That is that one of the original aims of COBOL was to allow people to write software without programmers, and we know how that worked out. So when any scheme is hatched to write code without programmers, I have to ask what's special this time that would make it succeed where COBOL (and so many other things) have failed.

I do think that programming involves a particular mind-set, an ability to both give precise instructions to a machine and the ability to structure a large amount of such instructions to make a comprehensible program. That talent, and the time involved to understand and build a program, is why programming has resisted being disintermediated for so long. It's also why many "non-programming" environments end up breeding their own class of programmers-in-fact.

That said, I do think that the greatest potential benefit of DSLs comes when business people participate directly in the writing of the DSL code. The sweet spot, however is in making DSLs business-readable rather than business-writeable. If business people are able to look at the DSL code and understand it, then we can build a deep and rich communication channel between software development and the underlying domain. Since this is the Yawning Crevasse of Doom in software, DSLs have great value if they can help address it.

With a business-readable DSL, programmers write the code but they show that code frequently to business people who can understand what it means. These customers can then make changes, maybe draft some code, but it's the programmers who make it solid and do the debugging and testing.

This isn't to say that there's no benefit in a business-writable DSL. Indeed a couple of years ago some colleagues of mine built a system that included just that, and it was much appreciated by the business. It's just that the effort in creating a decent editing environment, meaningful error messages, debugging and testing tools raises the cost significantly.

While I'm quick to use the COBOL inference to diss many tools that seek to avoid programmers, I also have to acknowledge the big exception: spreadsheets. All over the world suprisingly big business functions are run off the back of Excel. Serious programmers tend to look down their noses at these, but we need to take them more seriously and try to understand why they have been as successful as they are. It's certainly part of the reason that drives many LanguageWorkbench developers to provide a different vision of software development.


Oslo dsl 28 October 2008 Reactions

Oslo is a project at Microsoft, of which various things have been heard but with little details until this week's PDC conference. What we have known is that it has something to do with ModelDrivenSoftwareDevelopment and DomainSpecificLanguages.

A couple of weeks ago I got an early peek behind the curtain as I, and my language-geek colleague Rebecca Parsons, went through a preview of the PDC coming-out talks with Don Box, Gio Della-Libera and Vijaye Raji. It was a very interesting presentation, enough to convince me that Oslo is a technology to watch. It's broadly a Language Workbench. I'm not going to attempt a comprehensive review of the tool here, but just my scattered impressions from the walk-through. It was certainly interesting enough that I thought I'd publish my impressions here. With the public release at the PDC I'm sure you'll be hearing a lot more about it in the coming weeks. As I describe my thoughts I'll use a lot of the language I've been developing for my book, so you may find the terminology a little dense.

Oslo has three main components:

  • a modeling language (currently code-named M) for textual DSLs
  • a design surface (named Quadrant) for graphical DSLs
  • a repository (without a name) that stores semantic models in a relational database.

(All of these names are current code names. The marketing department will still use the same smarts that replaced "Avalon and Indigo" with "WPF and WCF". I'm just hoping they'll rename "Windows" to "Windows Technology Foundation".)

The textual language environment is bootstrapped and provides three base languages:

  • MGrammar: defines grammars for Syntax Directed Translation.
  • MSchema: defines schemas for a Semantic Model
  • MGraph: is a textual language for representing the population of a Semantic Model. So while MSchema represents types, MGraph represents instances. Lispers might think of MGraph as s-expressions with a ugly syntax.

You can represent any model in MGraph, but the syntax is often not too good. With MGrammar you can define a grammar for your own DSL which allows you to write scripts in your own DSL and build a parser to translate them into something more useful.

Using the state machine example from my book introduction, you could define a state machine semantic model with MSchema. You could then populate it (in an ugly way) with MGraph. You can build a decent DSL to populate it using MGrammar to define the syntax and to drive a parser.

There is a grammar compiler (called mg) that will take an input file in MGrammar and compile it into what they call an image file, or .mgx file. This is different to most parser generator tools. Most parser generators tools take the grammar and generate code which has to be compiled into a parser. Instead Oslo's tools compile the grammar into a binary form of the parse rules. There's then a separate tool (mgx) that can take an input script and a compiled grammar and outputs the MGraph representation of the syntax tree of the input script.

More likely you can take the compiled grammar and add it to your own code as a resource. With this you can call a general parser mechanism that Oslo provides as a .NET framework, supply the reference to the compiled grammar file, and generate an in-memory syntax tree. You can then walk this syntax tree and use it to do whatever you will - the parsing strategy I refer to as Tree Construction.

The parser gives you a syntax tree, but that's often not the same as a semantic model. So usually you'll write code to walk the tree and populate a semantic model defined with MSchema. Once you've done this you can easily take that model and store it in the repository so that it can accessed via SQL tools. Their demo showed entering some data via a DSL and accessing corresponding tables in the repository, although we didn't go into complicated structures.

You can also manipulate the semantic model instance with Quadrant. You can define a graphical notation for a schema and then the system can project the model instance creating a diagram using that notation. You can also change the diagram which updates the model. They showed a demo of two graphical projections of a model, updating one updated the other using Observer Synchronization. In that way using Quadrant seems like a similar style of work to a graphical Language Workbench such MetaEdit.

As they've been developing Oslo they have been using it on other Microsoft projects to gain experience in its use. Main ones so far have been with ASP, Workflow, and web services.

More on M

We spent most of the time looking at the textual environment. They have a way of hooking up a compiled grammar to a text editing control to provide a syntax-aware text editor with various completion and highlighting goodness. Unlike tools such as MPS, however, it is still a text editor. As a result you can cut and paste stretches of text and manipulate text freely. The tool will give you squigglies if there's a problem parsing what you've done, but it preserves the editing text experience.

I think I like this. When I first came across it, I rather liked the MPS notion of: "it looks like text, but really it's a structured editor". But recently I've begun to think that we lose a lot that way, so the Oslo way of working is appealing.

Another nice text language tool they have is an editor to help write MGrammars. This is a window divided into three vertical panes. The center pane contains MGrammar code, the left pane contains some input text, and the right pane shows the MGraph representation of parsing the input text with the MGrammar. It's very example driven. (However it is transient, unlike tests.) The tool resembles the capability in Antlr to process sample text right away with a grammar. In the conversation Rebecca referred to this style as "anecdotal testing" which is a phrase I must remember to steal.

The parsing algorithm they use is a GLR parser. The grammar syntax is comparable to EBNF and has notation for Tree Construction expressions. They use their own varient of regex notation in the lexer to be more consistent with their other tools, which will probably throw people like me more used to ISO/Perl regexp notation. It's mostly similar, but different enough to be annoying.

One of the nice features of their grammar notation is that they have provided constructs to easily make parameterized rules - effectively allowing you to write rule subroutines. Rules can also be given attributes (aka annotations), in a similar way to .NET's language attributes. So you can make a whole language case insensitive by marking it with an attribute. (Interestingly they use "@" to mark an attribute, as in the Java syntax.)

The default way a grammar is run is to do tree construction. As it turns out the tree construction is the behavior of the default class that gets called by the grammar while it's processing some input. This class has an interface and you can write your own class that implements this. This would allow you to do embedded translation and embedded interpretation. It's not the same as code actions, as the action code isn't in the grammar, but in this other class. I reckon this could well be better since the code inside actions often swamp grammars.

They talked a bit about the ability to embed one language in another and switch the parsers over to handle this gracefully - heading into territory that's been explored by Converge. We didn't look at this deeply but that would be interesting.

An interesting tidbit they mentioned was that originally they intended to only have the tools for graphical languages. However they found that graphical languages just didn't work well for many problems - including defining schemas. So they developed the textual tools.

(Here's a thought for the marketing department. If you stick with the name "M" you could use this excellent film for marketing inspiration ;-))

Comparisons

Plainly this tool hovers in the same space as tools like Intentional Software and JetBrains MPS that I dubbed as Language Workbenches in 2005. Oslo doesn't exactly fit the definition for a language workbench that I gave back then. In particular the textual component isn't a projectional editor and you don't have to use a storage representation based on the abstract representation (semantic model), instead you can store the textual source in a more conventional style. This lesser reliance on a persistent abstract representation is similar to Xtext. At some point I really need to rethink what I consider the defining elements of a Language Workbench to be. For the moment let's just say that Xtext and Oslo feel like Language Workbenches and until I revisit the definition I'll treat them as such.

One particularly interesting point in this comparison is comparing Oslo with Microsoft's DSL tools. They are different tools with a lot of overlap, which makes you wonder if there's a place for both them. I've heard vague "they fit together" phrases, but am yet to be convinced. It could be one of those situations (common in big companies) where multiple semi-competing projects are developed. Eventually this could lead to one being shelved. But it's hard to speculate about this as much depends on corporate politics and it's thus almost impossible to get a straight answer out of anyone (and even if you do, it's even harder to tell if it is a straight answer).

The key element that Oslo shares with its cousins is that it provides a toolkit to define new languages, integrate them together, and define tooling for those languages. As a result you get the freedom of syntax of external DomainSpecificLanguages with decent tooling - something that deals with one of the main disadvantages of external DSLs.

Oslo supports both textual and graphical DSLs and seems to do so reasonably evenly (although we spent more time on the textual). In this regard it seems to provide more variety than MPS and Intentional (structured textual) and MetaEdit/Microsoft's DSL tools (graphical). It's also different in its textual support in that it provides real free text input not the highly structured text input of Intentional/MPS.

Using a compiled grammar that plugs into a text editor strikes me as a very nice route for supporting entering DSL scripts. Other tools either require you to have the full language workbench machinery or to use code generation to build editors. Passing around a representation of the grammar that I could plug into an editor strikes me as a good way to do it. Of course if that language workbench is Open Source (as I'm told MPS will be), then that may make this issue moot.

One of the big issues with storing stuff like this in a repository is handling version control. The notion that we can all collaborate on a single shared database (the moral equivalent of a team editing one copy of its code on a shared drive) strikes me as close to irresponsible. As a result I tend to look askance at any vendors who suggest this approach. The Oslo team suggests, wisely, that you treat the text files as the authoritative source which allows you to use regular version control tools. Of course the bad news for many Microsoft shops would be that this tool is TFS (or, god-forbid, VSS), but the great advantage of using plain text files as your source is that you can use any of the multitude of version control systems to store it.

A general thing I liked was most of the tools leant towards run-time interpretation rather than code generation and compilation. Traditionally parser generators and many language workbenches assume you are going to generate code from your models rather than interpreting them. Code generation is all very well, but it always has this messy feel to it - and tends to lead to all sorts of ways to trip you up. So I do prefer the run-time emphasis.

It was only a couple of hours, so I can't make any far-reaching judgements about Oslo. I can, however, say it looks like some very interesting technology. What I like about it is that it seems to provide a good pathway to using language workbenches. Having Microsoft behind it would be a big deal although we do need to remember that all sorts of things were promised about Longhorn that never came to pass. But all in all I think this is an interesting addition to the Language Workbench scene and a tool that could make DSLs much more prevalent.


DslQandA dsl 9 September 2008 Reactions

I was asked to put together a discussion of DSLs for non-technical types. Maybe I've been reading too much Stephen O'Grady, but I felt an irresistible urge to do it in a Q and A manner. So here it comes.

What is a Domain Specific Language?

A Domain Specific Language (DSL) is a computer programming language of limited expressiveness focused on a particular domain. Most languages you hear of are General Purpose Languages, which can handle most things you run into during a software project. Each DSL can only handle one specific aspect of a system.

So you wouldn't write a whole project in a DSL?

No. Most projects will use one general purpose language and several DSLs

Are they a new idea?

Not at all. DSLs have been used extensively in Unix circles since the early days of that system. The lisp community often talks of creating DSLs in lisp and then using the DSLs to implement the logic. Most IT projects use several DSLs - you might of heard of things like CSS, SQL, regular expressions and the like.

So why are they getting a lot of noise now?

Probably because of Ruby and Rails. Ruby as a language has many features that make it easy to develop DSLs and the people who got involved in the Ruby community have been familiar with this approach from elsewhere, so they took advantage of these features. In particular Rails uses several DSLs which have played a big role in making it so easy to use. This in turn encouraged more people to take up these ideas.

Another reason is that many Java and C# systems need to have some of their behavior defined in a more dynamic way. This led to complex XML files that are difficult to comprehend, which in turn led to people exploring DSLs again.

So DSLs can be used with languages other than Ruby?

Yes, as I indicated DSLs have been around for much longer than Ruby has. Ruby has an unobtrusive syntax and meta-programming features that make it easier to create more elegant internal DSLs than languages like C# and Java. But there are useful internal DSLs in Java and C#.

What's the distinction between internal and external DSLs?

An internal DSL is just a particular idiom of writing code in the host language. So a Ruby internal DSL is Ruby code, just written in particular style which gives a more language-like feel. As such they are often called Fluent Interfaces or Embedded DSLs. An external DSL is a completely separate language that is parsed into data that the host language can understand.

Why are people interested in DSLs?

I see DSLs as having two main benefits. The most common benefit is that they make certain kinds of code easier to comprehend, which makes it much easier to modify, thus improving programmer productivity. This is worthwhile all on its own and is relatively easy to achieve.

The most interesting benefit, however, is that a well designed DSL can be understandable by business people, allowing them to directly comprehend the code that implements their business rules.

So is this the hook - business people write the rules themselves?

In general I don't think so. It's a lot of work to make an environment that allows business people to write their own rules. You have to make a comfortable editing tool, debugging tools, testing tools, and so on. You get most of the benefit of business facing DSLs by doing enough to allow business people to be able to read the rules. They can then review them for accuracy, talk about them with the developers and draft changes for developers to implement properly. Getting DSLs to be business readable is far less effort than business writable, but yields most of the benefits. There are times where it's worth making the effort to make the DSLs business-writable, but it's a more advanced goal.

Do you need special (ie expensive) tools?

In general, no. Internal DSLs just use the regular facilities of the programming language that you are using anyway. External DSLs do require you to use some special tools - but these are open source and are very mature. The biggest problem with these tools is that most developers aren't familiar with them and believe they are harder to use than they really are (a problem exacerbated by poor documentation).

There are exceptions on the horizon, however. These are a class of tools that I call a LanguageWorkbench. These tools allow you to define DSLs more easily, and also provide sophisticated editors for them. Tools like this make it more feasible to make business-writable DSLs.

So is this a repeat of the dream of developing software without programming (or programmers)?

That was the intent of COBOL, and I don't think there's any reason to think that DSLs will succeed where COBOL (and so many others failed). What I think is important is that DSLs allow business people and developers to collaborate more effectively because they can talk about a common set of precise rules that are the executable code.

When should I consider making a DSL?

When you are looking at an aspect of system with rich business rules or work-flow. A well-written DSL should allow customers to understand the rules by which the system works.

Isn't this going to lead to a cacophony of languages that people will find hard to learn?

We already have a cacophony of frameworks that programmers have to learn. That's the inevitable consequence of reusable software, which is the only way we can get a handle on all the things software has to do these days. In essence a DSL is nothing more than a fancy facade over a framework. As a result they contribute little complexity over what is already there. Indeed a good DSL should make things better by making these frameworks easier to use.

But won't people create lots of bad DSLs?

Of course, just like people create bad frameworks. But again I'd argue that bad DSLs don't do much additional harm compared to the cost of bad frameworks.