WORK-IN-PROGRESS: - this material is still under development
Explains the levels of instance, schema, meta-model and meta-meta model.
Last significant update: 11 Dec 07
When you read about models and modeling languages, you soon enter a world where you can run into confusions between models and meta-models. Here's how I try to keep my mind straight.
As usual I'll base the discussion on an example. We'll consider a system that simply controls a bunch of lights in rooms with switches. One light can have multiple switches, throwing the switch toggles the state of the light.
To describe this situation we can draw various models. I'll begin by drawing a state model of a light.
This state diagram represents a model of how lights and switches work. This morning I switch the light on as come downstairs for breakfast. If my house used this system, which I'll pretend for a while it does, then me throwing the switch this morning sends an event to the light causing to change its state to off. That little bit of action is an execution of the model. To use other words the state model is a program and me throwing the switch this morning is an execution of that program.
Related to this state model we have some objects: lights and switches. I can represent the schema for these like this, as a simple data model.
I could describe some of my lights in my house, by saying there is an instance of light in the stairs connected to two switches: one at the top of the stairs and one at the bottom. The bottom of the stairs has a second switch connected to a light outside the front door.
If I was feeling ambitious I could even draw a diagram of this for my stairs light, here using UML notation.
These two diagrams are related. The class diagram is a schema that defines the legal relationships that my instances (shown in the instance diagram) can be involved in. My hall light diagram is one of an infinite amount of instance diagrams I could draw that conform to the light and switch schema.
Here's another way of drawing the details of lights and switches, using some custom symbology.
In this situation we have a different notation in the diagram, but the core information that the diagram contains is exactly the same as the UML diagram in Figure 3. In Model Driven Development (MDD) circles it is the underlying model that's the key not the representation. A single model can have multiple representations in different forms (diagrams, text, tables, etc).
I started this section with a state machine diagram to describe the behavior of a light switch. This state machine diagram can also be thought of as an instance diagram of some form of schema. I can represent this schema as a class diagram.
Just as my lights and switches schema defines the lights, switches, and locations in my instance diagram, the state diagram schema defines states, transitions, and events. In MDD terms, the schema of Figure 5 is usually called the metamodel of state diagrams. The notion here is that Figure 1 is a model of light switches and Figure 5 is a model of state models, thus a model of models, thus a metamodel.
So when we draw a formal diagram, we can specify the metamodel of that diagram. That metamodel defines the various types that appear in the diagram and how they relate to each other.
We can go a step further. The state metamodel is itself a formal diagram, and thus can have its own metamodel. A somewhat simplistic version of this metamodel looks like this:
Inevitably this kind of model often gets referred to as a meta-metamodel, since it is the metamodel for the metamodel of state diagrams.
At this point you might think we can go on forever, constantly drawing a metamodel for every metamodel we create. In practice, however, we don't. That's because once we get to the meta-meta point, our meta-models become bootstrapped - they are capable of defining themselves. The state meta-meta model is like this, it's capable of being the metamodel for itself.
At this point you may be able to start seeing a clear progression of levels of models and metamodels. If so you've got company, because this notion is common in MDD circles. Indeed the Object Management Group came up with a standard definition for these levels as part of their Meta-Object Facility (MOF), part of what is broadly known these days as Model Driven Architecture.
| Level Number | Name | People Example |
|---|---|---|
| M0 | Instances | type = Person name = Sara-Jane Smith Sex = Female |
| M1 | Model | type = Class name = Person Attributes = name, sex |
| M2 | Metamodel | type = Class name = Class Attributes = name, attributes |
| M3 | Meta-Metamodel | type = Class name = Class Attributes = name, attributes |
This standard is intuitively appealing and you often hear people in MDD circles refering to it: "this is at M2". However push a bit and you usually find people getting terribly confused about modeling levels. There are various reasons for this.
One reason is the use of the word model. Like many similar concepts it's a victim of the Type Instance Homonym. It's reasonable to refer to Figure 4 as model of the lighting sysem of my house. But this isn't consistent with the OMG levels as that diagram is at M0 (the instance level) not M1 (the model level). It's equally natural to refer to Figure 2 as a model for lights and switches. That's the Type Instance Homonym, it's not strictly consistent to refer to both the instances and the schema by the same word. However as humans we do it all the time and its unatural not to do it. So my first problem with the OMG model levels is that you should be careful with the word "model" and remember it can refer to either instances or schema.
Even if we were to banish the word "model" from our formal vocabularly there are still problems with the levels. As I discussed this lighting example I drew parallels between the state diagram of Figure 1 and the light and switch diagram of Figure 4. Looking at the OMG levels, it seems natural that the light and switch diagram ( Figure 4) is M0 and its schema ( Figure 2) is M1. Yet it also seemed natural to say that Figure 5 is the meta model for state models - that implies it's M2 and thus Figure 1 is M1. Yet why is the state model M1 and the lights and switch model M0?
One way to look at it is that M0 corresponds to the instances in a running program. I can easily imagine that I have an instance of class light in my running program that corresponds to my hall light - this reinforces that the light and switch diagram is M0. But with the state model it very much depends on how I implement it. I could implement a state model with explicit instances for each state and object links for the transitions. In that case you can argue that the state diagram is M0. However I could also implement the state diagram by putting selection logic into my light classes (either though hand coding or code generation). In this case the state diagram has no corresponding intances, instead it's modifying classes. That would imply it's M1.
Hopefully by now I've made you very wary of the OMG modeling levels, or indeed of other similar schemes. Things often don't tend to level out in this neat kind of way, especially when we also blend in intuitive use of the word "model". A level system may make sense for a particular set of languages, but it isn't useful universally - and indeed trying to organize things in absolute levels can easily cause a lot of confusion.
I tend to think of a model as having many things: multiple instantiations, a schema, multiple notations and representations. Depending on context and background the speaker may use "model" to mean all of this or any of the pieces. Usually there is enough context to figure it out, or the differences don't really matter that much. But it's a term to be wary of as confusion is quite common.
With that background, "metamodel" refers to the language that we use to represent the schema of a model. In that sense it's natural to say Figure 5 is the schema of state models. It also implies that the light and switch schema ( Figure 2) is the metamodel for the light and switch diagram ( Figure 4). I think its more useful to use metamodel as a relative term than to think about it in absolute terms like "M2".
In this scheme a meta-metamodel is a shaky term - I prefer to think of a model that is bootstrapable. Bootstrappable is a characteristic of structural models and of grammars. Both define syntax - a bootstapable data model defines abstract syntax and a grammar defines concrete syntax.
So how does the MDD view of the world weave in the notion of a Domain Specific Langauge. Indeed what do we mean by a language in this context? One example of a DSL is the lights and switches model - a DSL for showing where they are and how they are connected together. As I mentioned earlier the UML instance diagram and the custom diagram are different representations of the same underlying model. In the MDDers would say that the metamodel is the abstract syntax of the language and the two diagrams are different concrete syntaxes of the language. In this way you think of it that lights and switches are one language with two representations.
This is a very different way of looking at things from the programming languages community. A particular language (java say) can be described by multiple languages and be processed into multiple abstract syntax trees. In the programming language world the representation is the central thing. Imagine we came up with an alternative representation of java - one with different keywords, "begin/end" instead of "{}", maybe even allowing multiple files to declare members of a class. These could all be thought of as a different concrete syntax to the basic idea of java, but many people would think of such a thing as a different language.
Yet the notion of multiple represenatations for the same language does make some sense. If I took the light and switch diagram and made a slight alteration to the light bulb icon, would that be enough to make it a separate language - particularly if I could choose which light bulb icon I wanted in a preferences file somewhere. Similarly we think nothing of altering the color scheme for syntax highlighting for the same language.
So lets consider the opening example of this book. In that case I came up with multiple representations: custom syntax, jruby internal DSL, java fluent interface. All of these followed the same underlying model - indeed my code transformed them all to identical instances of the single java framework. Does this mean they are different concrete syntaxes of the same DSL or multiple DSLs that all transform the a single model?
It was a conversation with Rebecca that gave me the key to all this. Rebecca made the point that language is about human communication. As a result the langauge is different if it seems different to the human using it, whether or not there is the same representation behind the scenes. Of course this introduces its own blurryness - there's no strict way to define what makes a language seem different. Convention tells us that color doesn't matter, but is there are formal reason why that is the case? One of the reason the MDD is comfortable with multiple represenations is that they have tools that allow you to easily change a representation. But it's the human view that drives the line, however blurry it may be.