|
One of my early favorite books was Tsichritzis
and Lochovsky's book on Data Models. The book discussed
different models for thinking about data, in particular the three
models most discussed at the time: RelationalDataModel,
HierarchicDataModel and NetworkDataModel.
These days I don't see people talking about the pros and cons of
different data models very much. The people who talk about data
models tend to be database people, and most people seem to think
that that contest was long since won by the
relational model.
However I don't think that was ever really true, and things are
getting more and more interesting. There's no doubt that the relational model is
dominant in the database world, almost every business application
I run into these days uses a SQL database, which is close enough
to relational for 99% of the population. However if you look at
in-memory data structures you see a different world. Here the
network model reigns supreme. Indeed people often go to great
efforts to turn a relational model on disk to a
network model in memory (this, I think is one of the
reasons why the AnemicDomainModel is so popular.) I find this interesting. Time and time again I ask people why
they bother to go through all the fuss of turning relations into
records. The answer I get back pretty much always boils down to
the same thing - most developers find the network model easier
to deal with than the relational model. Certainly this is not
always the case, but I do think there's a majority of those that
prefer records. This may have something to do with the fact that while SQL works
well for databases, we don't have the equivalent for in-memory
processing. Thinking relationally is one of the things I find
interesting about ADO.NET, yet again I see many people who don't
want to work with datasets in a relational style. The other thing that's steadily challenging the current data
model picture is the rise of XML. XML based technologies like
XPath and XQuery provide a standard way of accessing hierarchic
data structures. In many ways the fact the XML provides a standard
textual serialization of that data is just a bonus to having this
standard way of querying and manipulating hierarchic data. A fundamental technological shift will I think cause further data
model churn. As memory sizes grow as fast as prices drop, we are
increasingly reaching the point where most databases can be kept
entirely in memory. Couple this with a mechanism for durable
changes, and you have a whole different kind of database with
fundamentally different assumptions about what it takes to
perform. (See Prevayler as
an example of this kind of thinking. I have no idea how valid
their performance numbers are, but they could be a couple of orders
of magnitude off and still be impressive.) So maybe its time again to dust off those assumptions about which
data models make sense, and get start thinking about some of the
basics of these models. My sense is that different kinds of data
work well with different kinds of models. The
relational model is perfect for tabular data, but
suck really badly if you want to store The
Tempest. So it makes sense to be aware of different data
models, the technologies that use them, and which ones suit what
kinds of data.
|