Stuart Kent - Software Modeling and Visualization
There's been some debate between some of my MSFT colleagues (Alan Wills, Steve Cook, Jack Greenfield) and Grady Booch and others over at IBM around UML and DSLs. For those interested, Grady actually posted an article on UML and DSLs back in May last year - seems to have gone unnoticed. It touched on some of the themes that have been under discussion. My first blog entry was a response to this article.
A particular theme that crops up in the discussion is the issue of tools versus language. Grady seems to want to keep the two separate, whereas we believe that the two are closely linked - a language designed to be used in a tool is going to be different to a language designed to be used on paper.
I touched on this line of discussion in an article a few months back: http://blogs.msdn.com/stuart_kent/articles/181565.aspx
An example I used was that of class diagrams, and I pointed out a couple of aspects of class diagrams which may have been designed differently had the original intention been to use them within a tool rather than on paper. Now I can point you at an example notation that addresses some of these issues. It is the notation we use for defining domain models in our DSL Tools. Here's a sample:
Nodes are classes, lines are relationships or inheritance arrows. Nodes are organized into relationship and inheritance trees, which can be expanded or collapsed, making it easy to navigate and drill into large models (not something you do on paper).
A relationship line has the role information (would be association end information in UML 2) annotated in the middle of the line rather than at the ends: a role is represented as a triangle or rectangle containing a multiplicity, and the name of a role is annotated using a label (in the diagram above reverse role names have been hidden as they are usually less interesting).
Annotating role information in the middle of the line not only makes it easier to automatically place labels, it also means that relationship lines can be chanelled together enabling an expandable relationship tree to be automatically constructed and layed out. An example of channelling is given by the diagram below, where you can see all the relationship lines sourced on Page channelled together so that they connect to Page at the same point:
The point here is that this notation was designed from the start using criteria such as 'must support autolayout' and 'must support easy navigation of large models'. If the criteria were 'must be easily cut up into page size chunks', must be easy to sketch on a whiteboard' then the notation may well have turned out different.
As a number of my colleagues have already pointed out the December download of the DSL Tools are now available. Three walkthroughs giving more detail on how to use these bits should also be available soon - hopefully before Christmas, if not early in the New Year.
Not only do these bits allow you to create a graphical designer in Visual Studio for your own domain specific language, but also they create the environment for executing text artefact templates against domain data created using your new designer. For example, such templates can be used to generate code from the data or reports about it.
Artefact templates are an example of tools that consume the data. A designer is an example of a tool that creates it. DSL tools are all about enabling organizations to automate more and more aspects of their software development process, using domain specific abstractions. But however good the designers or editors are for creating the domain data, if they don't provide easy access to that data by other tools then this won't be possible.
There is work to do in this area. In the shorter term this includes persisting the data in a domain specific XML format, rather than the generic format used in the December bits, and providing the hooks for other tools to load persisted data and access it through a domain specific API (noting that we already generate the appropriate hooks to access the domain data through a domain specific API from within artefact templates). In the longer term, it's the authoring experience for building the consumption tools themselves that could be tackled. Examples of such tools would be tools for performing transformations between data in different domains, tools to synchronize and reconcile data across mappings, tools to simulate or animate behaviors specified by a DSL, and so on.
A third kind of tool is a tool to check the well-formedness of data. This is somewhere between a tool that consumes data and a tool the creates data - checking can be an intrinsic part of both the creation and consumption process. For example, data that inputs to artefact generators may need to pass a set of validation checks (in addition to ones which are intrinsic to the domain data) to guarantee successful artefact generation.
To conclude, the DSL Tools are not just about making it easy to build visual designers, tools for creating data, but also about making it easy to build tools that need to consume it and check it, in the context of automating aspects of the software development process. At the very least, we need to make sure that the data is easily accessed, both through APIs and the way the data is peristed in XML files. Even better if we can provide direct support for authoring such tools, such as the text artefact template technology.
I've just returned from OOPSLA and Redmond, and am pleased to see that the DSL tools workbench site is now live, delayed by a few teething problems with process - we should get better at this with each release of content. You can now download a preview version of the object model editor, which is accompanied by a document providing a walkthrough of that tool.
We'll be putting up some more content over the next few weeks - whitepapers, a video, that kind of thing - and planning to drop another preview, this time including the wizard and code generators (see my earlier posting), around Christmas.
When I've cleared the backlog of work from my two weeks away, I hope to return to more technical topics, probably starting with an explanation of the notation used in the object model editor.
I can now point you at a couple of announcements about the technology I've been working on. Here's the Microsoft Press announcement:
The exciting aspect of this is that we're going to start making the technology available to the community as soon as we can. Here's the site to watch:
I'll post to my blog as soon as some content is available - should be before the end of the week.
We gave a demo of some of the technology during Rick Rashid's keynote at the OOPSLA conference. Unfortunately I expect there'll be some flak about this - watching the keynote, our demo felt a bit like a 'commercial break'. I'll respond to the flak when it arrives.
So not quite the launch I'd hoped for, but that shouldn't detract from the technology itself. Here's a quick heads up on what we've been doing:
Our first release, at the end of this week, will be a preview of the object model editor. Previews of the other components should be available by the end of the year.
Here's what Microsoft are doing at OOPSLA this year: http://msdn.microsoft.com/architecture/community/events/oopsla2004/
Network connections permitting, I'll be blogging for the couple of days I'm there.
OOPSLA will be an interesting experience for me this year. In the past I've attended and presented as a researcher from the academic community. I wonder if the experience will be any different now I'm on the other side of the fence. It will also be a great opportunity to meet up with old friends - especially since I was unable to make the UML conference, the first time I've missed it since its inception.
And watch out for those announcements around software factories and domain specific languages...
I typed this entry a few days ago, but then managed to lose it through a set of circumstances I'm too embarrassed to tell you about. It's always better second time around in any case.
Anyway, reading this recent post from Simon Johnston prompted a few thoughts that I'd like to share. In summary, Simon likens UML to a craftsman's toolbox, that in the hand of a skilled craftsman can produce fine results. He then contrasts this with the domain specific language approach and software factories, suggesting that developers are all going to be turned into production-line workers - no more craftsman. The argument goes something like this: developers become specialized in a software factory to work on only one aspect of the product through a single, narrowly focussed domain specific language (DSL); they do their work in silos, without any awareness of what the others are doing; this may increase productivity of the individual developers, but lead to a less coherent solution.
Well, I presume this is a veiled reference to the recent book on Software Factories, written by Jack Greenfield and Keith Short, architects in my product group at Microsoft, and to which I contributed a couple of chapters with Steve Cook. The characterization of sofware factories suggested by Simon is at best an over-simplification of the vision presented in this book.
I trained as a mathematician. When constructing a proof in mathematics there are two approaches. Go back to the original definitions, the first principles, and work out your proof from there; or build on top of theorems already proven by others. The advantage of the first approach is that all you have to learn is the first principles and then you can set your hand to anything. The problem, is that it will take you a very long to time to prove all but the simplest theorems, and you'll continually be treading over ground you've trod many times before. The problem with the second approach is that you have to learn a lot more, including new notations (dare I say DSLs) and inevitably end up becoming a specialist in a particular branch of the subject; but in that area you'll be a lot more productive. And it is not unknown for different areas of mathematics to combine to prove some of the more sophisticated theorems.
With software factories we're saying that to become more productive we need to get more domain specific so that we can provide more focused tooling that cuts out the grunt work and let's us get on with the more challenging and exciting parts of the job. As with mathematics, the ability to invent domain specific notations, and, in our case, the automated tools to support them, is critical to this enterprise. And sophisticated factories (that is, most of them) will combine expertise from different domains, both horizontal and vertical, to get the job done, just as different branches of mathematics can combine to tackle tricky problems.
So our vision of software factories is closer to the desirable situation described by Simon towards the end of his article, where he talks about the need for a "coherent set of views into the problem". Each DSL looks at the problem, the software system being built or maintained, from a particular perspective. These perspectives need to be combined with the other views to give a complete picture. If developers specialize in one perspective or another, then so be it, but that doesn't mean that they can sit in silos and not communicate with the others in the team. There are always overlaps between views and work done by one will impact the work of another. But, having more specialized tooling should avoid a lot of error-prone grunt work, and will make the team as a whole far more productive as a result.
So what about UML in all this? To return to Simon's toolbox analogy (and slightly toungue-in-cheek) UML is like having a single hand drill in the toolbox, which we've got to try and use to drill all sizes of hole (for large holes you drill a number of small holes close together), and in all kinds of material; some materials you won't be able to drill into at all. DSLs, on the other hand, is like having a toolbox full of drill bits of all different sizes, each designed to drill into a particular material. And in a software factory, you support your DSLs with integrated tooling, which is like providing the electric hammer-drill: you'll be a lot more productive with these specialist tools, and even do things you couldn't manage before, like drill holes in concrete.
So I don't see UML as a central part of the software factory/DSL story. I see it first and foremost as a language for (sketching) the design of object-oriented programs - at least this is its history and its primary use to date. Later versions of UML, in particular the upcoming UML 2, have tried to extend its reach by adding to the bag of notations that it includes. At best, this bag is useful inspiration in the development of some DSLs, but I doubt very much that they'll get used exactly as specified in the standard - as far as conformance against the standard can be checked that is...
I read on Jim Steel's blog that back in August there was lots of discussion on OMG lists about what makes a compliant MDA tool. I followed a link from his blog to this entry, and there were three aspects that intrigued me.
I used the phrase 'premature standardization' in an earlier post today. I'm rather pleased with it, as it is a crisp expression of something that has vexed me for some time, namely the tendency of standards efforts in the software space to transform themselves into a research effort of the worst kind - one run by a committee. I have certainly observed this first hand, where what seemed to be happening was not standardization of technologies that existed and were proven, but instead paper designs for technology that might be useful in the future. Of course, then I was an academic researcher so was quite happy to contribute, with the hope that my ideas would have a better chance of seeing the light of day as part of an industry standard than being buried deep in a research paper. I also valued the exposure to the concentration of clever and experienced people from the industry sector. But now, as someone from that sector developing products and worrying everyday about whether those products are going to solve those difficult and real problems for our customers, I do wonder about the value of trying to standardize something which hasn't been tried and tested in the field, and, in some cases not even prototyped. To my mind, efforts should be made to standardize a technology when:
Even if all these tests come up positive, it is rarely necessary to standardize all aspects of the technology, just that part which is preventing the competing technologies to interoperate: a square plug really will not fit in a round hole, so my French electrical appliance can not be used in the UK, unless of course I use an adaptor...
If we apply the above tests to technologies for the development of DSLs, I'd say that we currently fail at least two of them. Which means that IMHO standardization of metamodelling and model transformation technologies is premature. We need a lot more innovation, a lot more tools, and, above all, many more customer testimonials that this stuff 'does what it says on the tin'.
Two posts in the same day. I guess I'm making up for the two month gap.
Anyway, Alan Wills and I are giving a tutorial at the UML conference in Lisbon in October. The tutorial is called "How to design and use Domain Specific Modeling Languages" and is on Tuesday 12th October in the afternoon. We promise you not much presentation, interesting exercises and lots of discussion and reflection.
Let's just recap where I've got to so far on the theme of modelling languages and tools. I started out with a reaction to an article by Grady Booch on DSL's, in particular why UML is not really the right tool for the job if this is the direction you want to go. I then talked about code generation, my thoughts prompted by an interesting article by Dan Haywood. Then, in the third entry, I talked about designing a visual language (strictly we should say pictorial or graphical language, as a textual language is also visual), focusing on the difference between designing one on paper and one to be used in a tool.
So what next? Well I'd like to return to the topic of DSLs, in particular try to pin down what is meant by the term 'domain specific language', why we need them, and how we can make it easier to build them. As I seem to be incapable of writing short entries, I've hived off the main content to a separate article.
A domain specific language is a language that's tuned to describing aspects of the chosen domain. Any language can be domain specific, provided you are able to identify the domain it is specific to and demonstrate that it is tuned to describe aspects of that domain. C# is a language specific to the (rather broad) domain of OO software. Its not a DSL for writing insurance systems, though. You could use it to write the software for an insurance system, but it's not exactly tuned to that domain.
So what is meant by the term 'domain'?. A common way to think about domains is to categorize them according to whether they are horizontal or vertical. Vertical domains include, for example: insurance systems, telephone billing systems, aircraft control systems, and so on. Horizontal domains include, for example, the bands in the classic waterfall method: requirements analysis, specification, design, implementation, deployment. New domains emerge by intersecting verticals and horizontals. So, for example, there is the domain of telephone billing systems implementation, which could have a matching DSL for programming telephone billing systems.
Domains can be broad or narrow, where broad ones can be further subdivided into narrow ones. So one can talk about the domain of real-time systems, with one sub-domain being aircraft control systems. Or the domain of web-based systems for conducting business over the internet, with a sub-domain being those particular to insurance versus another sub-domain of those dealing in electrical goods, say. And domains may overlap. For example, the domain of airport baggage control systems includes elements of real-time systems (the conveyer belts etc. that help deliver the luggage from the check-in desks to the aircraft) and database systems (to make a record of all the luggage checked in, its weight and who it belongs to, etc.).
So there are lots of domains. But is it necessary to have a language specific to each of them? Couldn't we just identify a small number of general purpose languages that cover the broad domains, and just use those for the sub-domains as well?
What we notice in this approach is that users demand general purpose languages that have extensibility mechanisms which allow the base language to be customized to narrower domains. There's always a desire to identify domain specific abstractions, because the right abstractions can help separate out the things that vary between systems in a domain and things that are common between them: you then only have to worry about the things that vary when defining systems in that domain.
Two extensibility mechanisms in common use today are:
These mechanisms take you so far, but do not exactly deliver customized languages that intuitively capture those domain specific abstractions - the problem is that the base language gets in the way. Using OO code frameworks is not exactly easy: it requires you to understand all or most of the mechanisms of the base language; then, although you get clues from the names of classes, methods and properties on where the extension points are, there is no substitute for good documentation, a raft of samples and understanding the framework architecture (patterns used and so on). Stereotypes and tagged values in UML are powerful in that you can decorate a model with virtually any data you like, but that data is generally unstructured and untyped, and often the intended meaning takes you a long way from the meaning of the language as described in the standard. Neither OO framework mechanisms or UML extensibility mechanisms, allow you to customize the concrete notation of the language, though some UML tools allow stereotypes to be identified with bitmaps that can be used to decorate the graphical notation.
Instead of defining extensibility mechanisms in the language, why not just open up the tools used to define languages in the first place, either to customize an existing language or create a new one?
Well, it could be argued that designing languages is hard, and tooling them (especially programming languages) even harder. And the tools used to support the language design process can only be used by experts. That probably is the case for programming languages, but I'm not sure it needs to be the case for (modelling) languages that might target other horizontal domains (e.g. design, requirements analysis, business modelling), where we are less interested in efficient, robust and secure execution of expressions in the language, and more interested in using them for communication, analysis and as input to transformations. Analysis may involve some execution, animation or simulation, but, as these models are not the deployed software, it doesn't have to be as efficient, robust or secure. Other forms of analysis include consistency checking with other models, possibly expressed in other DSLs, taking metrics from a model, and so on. Code generation is an obvious transformation that is performed on models, but equally one might translate models into other (non-code) models.
It could also be argued that having too many languages is a barrier to communication - too much to learn. I might be persuaded to agree with that statement, but only where the languages involved are targeted at the same domain and express the same concepts differently for no apparent reason (e.g. UML reduced the number of languages for sketching OO designs to one). Though it is worth pointing out that just having one language in a domain can lead to stagnation, and for domains where the languages and technologies are immature, inevitably there will be a plethora of different approaches until natural selection promotes the most viable ones - unless of course this process is interrupted by premature standardization :-). On the other hand, where a language is targeted on a broad domain, and then customized using its own extensibility mechanisms, the result carries a whole new layer of meaning (OO frameworks, stereotypes in UML), or even an entirely different meaning (some advanced uses of stereotypes). In the former case, there is a chance that someone who just understands the base language might be able to understand the extension without help; in the latter case, I'd argue that the use of the base language can actually hinder understanding, as it replaces the meaning of existing notation with something different.
Finally, whether we like it or not, people and organizations will continue to invent and use their own DSLs. Some of these may never ever get completed and will continue to evolve. Just look at the increasing use of XML to define DSLs to help automate the software development process - input to code generators, deployment scripts and so on. Yes, XML is a toolkit for defining DSLs; it's just that there are certain things missing: you can't define your own notation, certainly not a graphical one; the one you get is verbose; validation of well-formedness is weak.
Am I going to tell you what a toolkit for building domain specific modelling languages should look like? Soon I hope, but I've run out of time now. And I'm sure that some folks reading this will give feedback with pointers to their own kits.
One parting thought. In this entry, I have given the impression that you identify a domain and then define one or more languages to describe it. But perhaps it's the other way round: the language defines the domain…
In this posting I continue on the theme of designing tools and notations to support modeling.
On and off I've spent the last eight years thinking about the design of graphical (as opposed to purely textual) notations for use in software development. Until I started to build tools to support these notations, my thinking was unintentionally skewed towards how a notation would be used on paper or a whiteboard. As soon as I began building tools I realized that there's a whole range of facilities for viewing, navigating and manipulating models, available in a tool but not on paper or whiteboard. Perhaps I should have realized this from just using modeling tools, but it seems that something more was required to make it sink in! Anyway, I have also noticed that these facilities can not always be exploited if they are not taken into account when the notation is designed: there is a difference between designing a notation to be used in a tool and one to be used on paper. I have written a short article which lays out the argument in a little more detail, and gives a couple of concrete examples. As always, your feedback is valued.
Tools make available a whole range of facilities for viewing, navigating and manipulating models through diagrams, which are not available when using paper or a whiteboard. Unfortunately, these facilities can not always be exploited if they are not taken into account when the notation is designed: there is a difference between designing a notation to be used in a tool and one to be used on paper.
Some of the facilities available in a tool which are not available on paper:
To see how designing a notation for use in a tool can lead to different results than if the focus is paper or whiteboard use, let's take a look at two well-known notations. UML class diagrams and UML state machines. The former, I would argue, has not been well designed for use in a tool; whereas the latter benefits from features which can be exploited in a tool.
Class diagrams. A common use of class diagrams is to reverse engineer the diagram from code. This helps to understand and communicate designs reflected in code. Industrial scale programs have hundreds of classes; frameworks even more. Ideally, one would like a tool to create readable and intuitive diagrams automatically from code. However, the design of the notation mitigates against this. Top of my list of problems is the fact that the design of associations, with labels at each end, precludes channeling on lines, where channeling allows many lines to join a node at the same endpoint (inheritance arrows are often channeled to join the superclass shape at the same point). Because labels are placed at the ends of association lines, each line has to touch a class shape at a different end point in order to display the labels. This exacerbates line crossing and often means that class nodes need to be much larger than they ought to be, making diagrams less compact than they ought to be and much harder to achieve a good layout automatically.
State Machines. State machines can get large with many levels of nesting. This problem can be mitigated using zoom facilities, by a control that allows one to expand/collapse the inside of a state or a link that launches the nested state machine in a new window. As transitions can cross state boundaries, using any of these facilities means that you'll need to distinguish between when a transition (represented by an arrow) is sourced/targeted on a state nested inside and currently hidden from view, or on the state whose inside has been collapsed. This distinction requires a new piece of notation. In UML 1.5, the notation of stubbed transitions (transitions sourced or targeted on a small solid bar) was introduced to distinguish between a transition which touched a boundary and one which crossed it. Interestingly, in UML 2 this notation has been replaced by notation for entry/exit points. In this scheme, one can interrupt a transition crossing the boundary of a state by directing it through an entry or exit point drawn on the edge of the state. This is the state machine equivalent of page continuation symbols that one gets with flowcharts. However, it can be used to assist with expand/collapse in a tool: collapsing the inside of a state, leaves the entry/exit point visible, so one can still see the difference between transitions which cross the boundary and those which don't. In one sense, this isn't quite as flexible a solution as the solid bar notation, as, for expand/collapse to work in all cases, it requires all transitions that cross a boundary to be interrupted by an entry/exit point. I guess, a tool could introduce them and remove them as needed, but I must confess that I prefer the stubbed transition notation for the expand/collapse purpose - seems more natural somehow.
To conclude, designing a notation for use in a tool can lead to different decisions than if the focus is on paper or whiteboard use. However, much as we might like to think that a notation will only be used in a tool, there will always be times when we need to see or create it on paper or a whiteboard, and this has to be balanced against the desire to take advantage of what tools can offer. For example, it is always a good idea to incorporate symbols to support 'page continuation', which will make it easier to provide automated assistance for cutting a large virtual diagram into page-sized chunks (and, as we have seen, such symbols can also support other facilities like expand/collapse). And it is always worth considering whether the notation is sketchable. If not, it may be possible to define a simplified version which is. For example, one could provide alternatives for sophisticated shapes or symbols, that look great in tool, but are very hard to use when sketching.
An important part of my job is to analyze requirements and write specifications. I have found the technique of storyboarding to be extremely effective. A storyboard is a concrete rendition of a particular scenario - could be a use case or an XP story (see Martin Fowler on UseCasesAndStories for an explanation of the difference). Not only are storyboards great for sorting out requirements, they are also effective in communicating what needs to be built to developers, and as a starting point for the development of tests.
If what you are building is exposed to the user through UI, then the most likely form of your storyboard is a click-through of the UI. If what you are building is infrastructure, say an API, then your storyboard might be a click-through of the experience a developer goes through in authoring code that uses the API, or it might be a filmstrip - a series of object diagrams that illustrate what happens to the state of the system as methods are executed.
Tools that I have found to be particularly effective in developing storyboards, especially those that click-through a UI experience, are Powerpoint and Visio. You can really bring storyboards alive in Powerpoint using a combination of moving through slides, using its custom animation capabilities, and by mixing and matching graphics from multiple sources. And the Windows XP template in Visio is pretty useful too. I've put up an article with specific hints and tips on using Powerpoint and Visio for this purpose.
Interestingly, I soon move from sketching a storyboard on paper or the whiteboard to committing it to Powerpoint, as the latter imposes a discipline that forces you to make decisions and care about the detail. It's so easy to continue hand-waiving and putting off those hard decisions at the whiteboard. I don't spend a large amount of time getting the UI 'just so'. It just needs to be good enough to tease out those important decisions about what the requirements actually are.
If anyone has suggestions for alternative tools that could be used for this purpose, then please add them as comments to this post.
Here are a few techniques I have found useful for building storyboards or click-throughs using Powerpoint and Visio. If you have further suggestions please add them as comments to this article.
Making parts appear and disappear
Use custom animation in powerpoint. You can change the order in which things appear/disappear, and decide whether the effect should happen on mouse click or automatically after the previous effect. Or copy the slide and add/remove the part to/from the new slide. Animation will happen through slide transitions.
Don't do everything in one slide
Otherwise the animation will become unmanageable. I tend to have one slide per step in the scenario. Use your common sense to decide on the granularity of steps.
Use 'callouts' to comment on aspects of the scenario
A callout is a box with text, that has a line attached which can point to a particular aspect of the graphic. I tend to color my text boxes light yellow (like a yellow post-it note). Remember callouts can be made to appear and disappear too. Using callouts saves having to have separate slides with text notes on, which avoids breaking up the flow. The notes can also be set in context.
How to get a mouse pointer to move
Paste in a bitmap of the pointer. Select the bitmap. Use SlideShow > Custom Animation > Add Effect > Motion Paths to define a path along which the pointer should move.
Build/get a graphic for your application shell
A graphic of the application shell can be used as a backdrop for all the other animation. Use Alt-PrintScreen to create a bitmap of the shell for your running application. This works if you're building a plugin for an existing application, or you have an existing application with a similar shell to the one you're storyboarding. Alternatively build a graphic of the shell using the Windows XP template in Visio.
Underlying Dan's article seemed to be the assumption that models are just used as input to code generation. To be fair, the article was entirely focused on the OMG's view of model driven development, dubbed MDA, which tends to lean that way. My own belief is that there are many useful things you can use models for, other than code generation, but that's the topic of a different post. I'll just focus here on code generation.
So which path to follow? Translationist or elaborationist?
In the translationist approach, the model is really a programming language and the code generator a compiler. Unless you are going to debug the generated (compiled) code, this means that you'll need to develop a complete debugging and testing experience around the so-called modeling language. This, in turn, requires the language to be precisely defined, and to be rich enough to express all aspects of the target system. If the language has graphical elements, then this approach is tantamount to building a visual programming language. The construction of such a language and associated tooling is a major task that requires specialist skills. It will probably be done by a tool vendor in domains where there is enough of a market to warrant the initial investment. Indeed, one doesn't have to look far for examples. There are several companies who have built businesses on the back of this approach to MDA, especially in the domain of real-time, embedded systems. And, for obvious reasons, they have been leading efforts to define a programming language subset of UML, called Executable UML, xUML or xtUML, depending on which company you talk to.
In contrast, the elaborationist approach to code generation does not require the same degree of specialist skill or upfront investment. It can start out small and grow organically. However, there are pitfalls to watch out for. Here's some that I've identified:
Of course, we have been talking to our customers and partners about their needs in this area. But we're always to keen to receive more feedback. If you've been using code generation, then I'd like to hear from you. Has it been successful? What techniques have you been using to write the generators? To write the models? What pitfalls have you encountered? What development tools would have made the job easier?
Here's a great article on MDA that my colleague Steve Cook pointed me at:
A couple of highlights for me are:
In this, my first posting, I want to react to a recent posting by Grady Booch on his blog. I've got a fair bit to say, so I've put it in an article.
"I was delighted to see today's report that Sun has announced support for the UML in their tools. This comes on the heels of Microsoft telegraphing their support for modeling in various public presentations, although the crew in Redmond seem to trying to downplay the UML open standard in lieu of non-UML domain-specific languages (DSL). […] There is no doubt that different domains and different stakeholders are best served by visualizations that best speak their language - the work of Edward Tufte certainly demonstrates that - but there is tremendous value in having a common underlying semantic model for all such stakeholders. Additionally, the UML standard permits different visualizations, so if one follows the path of pure DSLs, you essentially end up having to recreate the UML itself again, which seems a bit silly given the tens of thousand of person-hours already invested in creating the UML as an open, public standard."
Let's start with the statement "There is no doubt that different domains and different stakeholders are best served by visualizations that best speak their language". This seems to imply that a domain specific-language is just about having a different graphical notation - that the semantics that underpin different DSLs is in fact the same - only the notation changes. This view is further reinforced by the statement "but there is tremendous value in having a common underlying semantics model for all such stakeholders".
How can one disagree with this? Well, what if the semantic model excludes the concepts that the stakeholders actually want to express? If you examine how folks use UML, especially those who are trying to practice model driven development, you'll see that exactly the opposite is happening. Model driven development is forcing people to get very precise about the semantics of the models they write (in contrast to the sometimes contradictory, confusing and ambiguous semantics that appears in the UML spec). This precision is embodied in the code generators and model analysis tools that are being written. And, surprise, surprise, there are differences from one domain to another, from one organization to another. Far from there being a common semantics, there are significant and actual differences between the interpretations of models being taken. And how are these semantic differences being exposed notationally? Well, unfortunately, you are rather limited in UML with how you can adapt the notation. About all you can do is decorate your diagrams with stereotypes and tagged values. This leads to (a) ugly diagrams and (b) significant departure from the standard UML semantics for those diagrams (as far as this can be pinned down).
I speak from experience. I once worked on the Enterprise Application Integration UML profile, which has recently been ratified as a standard by the OMG. The game here was to find a way of expressing the concepts you wanted, using UML diagrams decorated with stereotypes and trying to align with semantics of diagrams as best you could. It boiled down to trying to find a way of expressing your models in a UML modelling tool, at first to visualize them, and then, if you were brave enough, to get hold of the XMI and generate code from them. So in the EAI profile, we bastardized class diagrams to define component types (including using classes to define kinds of port), and object diagrams were used to define the insides of composite components, by representing the components and ports as objects, and wires as links. Now, you can hardly say that this follows the "standard" semantics of UML class diagrams and object diagrams.
And this gets to the heart of the matter. People are using UML to express domain-specific models, not because it is the best tool for the job, but because it saves them having to build their own visual modelling tool (which they perceive as something requiring a great deal of specialist expertise). And provided they can get their models into an XML format (XMI), however ugly that is, they can at least access them for code generation and the like. Of course, people can use XML for this as well, provided they don't care about seeing their models through diagrams, and they are prepared to edit XML directly.
So, rather than trying to force everyone to use UML tools, we should be making it easy for people to build their own designers for languages tailored to work within a particular domain or organization's process, not forgetting that these languages will often be graphical and that we'll want to get at and manipulate the models programmatically. And this does not preclude vendors building richer designers to support those (often horizontal) domains where it is perceived that the additional investment in coding is worthwhile. Indeed, Microsoft are releasing some designers in this category with the next release of Visual Studio.