Stuart Kent - Building developer tools at Microsoft - @sjhkent
When I'm creating domain models, the ability to place properties on a relationship is proving very useful.
For example, at the moment I'm remodelling our designer definition format (we generate it from a domain model) and have a relationship between Connector and Decorator and Shape and Decorator. Decorators have a position, which is modelled as an enumeration (inner-top-right, etc.), but the values of the enumeration are different depending on whether the decorator is on a Shape or on a Connector (inner-top-right is not a meaningful position for a connector). Without properties on relationships, we'd have to subclass Decorator to ShapeDecorator and ConnectorDecorator, and all we'd be adding was a differently typed Position property in each case. With properties on relationships, we can just attach a differently typed Position property to the relationships from Shape to Decoarator and from Connector to Decorator, respectively - no subclassing required.
UML has associations which are like our relationships. You can attach properties (attributes) to associations via their association classes. UML also has qualified associations, where you can index links of the associations by a property - e.g. an integer or a position. But it seems to me that one could achieve the effect of qualified associations by adding attributes to association classes, as we add properties to relationships. So, in my mind, if you've got association classes, qualified associations are redundant.
Am I missing something?
Last night the conference banquet was held for the MoDELS conference in Jamaica. The MoDELS conference changed it's name this year - it used to be known as the UML conference, and UML is still used in the byline. I'm the general chair for the conference this year, and Microsoft sponsored the banquet. As you know, Microsoft is arguing for domain specific languages, with UML playing a useful role in some circumstances within that approach.
So I was mildly amused to see the title printed on the menu that interpreted the acronym 'UML' as 'Universal Modeling Language'. I also noticed that alongside the UML logo on the conference program, the acronym is expanded as 'United Modeling Language'. Having experienced OMG politics at first hand during the standardization process for UML 2.0, I find the latter interpretation particularly ironic.
This reminds me of a game I once played with colleagues in a quiet moment at an OMG meeting. It's surprising how many subsititutions for the letter 'U' one can come up with. Kept us going for a good 30 minutes.
I'm in Jamaica at the MoDELS 05 conference.
Yesterday I attended a workshop on model transformation, where a number of different techniques were presented. The organizers had asked all submitters to apply their technique to a standard example (the object-relational mapping) so it was quite easy to compare the different approaches. There were also some excellent discussions. Here's my distillation of key take-aways:
1) Most of the approaches modeled the tracing data (i.e. the mapping itself) in some way. Transformations created both the tracing data as well as the target model. Some of the approaches (e.g. triple graph grammars) used the tracing data generated by one application of the transformation as input into the next application.
2) Rules which use pattern matching was a common theme running through most (though not all) techniques. Placing priorities on rules was one way of controlling the order in which rules are processed. Some combined rules with imperative code, to put control structure around rules or to 'finish' off work done by a rule. Some techniques used constraint solvers to avoid writing any imperative code at all.
3) There was an interesting discussion about specifying and testing transformations. One could argue that a dedicated model transformation language, if it is any good, is high level enough not to require a separate specification. Even so, it's still necessary to validate that the transformation meets the business need: does it produce the expected results for designated example input models? So we need testing frameworks for delivering input models to a set of rules and checking the output is correct. How do we check the output is correct? It depends on what it is. If the model is executable, you can test its execution to see that it has the desired behavior. If not you can at least inspect it and check that it is well formed. One can also write well formedness constraints on the tracing model (see 1) and check that generated traces are well formed. This then led into a discussion about debugging transformation rules... here again the tracing data may be useful information, especially if the order in which rules have been fired, hence tracing information created, is also kept.
(An aside: In building DSL Tools we have been faced with the issue of specifying code generators, which are expressed as text templates. An effective way of specifying them, we have found, is to describe the behavior of the generated code for various combinations of inputs.)
4) Another interesting discussion emerged around the topic of bidirectional mappings and model management. Suppose we have two models where one is not fully generated from the other - they are both edited directly in some way. The goal is then to keep them consistent and help the user bring them back to consistency, and there would probably need to be UI specific to the particular transformation in question to do this. Again, tracing information seems important in this scenario. But now consider a team scenario with multiple models and multiple mappings between them. Different team members make changes to models, checking them into source control. How do you go about keeping all models consistent with one another across the mappings? What are the steps a developer must take when they check in? When dealing with a large code base through source control, you soon learn to use diff and merge tools and also soon learn that there are different levels of consistency: will the code build? does it run all tests? does it meet all scenarios? I think it's a similar situation with models. We need diff and merge tools with models, and, they have to be domain specific, just like the languages. We need to start considering what the different levels of consistency are: are there any unresolved cross references between model elements? are the models well-formed? are the mappings between models consistent?
5) Finally there was a brief discussion about diagrams and diagram layout. The point being that if you run a transformation to create a new model, then what about creating and laying out the diagram to go with that model (assuming a graphical modeling language here)? And what about getting that layout to be a transformation of the layout of the source model?
Interesting, uh? Well I thought so, at least.
If you're interested in Workflow then you'll want to have a look at Windows Worflow Foundation, announced at last week's PDC. Here are some links to get you going:
The main page: http://msdn.microsoft.com/windowsvista/building/workflow/
An introductory article.
Dave Green's blog. Dave is the architect of Windows Workflow.
A new release of DSL Tools is now available. You can dowload it from:
The readme included in the zip file provides more information. This is hot off the press - the updates to the main DSL Tools site haven't filtered through yet.
The list of known issues that accompanies this release is at:
This release still works with VS2005 Beta2 - same as the May release. Quoting from the readme, new this release:
As Jochen points out, our next release should be available very soon after the RTM release of VS2005, and will work with that release. Other features planned for that release, are:
[edited to update link to download page, instead of the file itself]
Jack Greenfield asked me to mention that the organizing committee have extended the deadline for submission of position papers to the software factories workshop being held at OOPSLA05. 26th August is the new deadline.
Unfortunately I won't be at OOPSLA this year, as I'm general chair for the MoDELS conference in Jamaica, and I don't want to take any more time away from getting our V1 of DSL Tools out the door. There are some good workshops there too, including a workshop on model transformation for which I'm on the programme committee. Better get your skates on though - the submission deadline for that is August 15th, next Monday!
Also the DSL Tools team is looking for new graduates to be developers in Cambridge, UK. Contact Steve Cook if you are interested.
Edward Bakker has been blogging his experience of using DSL Tools: http://www.edwardbakker.nl/
This is great feedback for us. Edward, rest assured that we are fixing the "keeping dd in synch with dmd" problem for the V1 release, as one of the many things that we'll be doing.
Indeed, the reason I've been a little quiet on my blog recently is because we have been engaged in an intense period of planning, nailing down the scenarios and feature set that we'll be targeting for V1. I hope to be able to post a roadmap to V1, with some details of those features and scenarios fairly soon after I get back from vacation - should be early September.
I'm involved in a workshop on model to model transformations at the MoDELS conference this year.
The call for papers is at http://sosym.dcs.kcl.ac.uk/events/mtip/
An interesting feature of this workshop is that they're asking all participants to apply their favourite transformation techniques to a common mapping problem, so the workshop can more easily contrast and compare approaches. The results should be interesting.
For all those who've installed the May 2005 version of DSL Tools (the one that works on VS2005 Beta 2), our friends over at ModeliSoft have just updated their tool - the one that maintains a designer definition as the domain model changes - to work with the new version. The announcement is here. Have fun.
[Edit - for those of you who can't see the post on the forum, the url is http://www.modelisoft.com/Dmd2dd.aspx]
Martin has just put up an article on Language Workbenches - IDEs for creating and using DSLs.
As you'd expect from Martin, this is an insightful piece, with enough follow-on links to keep you interested and busy for days.
One of the links is to a second article on code generation. Here Martin explains how to write a code generator for a DSL. The first point that comes out is the important distinction between concrete and abstract syntax. This distinction allows a language to have a number of concrete views, which map to the same abstract structure, which code generators then take as input. This saves having to rewrite the code generator every time you add a new concrete view to the language. In our own DSL Tools, we are emphasizing graphical and XML concrete syntaxes for languages. We also generate an API from a language definition which allows direct access to the abstract data structures in memory for purposes such as code generation (all the file handling is done for you).
Martin continues, in the article, to talk about code generation itself. The first approach he demonstrates is not really a generator at all, but rather an interpreter. This is written in plain code and makes use of reflection. The second approach uses text templates. In generating code for designers from definitions of DSLs, we have found text templates to be our preferred method for writing code generators. We wrote our own text templating engine, which is included as part of DSL Tools. We have taken great care to architect the engine so that it can be integrated into different contexts, which means that it can be hosted in different environements (e.g. inside visual studio or not) and can accept inputs from multiple sources. For DSLs, we've built a Visual Studio host and the extensions that allow direct access within templates to models in memory through the generated APIs mentioned above. My colleague Gareth Jones has blogged about the engine, and there its use in a DSL Tools context is illustrated in the walkthroughs that are part of the DSL Tools download. We're actively working on more complete documentation for the engine itself, including the APIs.
Aspects that Martin did not touch on in his article include the issues of orchestrating the generation of multiple files from multiple sources, integration with source control (though it is a moot point whether generated files should be checked into source control or not), as well as how to handle cases where 100% code generation is not feasible - particular tricky are the cases where code in the same file has to be further added to - skeleton code is generated, but the programmer has to fill in method bodies, for example. We haven't answers to these yet, but they're on the roadmap.
I see a transcript of the DSL Tools web chat is now available. Members of the DSL Tools team answered questions from customers for an hour or so, and there's some interesting material in there.
I was hoping to participate in this chat, but Elsa, the latest addition to our family, arrived later than expected.
Folks probably haven't noticed because of the sporadic nature of my blog entries, but I have been out for two weeks on paternity leave. We now have another daughter called Elsa-Maude. She joins her three sisters and brother.
I see that whilst I've been out the team put out another release of DSL Tools. I'd flagged this some weeks ago. Jochen has the details. He indicates that we have reworked the text templating engine from previous releases. Unfortunately, we were not able to do it full justice in the documentation. We hope to put that right soon, but in the meantime I see that Gareth has posted more information in his blog.
Also, TechEd is currently running in the US and our team is represented. Pedro Silva is filing a daily report.
Meanwhile, time to catch up and get on with the next release...
Back in the days when I was an academic and researcher, I used to teach Software Engineering. There are many interpretations of this term, but the focus in my classes was on turning a set of vague requirements into a tangible, detailed spec from which you could reliably cut code. I didn't go much for teaching the text book stuff - waterfall versus iterative and all that - but rather encouraged students to try out techniques for themselves to see what works and doesn't.
Perhaps unsurprisingly given my background, I preached a modelling approach. We'd start out scripting and playing out scenarios (yes, we would actually role play the scenarios in class). I didn't go in for use case diagrams - never really understood, still don't, how a few ellipses, some stick men and arrows helped - but I guess the scenario scripts could be viewed as textual descriptions of uses cases. We'd then turn these scripts into filmstrips. For the uninitiated, these are sequences of object diagrams (snapshots), illustrating how the state of the system being modelled changes as you run through the script. I learnt this technique when teaching Catalysis courses for Desmond D'Souza - indeed, my first assignment was to help Alan Wills, the co-author of the Catalysis book, and now my colleague, teach a week course somewhere in the Midlands. The technique is great, and I still swear by it as the way to start constructing an OO model. From the filmstrips, we'd develop an OO analysis model, essentially an OO model of the business processes. This was class diagrams, plus invariant constraints written in English or more formally, plus lists of actions, plus some pre/post specs of these actions. Then would come the job of turning this model into an OO design model for the (non-distributed, self-contained) program written in Java. And the scripts and accompanying filmstrips could be turned into tests and test data.
Well, that was the theory, anyway. In practice, only a very few students really got it end-to-end, though most picked up enough to still do well in the exam. Reflecting on it now, here are some of my observations:
I now find myself in a role in which most of my time is spent doing what I was trying to teach, though there are a couple of differences:
Here are my observations from the experience so far:
If I was back teaching again, I think I would focus much less on specific notations, and much more on the need to track scenarios through to detailed features, and have coherent specs of the details that communicate the decisions made. I'd also look forward to the prospect of greater automation and use of code generation and software factory techniques. If you've got the right domain specific languages, then models expressed in those languages can replace reams of English spec, and code generators sourced on those models can replace a lot of hand coding. However, they have to be languages matched to the problem, and I suspect that for most systems there's still going to be old fashioned spec work to do.
[edited soon after original post to correct some formatting issues]
I just came across this post about XMI from Steven Kelly over at Metacase. In particular, he quotes some figures about usage of the various XMI versions: he's conducted a web search for XMI files out there on the web. The first thing that struck me was how few there are, and secondly how very few there are (34) that use the latest version (2.0) released in 2003.
Steven also makes the observation that XMI is just an XML document containing sufficient information to describe models. Provide your tool stores its models in XML files and/or provides API access to allow you to create models using the API, it's no big deal to write your own importer in code or using XSLT. And if you want to import an XMI file into a domain specific tool, you'd have to do this in any case, because it is very likely your model will be full of stereotypes and tagged values which will need special interpretation that would not be provided by an off-the-shelf importer.
In another post, Steven talks about XMI[DI] which also supports diagram interchange. Another fact: a 20-class diagram takes nearly 400KB. That was a bit of a surprise to me. But it's his observation that "Standards are great, but I think they work best when they arise from best practice based on good theory, not when some committee tries to create something without ever having done it in practice". This strikes a chord with me: it's what I've called premature standardization.
I see that the architect of the Guidance Automation Toolkit (GAT), Wojtek Kozaczynski, has started blogging. I worked with Wojtek closely for a couple of months at the inception of GAT (fun it was too), and have been continuing to work with him and his team to merge our text templating technologies - the next version of GAT will contain the new, merged engine, as will the next version of DSL Tools (should be available by the end of the month, and will work with VS2005 Beta2). So soon folks will be able to install both GAT and DSL Tools, and it will be very interesting to see how they get used in combination.
Anyway, his second post is a pocket history of how GAT came to be and makes for an interesting read.
My last post was about this toolkit, pointing at a webcast that gave a demo. But at that point there was no download. Well, now there is. Just visit the GAT workshop site.
I've just noticed that a webcast on the Guidance Automation Toolkit (GAT) is now available. This is some emerging technology that should soon be made available in a download. Harry Pierson has a nice description over on his blog.
GAT and DSL Tools are both key technologies for realising the software factories vision - they tackle different aspects of the problem. What GAT brings to the table is a notion of recipe and recipe spawning. In its simplest form, a recipe is a wizard that gathers information from the user, then does stuff in Visual Studio, based on that information and information in the environment, thereby automating one or more steps of the software development process. A typical example of 'stuff' would be to create a a set of new items in the solution, perhaps further configure a project, perhaps add one more new projects, and so on. All these things that are created would be based on templates, which get filled in by the information supplied in the wizard. But it's not restricted to creating stuff; you can also delete stuff, perfrom refactoring operations, whatever really, provide you can work out how to do it programatically. A really neat feature of GAT is the notion of recipe spawning: one thing a recipe can do is create new recipes and attach them to items in the solution. This is crucial to automating guidance, where there are many steps to be performed and often repeated. With GAT, you automate the individual steps as recipes, then use recipe spawning to guide folks to the next steps that need to be performed, by spawning recipes which are revealed to you in the context of the items created (or which have been manipulated) by the recipe you've just applied. A spawned recipe can be a one-off action, whcih disappears when done, or can hang around to be repeated as many times as you like.
If you think of the DSL tools as a factory for building designers, then you can see how GAT and DSL Tools can work together. DSL Tools has a wizard for creating a solution in VS used to build a graphical designer. This is effectively a recipe. One of the things a recipe creates is a domain model, based on whatever language template was chosen when running the wizard. The domain model can be edited using a graphical designer (created solely for the purpose of editing domain models), then code generation templates (another key technology) are used to generate code for some aspects of the designer from the domain model. There's another DSL involved as well, the designer definition, from which other aspects of the code are generated. So here's a little factory involving (so far) one recipe and two DSLs.
[13 May 2005: Added this to the GAT category in my blog.]
As a footnote, I should also confess to some involvement with GAT. I spent a little time working with Wojtek and Tom at the inception of GAT, in particular on the notion of recipes and recipe spawning. It's great to see this work come to fruition, and it will be even better when the tools are available for download.
Update: Corrected the spelling of 'Harry Pierson'. Apologies Harry...
In case you haven't seen it, there's been some interesting discussion about n'ary and binary relationships over on the DSL Tools Forum.
A set of interesting questions were posted to the DSL Tools Newsgroup recently, so I've decided to reply to them here. The text from the newsgroup posting appears like this.
Hello -- I quickly ran through the walkthroughs and worked a little with the beta version of your tools, and they are neat. Having designed a modeling language and build a few models, one thing which I would like to do is 'execute' those models. I want to write a C# plugin for Visual Studio which uses an automatically-generated domain-specific API to query and perhaps modify the models programmatically. Based on what the plugin finds in the models, it can do some other useful work. Let's say I want to do some domain-specific analysis, where there isn't any existing analysis framework which correctly supports my domain. In that case, I might as well roll my own analysis framework as a plug-in which is integrated with VS's DSL tools. What I don't want to do is serialize the models to XML and have my independent tool read in the XML file, create an internal representation of the models in memory, and then do stuff. It's a waste of time. I want to integrate my analysis tool with VS and access my models...directly.
These are exactly the kinds of scenario we are envisaging. As I discussed in a past entry, creating models is not much use if it's difficult or impossible for other tools to consume them.
So, my hope is:
Will this be supported? If so, can you publish a walkthrough about this? The models are only worth so much if they're only good for xml/text/code generation --software isn't the only thing which needs to be modeled.
The models are held in memory (we call it the in-memory store). As well as giving access to CRUD operations, this supports transactional processing and event firing. We also generate domain specific APIs from domain models - indeed, you can see what these APIs look like if you look at e.g. XXXX.dmd.cs generated from the XXXX.dmd using the template XXXX.dmd.mdfomt in a designer solution. These APIs work against the generic framework, thus allowing both generic and domain specific access to model data. However, we still have some work to do to make all this easily available, including making some improvements to the generic APIs and doing some repackaging of code. The goal would be that you'd be able to use the dll generated from a domain model to load models into memory from XML files, access them through generic and and domain specific APIs, and then save them back to XML files. We will also be overhauling the XML serialization, so that models will get stored in domain specific, customized XML - see Gareth's posting for some details around this.
As for VS plugins, these will be supported in some way, for example via the addition of custom menus to your designer, or by writing 'standalone' tools integrated into VS making use of the existing VS extensibility features.
On the issue of timing, the API work will happen over the next few months, the serialization work after that. We will continue to put out new preview releases as new features are introduced. Walkthroughs, other documentation and samples will be provided with the new features.
Now we've got the March release out of the door, I'm sure folks are going to ask soon what's in the next release and when to expect it.
First the 'when' bit. As Harry Pierson has already indicated, we expect the when to be shortly after VS2005 Beta2 is released, where 'shortly after' = a small number of weeks. At this point we'll be moving from VS2005 Beta1 to VS2005 Beta2.
Now the 'what'. We're focusing on two feature areas next release (at least that's the plan, usual disclaimers apply):
The above should mean that users will be far less restricted than they are at present in the kindof designer they can build.
We're also making an investment on quality in this cycle, ramping up the automated testing & fixing a whole swathe of bugs.
And after the next release?
Well, here are some of the features in the pipeline: richer notations, constraints and validation, a proper treatment of serialization in XML (see this entry from Gareth), better hooks for code customization of generated designers, deployment of designers to other machines, multiple diagrams viewing a model, better hooks for writing your own tools to consume model data (as explained in this post), ...
No doubt lots of my colleagues will point you at this, including Steve himself.
But here is a great interview with Steve Cook, giving lots of detailed answers to questions about software factories, DSLs, MDA and UML.
In his announcement of the March release of DSL Tools, Gareth mentioned that we now have a designer definition (DD) file validator. This validates the DD file for anything that is not caught by XSD validation, including whether the cross references to the domain model are correct. It also validates those aspects of the domain model which impact the mapping of the designer definition to the domain model. For example, it will check that the XML Root class is mapped to the diagram defined in the DD file. Errors and warnings appear in the Visual Studio errors window whenever you try to generate code from the DD file (i.e. any of the code generators in the Designer project) and disappear the next time your try if the error has been fixed.
You may not have realized this, but the domain model designer also includes some validation. It gets invoked whenever you try to save the file, or you can invoke it from the ValidateAll context menu. Try, for example, giving two classes the same name and then invoking ValidateAll.
As Gareth indicated, this validation is implemented on top of a validation framework, that we will be leveraging to allow users to include validation as part of their own designers, or calling directly through the API, from within a text generation template, for example, as a precondition to code generation. We'd be interested to hear from users about what they would do with such features, whether they think this to be an important set of features (customers we so far have talked with do), what kind of authoring experience they would expect or want, and any suggestions for other features in this general area (for example, how else would you like validation to be exposed through the UI of a designer). You can provide this feedback as comments to this posting, as comments to Gareth's post, or through the DSL Tools newsgroup, or as suggestions through the feedback center.
I've been pondering what the fundamental problems are that we and others are trying to solve with DSLs, Software Factories, Model Driven Software Development, and the like. I've distilled it down to two key problems:
DSLs help with the first of these because they let you codify information that would otherwise be scattered and repeated in many development artefacts. The idea is that to change that information you change it in a single domain specific viewpoint or model, and the changes are propagated to all the artefacts that would otherwise need to be changed by hand. Of course the interesting problem here is how the propagation is performed, and one common approach is to propagate by regenerating the development artefacts by merging the information in the domain specific model with boilerplate. This works best if you can separate out generated aspects of artefacts from hand written aspects, for example by using C# partial classes. In this way you avoid that task of copying boilerplate code and making changes in designated places, and when things change you avoid multiple manual updates.
If it is not possible to cleanly separate the generated aspects from the hand written ones then more sophisticated synchronization techniques will be required, but I'm not going to go into that now.
And once you start thinking in this way, you then discover you can have multiple domain specific viewpoints contributing different aspects to your development artefacts. And then you discover that you can relate these viewpoints, synchronizing information between them and generating one from another. You're treading a path towards software factories.
Domain specific models created to help solve the first problem, also tend to be more abstract and provide new perspectives on the system. They hide detail and can reveal connections that it is difficult to find by looking directly at the development artefacts, especially when those models are visualized through diagrams. This contributes to the second problem: they provide viewpoints on the system which it is often easier to connect to business requirements. One can then go a step further, and build new viewpoints specifically focused on expressing and communicating the business requirements, and set up connections between those viewpoints and viewpoints of the system which can be monitored and synchronized as one or other change.
We see customers already leveraging such techniques in their development processes, codifying their DSLs using XML or UML + stereotypes & tagged values, for example. They also tell us they are having problems with these technologies, and it is those problems that we're trying to address with DSL tools. I'll go into more depth on this, and reveal more of what we're planning to help solve these problems, in future posts.
As posted by Gareth.