Stuart Kent - Software Modeling and Visualization
Jack Greenfield asked me to mention that the organizing committee have extended the deadline for submission of position papers to the software factories workshop being held at OOPSLA05. 26th August is the new deadline.
Unfortunately I won't be at OOPSLA this year, as I'm general chair for the MoDELS conference in Jamaica, and I don't want to take any more time away from getting our V1 of DSL Tools out the door. There are some good workshops there too, including a workshop on model transformation for which I'm on the programme committee. Better get your skates on though - the submission deadline for that is August 15th, next Monday!
Also the DSL Tools team is looking for new graduates to be developers in Cambridge, UK. Contact Steve Cook if you are interested.
Edward Bakker has been blogging his experience of using DSL Tools: http://www.edwardbakker.nl/
This is great feedback for us. Edward, rest assured that we are fixing the "keeping dd in synch with dmd" problem for the V1 release, as one of the many things that we'll be doing.
Indeed, the reason I've been a little quiet on my blog recently is because we have been engaged in an intense period of planning, nailing down the scenarios and feature set that we'll be targeting for V1. I hope to be able to post a roadmap to V1, with some details of those features and scenarios fairly soon after I get back from vacation - should be early September.
I'm involved in a workshop on model to model transformations at the MoDELS conference this year.
The call for papers is at http://sosym.dcs.kcl.ac.uk/events/mtip/
An interesting feature of this workshop is that they're asking all participants to apply their favourite transformation techniques to a common mapping problem, so the workshop can more easily contrast and compare approaches. The results should be interesting.
For all those who've installed the May 2005 version of DSL Tools (the one that works on VS2005 Beta 2), our friends over at ModeliSoft have just updated their tool - the one that maintains a designer definition as the domain model changes - to work with the new version. The announcement is here. Have fun.
[Edit - for those of you who can't see the post on the forum, the url is http://www.modelisoft.com/Dmd2dd.aspx]
Martin has just put up an article on Language Workbenches - IDEs for creating and using DSLs.
As you'd expect from Martin, this is an insightful piece, with enough follow-on links to keep you interested and busy for days.
One of the links is to a second article on code generation. Here Martin explains how to write a code generator for a DSL. The first point that comes out is the important distinction between concrete and abstract syntax. This distinction allows a language to have a number of concrete views, which map to the same abstract structure, which code generators then take as input. This saves having to rewrite the code generator every time you add a new concrete view to the language. In our own DSL Tools, we are emphasizing graphical and XML concrete syntaxes for languages. We also generate an API from a language definition which allows direct access to the abstract data structures in memory for purposes such as code generation (all the file handling is done for you).
Martin continues, in the article, to talk about code generation itself. The first approach he demonstrates is not really a generator at all, but rather an interpreter. This is written in plain code and makes use of reflection. The second approach uses text templates. In generating code for designers from definitions of DSLs, we have found text templates to be our preferred method for writing code generators. We wrote our own text templating engine, which is included as part of DSL Tools. We have taken great care to architect the engine so that it can be integrated into different contexts, which means that it can be hosted in different environements (e.g. inside visual studio or not) and can accept inputs from multiple sources. For DSLs, we've built a Visual Studio host and the extensions that allow direct access within templates to models in memory through the generated APIs mentioned above. My colleague Gareth Jones has blogged about the engine, and there its use in a DSL Tools context is illustrated in the walkthroughs that are part of the DSL Tools download. We're actively working on more complete documentation for the engine itself, including the APIs.
Aspects that Martin did not touch on in his article include the issues of orchestrating the generation of multiple files from multiple sources, integration with source control (though it is a moot point whether generated files should be checked into source control or not), as well as how to handle cases where 100% code generation is not feasible - particular tricky are the cases where code in the same file has to be further added to - skeleton code is generated, but the programmer has to fill in method bodies, for example. We haven't answers to these yet, but they're on the roadmap.
I see a transcript of the DSL Tools web chat is now available. Members of the DSL Tools team answered questions from customers for an hour or so, and there's some interesting material in there.
I was hoping to participate in this chat, but Elsa, the latest addition to our family, arrived later than expected.
Folks probably haven't noticed because of the sporadic nature of my blog entries, but I have been out for two weeks on paternity leave. We now have another daughter called Elsa-Maude. She joins her three sisters and brother.
I see that whilst I've been out the team put out another release of DSL Tools. I'd flagged this some weeks ago. Jochen has the details. He indicates that we have reworked the text templating engine from previous releases. Unfortunately, we were not able to do it full justice in the documentation. We hope to put that right soon, but in the meantime I see that Gareth has posted more information in his blog.
Also, TechEd is currently running in the US and our team is represented. Pedro Silva is filing a daily report.
Meanwhile, time to catch up and get on with the next release...
Back in the days when I was an academic and researcher, I used to teach Software Engineering. There are many interpretations of this term, but the focus in my classes was on turning a set of vague requirements into a tangible, detailed spec from which you could reliably cut code. I didn't go much for teaching the text book stuff - waterfall versus iterative and all that - but rather encouraged students to try out techniques for themselves to see what works and doesn't.
Perhaps unsurprisingly given my background, I preached a modelling approach. We'd start out scripting and playing out scenarios (yes, we would actually role play the scenarios in class). I didn't go in for use case diagrams - never really understood, still don't, how a few ellipses, some stick men and arrows helped - but I guess the scenario scripts could be viewed as textual descriptions of uses cases. We'd then turn these scripts into filmstrips. For the uninitiated, these are sequences of object diagrams (snapshots), illustrating how the state of the system being modelled changes as you run through the script. I learnt this technique when teaching Catalysis courses for Desmond D'Souza - indeed, my first assignment was to help Alan Wills, the co-author of the Catalysis book, and now my colleague, teach a week course somewhere in the Midlands. The technique is great, and I still swear by it as the way to start constructing an OO model. From the filmstrips, we'd develop an OO analysis model, essentially an OO model of the business processes. This was class diagrams, plus invariant constraints written in English or more formally, plus lists of actions, plus some pre/post specs of these actions. Then would come the job of turning this model into an OO design model for the (non-distributed, self-contained) program written in Java. And the scripts and accompanying filmstrips could be turned into tests and test data.
Well, that was the theory, anyway. In practice, only a very few students really got it end-to-end, though most picked up enough to still do well in the exam. Reflecting on it now, here are some of my observations:
I now find myself in a role in which most of my time is spent doing what I was trying to teach, though there are a couple of differences:
Here are my observations from the experience so far:
If I was back teaching again, I think I would focus much less on specific notations, and much more on the need to track scenarios through to detailed features, and have coherent specs of the details that communicate the decisions made. I'd also look forward to the prospect of greater automation and use of code generation and software factory techniques. If you've got the right domain specific languages, then models expressed in those languages can replace reams of English spec, and code generators sourced on those models can replace a lot of hand coding. However, they have to be languages matched to the problem, and I suspect that for most systems there's still going to be old fashioned spec work to do.
[edited soon after original post to correct some formatting issues]
I just came across this post about XMI from Steven Kelly over at Metacase. In particular, he quotes some figures about usage of the various XMI versions: he's conducted a web search for XMI files out there on the web. The first thing that struck me was how few there are, and secondly how very few there are (34) that use the latest version (2.0) released in 2003.
Steven also makes the observation that XMI is just an XML document containing sufficient information to describe models. Provide your tool stores its models in XML files and/or provides API access to allow you to create models using the API, it's no big deal to write your own importer in code or using XSLT. And if you want to import an XMI file into a domain specific tool, you'd have to do this in any case, because it is very likely your model will be full of stereotypes and tagged values which will need special interpretation that would not be provided by an off-the-shelf importer.
In another post, Steven talks about XMI[DI] which also supports diagram interchange. Another fact: a 20-class diagram takes nearly 400KB. That was a bit of a surprise to me. But it's his observation that "Standards are great, but I think they work best when they arise from best practice based on good theory, not when some committee tries to create something without ever having done it in practice". This strikes a chord with me: it's what I've called premature standardization.
I see that the architect of the Guidance Automation Toolkit (GAT), Wojtek Kozaczynski, has started blogging. I worked with Wojtek closely for a couple of months at the inception of GAT (fun it was too), and have been continuing to work with him and his team to merge our text templating technologies - the next version of GAT will contain the new, merged engine, as will the next version of DSL Tools (should be available by the end of the month, and will work with VS2005 Beta2). So soon folks will be able to install both GAT and DSL Tools, and it will be very interesting to see how they get used in combination.
Anyway, his second post is a pocket history of how GAT came to be and makes for an interesting read.
My last post was about this toolkit, pointing at a webcast that gave a demo. But at that point there was no download. Well, now there is. Just visit the GAT workshop site.
I've just noticed that a webcast on the Guidance Automation Toolkit (GAT) is now available. This is some emerging technology that should soon be made available in a download. Harry Pierson has a nice description over on his blog.
GAT and DSL Tools are both key technologies for realising the software factories vision - they tackle different aspects of the problem. What GAT brings to the table is a notion of recipe and recipe spawning. In its simplest form, a recipe is a wizard that gathers information from the user, then does stuff in Visual Studio, based on that information and information in the environment, thereby automating one or more steps of the software development process. A typical example of 'stuff' would be to create a a set of new items in the solution, perhaps further configure a project, perhaps add one more new projects, and so on. All these things that are created would be based on templates, which get filled in by the information supplied in the wizard. But it's not restricted to creating stuff; you can also delete stuff, perfrom refactoring operations, whatever really, provide you can work out how to do it programatically. A really neat feature of GAT is the notion of recipe spawning: one thing a recipe can do is create new recipes and attach them to items in the solution. This is crucial to automating guidance, where there are many steps to be performed and often repeated. With GAT, you automate the individual steps as recipes, then use recipe spawning to guide folks to the next steps that need to be performed, by spawning recipes which are revealed to you in the context of the items created (or which have been manipulated) by the recipe you've just applied. A spawned recipe can be a one-off action, whcih disappears when done, or can hang around to be repeated as many times as you like.
If you think of the DSL tools as a factory for building designers, then you can see how GAT and DSL Tools can work together. DSL Tools has a wizard for creating a solution in VS used to build a graphical designer. This is effectively a recipe. One of the things a recipe creates is a domain model, based on whatever language template was chosen when running the wizard. The domain model can be edited using a graphical designer (created solely for the purpose of editing domain models), then code generation templates (another key technology) are used to generate code for some aspects of the designer from the domain model. There's another DSL involved as well, the designer definition, from which other aspects of the code are generated. So here's a little factory involving (so far) one recipe and two DSLs.
[13 May 2005: Added this to the GAT category in my blog.]
As a footnote, I should also confess to some involvement with GAT. I spent a little time working with Wojtek and Tom at the inception of GAT, in particular on the notion of recipes and recipe spawning. It's great to see this work come to fruition, and it will be even better when the tools are available for download.
Update: Corrected the spelling of 'Harry Pierson'. Apologies Harry...
In case you haven't seen it, there's been some interesting discussion about n'ary and binary relationships over on the DSL Tools Forum.
A set of interesting questions were posted to the DSL Tools Newsgroup recently, so I've decided to reply to them here. The text from the newsgroup posting appears like this.
Hello -- I quickly ran through the walkthroughs and worked a little with the beta version of your tools, and they are neat. Having designed a modeling language and build a few models, one thing which I would like to do is 'execute' those models. I want to write a C# plugin for Visual Studio which uses an automatically-generated domain-specific API to query and perhaps modify the models programmatically. Based on what the plugin finds in the models, it can do some other useful work. Let's say I want to do some domain-specific analysis, where there isn't any existing analysis framework which correctly supports my domain. In that case, I might as well roll my own analysis framework as a plug-in which is integrated with VS's DSL tools. What I don't want to do is serialize the models to XML and have my independent tool read in the XML file, create an internal representation of the models in memory, and then do stuff. It's a waste of time. I want to integrate my analysis tool with VS and access my models...directly.
These are exactly the kinds of scenario we are envisaging. As I discussed in a past entry, creating models is not much use if it's difficult or impossible for other tools to consume them.
So, my hope is:
Will this be supported? If so, can you publish a walkthrough about this? The models are only worth so much if they're only good for xml/text/code generation --software isn't the only thing which needs to be modeled.
The models are held in memory (we call it the in-memory store). As well as giving access to CRUD operations, this supports transactional processing and event firing. We also generate domain specific APIs from domain models - indeed, you can see what these APIs look like if you look at e.g. XXXX.dmd.cs generated from the XXXX.dmd using the template XXXX.dmd.mdfomt in a designer solution. These APIs work against the generic framework, thus allowing both generic and domain specific access to model data. However, we still have some work to do to make all this easily available, including making some improvements to the generic APIs and doing some repackaging of code. The goal would be that you'd be able to use the dll generated from a domain model to load models into memory from XML files, access them through generic and and domain specific APIs, and then save them back to XML files. We will also be overhauling the XML serialization, so that models will get stored in domain specific, customized XML - see Gareth's posting for some details around this.
As for VS plugins, these will be supported in some way, for example via the addition of custom menus to your designer, or by writing 'standalone' tools integrated into VS making use of the existing VS extensibility features.
On the issue of timing, the API work will happen over the next few months, the serialization work after that. We will continue to put out new preview releases as new features are introduced. Walkthroughs, other documentation and samples will be provided with the new features.
Now we've got the March release out of the door, I'm sure folks are going to ask soon what's in the next release and when to expect it.
First the 'when' bit. As Harry Pierson has already indicated, we expect the when to be shortly after VS2005 Beta2 is released, where 'shortly after' = a small number of weeks. At this point we'll be moving from VS2005 Beta1 to VS2005 Beta2.
Now the 'what'. We're focusing on two feature areas next release (at least that's the plan, usual disclaimers apply):
The above should mean that users will be far less restricted than they are at present in the kindof designer they can build.
We're also making an investment on quality in this cycle, ramping up the automated testing & fixing a whole swathe of bugs.
And after the next release?
Well, here are some of the features in the pipeline: richer notations, constraints and validation, a proper treatment of serialization in XML (see this entry from Gareth), better hooks for code customization of generated designers, deployment of designers to other machines, multiple diagrams viewing a model, better hooks for writing your own tools to consume model data (as explained in this post), ...
No doubt lots of my colleagues will point you at this, including Steve himself.
But here is a great interview with Steve Cook, giving lots of detailed answers to questions about software factories, DSLs, MDA and UML.
In his announcement of the March release of DSL Tools, Gareth mentioned that we now have a designer definition (DD) file validator. This validates the DD file for anything that is not caught by XSD validation, including whether the cross references to the domain model are correct. It also validates those aspects of the domain model which impact the mapping of the designer definition to the domain model. For example, it will check that the XML Root class is mapped to the diagram defined in the DD file. Errors and warnings appear in the Visual Studio errors window whenever you try to generate code from the DD file (i.e. any of the code generators in the Designer project) and disappear the next time your try if the error has been fixed.
You may not have realized this, but the domain model designer also includes some validation. It gets invoked whenever you try to save the file, or you can invoke it from the ValidateAll context menu. Try, for example, giving two classes the same name and then invoking ValidateAll.
As Gareth indicated, this validation is implemented on top of a validation framework, that we will be leveraging to allow users to include validation as part of their own designers, or calling directly through the API, from within a text generation template, for example, as a precondition to code generation. We'd be interested to hear from users about what they would do with such features, whether they think this to be an important set of features (customers we so far have talked with do), what kind of authoring experience they would expect or want, and any suggestions for other features in this general area (for example, how else would you like validation to be exposed through the UI of a designer). You can provide this feedback as comments to this posting, as comments to Gareth's post, or through the DSL Tools newsgroup, or as suggestions through the feedback center.
I've been pondering what the fundamental problems are that we and others are trying to solve with DSLs, Software Factories, Model Driven Software Development, and the like. I've distilled it down to two key problems:
DSLs help with the first of these because they let you codify information that would otherwise be scattered and repeated in many development artefacts. The idea is that to change that information you change it in a single domain specific viewpoint or model, and the changes are propagated to all the artefacts that would otherwise need to be changed by hand. Of course the interesting problem here is how the propagation is performed, and one common approach is to propagate by regenerating the development artefacts by merging the information in the domain specific model with boilerplate. This works best if you can separate out generated aspects of artefacts from hand written aspects, for example by using C# partial classes. In this way you avoid that task of copying boilerplate code and making changes in designated places, and when things change you avoid multiple manual updates.
If it is not possible to cleanly separate the generated aspects from the hand written ones then more sophisticated synchronization techniques will be required, but I'm not going to go into that now.
And once you start thinking in this way, you then discover you can have multiple domain specific viewpoints contributing different aspects to your development artefacts. And then you discover that you can relate these viewpoints, synchronizing information between them and generating one from another. You're treading a path towards software factories.
Domain specific models created to help solve the first problem, also tend to be more abstract and provide new perspectives on the system. They hide detail and can reveal connections that it is difficult to find by looking directly at the development artefacts, especially when those models are visualized through diagrams. This contributes to the second problem: they provide viewpoints on the system which it is often easier to connect to business requirements. One can then go a step further, and build new viewpoints specifically focused on expressing and communicating the business requirements, and set up connections between those viewpoints and viewpoints of the system which can be monitored and synchronized as one or other change.
We see customers already leveraging such techniques in their development processes, codifying their DSLs using XML or UML + stereotypes & tagged values, for example. They also tell us they are having problems with these technologies, and it is those problems that we're trying to address with DSL tools. I'll go into more depth on this, and reveal more of what we're planning to help solve these problems, in future posts.
As posted by Gareth.
I see that Ramesh on the class designer team has posted a note about collection associations. The basic idea is that when visualising code one can choose to elide the collection class, so, for example, you'll see an association from Customer to Order instead of an association from Customer to IList.
This may seem a small matter, but when I used to teach OO design and programming, any first sketch of the design as a UML class diagram would almost always elide the collection class. So it used to really annoy me that when it came to writing out the code, the class diagrams which we produced in TogetherJ could not maintain this elision - not if you wanted to keep diagram and code in sync. This made the diagrams far less useful for communicating a design than they could have been. (There are many great features that the Together tool offered, but its treatment of associations always used to bug me.)
So, hats off to the CD team for getting this aspect just right.
Here's a lot of detailed questions from Kimberly Marcantonio, an active participant in the DSL Tools Newsgroup. I thought it would be more useful to the community to publicise the answers on my blog. Indeed, expect me to do this more, when the answers are likely to be of interest to those following the DSL Tools. The questions also touch on issues concerning the direction we're taking with this technology. I've tried to be open in my responses, without making firm commitments. I hope soon to be more precise about our plans for new features and their roll out.
In what follows, Kimberly's text is in blue, and my responses are in red italic...
I am currently trying to model the Corba Component Model using this DSL package and have run into the following problems/questions:
We've had the following question posted to the newsgroup on the DSL Tools site:
"Can I create a template of my own... right now? Two templates are available but my model doesn't fit into any. Can I use the 'Blank Template' and delete the existing model and create my own model. Is this possible right now? If I do this what all will I need to change wrt to the .dd file and resource files."
The basic model here is to design your DSL, or, more specifically the tools to support that DSL, based on an existing template. In the December CTP we aonly provide two templates: a blank one, with virtually nothing in it; and a template which started life as a simple architecture chart language and has evolved to include one of everything that is currently supported in the DD file. Over time we'll be updating and expanding this set.
Using these templates you can build your own language. As the questioner suggests, you do this by choosing one or other of the templates in the wizard, and then updating the domain model (deleting stuff, renaming stuff, adding stuff), and then updating the dd file and the resources files to match. The end to end walkthrough takes you through this process for the construction of a UIP Chart language. So, in short, you can go create a designer now for whatever language you like, with the big proviso that it fits within the realms of what the dd file currently supports.
And there's the rub. Updating the .dd file is not as easy as it should be, and what is currently supported is rather limiting (e.g. see the known issues list). We will shortly be releasing (within the next couple of weeks) another preview which will fix some key bugs and provide a dd file validator which should make it easier to work with dd files. Our plan after that is to release a preview that will work atop the Beta2 release of Visual Studio 2005, which will be more robust, and will relax some of the limitations imposed by the dd file. At that time, we should also be in a position to be more precise about our plans until the end of the year.
Now back to the first question: "Can I create a template of my own... right now?". The short answer is no. Well, at least we've provided no documentation and to do it manually can be a painstaking job. We have an internal tool that automates the process of creating the wizard templates from samples, but that would take some work to make available to customers. We are also not fixed on the current format for templates. I'd be very interested to hear of scenarios where customers would find the ability to create their own templates useful or essential. Who would the templates be for - yourself? Someone else? What kinds of language, or aspects of a language, would you want to bake into a template?
Fred Thwaites has asked a question in response to this post. He asks:
"Does this imply that in general users of DSL in VS2005 will need to be aware of this metamodel, or will VS2005 come with a comprehensive set of predefined DSL's filling all the cells of Jack Greenfields software factory schema grid."
"Secondly, how fragmented do you feel DSL's will become? Do you expect that in most cases the VS supplied DSL will suffice, or do you see companies, departments and even projects regulally creating of extending DSL's."
In answer to the first question, those who build the DSL tools using our technology will be aware of the metamodel - or domain model as we call it - which is a key part of defining such tools. Users of the the tools so built will not need to be aware of this model, although, of course, they will need to understand the concepts in the domain which it encodes. However, we do not expect most tool-building users to start from scratch everytime, but instead we (and possibly others) will provide 'vanilla' (and not so vanilla) languages in the form of templates to get started. If you try out the December release of the DSL tools, or even just read through the walkthroughs, you will see that the process of building a designer begins by running a wizard, in which you can select a language template on which to base your own DSL. The set if templates is limited at the moment, but we have plans to expand it.
In answer to the second question, a key motivation for the DSLs approach is that as you get serious about automating aspects of your software development processes you find that the 'off-the-shelf' languages/tools either don't quite fit your needs, or fail completely to address your domain. So I fully expect companies, possibly departaments and perhaps projects, creating, customizing and extending DSLs, although in many cases they'll do so from an existing template - it won't be necessary to start from scratch very often.
It is also worth noting that DSLs will evolve as requirements for automation evolve - for example, as the scope of code generation grows. The process might go something like this: I have a DSL for describing executable business processes, and I'm generating a fair chunk of my on-line systems from these models. This DSL has encoded some information very specific to my organization, for example the particular set of roles that my users can play and the particular set of channels through which they can interact with the business process. As these don't change very frequently, it's easier just to make a list of them as part of the definition of the DSL (simplifies the use of the DSL, simplifies the code generated, etc.), rather than extend the DSL to allow new ones to be defined. If they need to be changed later, then the DSL can be udpated, and a new version of the tools released (with appropriate migration tools as well). I then observe that by extending that DSL, or complementing it with another to describe security aspects, say, (noting that the description of security parameters will need to reference elements of the executable business process model), I can then extend the reach of my code generators to put in the necessary plumbing to handle security.