Stuart Kent - Building developer tools at Microsoft - @sjhkent
I promised a while ago to publish a roadmap for what we're doing with DSL Tools, post VS2008. Now that VS2008 and the VS2008 SDK have just shipped (thanks Gareth for providing a post which points at both announecements) now seems a good time honour that promise.
There are two main themes in how we expect DSL Tools to develop in the future:
I mentioned the second theme soon after the DSL Tools team joined the Visual Studio platform team. That is still in our sights and will be part of our roadmap for the VS SDK as a whole, which we'll publicize when it's ready. For the remainder of this post, I’ll focus on the first theme.
Roadmap for VS codename Rosario
Below is a list of features we have in the backlog for Rosario. We’re not promising to complete all of them, but expect to complete a significant subset. Some of the features may appear earlier in a service pack for VS2008.
The two main investments we’re considering post-Rosario are:
Well, that's all for now. Please provide your feedback on this. What do you like? What don't you like? What's on your wish list?
The first public CTP of Visual Studio 10
Anyone attending or watching the goings on at Microsoft's Professional Developers Conference (PDC) will have heard about the next version of Visual Studio.
If you were at PDC, you will have received a VPC containing a Community Technology Preview of Visual Studio 10. If you were not, you might know that you can download it at: Visual Studio 2010 and .NET Framework 4.0 CTP.
What about the DSL Tools?
If you are interested in the DSL Tools, and try this CTP, you might be disappointed not to find anything new in DSL Tools. This doesn't mean we haven't been busy. In fact we've been developing a number of highly-requested new features, but unfortunately did not get these integrated into this CTP release. We'll share them with customers in our next preview release.
Also, in the CTP you'll find a suite of new designers from Team Architect, including a set of UML designers. These have all been built using DSL Tools, and some of the new features we're delivering are directly to support this effort.
The new features
So what are the new features then? Below is a summary. They've been driven by a combination of supporting internal teams such as Team Architect, and responding to customer feedback. We'll blog more about these features over the coming weeks, including, we hope, publishing some videos demo'ing them as a taster whilst you wait for the next preview.
Dsl Libraries. Share fragments of DSL definitions between different designers.
Dsl extensibility. Extend the domain model and behavior for a DSL after it's been deployed, including DSLs shipped by a third party.
Readonly. Selectively switch off the ability to edit models and model elements in a designer.
Forms-based UI. Easily bind models to winforms and WPF-based forms UI. IMS now implements the necessary databinding interfaces.
Modelbus. A new piece of platform to support cross-referencing between models and interaction between designers. This has been one of customers' main requests.
T4 precompile. Precompile text templates so that they can be deployed for use on machines that do not have VS installed. (This applies to scenarios where text templates take inputs from data sources other than DSL models. We've had requests from internal teams to use text templates in scenarios which don't use models created from DSLs.)
An important part of my job is to analyze requirements and write specifications. I have found the technique of storyboarding to be extremely effective. A storyboard is a concrete rendition of a particular scenario - could be a use case or an XP story (see Martin Fowler on UseCasesAndStories for an explanation of the difference). Not only are storyboards great for sorting out requirements, they are also effective in communicating what needs to be built to developers, and as a starting point for the development of tests.
If what you are building is exposed to the user through UI, then the most likely form of your storyboard is a click-through of the UI. If what you are building is infrastructure, say an API, then your storyboard might be a click-through of the experience a developer goes through in authoring code that uses the API, or it might be a filmstrip - a series of object diagrams that illustrate what happens to the state of the system as methods are executed.
Tools that I have found to be particularly effective in developing storyboards, especially those that click-through a UI experience, are Powerpoint and Visio. You can really bring storyboards alive in Powerpoint using a combination of moving through slides, using its custom animation capabilities, and by mixing and matching graphics from multiple sources. And the Windows XP template in Visio is pretty useful too. I've put up an article with specific hints and tips on using Powerpoint and Visio for this purpose.
Interestingly, I soon move from sketching a storyboard on paper or the whiteboard to committing it to Powerpoint, as the latter imposes a discipline that forces you to make decisions and care about the detail. It's so easy to continue hand-waiving and putting off those hard decisions at the whiteboard. I don't spend a large amount of time getting the UI 'just so'. It just needs to be good enough to tease out those important decisions about what the requirements actually are.
If anyone has suggestions for alternative tools that could be used for this purpose, then please add them as comments to this post.
There's been some debate between some of my MSFT colleagues (Alan Wills, Steve Cook, Jack Greenfield) and Grady Booch and others over at IBM around UML and DSLs. For those interested, Grady actually posted an article on UML and DSLs back in May last year - seems to have gone unnoticed. It touched on some of the themes that have been under discussion. My first blog entry was a response to this article.
A particular theme that crops up in the discussion is the issue of tools versus language. Grady seems to want to keep the two separate, whereas we believe that the two are closely linked - a language designed to be used in a tool is going to be different to a language designed to be used on paper.
I touched on this line of discussion in an article a few months back: http://blogs.msdn.com/stuart_kent/articles/181565.aspx
An example I used was that of class diagrams, and I pointed out a couple of aspects of class diagrams which may have been designed differently had the original intention been to use them within a tool rather than on paper. Now I can point you at an example notation that addresses some of these issues. It is the notation we use for defining domain models in our DSL Tools. Here's a sample:
Nodes are classes, lines are relationships or inheritance arrows. Nodes are organized into relationship and inheritance trees, which can be expanded or collapsed, making it easy to navigate and drill into large models (not something you do on paper).
A relationship line has the role information (would be association end information in UML 2) annotated in the middle of the line rather than at the ends: a role is represented as a triangle or rectangle containing a multiplicity, and the name of a role is annotated using a label (in the diagram above reverse role names have been hidden as they are usually less interesting).
Annotating role information in the middle of the line not only makes it easier to automatically place labels, it also means that relationship lines can be chanelled together enabling an expandable relationship tree to be automatically constructed and layed out. An example of channelling is given by the diagram below, where you can see all the relationship lines sourced on Page channelled together so that they connect to Page at the same point:
The point here is that this notation was designed from the start using criteria such as 'must support autolayout' and 'must support easy navigation of large models'. If the criteria were 'must be easily cut up into page size chunks', must be easy to sketch on a whiteboard' then the notation may well have turned out different.
Tools make available a whole range of facilities for viewing, navigating and manipulating models through diagrams, which are not available when using paper or a whiteboard. Unfortunately, these facilities can not always be exploited if they are not taken into account when the notation is designed: there is a difference between designing a notation to be used in a tool and one to be used on paper.
Some of the facilities available in a tool which are not available on paper:
To see how designing a notation for use in a tool can lead to different results than if the focus is paper or whiteboard use, let's take a look at two well-known notations. UML class diagrams and UML state machines. The former, I would argue, has not been well designed for use in a tool; whereas the latter benefits from features which can be exploited in a tool.
Class diagrams. A common use of class diagrams is to reverse engineer the diagram from code. This helps to understand and communicate designs reflected in code. Industrial scale programs have hundreds of classes; frameworks even more. Ideally, one would like a tool to create readable and intuitive diagrams automatically from code. However, the design of the notation mitigates against this. Top of my list of problems is the fact that the design of associations, with labels at each end, precludes channeling on lines, where channeling allows many lines to join a node at the same endpoint (inheritance arrows are often channeled to join the superclass shape at the same point). Because labels are placed at the ends of association lines, each line has to touch a class shape at a different end point in order to display the labels. This exacerbates line crossing and often means that class nodes need to be much larger than they ought to be, making diagrams less compact than they ought to be and much harder to achieve a good layout automatically.
State Machines. State machines can get large with many levels of nesting. This problem can be mitigated using zoom facilities, by a control that allows one to expand/collapse the inside of a state or a link that launches the nested state machine in a new window. As transitions can cross state boundaries, using any of these facilities means that you'll need to distinguish between when a transition (represented by an arrow) is sourced/targeted on a state nested inside and currently hidden from view, or on the state whose inside has been collapsed. This distinction requires a new piece of notation. In UML 1.5, the notation of stubbed transitions (transitions sourced or targeted on a small solid bar) was introduced to distinguish between a transition which touched a boundary and one which crossed it. Interestingly, in UML 2 this notation has been replaced by notation for entry/exit points. In this scheme, one can interrupt a transition crossing the boundary of a state by directing it through an entry or exit point drawn on the edge of the state. This is the state machine equivalent of page continuation symbols that one gets with flowcharts. However, it can be used to assist with expand/collapse in a tool: collapsing the inside of a state, leaves the entry/exit point visible, so one can still see the difference between transitions which cross the boundary and those which don't. In one sense, this isn't quite as flexible a solution as the solid bar notation, as, for expand/collapse to work in all cases, it requires all transitions that cross a boundary to be interrupted by an entry/exit point. I guess, a tool could introduce them and remove them as needed, but I must confess that I prefer the stubbed transition notation for the expand/collapse purpose - seems more natural somehow.
To conclude, designing a notation for use in a tool can lead to different decisions than if the focus is on paper or whiteboard use. However, much as we might like to think that a notation will only be used in a tool, there will always be times when we need to see or create it on paper or a whiteboard, and this has to be balanced against the desire to take advantage of what tools can offer. For example, it is always a good idea to incorporate symbols to support 'page continuation', which will make it easier to provide automated assistance for cutting a large virtual diagram into page-sized chunks (and, as we have seen, such symbols can also support other facilities like expand/collapse). And it is always worth considering whether the notation is sketchable. If not, it may be possible to define a simplified version which is. For example, one could provide alternatives for sophisticated shapes or symbols, that look great in tool, but are very hard to use when sketching.
If you're a student, then Microsoft has just announced a new program called DreamSpark to get Visual Studio 2008 Professional Edition (and Expression Studio and Windows Server 2003 and XNA Game Studio and SQL Server 2005) for free.
As Aaron points out, once you've got Visual Studio you can download the VS SDK (also for free) and start extending it.
If there are any academics reading my blog (I'm hoping that some of my academic colleagues might take a peek now and again), please let your students know about this.
Putting on my Lecturer's hat, I'd encourage all Computer Science students to try out both Visual Studio and Eclipse before they leave college, as it's very likely that a potential employer will be using one or other, and sometimes both. Also, the more experience you have with different development environments and programming languages the more transferrable and rounded your computing skills are likely to be.
I can now point you at a couple of announcements about the technology I've been working on. Here's the Microsoft Press announcement:
The exciting aspect of this is that we're going to start making the technology available to the community as soon as we can. Here's the site to watch:
I'll post to my blog as soon as some content is available - should be before the end of the week.
We gave a demo of some of the technology during Rick Rashid's keynote at the OOPSLA conference. Unfortunately I expect there'll be some flak about this - watching the keynote, our demo felt a bit like a 'commercial break'. I'll respond to the flak when it arrives.
So not quite the launch I'd hoped for, but that shouldn't detract from the technology itself. Here's a quick heads up on what we've been doing:
Our first release, at the end of this week, will be a preview of the object model editor. Previews of the other components should be available by the end of the year.
We've had the following question posted to the newsgroup on the DSL Tools site:
"Can I create a template of my own... right now? Two templates are available but my model doesn't fit into any. Can I use the 'Blank Template' and delete the existing model and create my own model. Is this possible right now? If I do this what all will I need to change wrt to the .dd file and resource files."
The basic model here is to design your DSL, or, more specifically the tools to support that DSL, based on an existing template. In the December CTP we aonly provide two templates: a blank one, with virtually nothing in it; and a template which started life as a simple architecture chart language and has evolved to include one of everything that is currently supported in the DD file. Over time we'll be updating and expanding this set.
Using these templates you can build your own language. As the questioner suggests, you do this by choosing one or other of the templates in the wizard, and then updating the domain model (deleting stuff, renaming stuff, adding stuff), and then updating the dd file and the resources files to match. The end to end walkthrough takes you through this process for the construction of a UIP Chart language. So, in short, you can go create a designer now for whatever language you like, with the big proviso that it fits within the realms of what the dd file currently supports.
And there's the rub. Updating the .dd file is not as easy as it should be, and what is currently supported is rather limiting (e.g. see the known issues list). We will shortly be releasing (within the next couple of weeks) another preview which will fix some key bugs and provide a dd file validator which should make it easier to work with dd files. Our plan after that is to release a preview that will work atop the Beta2 release of Visual Studio 2005, which will be more robust, and will relax some of the limitations imposed by the dd file. At that time, we should also be in a position to be more precise about our plans until the end of the year.
Now back to the first question: "Can I create a template of my own... right now?". The short answer is no. Well, at least we've provided no documentation and to do it manually can be a painstaking job. We have an internal tool that automates the process of creating the wizard templates from samples, but that would take some work to make available to customers. We are also not fixed on the current format for templates. I'd be very interested to hear of scenarios where customers would find the ability to create their own templates useful or essential. Who would the templates be for - yourself? Someone else? What kinds of language, or aspects of a language, would you want to bake into a template?
Now that Visual Studio 2005 has been released to manufacture (RTM), folks are asking us when to expect a version of DSL Tools that works with the RTM release. Well, you won't have to wait long - it should be available within the next two or three weeks. And we've got two new features lined up for you as well:
Deployment, where you create a setup project for your designer authoring solution using a new DSL Tools Setup project template, which, when built, will generate a setup.exe and a .msi for installing your designer on another machine on which VS2005 standard or above is installed.
Validation, where it's now possible to add constraints to your DSL definition which can be validated against models built using your designer. Errors and warnings get posted in the VS errors window. We support various launch points for validation - Open, Save, Validate menu, and these are configurable. All the plumbing is generated for you - all you have to do is write validation methods in partial classes - one per domain class to which you want to attach constraints - which follow a fairly straightforward pattern.
We're also going to release our first end to end sample about the same time. The sample is a designer for modelling the flow in Wizard UI's (it's state chart like), from which working code is generated.
There aren't really seven, but it makes for a good title (look up "seven stages shakespeare" if you're feeling puzzled at this point.)
Anyway, I've been watching a debate between Harry, Gareth and now Steven, on the process of building models, including how alike or unalike that is to programming. There also seem to be a number of different terms being suggested for the different states of models - terms like complete/incomplete or precise/imprecise. Now I start to worry when I see terms like that without concrete examples to back them up. So I thought I'd weigh in with some comments, and give a concrete example to illustrate what I mean.
The general point I'd like to make is that I think models and programs do go through different states, like complete or incomplete, but I wouldn't try to use generic terms to describe these states. I think those states are language, dare I say domain, specific, and depend largely on the tools we have for checking and processing expressions in those languages, be they programs or models. So a program goes through states such as 'compiles without error' and 'executes' and 'passes all tests' and 'has zero reported bugs against it' and 'has passed customer acceptance testing'. At least the first three of those states are very specific to programming - we know what it means to compile a program, to execute it, to test it. We'll have to find different terms for different kinds of model.
And if I was going to use a generic term for the sliding scale against which we can judge the state of a model or program, I would use the term 'fidelity', as it has connotations of trustworthiness, confidence and dependibility, which terms like 'precision' and 'completeness' don't really have.
So now for an example of stages a model might go through, taken from my very recent experience as we are developing Dsl Tools (I apologize for the mild case of navel-gazing here).
In the last few days I've been developing a new domain model for the new Dsl definition language that will replace the designer definition (.dsldd) file we currently use in Dsl Tools. There are various stages which that model goes through. I don't bother with sketching it on paper, as the graphical design surface is good enough for me to do that directly in the designer. I miss out lots of information initially, and the model is generally not well-formed most of the time. However, I then get to a point where I'm basically happy with the design (or as happy as I can be without actually testing it out) and I'm ready to get it to the next level of fidelity. Let's call this 'Initial Design Complete'.
For the next step, I start by running a code generation template against the model, which generates an XSD. This usually fails initially, so I then spend time adding missing information to the model and correcting information that is already there. Finally the XSD generation works - I have more faith in my model than I had before. This is the 'XSD generation succesful' state.
Now I've got the XSD, I develop example xml files which represent instances of the domain model I'm building (in this case, particular dsl definitions) and check they validate against the XSD. I find out, in this stage, whether the model I'm building captures the concepts required to define Dsl's. Can I construct a definition for a particular Dsl? As these examples are built, I refine the domain model, and regenerate the XSD, until I have a set of examples that I'm happy with. The fidelity of the model is increased and I'm much more confident that it's correct. We might call this the 'All candidate Dsl Definitions can be expressed' state.
The next stage, which will now mostly be done by developers I work with, is to write the code generators of this new format that generate working designers. We'll write automated tests against those generated designers and exercise them to check that they have the behavior we expected to get from the Dsl definitions provided as input to the code generators. This process is bound to find issues with the original model, which will get updated accordingly. This might be the 'Semantics of language implemented' state, where here the semantics is encoded as a set of designer code generators.
So how is this process similar to programming? Well, with programming I'd probably do my initial sketches of the design on a whiteboard or paper. I might use a tool like the new class designer in Visual Studio, which essentially visualizes code, and just ignore compile errors during this stage.
When I think the design is about right I'll then do the work to get the code to compile. I guess that's a bit like doing the work in the modeling example above to generate the XSD.
Once I've got the code to compile, I'll try and execute it. I'll build tests and check that the results are what I and the customer expect. This, perhaps, is analagous to me writing out specific examples of dsl definitions in Xml and checking them against the generated XSD.
We could probably debate for ages how alike generating an XSD from a domain model is to compiling a program, or how alike checking that the domain model is right by building example instances in XML is to executing and testing a program. I don't think it really matters - the general point is that there are various stages of running artefacts through tools, which check and test and generally increase our confidence in the artefact we're developing - they increase the fidelity of the artefact. Exactly what those stages are depends on the domain - what the artefact is, what language is used to express it, and what tools are used to process it.
Now it would be interesting to look at what some of these states should be for languages in different domains. For example, what are they for languages describing business models...
[10th February 2006: Fixed some typos and grammatical errors.]
Underlying Dan's article seemed to be the assumption that models are just used as input to code generation. To be fair, the article was entirely focused on the OMG's view of model driven development, dubbed MDA, which tends to lean that way. My own belief is that there are many useful things you can use models for, other than code generation, but that's the topic of a different post. I'll just focus here on code generation.
So which path to follow? Translationist or elaborationist?
In the translationist approach, the model is really a programming language and the code generator a compiler. Unless you are going to debug the generated (compiled) code, this means that you'll need to develop a complete debugging and testing experience around the so-called modeling language. This, in turn, requires the language to be precisely defined, and to be rich enough to express all aspects of the target system. If the language has graphical elements, then this approach is tantamount to building a visual programming language. The construction of such a language and associated tooling is a major task that requires specialist skills. It will probably be done by a tool vendor in domains where there is enough of a market to warrant the initial investment. Indeed, one doesn't have to look far for examples. There are several companies who have built businesses on the back of this approach to MDA, especially in the domain of real-time, embedded systems. And, for obvious reasons, they have been leading efforts to define a programming language subset of UML, called Executable UML, xUML or xtUML, depending on which company you talk to.
In contrast, the elaborationist approach to code generation does not require the same degree of specialist skill or upfront investment. It can start out small and grow organically. However, there are pitfalls to watch out for. Here's some that I've identified:
Of course, we have been talking to our customers and partners about their needs in this area. But we're always to keen to receive more feedback. If you've been using code generation, then I'd like to hear from you. Has it been successful? What techniques have you been using to write the generators? To write the models? What pitfalls have you encountered? What development tools would have made the job easier?
Like buses, you wait a while then two come at once. That's right, just a week after we released the latest version of DSL Tools, we've also released an update to the samples. This download includes:
(1) is brand new. (2) is an update of the samples we released a few weeks ago.
And next? We should get some more documentation up soon, and we're working on integrating the release into the VS SDK. After that, you'll have to wait a while as we do some root and branch work on some of the APIs and replace the .dsldmd and .dsldd with a single .dsl format, with it's own editor (yes, editing .dsldd files will become a thing of the past). We'll also be supporting two more pieces of new notation - port shapes and swimlanes, and providing richer and more flexible support for persisting models in XML files. Our goal is a release candidate at the end of March.
One of the most common asks from customers of the DSL Tools has been support for integrating models and designers, that is the ability to have models of the same or different DSL cross-reference one another, and the ability to write simple code that exploits these cross-references to implement interesting interactions between designers.
In VS2010 we’re providing this support which can now be previewed in VS2010 Beta1 release. Jean-Marc has just posted a sample (StateMachineOverModelBus.zip, documentation) that illustrates how the modelbus can be used to achieve the behaviors described above, and I’ve just posted a short video demonstrating the behaviors and giving a quick insight into what you need to do to realize them in your own designers.
Details on how you can provide feedback can be found on our the DSL Tools Code Gallery landing page. Or by all means provide feedback on this blog.
Three tutorial documents which 'walk through' various aspects of the December release of our DSL Tools are now available for download in a zip file.
You can also navigate to them from the DSL Tools December release.
The Oslo modeling platform was announced at Microsoft's PDC and we've been asked by a number of customers what the relationship is between DSL Tools and Oslo. So I thought it would be worth clearing the air on this. Keith Short from the Oslo team has just posted on this very same question. I haven’t much to add really, except to clarify a couple of things about DSL Tools and VSTS Team Architect.
As Keith pointed out, some commentators have suggested that DSL Tools is dead. This couldn’t be further from the truth. Keith himself points out that "both products have a lifecycle in front of them". In DSL Tools in Visual Studio 2010 I summarize the new features that we're shipping for DSL Tools in VS 2010, and we'll be providing more details in future posts. In short, the platform has expanded to support forms-based designers and interaction between models and designers. There's also the new suite of designers from Team Architect including a set of UML designers and technology specific DSLs coming in VS 2010. These have been built using DSL Tools. Cameron has blogged about this, and there are now some great videos describing the features, including some new technology for visualizing existing code and artifacts. See this entry from Steve for details.
The new features in DSL Tools support integration with the designers from team architect, for example with DSLs of your own, using the new modelbus, and we're exploring other ways in which you can enhance and customize those designers without having to taking the step of creating your own DSL. Our T4 text templating technology will also work with these designers for code generation and will allow access to models across the modelbus. You may also be interested in my post Long Time No Blog, UML and DSLs which talks more about the relationship between DSLs and UML.
I've just returned from OOPSLA and Redmond, and am pleased to see that the DSL tools workbench site is now live, delayed by a few teething problems with process - we should get better at this with each release of content. You can now download a preview version of the object model editor, which is accompanied by a document providing a walkthrough of that tool.
We'll be putting up some more content over the next few weeks - whitepapers, a video, that kind of thing - and planning to drop another preview, this time including the wizard and code generators (see my earlier posting), around Christmas.
When I've cleared the backlog of work from my two weeks away, I hope to return to more technical topics, probably starting with an explanation of the notation used in the object model editor.
"I was delighted to see today's report that Sun has announced support for the UML in their tools. This comes on the heels of Microsoft telegraphing their support for modeling in various public presentations, although the crew in Redmond seem to trying to downplay the UML open standard in lieu of non-UML domain-specific languages (DSL). […] There is no doubt that different domains and different stakeholders are best served by visualizations that best speak their language - the work of Edward Tufte certainly demonstrates that - but there is tremendous value in having a common underlying semantic model for all such stakeholders. Additionally, the UML standard permits different visualizations, so if one follows the path of pure DSLs, you essentially end up having to recreate the UML itself again, which seems a bit silly given the tens of thousand of person-hours already invested in creating the UML as an open, public standard."
Let's start with the statement "There is no doubt that different domains and different stakeholders are best served by visualizations that best speak their language". This seems to imply that a domain specific-language is just about having a different graphical notation - that the semantics that underpin different DSLs is in fact the same - only the notation changes. This view is further reinforced by the statement "but there is tremendous value in having a common underlying semantics model for all such stakeholders".
How can one disagree with this? Well, what if the semantic model excludes the concepts that the stakeholders actually want to express? If you examine how folks use UML, especially those who are trying to practice model driven development, you'll see that exactly the opposite is happening. Model driven development is forcing people to get very precise about the semantics of the models they write (in contrast to the sometimes contradictory, confusing and ambiguous semantics that appears in the UML spec). This precision is embodied in the code generators and model analysis tools that are being written. And, surprise, surprise, there are differences from one domain to another, from one organization to another. Far from there being a common semantics, there are significant and actual differences between the interpretations of models being taken. And how are these semantic differences being exposed notationally? Well, unfortunately, you are rather limited in UML with how you can adapt the notation. About all you can do is decorate your diagrams with stereotypes and tagged values. This leads to (a) ugly diagrams and (b) significant departure from the standard UML semantics for those diagrams (as far as this can be pinned down).
I speak from experience. I once worked on the Enterprise Application Integration UML profile, which has recently been ratified as a standard by the OMG. The game here was to find a way of expressing the concepts you wanted, using UML diagrams decorated with stereotypes and trying to align with semantics of diagrams as best you could. It boiled down to trying to find a way of expressing your models in a UML modelling tool, at first to visualize them, and then, if you were brave enough, to get hold of the XMI and generate code from them. So in the EAI profile, we bastardized class diagrams to define component types (including using classes to define kinds of port), and object diagrams were used to define the insides of composite components, by representing the components and ports as objects, and wires as links. Now, you can hardly say that this follows the "standard" semantics of UML class diagrams and object diagrams.
And this gets to the heart of the matter. People are using UML to express domain-specific models, not because it is the best tool for the job, but because it saves them having to build their own visual modelling tool (which they perceive as something requiring a great deal of specialist expertise). And provided they can get their models into an XML format (XMI), however ugly that is, they can at least access them for code generation and the like. Of course, people can use XML for this as well, provided they don't care about seeing their models through diagrams, and they are prepared to edit XML directly.
So, rather than trying to force everyone to use UML tools, we should be making it easy for people to build their own designers for languages tailored to work within a particular domain or organization's process, not forgetting that these languages will often be graphical and that we'll want to get at and manipulate the models programmatically. And this does not preclude vendors building richer designers to support those (often horizontal) domains where it is perceived that the additional investment in coding is worthwhile. Indeed, Microsoft are releasing some designers in this category with the next release of Visual Studio.
I typed this entry a few days ago, but then managed to lose it through a set of circumstances I'm too embarrassed to tell you about. It's always better second time around in any case.
Anyway, reading this recent post from Simon Johnston prompted a few thoughts that I'd like to share. In summary, Simon likens UML to a craftsman's toolbox, that in the hand of a skilled craftsman can produce fine results. He then contrasts this with the domain specific language approach and software factories, suggesting that developers are all going to be turned into production-line workers - no more craftsman. The argument goes something like this: developers become specialized in a software factory to work on only one aspect of the product through a single, narrowly focussed domain specific language (DSL); they do their work in silos, without any awareness of what the others are doing; this may increase productivity of the individual developers, but lead to a less coherent solution.
Well, I presume this is a veiled reference to the recent book on Software Factories, written by Jack Greenfield and Keith Short, architects in my product group at Microsoft, and to which I contributed a couple of chapters with Steve Cook. The characterization of sofware factories suggested by Simon is at best an over-simplification of the vision presented in this book.
I trained as a mathematician. When constructing a proof in mathematics there are two approaches. Go back to the original definitions, the first principles, and work out your proof from there; or build on top of theorems already proven by others. The advantage of the first approach is that all you have to learn is the first principles and then you can set your hand to anything. The problem, is that it will take you a very long to time to prove all but the simplest theorems, and you'll continually be treading over ground you've trod many times before. The problem with the second approach is that you have to learn a lot more, including new notations (dare I say DSLs) and inevitably end up becoming a specialist in a particular branch of the subject; but in that area you'll be a lot more productive. And it is not unknown for different areas of mathematics to combine to prove some of the more sophisticated theorems.
With software factories we're saying that to become more productive we need to get more domain specific so that we can provide more focused tooling that cuts out the grunt work and let's us get on with the more challenging and exciting parts of the job. As with mathematics, the ability to invent domain specific notations, and, in our case, the automated tools to support them, is critical to this enterprise. And sophisticated factories (that is, most of them) will combine expertise from different domains, both horizontal and vertical, to get the job done, just as different branches of mathematics can combine to tackle tricky problems.
So our vision of software factories is closer to the desirable situation described by Simon towards the end of his article, where he talks about the need for a "coherent set of views into the problem". Each DSL looks at the problem, the software system being built or maintained, from a particular perspective. These perspectives need to be combined with the other views to give a complete picture. If developers specialize in one perspective or another, then so be it, but that doesn't mean that they can sit in silos and not communicate with the others in the team. There are always overlaps between views and work done by one will impact the work of another. But, having more specialized tooling should avoid a lot of error-prone grunt work, and will make the team as a whole far more productive as a result.
So what about UML in all this? To return to Simon's toolbox analogy (and slightly toungue-in-cheek) UML is like having a single hand drill in the toolbox, which we've got to try and use to drill all sizes of hole (for large holes you drill a number of small holes close together), and in all kinds of material; some materials you won't be able to drill into at all. DSLs, on the other hand, is like having a toolbox full of drill bits of all different sizes, each designed to drill into a particular material. And in a software factory, you support your DSLs with integrated tooling, which is like providing the electric hammer-drill: you'll be a lot more productive with these specialist tools, and even do things you couldn't manage before, like drill holes in concrete.
So I don't see UML as a central part of the software factory/DSL story. I see it first and foremost as a language for (sketching) the design of object-oriented programs - at least this is its history and its primary use to date. Later versions of UML, in particular the upcoming UML 2, have tried to extend its reach by adding to the bag of notations that it includes. At best, this bag is useful inspiration in the development of some DSLs, but I doubt very much that they'll get used exactly as specified in the standard - as far as conformance against the standard can be checked that is...
This is the question asked in the title of a post on our modeling and tools forum, and rather than answer directly there, I thought it would make more sense to post an answer here, as it’s then easier to cross-reference in the future.
The body of the post actually reads:
Various Microsoft attempts at MDD have failed or been put on the back burner: WhiteHorse, Software Factories, Oslo. Does Microsoft have any strategy for Model Driven Development? Will any of the forementioned tools ever see the light of day?
Various Microsoft attempts at MDD have failed or been put on the back burner: WhiteHorse, Software Factories, Oslo.
Does Microsoft have any strategy for Model Driven Development? Will any of the forementioned tools ever see the light of day?
First we need to clarify some definitions. I distinguish between the following:
Model-assisted development, where models are used to help with development but are not the prime artefacts – they don’t contribute in any direct way to the executing code. Examples would be verifying or validating code against models or even just using models to think through or communicate designs. UML is often used for the latter.
Model-driven development, where models are the prime artefacts, with large parts of the executing code generated from them. Custom, hand-written code is used to complete the functionality of the software, filling in those parts which are not represented in the model.
Model-centric development, where models are the prime artefacts, but interpreted directly by the runtime. Custom, hand-written code is used to complete the functionality, with an API provided to access model data and talk to the runtime as necessary.
These are not separated approaches, and development projects may use a combination of all three. In fact, I imagine a continuous dial that ranges from model-assisted development through to model-centric development, such as the one illustrated below (starting at the left, the first five tabs are examples of model-assisted development).
The challenge is to make the movement through the dial as seamless and integrated as possible, and also to make sure that these approaches can be adopted incrementally and in a way that integrates with mainstream development (agile) practices. Only then will we be able to unlock the potential productivity benefits of these approaches for the broader development community.
In Microsoft, there are a number of shipping products which support or employ one or more of these approaches.
In Visual Studio, we have DSL Tools and T4, which together support model-driven development. New functionality was added to both in VS2010, and T4, for template-based code generation, continues to see broader adoption, for example in the ASP .Net community as evidenced by this blog post. for example. Many customers use DSL Tools for their own internal projects, and we continually get questions on this forum and through other channels on this.
Visual Studio 2010 edition introduced a set of architecture tools: a set of UML Tools (built using DSL Tools), a tool for visualizing dependencies between code as graphs, and a tool for validating expected dependencies between different software components (layers). I would place all these tools under the category of model-assisted development, although the VS2010 feature pack 2 does provide some support for generating code from UML models, which allows these tools to be used for model-driven development, if MDD with UML is your thing.
Visual Studio LightSwitch combines a model-driven (generates code) and model-centric (directly interprets the model) approach to the development of business applications. It is a great example of how models can really make the developer more productive.
Outside of Visual Studio, the Dynamix line of products, also take a combined model-centric/model-driven approach to the domains of ERP and CRM business applications. For example, take a look at this blog post.
I am an architect with the Visual Studio Ultimate team, who are responsible for DSL Tools, T4, and the architecture tools. In that team, we are now looking into how to take these tools forward, focused very much on how to make software developers and testers more productive. Part of this will be to consolidate and integrate some of what we already have, as well as integrate better with other parts of Visual Studio and mainstream development practices. Another part will be to add new features and capabilities and target support on specific frameworks and domains – as LightSwitch does. We’re focused on addressing the challenge set out above, and to deliver value incrementally.
As we start to deliver the next wave of tools, I look forward to a continued conversation with our customers about the direction we’re headed.
Addendum (8/4/2011): I forgot to mention that another problem we’re currently focused on is how deliver a much more seamless integration between models and code, which is essential to meet the challenge described above.
Here's a lot of detailed questions from Kimberly Marcantonio, an active participant in the DSL Tools Newsgroup. I thought it would be more useful to the community to publicise the answers on my blog. Indeed, expect me to do this more, when the answers are likely to be of interest to those following the DSL Tools. The questions also touch on issues concerning the direction we're taking with this technology. I've tried to be open in my responses, without making firm commitments. I hope soon to be more precise about our plans for new features and their roll out.
In what follows, Kimberly's text is in blue, and my responses are in red italic...
I am currently trying to model the Corba Component Model using this DSL package and have run into the following problems/questions:
I read on Jim Steel's blog that back in August there was lots of discussion on OMG lists about what makes a compliant MDA tool. I followed a link from his blog to this entry, and there were three aspects that intrigued me.
A new sample has been published of a hosting a DSL in the VS2008 isolated shell. See http://www.codeplex.com/storyboarddesigner.
The isolated shell is a stripped out VS shell who's UI and splash screen you can customize. VS packages, including DSLs, can be hosted in both full VS and in your own customized shell without change. This sample shows how to do this specifically in the case where you want to host a DSL in a custom shell, for example to distribute to a designer to non-developer stakeholders. See the VSX development center for more details.
This sample shows a storyboarding DSL, which a product manager or business analyst might do to envision the requirements of an application. In addition to demonstrating hosting in the shell, the sample also shows a couple of cool customizations on a DSL, including a very simple approach to linking diagrams, generation of web pages for the storyboard, and export of images off the design surface.
Here's a screenshot:
Thanks to Clarius for putting this up.
A set of interesting questions were posted to the DSL Tools Newsgroup recently, so I've decided to reply to them here. The text from the newsgroup posting appears like this.
Hello -- I quickly ran through the walkthroughs and worked a little with the beta version of your tools, and they are neat. Having designed a modeling language and build a few models, one thing which I would like to do is 'execute' those models. I want to write a C# plugin for Visual Studio which uses an automatically-generated domain-specific API to query and perhaps modify the models programmatically. Based on what the plugin finds in the models, it can do some other useful work. Let's say I want to do some domain-specific analysis, where there isn't any existing analysis framework which correctly supports my domain. In that case, I might as well roll my own analysis framework as a plug-in which is integrated with VS's DSL tools. What I don't want to do is serialize the models to XML and have my independent tool read in the XML file, create an internal representation of the models in memory, and then do stuff. It's a waste of time. I want to integrate my analysis tool with VS and access my models...directly.
These are exactly the kinds of scenario we are envisaging. As I discussed in a past entry, creating models is not much use if it's difficult or impossible for other tools to consume them.
So, my hope is:
Will this be supported? If so, can you publish a walkthrough about this? The models are only worth so much if they're only good for xml/text/code generation --software isn't the only thing which needs to be modeled.
The models are held in memory (we call it the in-memory store). As well as giving access to CRUD operations, this supports transactional processing and event firing. We also generate domain specific APIs from domain models - indeed, you can see what these APIs look like if you look at e.g. XXXX.dmd.cs generated from the XXXX.dmd using the template XXXX.dmd.mdfomt in a designer solution. These APIs work against the generic framework, thus allowing both generic and domain specific access to model data. However, we still have some work to do to make all this easily available, including making some improvements to the generic APIs and doing some repackaging of code. The goal would be that you'd be able to use the dll generated from a domain model to load models into memory from XML files, access them through generic and and domain specific APIs, and then save them back to XML files. We will also be overhauling the XML serialization, so that models will get stored in domain specific, customized XML - see Gareth's posting for some details around this.
As for VS plugins, these will be supported in some way, for example via the addition of custom menus to your designer, or by writing 'standalone' tools integrated into VS making use of the existing VS extensibility features.
On the issue of timing, the API work will happen over the next few months, the serialization work after that. We will continue to put out new preview releases as new features are introduced. Walkthroughs, other documentation and samples will be provided with the new features.
I see that Ramesh on the class designer team has posted a note about collection associations. The basic idea is that when visualising code one can choose to elide the collection class, so, for example, you'll see an association from Customer to Order instead of an association from Customer to IList.
This may seem a small matter, but when I used to teach OO design and programming, any first sketch of the design as a UML class diagram would almost always elide the collection class. So it used to really annoy me that when it came to writing out the code, the class diagrams which we produced in TogetherJ could not maintain this elision - not if you wanted to keep diagram and code in sync. This made the diagrams far less useful for communicating a design than they could have been. (There are many great features that the Together tool offered, but its treatment of associations always used to bug me.)
So, hats off to the CD team for getting this aspect just right.
Fred Thwaites has asked a question in response to this post. He asks:
"Does this imply that in general users of DSL in VS2005 will need to be aware of this metamodel, or will VS2005 come with a comprehensive set of predefined DSL's filling all the cells of Jack Greenfields software factory schema grid."
"Secondly, how fragmented do you feel DSL's will become? Do you expect that in most cases the VS supplied DSL will suffice, or do you see companies, departments and even projects regulally creating of extending DSL's."
In answer to the first question, those who build the DSL tools using our technology will be aware of the metamodel - or domain model as we call it - which is a key part of defining such tools. Users of the the tools so built will not need to be aware of this model, although, of course, they will need to understand the concepts in the domain which it encodes. However, we do not expect most tool-building users to start from scratch everytime, but instead we (and possibly others) will provide 'vanilla' (and not so vanilla) languages in the form of templates to get started. If you try out the December release of the DSL tools, or even just read through the walkthroughs, you will see that the process of building a designer begins by running a wizard, in which you can select a language template on which to base your own DSL. The set if templates is limited at the moment, but we have plans to expand it.
In answer to the second question, a key motivation for the DSLs approach is that as you get serious about automating aspects of your software development processes you find that the 'off-the-shelf' languages/tools either don't quite fit your needs, or fail completely to address your domain. So I fully expect companies, possibly departaments and perhaps projects, creating, customizing and extending DSLs, although in many cases they'll do so from an existing template - it won't be necessary to start from scratch very often.
It is also worth noting that DSLs will evolve as requirements for automation evolve - for example, as the scope of code generation grows. The process might go something like this: I have a DSL for describing executable business processes, and I'm generating a fair chunk of my on-line systems from these models. This DSL has encoded some information very specific to my organization, for example the particular set of roles that my users can play and the particular set of channels through which they can interact with the business process. As these don't change very frequently, it's easier just to make a list of them as part of the definition of the DSL (simplifies the use of the DSL, simplifies the code generated, etc.), rather than extend the DSL to allow new ones to be defined. If they need to be changed later, then the DSL can be udpated, and a new version of the tools released (with appropriate migration tools as well). I then observe that by extending that DSL, or complementing it with another to describe security aspects, say, (noting that the description of security parameters will need to reference elements of the executable business process model), I can then extend the reach of my code generators to put in the necessary plumbing to handle security.