Stuart Kent - Building developer tools at Microsoft - @sjhkent

  • stuart kent's blog

    Collection associations in class designer

    • 2 Comments

    I see that Ramesh on the class designer team has posted a note about collection associations. The basic idea is that when visualising code one can choose to elide the collection class, so, for example, you'll see an association from Customer to Order instead of an association from Customer to IList.  

    This may seem a small matter, but when I used to teach OO design and programming, any first sketch of the design as a UML class diagram would almost always elide the collection class. So it used to really annoy me that when it came to writing out the code, the class diagrams which we produced in TogetherJ could not maintain this elision - not if you wanted to keep diagram and code in sync. This made the diagrams far less useful for communicating a design than they could have been. (There are many great features that the Together tool offered, but its treatment of associations always used to bug me.)

    So, hats off to the CD team for getting this aspect just right.

      

  • stuart kent's blog

    Answers to questions on the domain model designer and future features

    • 5 Comments

    Here's a lot of detailed questions from Kimberly Marcantonio, an active participant in the DSL Tools Newsgroup. I thought it would be more useful to the community to publicise the answers on my blog. Indeed, expect me to do this more, when the answers are likely to be of interest to those following the DSL Tools. The questions also touch on issues concerning the direction we're taking with this technology. I've tried to be open in my responses, without making firm commitments. I hope soon to be more precise about our plans for new features and their roll out.

    In what follows, Kimberly's text is in blue, and my responses are in red italic...

    I am currently trying to model the Corba Component Model using this DSL package and have run into the following problems/questions:

    1. Why is it that you can not currently move the classes (drag and drop) around the canvas? This leads to very spread out models that take up a lot of space and do not print well. Is there a better way to print these models to make sure that they fit onto one page?
      We took the decision to automate as much of the layout as possible. You can use 'Bring Definition Here' and 'Create Root' context menu items when a node in the diagram is selected, to control where definitions of classes appear in the diagram. Our experience is that this gives a reasonable amount of control of diagram layout, without losing the significant advantages of autolayout. I'd be interested to know if anyone has tried using these, and whether this helps with the printing issue? We would like to add facilities for being able to create partial diagrams, perhaps showing relationships structures as the true graphs which they are, but have to balance this against the myriad of other features we need to build (e.g. see comments about constraints and serialization below).
    2. Can you offer further explanation into when to use an embedded relationship, and a Reference relationship? I understand the use of Inheritance, but often do not know when to use the other two. Also I feel as if there should just be a regular connection, for sometimes I feel these two types of connections are not fitting. Is containment the same as embedding?
      Embedding and reference are used to drive behaviours of the underlying tools, or, to be more accurate will be used to drive the behaviour of the underlying tools. At the moment they drive the deletion behaviour of a designer - the default behaviour is taken from the diagram, deletion is propagated across embeddings relationships but not reference relationships, though this can be overridden using the delete behaviour designer in the DMD. This information will also be used to drive the XML serialization format (the approach to serialization is only an interim measure at the moment) and the default layout of the explorer. There are other aspects of behaviour where this kind of information is useful, though I won't go into that here. Also see answers below. 
    3. Also could I have more information as to what the XML root is used for? I sometimes feel as if my diagrams have no root, or multiple roots, yet this is not supported.
      The current serialization solution is only an interim measure. XML root is used to indicate which element is used at the top of the tree when a model is serialized into a file, and teh kind of element a diagram must map to. Our actual approach to serialization should be richer and mor domain specific than this, and the constraint requiring a diagram to map to the XML root is likely to be relaxed. 
    4. Is there anyway to enforce constraints in this modeling language, such as OCL (Object-Constraint Language)?
      Not yet, but constraint validation is in our plans. We'll probably just use .Net languages to write the bodies of constraints initially, as you that brings with it the intellisense and debugging support, but all the plumbing into the designer, reported of errors etc. will be handled for you.
    5. Is it possible to have more than one .dmd file in a project? If so do you have one .dd file for all of these .dmd files, or many .dd files, one for each .dmd files?
      Yes you can have more than one dmd file per project, and indeed you can generate the code for each one you have (currently you'll need to make copies of the three .mdfomt files in the ObjectModel directory, giving them names that match the name of your .dmd file and editing the <%@ modelFile ... %> line). However, at the moment a .dd file can only refer to one .dmd file. Our plans include the ability to define many designers per .dmd, and have one designer able to view models from multiple dmd's. Exactly how we'll do this (there are a number of design options) is yet to be worked out. Basically, we're in the business of being able to create multiple designers which can provide different perspectives on the same model, as well as being able to break down the definition of a domain model into component parts. At least that's the plan.
    6. If you can have multiple .dmd files can you reference classes on other models?
      Yes, though the mechanism is a bit clunky at the moment. To create a reference to a class in another domain model, create a class with the same name and namespace, and then set its IsLoaded property to be false. The code generated from the domain model will put in the appropriate references, though, thinking about it, I don't think the designer will quite do the right thing (it needs to reference both models and ensure that their definitions are loaded into the store on startup).
    7. Can you show Bi-directional connections?
      All relationships are bidirectional. It's just that we have chosen to overlay the definition of relationships with a tree navigator - the diagram reflects one way of walking a graph of objects of classes defined in the model connected by links whcih are instances of relationships in the model, as a tree. This tree is used as the default for behaviours in the designer which require this, such as serialization to XML, viewing a model in the explorer and deletion behavior. At present, the DMD allows you to override this default for deletion behaviour in the Delete Behavior Designer. In future versions, we hope to provide a means of defining similar overrides to drive XML serialization and the explorer behaviour. Also see answer to (2).
    8. Can you cut across the tree hierarchy?
      If I understand the question correctly, yes. You can define relationships between any two classes, including different nodes in the tree. When you do so, a 'use node' will be created for the target class, as a child of the source class, wherever the definition of the source class appears. You can do this for both embedding and reference relationships. Also see answer to (2).  
    9. Can two classes share the same value property?
      No. We follow standard OO practice in this regard. So the only way to achieve this result is to have a common superclass which defines the value property.
    10. Why is it not possible to cut and paste? This would make it easier to create similar classes
      This is just a feature we have not implemented yet.
    11. If B is a child of A, and C is a child of B, does C have to be a child of A?
      I assume you're not talking about inheritance here, but embedding relationships, and that you are asking whether the definition of C must appear beneath the definition of B which appears beneath the deifnition of A. It is possible to have the defintiion of C appear as a root on the diagram, or anywhere C is referenced, as the target of an embedding or reference relationship, or as a child in an inheritance relationship. Select the node where you want the definition to appear, and choose the 'Bring Definition Here' option, or choose 'Create Root' if you want the definition to appear at the root of the diagram. Details are in the DMD walkthrough. Also see answer to (2).
  • stuart kent's blog

    Creating your own DSL / DSL template

    • 3 Comments

    We've had the following question posted to the newsgroup on the DSL Tools site:

    "Can I create a template of my own... right now? Two templates are available but my model doesn't fit into any. Can I use the 'Blank Template' and delete the existing model and create my own model. Is this possible right now? If I do this what all will I need to change wrt to the .dd file and resource files."

    The basic model here is to design your DSL, or, more specifically the tools to support that DSL, based on an existing template. In the December CTP we aonly provide two templates: a blank one, with virtually nothing in it; and a template which started life as a simple architecture chart language and has evolved to include one of everything that is currently supported in the DD file. Over time we'll be updating and expanding this set.

    Using these templates you can build your own language. As the questioner suggests, you do this by choosing one or other of the templates in the wizard, and then updating the domain model (deleting stuff, renaming stuff, adding stuff), and then updating the dd file and the resources files to match. The end to end walkthrough takes you through this process for the construction of a UIP Chart language. So, in short, you can go create a designer now for whatever language you like, with the big proviso that it fits within the realms of what the dd file currently supports.

    And there's the rub. Updating the .dd file is not as easy as it should be, and what is currently supported is rather limiting (e.g. see the known issues list). We will shortly be releasing (within the next couple of weeks) another preview which will fix some key bugs and provide a dd file validator which should make it easier to work with dd files. Our plan after that is to release a preview that will work atop the Beta2 release of Visual Studio 2005, which will be more robust, and will relax some of the limitations imposed by the dd file. At that time, we should also be in a position to be more precise about our plans until the end of the year.

    Now back to the first question: "Can I create a template of my own... right now?". The short answer is no. Well, at least we've provided no documentation and to do it manually can be a painstaking job. We have an internal tool that automates the process of creating the wizard templates from samples, but that would take some work to make available to customers. We are also not fixed on the current format for templates. I'd be very interested to hear of scenarios where customers would find the ability to create their own templates useful or essential. Who would the templates be for - yourself? Someone else? What kinds of language, or aspects of a language, would you want to bake into a template?

  • stuart kent's blog

    DSLs and customization

    • 2 Comments

    Fred Thwaites has asked a question in response to this post. He asks:

    "Does this imply that in general users of DSL in VS2005 will need to be aware of this metamodel, or will VS2005 come with a comprehensive set of predefined DSL's filling all the cells of Jack Greenfields software factory schema grid."

    and

    "Secondly, how fragmented do you feel DSL's will become? Do you expect that in most cases the VS supplied DSL will suffice, or do you see companies, departments and even projects regulally creating of extending DSL's."

    In answer to the first question, those who build the DSL tools using our technology will be aware of the metamodel - or domain model as we call it - which is a key part of defining such tools. Users of the the tools so built will not need to be aware of this model, although, of course, they will need to understand the concepts in the domain which it encodes. However, we do not expect most tool-building users to start from scratch everytime, but instead we (and possibly others) will provide 'vanilla' (and not so vanilla) languages in the form of templates to get started. If you try out the December release of the DSL tools, or even just read through the walkthroughs, you will see that the process of building a designer begins by running a wizard, in which you can select a language template on which to base your own DSL. The set if templates is limited at the moment, but we have plans to expand it.

    In answer to the second question, a key motivation for the DSLs approach is that as you get serious about automating aspects of your software development processes you find that the 'off-the-shelf' languages/tools either don't quite fit your needs, or fail completely to address your domain. So I fully expect companies, possibly departaments and perhaps projects, creating, customizing and extending DSLs, although in many cases they'll do so from an existing template - it won't be necessary to start from scratch very often.

    It is also worth noting that DSLs will evolve as requirements for automation evolve - for example, as the scope of code generation grows. The process might go something like this: I have a DSL for describing executable business processes, and I'm generating a fair chunk of my on-line systems from these models. This DSL has encoded some information very specific to my organization, for example the particular set of roles that my users can play and the particular set of channels through which they can interact with the business process. As these don't change very frequently, it's easier just to make a list of them as part of the definition of the DSL (simplifies the use of the DSL, simplifies the code generated, etc.), rather than extend the DSL to allow new ones to be defined. If they need to be changed later, then the DSL can be udpated, and a new version of the tools released (with appropriate migration tools as well). I then observe that by extending that DSL, or complementing it with another to describe security aspects, say, (noting that the description of security parameters will need to reference elements of the executable business process model), I can then extend the reach of my code generators to put in the necessary plumbing to handle security.

  • stuart kent's blog

    New blog

    • 0 Comments
    My friend Jean Bezivin, a well-known figure in the 'model engineering' research community, has started a blog. There are already a couple of good posts worth reading. I look forward to reading more.
  • stuart kent's blog

    Walkthroughs for DSL tools available

    • 5 Comments

    Three tutorial documents which 'walk through' various aspects of the December release of our DSL Tools are now available for download in a zip file.

    You can also navigate to them from the  DSL Tools December release

  • stuart kent's blog

    Why we view a domain model as a tree

    • 9 Comments

    In some feedback to my last posting, there was a question about why we visualized a domain model as a tree, with the point that this seemed inappropriate for some domain models. This is an interesting question, and warrants a more public response. So, here goes.

    • The visualization is a tree view of a graph of classes (actually  tree views of two graphs - the inheritance and relationship graphs). In the tool you have a good deal of control over how you want to view the graph as a tree.
    • The tree views certainly enables the expand/collapse and automated layout, both things which we wanted in the tool
    • But, perhaps most importantly, it is the case that many of the behaviours required in a designer and other tools implemented using code generated from the domain model, require one to walk the graph as a tree (and often in different ways). Examples include: deletion behaviour, how a model appears in an explorer, how it gets serialized in XML, how constraint evaluation cascades through a model, and so on. We have also found that although the trees required for each of these can be subtly different, they all seem to be some variation on some common, core tree. So we have chosen to make one tree central in the visualization, and have editing experiences for defining variants from this. At the moment, the only variant supported is the definition of deletion behavior.
  • stuart kent's blog

    The UML / DSL debate

    • 6 Comments

    There's been some debate between some of my MSFT colleagues (Alan Wills, Steve Cook, Jack Greenfield) and Grady Booch and others over at IBM around UML and DSLs. For those interested, Grady actually posted an article on UML and DSLs back in May last year - seems to have gone unnoticed. It touched on some of the themes that have been under discussion. My first blog entry was a response to this article.

    A particular theme that crops up in the discussion is the issue of tools versus language. Grady seems to want to keep the two separate, whereas we believe that the two are closely linked - a language designed to be used in a tool is going to be different to a language designed to be used on paper.

    I touched on this line of discussion in an article a few months back: http://blogs.msdn.com/stuart_kent/articles/181565.aspx

    An example I used was that of class diagrams, and I pointed out a couple of aspects of class diagrams which may have been designed differently had the original intention been to use them within a tool rather than on paper. Now I can point you at an example notation that addresses some of these issues. It is the notation we use for defining domain models in our DSL Tools. Here's a sample:

    Domain model for UIP Chart Language

    Nodes are classes, lines are relationships or inheritance arrows. Nodes are organized into relationship and inheritance trees, which can be expanded or collapsed, making it easy to navigate and drill into large models (not something you do on paper).

    A relationship line has the role information (would be association end information in UML 2) annotated in the middle of the line rather than at the ends: a role is represented as a triangle or rectangle containing a multiplicity, and the name of a role is annotated using a label (in the diagram above reverse role names have been hidden as they are usually less interesting).

    Annotating role information in the middle of the line not only makes it easier to automatically place labels, it also means that relationship lines can be chanelled together enabling an expandable relationship tree to be automatically constructed and layed out. An example of channelling is given by the diagram below, where you can see all the relationship lines sourced on Page channelled together so that they connect to Page at the same point:

    Domain model for UIP Chart Language

    The point here is that this notation was designed from the start using criteria such as 'must support autolayout' and 'must support easy navigation of large models'. If the criteria were 'must be easily cut up into page size chunks', must be easy to sketch on a whiteboard' then the notation may well have turned out different.

  • stuart kent's blog

    Tools to create, tools to consume, tools to check

    • 2 Comments

    As a number of my colleagues have already pointed out the December download of the DSL Tools are now available. Three walkthroughs giving more detail on how to use these bits should also be available soon - hopefully before Christmas, if not early in the New Year. 

    Not only do these bits allow you to create a graphical designer in Visual Studio for your own domain specific language, but also they create the environment for executing text artefact templates against domain data created using your new designer. For example, such templates can be used to generate code from the data or reports about it.

    Artefact templates are an example of tools that consume the data. A designer is an example of a tool that creates it. DSL tools are all about enabling organizations to automate more and more aspects of their software development process, using domain specific abstractions. But however good the designers or editors are for creating the domain data, if they don't provide easy access to that data by other tools then this won't be possible.

    There is work to do in this area. In the shorter term this includes persisting the data in a domain specific XML format, rather than the generic format used in the December bits, and providing the hooks for other tools to load persisted data and access it through a domain specific API (noting that we already generate the appropriate hooks to access the domain data through a domain specific API from within artefact templates). In the longer term, it's the authoring experience for building the consumption tools themselves that could be tackled. Examples of such tools would be tools for performing transformations between data in different domains, tools to synchronize and reconcile data across mappings, tools to simulate or animate behaviors specified by a DSL, and so on.

    A third kind of tool is a tool to check the well-formedness of data. This is somewhere between a tool that consumes data and a tool the creates data - checking can be an intrinsic part of both the creation and consumption process. For example, data that inputs to artefact generators may need to pass a set of validation checks (in addition to ones which are intrinsic to the domain data) to guarantee successful artefact generation.

    To conclude, the DSL Tools are not just about making it easy to build visual designers, tools for creating data, but also about making it easy to build tools that need to consume it and check it, in the context of automating aspects of the software development process. At the very least, we need to make sure that the data is easily accessed, both through APIs and the way the data is peristed in XML files. Even better if we can provide direct support for authoring such tools, such as the text artefact template technology.

  • stuart kent's blog

    DSL workbench now live

    • 2 Comments

    I've just returned from OOPSLA and Redmond, and am pleased to see that the DSL tools workbench site is now live, delayed by a few teething problems with process - we should get better at this with each release of content. You can now download a preview version of the object model editor, which is accompanied by a document providing a walkthrough of that tool.

    We'll be putting up some more content over the next few weeks - whitepapers, a video, that kind of thing - and planning to drop another preview, this time including the wizard and code generators (see my earlier posting), around Christmas.

    When I've cleared the backlog of work from my two weeks away, I hope to return to more technical topics, probably starting with an explanation of the notation used in the object model editor.

  • stuart kent's blog

    What I've been working on

    • 6 Comments

    I can now point you at a couple of announcements about the technology I've been working on. Here's the Microsoft Press announcement:

    http://www.microsoft.com/presspass/press/2004/oct04/10-26OOPSLAEcosystemPR.asp

    The exciting aspect of this is that we're going to start making the technology available to the community as soon as we can. Here's the site to watch:

    http://lab.msdn.microsoft.com/vs2005/teamsystem/workshop/

    I'll post to my blog as soon as some content is available - should be before the end of the week.

    We gave a demo of some of the technology during Rick Rashid's keynote at the OOPSLA conference. Unfortunately I expect there'll be some flak about this - watching the keynote, our demo felt a bit like a 'commercial break'. I'll respond to the flak when it arrives.

    So not quite the launch I'd hoped for, but that shouldn't detract from the technology itself. Here's a quick heads  up on what we've been doing:

    • A wizard that creates a solution in VS which when built installs a graphical designer as a first class tool hosted in VS. The designer can be customized to the Domain Specific Language (DSL) of your choice.
    • The wizard allows you to choose from a selcetion of language templates. You can define your own templates based on designers you've already built.
    • Once you've run the wizard, you can edit a couple of XML files to customize the designer. Code is generated from these files to customize the designer. One file allows you to define and customize the graphical notation and other aspects of the designer, like the explorer, properties window and so on. The other file defines the concepts that underpin the language in the form of an object model (metamodel, if you're familiar with that term). We have graphical designer for editing the object model.
    • You then just build the solution, hit F5 and get a second copy of VS open in which you can use the designer you've just built.

    Our first release, at the end of this week, will be a preview of the object model editor. Previews of the other components should be available by the end of the year. 

  • stuart kent's blog

    Microsoft at OOPSLA

    • 1 Comments

    Here's what Microsoft are doing at OOPSLA this year: http://msdn.microsoft.com/architecture/community/events/oopsla2004/

    Network connections permitting, I'll be blogging for the couple of days I'm there.

    OOPSLA will be an interesting experience for me this year. In the past I've attended and presented as a researcher from the academic community. I wonder if the experience will be any different now I'm on the other side of the fence. It will also be a great opportunity to meet up with old friends - especially since I was unable to make the UML conference, the first time I've missed it since its inception.

    And watch out for those announcements around software factories and domain specific languages...

  • stuart kent's blog

    Not going to UML conference, but will be at OOPSLA

    • 0 Comments
    I mentioned in an earlier post that I was giving a tutorial at the UML conference with Alan Wills. The tutorial is still happening, but unortunately I won't have a hand in presenting it - I'm not able to make it to the conference. However, I will be at Oopsla on Tuesday (26 Oct) and Wednesday (27 Oct).
  • stuart kent's blog

    UML, DSLs and software factories: let those analogies flow...

    • 9 Comments

    I typed this entry a few days ago, but then managed to lose it through a set of circumstances I'm too embarrassed to tell you about. It's always better second time around in any case.

    Anyway, reading this recent post from Simon Johnston prompted a few thoughts that I'd like to share. In summary, Simon likens UML to a craftsman's toolbox, that in the hand of a skilled craftsman can produce fine results. He then contrasts this with the domain specific language approach and software factories, suggesting that developers are all going to be turned into production-line workers - no more craftsman. The argument goes something like this: developers become specialized in a software factory to work on only one aspect of the product through a single, narrowly focussed domain specific language (DSL); they do their work in silos, without any awareness of what the others are doing; this may increase productivity of the individual developers, but lead to a less coherent solution.

    Well, I presume this is a veiled reference to the recent book on Software Factories, written by Jack Greenfield and Keith Short, architects in my product group at Microsoft, and to which I contributed a couple of chapters with Steve Cook. The characterization of sofware factories suggested by Simon is at best an over-simplification of the vision presented in this book.

    I trained as a mathematician. When constructing a proof in mathematics there are two approaches. Go back to the original definitions, the first principles, and work out your proof from there; or build on top of theorems already proven by others. The advantage of the first approach is that all you have to learn is the first principles and then you can set your hand to anything. The problem, is that it will take you a very long to time to prove all but the simplest theorems, and you'll continually be treading over ground you've trod many times before. The problem with the second approach is that you have to learn a lot more, including new notations (dare I say DSLs) and inevitably end up becoming a specialist in a particular branch of the subject; but in that area you'll be a lot more productive. And it is not unknown for different areas of mathematics to combine to prove some of the more sophisticated theorems.

    With software factories we're saying that to become more productive we need to get more domain specific so that we can provide more focused tooling that cuts out the grunt work and let's us get on with the more challenging and exciting parts of the job. As with mathematics, the ability to invent domain specific notations, and, in our case, the automated tools to support them, is critical to this enterprise. And sophisticated factories (that is, most of them) will combine expertise from different domains, both horizontal and vertical, to get the job done, just as different branches of mathematics can combine to tackle tricky problems.

    So our vision of software factories is closer to the desirable situation described  by Simon towards the end of his article, where he talks about the need for a "coherent set of views into the problem". Each DSL looks at the problem, the software system being built or maintained, from a particular perspective. These perspectives need to be combined with the other views to give a complete picture. If developers specialize in one perspective or another, then so be it, but that doesn't mean that they can sit in silos and not communicate with the others in the team. There are always overlaps between views and work done by one will impact the work of another. But, having more specialized tooling should avoid a lot of error-prone grunt work, and will make the team as a whole far more productive as a result.

    So what about UML in all this? To return to Simon's toolbox analogy (and slightly toungue-in-cheek) UML is like having a single hand drill in the toolbox, which we've got to try and use to drill all sizes of hole (for large holes you drill a number of small holes close together), and in all kinds of material; some materials you won't be able to drill into at all. DSLs, on the other hand, is like having a toolbox full of drill bits of all different sizes, each designed to drill into a particular material. And in a software factory, you support your DSLs with integrated tooling, which is like providing the electric hammer-drill: you'll be a lot more productive with these specialist tools, and even do things you couldn't manage before, like drill holes in concrete.

    So I don't see UML as a central part of the software factory/DSL story. I see it first and foremost as a language for (sketching) the design of object-oriented programs - at least this is its history and its primary use to date. Later versions of UML, in particular the upcoming UML 2, have tried to extend its reach by adding to the bag of notations that it includes. At best, this bag is useful inspiration in the development of some DSLs, but I doubt very much that they'll get used exactly as specified in the standard - as far as conformance against the standard can be checked that is...

     

  • stuart kent's blog

    What does it mean to be MDA compliant?

    • 0 Comments

    I read on Jim Steel's blog that back in August there was lots of discussion on OMG lists about what makes a compliant MDA tool. I followed a link from his blog to this entry, and there were three aspects that intrigued me.

    • The top two proposed criteria for a conformant MDA tool require that such a tool use OMG modelling standards (UML, MOF, XMI) to represent and interchange models. Perhaps this shouldn't be a surprise, but I have seen the term MDA used as a catch-all term for anything remotely automated concerning models. Folks should be careful about using the term MDA: use a more generic term if you do don't use OMG standards to do your modelling.
    • The fourth proposed criteria puts platform independence at the heart of MDA. Platform independence is just one possible benefit of abstraction, there are many others. A problem I have with the MDA concept, is its narrow interpretation of the wider field of model driven development, or model driven engineering, or just model driven stuff, in particular its emphasis on platform independence.
    • Many of the proposed criteria are challenged by valid points made by the writer of this entry, illustrating how hard it is to actually pin down what it means to be compliant to MDA. And surely a standard is only a standard if you can test objectively and concretely what it means to be compliant to that standard?
  • stuart kent's blog

    Premature standardization

    • 5 Comments

    I used the phrase 'premature standardization' in an earlier post today. I'm rather pleased with it, as it is a crisp expression of something that has vexed me for some time, namely the tendency of standards efforts in the software space to transform themselves into a research effort of the worst kind - one run by a committee. I have certainly observed this first hand, where what seemed to be happening was not standardization of technologies that existed and were proven, but instead paper designs for technology that might be useful in the future. Of course, then I was an academic researcher so was quite happy to contribute, with the hope that my ideas would have a better chance of seeing the light of day as part of an industry standard than being buried deep in a research paper. I also valued the exposure to the concentration of clever and experienced people from the industry sector. But now, as someone from that sector developing products and worrying everyday about whether those products are going to solve those difficult and real problems for our customers, I do wonder about the value of trying to standardize something which hasn't been tried and tested in the field, and, in some cases not even prototyped. To my mind, efforts should be made to standardize a technology when:

    • There are two or more competing technologies which are essentially the same in concept, but different in concrete form
    • The technologies are proven to work - there is sufficient evidence that the technologies can deliver the promised benefits
    • There is more value to the customer in working out the differences, than would be gained through the innovation that stems from the technologies competing head-to-head

    Even if all these tests come up positive, it is rarely necessary to standardize all aspects of the technology, just that part which is preventing the competing technologies to interoperate: a square plug really will not fit in a round hole, so my French electrical appliance can not be used in the UK, unless of course I use an adaptor...

    If we apply the above tests to technologies for the development of DSLs, I'd say that we currently fail at least two of them. Which means that IMHO standardization of metamodelling and model transformation technologies is premature. We need a lot more innovation, a lot more tools, and, above all, many more customer testimonials that this stuff 'does what it says on the tin'.

  • stuart kent's blog

    UML conference

    • 3 Comments

    Two posts in the same day. I guess I'm making up for the two month gap.

    Anyway, Alan Wills and I are giving a tutorial at the UML conference in Lisbon in October. The tutorial is called "How to design and use Domain Specific Modeling Languages" and is on Tuesday 12th October in the afternoon. We promise you not much presentation, interesting exercises and lots of discussion and reflection.

  • stuart kent's blog

    Back from vacation, more on DSLs

    • 0 Comments
    It's been a long time since my last entry - those few who were following my blog have probably given up by now. The interruption in service has been due to (a) family vacation and (b) moving house. One of these days I'll wax lyrical about the inadequacies of the English system for buying and selling houses…

     

    Let's just recap where I've got to so far on the theme of modelling languages and tools. I started out with a reaction to an article by Grady Booch on DSL's, in particular why UML is not really the right tool for the job if this is the direction you want to go. I then talked about code generation, my thoughts prompted by an interesting article by Dan Haywood. Then, in the third entry, I talked about designing a visual language (strictly we should say pictorial or graphical language, as a textual language is also visual), focusing on the difference between designing one on paper and one to be used in a tool.

     

    So what next? Well I'd like to return to the topic of DSLs, in particular try to pin down what is meant by the term 'domain specific language', why we need them, and how we can make it easier to build them. As I seem to be incapable of writing short entries, I've hived off the main content to a separate article.  

     

  • stuart kent's blog

    More ruminations on DSLs

    • 4 Comments

    A domain specific language is a language that's tuned to describing aspects of the chosen domain. Any language can be domain specific, provided you are able to identify the domain it is specific to and demonstrate that it is tuned to describe aspects of that domain. C# is a language specific to the (rather broad) domain of OO software. Its not a DSL for writing insurance systems, though. You could use it to write the software for an insurance system, but it's not exactly tuned to that domain.

     

    So what is meant by the term 'domain'?. A common way to think about domains is to categorize them according to whether they are horizontal or vertical. Vertical domains include, for example: insurance systems, telephone billing systems, aircraft control systems, and so on. Horizontal domains include, for example, the bands in the classic waterfall method: requirements analysis, specification, design, implementation, deployment. New domains emerge by intersecting verticals and horizontals. So, for example, there is the domain of telephone billing systems implementation, which could have a matching DSL for programming telephone billing systems.

     

    Domains can be broad or narrow, where broad ones can be further subdivided into narrow ones. So one can talk about the domain of real-time systems, with one sub-domain being aircraft control systems. Or the domain of web-based systems for conducting business over the internet, with a sub-domain being those particular to insurance versus another sub-domain of those dealing in electrical goods, say. And domains may overlap. For example, the domain of airport baggage control systems includes elements of real-time systems (the conveyer belts etc. that help deliver the luggage from the check-in desks to the aircraft) and database systems (to make a record of all the luggage checked in, its weight and who it belongs to, etc.).

     

    So there are lots of domains. But is it necessary to have a language specific to each of them? Couldn't we just identify a small number of general purpose languages that cover the broad domains, and just use those for the sub-domains as well?

     

    What we notice in this approach is that users demand general purpose languages that have extensibility mechanisms which allow the base language to be customized to narrower domains. There's always a desire to identify domain specific abstractions, because the right abstractions can help separate out the things that vary between systems in a domain and things that are common between them: you then only have to worry about the things that vary when defining systems in that domain.

     

    Two extensibility mechanisms in common use today are:

    • class inheritance and delegate methods, which allow one to create OO code frameworks;
    • stereotypes and tagged values in UML which provide primitive mechanisms for attaching additional data to models.

    These mechanisms take you so far, but do not exactly deliver customized languages that intuitively capture those domain specific abstractions - the problem is that the base language gets in the way. Using OO code frameworks is not exactly easy: it requires you to understand all or most of the mechanisms of the base language; then, although you get clues from the names of classes, methods and properties on where the extension points are, there is no substitute for good documentation, a raft of samples and understanding the framework architecture (patterns used and so on). Stereotypes and tagged values in UML are powerful in that you can decorate a model with virtually any data you like, but that data is generally unstructured and untyped, and often the intended meaning takes you a long way from the meaning of the language as described in the standard. Neither OO framework mechanisms or UML extensibility mechanisms, allow you to customize the concrete notation of the language, though some UML tools allow stereotypes to be identified with bitmaps that can be used to decorate the graphical notation.

     

    Instead of defining extensibility mechanisms in the language, why not just open up the tools used to define languages in the first place, either to customize an existing language or create a new one?

     

    Well, it could be argued that designing languages is hard, and tooling them (especially programming languages) even harder. And the tools used to support the language design process can only be used by experts. That probably is the case for programming languages, but I'm not sure it needs to be the case for (modelling) languages that might target other horizontal domains (e.g. design, requirements analysis, business modelling), where we are less interested in efficient, robust and secure execution of expressions in the language, and more interested in using them for communication, analysis and as input to transformations. Analysis may involve some execution, animation or simulation, but, as these models are not the deployed software, it doesn't have to be as efficient, robust or secure. Other forms of analysis include consistency checking with other models, possibly expressed in other DSLs, taking metrics from a model, and so on. Code generation is an obvious transformation that is performed on models, but equally one might translate models into other (non-code) models.

     

    It could also be argued that having too many languages is a barrier to communication - too much to learn. I might be persuaded to agree with that statement, but only where the languages involved are targeted at the same domain and express the same concepts differently for no apparent reason (e.g. UML reduced the number of languages for sketching OO designs to one). Though it is worth pointing out that just having one language in a domain can lead to stagnation, and for domains where the languages and technologies are immature, inevitably there will be a plethora of different approaches until natural selection promotes the most viable ones - unless of course this process is interrupted by premature standardization :-). On the other hand, where a language is targeted on a broad domain, and then customized using its own extensibility mechanisms, the result carries a whole new layer of meaning (OO frameworks, stereotypes in UML), or even an entirely different meaning (some advanced uses of stereotypes). In the former case, there is a chance that someone who just understands the base language might be able to understand the extension without help; in the latter case, I'd argue that the use of the base language can actually hinder understanding, as it replaces the meaning of existing notation with something different.

     

    Finally, whether we like it or not, people and organizations will continue to invent and use their own DSLs. Some of these may never ever get completed and will continue to evolve. Just look at the increasing use of XML to define DSLs to help automate the software development process - input to code generators, deployment scripts and so on. Yes, XML is a toolkit for defining DSLs; it's just that there are certain things missing: you can't define your own notation, certainly not a graphical one; the one you get is verbose; validation of well-formedness is weak.

     

    Am I going to tell you what a toolkit for building domain specific modelling languages should look like? Soon I hope, but I've run out of time now. And I'm sure that some folks reading this will give feedback with pointers to their own kits.

     

    One parting thought. In this entry, I have given the impression that you identify a domain and then define one or more languages to describe it. But perhaps it's the other way round: the language defines the domain…

  • stuart kent's blog

    Designing graphical notations: for paper or tools?

    • 1 Comments

    In this posting I continue on the theme of designing tools and notations to support modeling.

    On and off I've spent the last eight years thinking about the design of graphical (as opposed to purely textual) notations for use in software development. Until I started to build tools to support these notations, my thinking was unintentionally skewed towards how a notation would be used on paper or a whiteboard. As soon as I began building tools I realized that there's a whole range of facilities for viewing, navigating and manipulating models, available in a tool but not on paper or whiteboard. Perhaps I should have realized this from just using modeling tools, but it seems that something more was required to make it sink in! Anyway, I have also noticed that these facilities can not always be exploited if they are not taken into account when the notation is designed: there is a difference between designing a notation to be used in a tool and one to be used on paper. I have written a short article which lays out the argument in a little more detail, and gives a couple of concrete examples. As always, your feedback is valued.

  • stuart kent's blog

    Designing notations for use in tools

    • 8 Comments

    Tools make available a whole range of facilities for viewing, navigating and manipulating models through diagrams, which are not available when using paper or a whiteboard.  Unfortunately, these facilities can not always be exploited if they are not taken into account when the notation is designed: there is a difference between designing a notation to be used in a tool and one to be used on paper. 

     

    Some of the facilities available in a tool which are not available on paper:

    • Expand/collapse. The use of expand collapse buttons to show/hide detail inside shapes, descendants in tree layouts of node-line diagrams, and so on.
    • Zoom in/out. The ability to focus in on one area of a diagram revealing all the detail, or zoom out just to see the diagram at various scales.
    • Navigation. The ability to navigate around a diagram, or between diagrams, by clicking on a link, using scroll bars, searching, and so on.
    • (Virtually) No limit on paper size (though limited screen size). Paper is physically limited in size, so one is forced to split a diagram over multiple sheets, there is no choice. In contrast, there can be a very large virtual diagram in a tool, even if one only has a screen sized window to look at it. But the screen is not such a limitation, if combined with expand/collapse, zoom, and great navigation facilities.
    • Automatic construction/layout of diagrams. The ability to construct a diagram just from model information (the user doesn't have to draw any shapes explicitly at all), and/or the ability to layout shapes on a diagram surface. The best diagram is one which is created and (intuitively) laid out completely automatically.
    • Explorer. Provides a tree view on a model, which can be as simple as an alphabetical list of all the elements in the underlying model, possible categorized by their type. The explorer can provide an index into models, which provides a valuable aid to navigation in many cases.
    • Properties window. A grid which allows you to view and edit detailed information about the selected element.
    • Tooltips. Little popups that appear when you hover over symbols in a diagram. They can be used to reveal detailed information that otherwise can remain hidden.
    • Forms. An alternative way of viewing and editing detailed information about the model, selected elements or element.

    To see how designing a notation for use in a tool can lead to different results than if the focus is paper or whiteboard use, let's take a look at two well-known notations. UML class diagrams and UML state machines. The former, I would argue, has not been well designed for use in a tool; whereas the latter benefits from features which can be exploited in a tool.

     

    Class diagrams. A common use of class diagrams is to reverse engineer the diagram from code. This helps to understand and communicate designs reflected in code. Industrial scale programs have hundreds of classes; frameworks even more. Ideally, one would like a tool to create readable and intuitive diagrams automatically from code. However, the design of the notation mitigates against this. Top of my list of problems is the fact that the design of associations, with labels at each end, precludes channeling on lines, where channeling allows many lines to join a node at the same endpoint (inheritance arrows are often channeled to join the superclass shape at the same point). Because labels are placed at the ends of association lines, each line has to touch a class shape at a different end point in order to display the labels. This exacerbates line crossing and often means that class nodes need to be much larger than they ought to be, making diagrams less compact than they ought to be and much harder to achieve a good layout automatically.

     

    State Machines. State machines can get large with many levels of nesting. This problem can be mitigated using zoom facilities, by a control that allows one to expand/collapse the inside of a state or a link that launches the nested state machine in a new window. As transitions can cross state boundaries, using any of these facilities means that you'll need to distinguish between when a transition (represented by an arrow) is sourced/targeted on a state nested inside and currently hidden from view, or on the state whose inside has been collapsed. This distinction requires a new piece of notation. In UML 1.5, the notation of stubbed transitions (transitions sourced or targeted on a small solid bar) was introduced to distinguish between a transition which touched a boundary and one which crossed it. Interestingly, in UML 2 this notation has been replaced by notation for entry/exit points. In this scheme, one can interrupt a transition crossing the boundary of a state by directing it through an entry or exit point drawn on the edge of the state. This is the state machine equivalent of page continuation symbols that one gets with flowcharts. However, it can be used to assist with expand/collapse in a tool: collapsing the inside of a state, leaves the entry/exit point visible, so one can still see the difference between transitions which cross the boundary and those which don't. In one sense, this isn't quite as flexible a solution as the solid bar notation, as, for expand/collapse to work in all cases, it requires all transitions that cross a boundary to be interrupted by an entry/exit point. I guess, a tool could introduce them and remove them as needed, but I must confess that I prefer the stubbed transition notation for the expand/collapse purpose - seems more natural somehow.

     

    To conclude, designing a notation for use in a tool can lead to different decisions than if the focus is on paper or whiteboard use. However, much as we might like to think that a notation will only be used in a tool, there will always be times when we need to see or create it on paper or a whiteboard, and this has to be balanced against the desire to take advantage of what tools can offer. For example, it is always a good idea to incorporate symbols to support 'page continuation', which will make it easier to provide automated assistance for cutting a large virtual diagram into page-sized chunks (and, as we have seen, such symbols can also support other facilities like expand/collapse). And it is always worth considering whether the notation is sketchable. If not, it may be possible to define a simplified version which is. For example, one could provide alternatives for sophisticated shapes or symbols, that look great in tool, but are very hard to use when sketching.

  • stuart kent's blog

    Storyboarding

    • 2 Comments

    An important part of my job is to analyze requirements and write specifications. I have found the technique of storyboarding to be extremely effective. A storyboard is a concrete rendition of a particular scenario - could be a use case or an XP story (see Martin Fowler on UseCasesAndStories for an explanation of the difference). Not only are storyboards great for sorting out requirements, they are also effective in communicating what needs to be built to developers, and as a starting point for the development of tests.

    If what you are building is exposed to the user through UI, then the most likely form of your storyboard is a click-through of the UI. If what you are building is infrastructure, say an API, then your storyboard might be a click-through of the experience a developer goes through in authoring code that uses the API, or it might be a filmstrip - a series of object diagrams that illustrate what happens to the state of the system as methods are executed.

     

    Tools that I have found to be particularly effective in developing storyboards, especially those that click-through a UI experience, are Powerpoint and Visio. You can really bring storyboards alive in Powerpoint using a combination of moving through slides, using its custom animation capabilities, and by mixing and matching graphics from multiple sources. And the Windows XP template in Visio is pretty useful too. I've put up an article with specific hints and tips on using Powerpoint and Visio for this purpose.

     

    Interestingly, I soon move from sketching a storyboard on paper or the whiteboard to committing it to Powerpoint, as the latter imposes a discipline that forces you to make decisions and care about the detail. It's so easy to continue hand-waiving and putting off those hard decisions at the whiteboard. I don't spend a large amount of time getting the UI 'just so'. It just needs to be good enough to tease out those important decisions about what the requirements actually are.

     

    If anyone has suggestions for alternative tools that could be used for this purpose, then please add them as comments to this post.

  • stuart kent's blog

    Hints and tips for using Powerpoint and Visio for storyboarding

    • 3 Comments

    Here are a few techniques I have found useful for building storyboards or click-throughs using Powerpoint and Visio. If you have further suggestions please add them as comments to this article.

     

    Making parts appear and disappear

    Use custom animation in powerpoint. You can change the order in which things appear/disappear, and decide whether the effect should happen on mouse click or automatically after the previous effect. Or copy the slide and add/remove the part to/from the new slide. Animation will happen through slide transitions.

     

    Don't do everything in one slide

    Otherwise the animation will become unmanageable. I tend to have one slide per step in the scenario. Use your common sense to decide on the granularity of steps.

     

    Use 'callouts' to comment on aspects of the scenario

    A callout is a box with text, that has a line attached which can point to a particular aspect of the graphic. I tend to color my text boxes light yellow (like a yellow post-it note). Remember callouts can be made to appear and disappear too. Using callouts saves having to have separate slides with text notes on, which avoids breaking up the flow. The notes can also be set in context.

     

    How to get a mouse pointer to move

    Paste in a bitmap of the pointer. Select the bitmap. Use SlideShow > Custom Animation > Add Effect > Motion Paths to define a path along which the pointer should move.

     

    Build/get a graphic for your application shell

    A graphic of the application shell can be used as a backdrop for all the other animation. Use Alt-PrintScreen to create a bitmap of the shell for your running application. This works if you're building a plugin for an existing application, or you have an existing application with a similar shell to the one you're storyboarding. Alternatively build a graphic of the shell using the Windows XP template in Visio.

     

  • stuart kent's blog

    On code generation from models

    • 7 Comments
    In a recent article, Dan Hayward introduced two kinds of approaches to MDA: translationist and elaborationist. In the former approach 100% code is generated from the model; in the latter approach some of the code is generated and then hand finished. He gives examples of tools and companies following each of these approaches.

     

    Underlying Dan's article seemed to be the assumption that models are just used as input to code generation. To be fair, the article was entirely focused on the OMG's view of model driven development, dubbed MDA, which tends to lean that way. My own belief is that there are many useful things you can use models for, other than code generation, but that's the topic of a different post. I'll just focus here on code generation.

     

    So which path to follow? Translationist or elaborationist?

     

    In the translationist approach, the model is really a programming language and the code generator a compiler. Unless you are going to debug the generated (compiled) code, this means that you'll need to develop a complete debugging and testing experience around the so-called modeling language. This, in turn, requires the language to be precisely defined, and to be rich enough to express all aspects of the target system. If the language has graphical elements, then this approach is tantamount to building a visual programming language. The construction of such a language and associated tooling is a major task that requires specialist skills. It will probably be done by a tool vendor in domains where there is enough of a market to warrant the initial investment. Indeed, one doesn't have to look far for examples. There are several companies who have built businesses on the back of this approach to MDA, especially in the domain of real-time, embedded systems. And, for obvious reasons, they have been leading efforts to define a programming language subset of UML, called Executable UML, xUML or xtUML, depending on which company you talk to.

     

    In contrast, the elaborationist approach to code generation does not require the same degree of specialist skill or upfront investment. It can start out small and grow organically. However, there are pitfalls to watch out for. Here's some that I've identified:

    • Be careful to separate generated code from handwritten code so that when you regenerate you do not overwrite the hand written code. If that is not possible, e.g. because you have to fill in method bodies by hand, then there are mitigation strategies one can use. For example, you can use the source control system and code diff tools to forward integrate hand written code in the previous version to the newly generated version.
    • Remember that you will be testing and debugging your handwritten code in the context of the generated code. This means that your developers can not avoid coming into contact with the generated code. So make the generated code as understandable as possible. Simple generated code that extends well factored libraries (as opposed to generated code that starts from low-level base classes) can make a big difference.
    • The code generator itself will need testing and debugging, especially in the early stages. It should be written in a form that is accessible to your developers and allows the use of testing and debugging tools.
    • Manage your models, like you manage code. Check them into the source control system and validate them as much as you can. The amount you can validate the models depends on the tools you're using to represent them. You could just choose to represent the models as plain XML, in which case the definition of your modeling language might be an XSD, so you can validate your models against the XSD. If you choose to represent your models as UML, then it is likely that you'll also be using stereotypes and tagged values to customize your modeling language (see an earlier post). In general, UML tools don't do a good job of validating whether models are using them in the intended way, so resort to inspection or build validation checks into your code generator instead. 
    • Remember that 'code' is not just C# or Java. Run-time configuration files, build scripts, indeed any artifact that needs to be constructed in order to build and deploy the system, count as code.
    • Remember that the use of code generators is meant to increase productivity. So look for those cases where putting information in a model and generating code will save time and/or increase quality. Typically you'll be building your system on top of a code framework, and your code generator will be designed to take the drudgery out of completing that framework, and prevent human errors that often accompany drudgery. For example, look for cases where you can define a single piece of information in a model, that the generator then injects into many places in the underlying code. Then, instead of changing that piece of information in multiple places, you just change it once in the model and regenerate.

    Of course, we have been talking to our customers and partners about their needs in this area. But we're always to keen to receive more feedback. If you've been using code generation, then I'd like to hear from you. Has it been successful? What techniques have you been using to write the generators? To write the models? What pitfalls have you encountered? What development tools would have made the job easier?

  • stuart kent's blog

    Great article on MDA

    • 1 Comments

    Here's a great article on MDA that my colleague Steve Cook pointed me at:

    http://www.theserverside.com/articles/article.tss?l=MDA_Haywood

    A couple of highlights for me are:

    • The distinction between the elaborationist and translationist views of MDA.
    • A detailed critique of why the PIM/PSM distinction, so central to the OMG's vision of MDA, places the emphasis in the wrong place.
Page 6 of 7 (152 items) «34567