SharePoint Development from a Documentation Perspective

Andrew May

February, 2005

  • Andrew May's WebLog

    OneNote: An In-Depth Look at the OneNoteImporter Managed Assembly (Part 5 of 5)


    In this series of entries, we're taking an in-depth at the OneNoteImporter manage class, which provides an object model interface for the programmability functionality added in OneNote 2003 SP 1.

    Read part one here.

    Read part two here.

    Read part three here.

    Read part four here.

    Object Model Maps

    The following figures diagram the OneNoteImporter assembly object model, including abstract classes and inheritance. The diagrams mainly document how the objects in the assembly relate to each other. In most cases, when a member takes an object as a parameter, or returns an object, that object is included on the diagram. Value types, such as string or integers, are for the most part not displayed.

    For the sake of clarity, the following object information, pertaining to methods that most of the classes have, has been left off the diagrams:

    ·         The Clone method returns the type of object from which you call it.

    ·         The Equals method takes a System.Object as a parameter.

    ·         The GetHashCode method returns a System.Int32 object suitable for use in hashing algorithms and data structures like a hash table.

    ·         Inheritance from System.Object is not shown.

    Figure 2. The Application Object (and Legend)



    Figure 3. The ImportNode Abstract Class, and Page Class

    Figure 4. The PageObject Abstract Class, and Derived Classes

    Figure 5. The OutlineContent Abstract Class, and Derived Classes

    Figure 6. The Data Abstract Class, and Derived Classes


    The OneNoteImporter managed assembly provides a convenient and multi-functional ‘wrapper’ for working with the SimpleImporter and command line functionality in OneNote 2003 SP1. Moreover, using the provided source files for the assembly, a developer can customize and extend the classes as required for his particular application.

  • Andrew May's WebLog

    OneNote: An In-Depth Look at the OneNoteImporter Managed Assembly (Part 1 of 5)


    A while back I wrote a series of entries dealing with how you can use Donovan Lange's OneNoteImporter manage class to make importing content into OneNote 2003 SP1 even easier. To recap: Donovan's OneNoteImporter managed class assembly provides an object model interface for the programmability functionality added in OneNote 2003 SP 1. Both the Send to OneNote from Outlook and Send to OneNote from Internet Explorer PowerToy add-ins actually use the OneNoteImporter class in their source code.

    You can download the OneNoteImporter class source code from Donovan's blog here.

    You can read my initial series of blog entries about the OneNoteImporter class here: part one, and part two.

    For the next week or so I'll be running a series of entries that examine the OneNoteImporter class in more detail, and really get 'under the hood' on how the class is structured and functions.


    The OneNoteImporter managed class assembly provides an object model interface for the programmability functionality added in OneNote 2003 Service Pack (SP) 1.

    The classes in this assembly enable you to:

    ·         Import content into OneNote SP1 using the SimpleImport.Import method without having to explicitly author the XML string this method takes as a parameter. The OneNote managed assembly does this for you.

    You do not need to be familiar with the SimpleImporter class, or the XML schema used by its Import method, to use the OneNoteImporter managed assembly. However, for more information about the SimpleImporter class and its operation, see Importing Content into Microsoft Office OneNote 2003 SP1.

    ·         Navigate to a specific page in OneNote using the SimpleImport.NavigateTo method.

    This is also covered in the article referenced above.

    ·         Use class methods to invoke some of the more popular command line switches you can use to customize how the OneNote application launches.

    For more information on customizing OneNote 2003 using command line switches, see Customizing OneNote 2003 SP 1 Using New Command Line Switches.

    This article presents a detailed discussion of the internal design and function of the OneNoteImporter classes, in case a developer wants to modify the classes, or just know more about how they operate internally. For a general discussion of how to use the public members of the OneNoteImporter managed assembly, see OneNote Import Managed Assembly: The Quick Rundown (part one and part two).

    Source files for the assembly are available from Donovan Lange's blog here. Developers are encouraged to modify the OneNoteImporter assembly as they desire and redistribute it with their application.

    Note To avoid compatibility issues with other versions of the assembly that might be loaded on the user’s computer, include the .dll in your application directory, rather than the system directory.

    There are several basic steps in using the OneNoteImporter assembly to import content in to OneNote:

    ·         Create the page onto which you want to import content

    ·         Create and add the content to the page

    ·         Import the page into the desired OneNote location

    We’ll discuss the internal operation of the assembly classes during each of these steps.

    The OneNoteImporter assembly also enables you to update and delete pages or specific content on them, as long as you know their unique identifier. This is discussed in detail later in this article.

    It’s worth noting at this point that the OneNoteImporter assembly is designed to import a single OneNote page at a time. The OneNote.SimpleImporter class can take an XML string that includes content to be imported onto multiple pages. However, when using the OneNoteImporter assembly, you create a separate XML string for each page you want to import.

    Examining the Classes

    Before discussing how the classes in the assembly function internally, let’s briefly look at the abstract classes that form the basis of the assembly. There are four such classes, from which almost all of the other assembly classes derive: ImportNode, PageObject, OutlineContent, and Data.

    The ImportNode Class

    The ImportNode class is the base class for the entire assembly. Almost all the other classes inherit from it, including the other three abstract classes. As mentioned before, the SimpleImporter.Import method takes an XML string comprised of elements that detail the OneNote page and contents you want to import. The ImportNode represents a single node (or element) in this XML structure, such as a page, or an object on a page.

    This class provides several important pieces of common functionality:

    ·         Serialization: The ImportNode contains an abstract method, SerializeToXML, that classes derived from it must implement. This ensures that each derived class contains the means to serialize itself into XML for the XML string passed to the SimpleImport.Import method. We discuss how the various classes implement this method later in the article.

    ·         Selection for importing: This class also contains an internal Boolean property, CommitPending, that denotes whether or not to include this object in the XML string passed to the SimpleImport.Import method. By default, all objects based on the ImportNode or its derived classes have their CommitPending property set to True when they are constructed. This denotes that the object has not yet been imported into OneNote. We also discuss how and when an object’s CommitPending property is changed later in this article.

    The PageObject class

    The abstract PageObject class represents an object that can be added, updated, or deleted from the specified OneNote notebook page. There are three classes derived from the PageObject class in the assembly:

    ·         ImageObject, which represents an image, such as a jpeg or gif, on a OneNote page

    ·         InkObject, which represents ink on a OneNote page

    ·         OutlineObject, which represents an outline on a OneNote page. OutlineObject objects are actually comprised of other object, such as images, ink, and text described in html format.

    The SerializeToXml method is implemented in the PageObject class. It serializes the properties common to all PageObject classes:

    ·         Whether to delete the object

    ·         The object’s unique identifier

    ·         The object’s position on the page

    It then calls the abstract PageObject method SerializedObjectToXml. All classes derived from PageObject must implement this method, which serializes the specific attributes of each derived class.

    The OutlineContent class

    The abstract OutlineContent class represents an object that is part of an outline. There are three classes derived from the OutlineContent class in the assembly:

    ·         HtmlContent, which represents text on a OneNote page, described in html format.

    ·         InkContent, which represent ink that is part of an outline.

    ·         ImageContent, which represent an image that is part of an outline.

    The SerializeToXml method is abstract in this class; each class derived from the OutlineContent class must implement the serialization process.

    The Data class

    The final abstract class, Data, represents the actual data of the object to be imported. For example, for an image, this would be either the path to the image file, or the base 64-encoded data of the image itself. There are three concrete derived classes for the different types of data content an object can represent:

    ·         BinaryData, which represents ink or image data that is base-64 encoded.

    ·         FileData, which represents the path to a source file.

    ·         StringData, which represents HTML content.

    The SerializeToXml method is abstract in this class; each class derived from the Data class must implement the serialization process.

    In the next entry, we'll look at creating objects.

    Read part two here.

  • Andrew May's WebLog

    OneNote: An In-Depth Look at the OneNoteImporter Managed Assembly (Part 3 of 5)


    In this series of entries, we're taking an in-depth at the OneNoteImporter manage class, which provides an object model interface for the programmability functionality added in OneNote 2003 SP 1.

    Read part one here.

    Read part two here.

    Importing Objects into OneNote

    The actual creation of the XML import document, and importing the page contents, takes place when you call the Page.Commit method. This method, in turn, invokes a number of methods in other OneNoteImporter objects. Because of this method’s importance and complexity, it’s worth examining how the method functions.

    First, the code checks to see if the page has changed in any way from the last time it was imported. It does this by determining if the Page object’s CommitPending property is set to True. If it is, it calls the SimpleImporter.Import method.

    The code calls the Page.ToString method to generate the XML string it passes to the Import method. The ToString method in turn calls the Page.SerializeToXml method.

    This begins a series of recursive calls to the SerializeToXml methods of the various objects. Each object’s SerializeToXml method includes instructions to call the SerializeToXml method of any child objects, and append the resulting XML to the parent element. This in turn invokes the SerializeToXml method of any child objects the original child object might have, and so on, until the entire page structure has been serialized to xml in a single xml document.

    The Page.SerializeToXml begins by creating a new XmlDocument object, and generating <Import> and <EnsurePage> elements and adding them to the document. Page object property values are used to set the various attributes of the <EnsurePage> element.

    Note that the Commit method generates import XML with both <EnsurePage> and <PlaceObjects> elements for that page. Specifying an <EnsurePage> element for a page guarantees that the page exists before OneNote attempts to import objects onto it. So if your application includes a scenario where you only want to import objects onto a page if the page already exists, you’ll need to modify this method, or use another means.

    The code then generates a <PlaceObjects> element. For each of the Page object’s children whose CommitPending property is set to True, the code calls the PageObject.SerializeToXml method.

    If the page object’s DeletePending property is set to True, the PageObject.SerializeToXml method generates a <Delete> element. If not, the method does three things:

    ·         Generates a <Position> element, whose attributes are set according to Position object property values.

    ·         Calls the SerializeObjectToXml method for the specific PageObject-derived class involved, i.e., ImageObject, InkObject, or OutlineObject.

    ·         Calls the SerializeToXml method for the specific OutlineContent-derived class involved, i.e., HtmlContent, InkContent, or ImageContent.

    Executing the SerializeToXml method for each of these content types includes a call to the SerializeToXml method for the Data-derived object they contain: BinaryData, FileData, or StringData. In this way, the entire page structure is serialized to xml in a single xml document.

    Note that the HtmlContent.SerializeToXml method includes a call to another internal method of that same object, called CleanHtml. The CleanHtml method reads through the html string or file data and makes sure the HTML is formatted in a way that OneNote accepts. It identifies and replaces problematic formatting with characters which OneNote can process. For example, the CleanHtml method wraps the HTML string with the appropriate <html> and <body> tags if the HTML lacks them.

    The serialization of the page nodes is now complete. If the Page object had no children, the <PlaceObjects> element remains empty. In such a case, the Page.SerializeToXml method does not append it to the <Import> element.

    Finally, the Page.SerializeToXml method determines the appropriate namespace designation and adds it to the <Import> element.

    The ToString method then takes the XmlDocument object, saves it as a text stream, converts it to a string, and passes it back to the SimpleImporter.Import method. This Import method uses the XML string to import the specified content into OneNote.

    Now that the content has been imported into OneNote, the Commit method performs some vital housekeeping. Using the RemoveChild method, it removes any of the Page object’s children who have their DeletePending property set to True. It then sets the private committed field to True, thereby making the Date, PreviousPage, RTL, and Title properties read-only. You cannot change these attributes once you import a page into OneNote.

    Lastly, it sets the CommitPending property of the Page to False. This it turn sets the CommitPending properties of all the Page object’s remaining children to False as well.

    In part four, we'll examine the internal method calls of the Commit method.

    Read part four here.

  • Andrew May's WebLog

    OneNote: An In-Depth Look at the OneNoteImporter Managed Assembly (Part 2 of 5)


    In part one, we started to take an in-depth at the OneNoteImporter manage class, which provides an object model interface for the programmability functionality added in OneNote 2003 SP 1.

    Read part one here.

    Creating Objects

    Creating the page and the object you want to import onto it is relatively straight-forward. The Page, PageObject-derived sub-classes, and OutlineContent-derived sub-classes all have one or more public constructors with which you create new instances. We should, however, briefly touch on a few issues to keep in mind.

    Identifying Objects

    Both the Page and PageObject-derived sub-classes have an internal Id property, which gets or sets an ObjectId object. The ObjectId represents the globally unique identifier (GUID) OneNote requires to identify each page and object imported. The data in an ObjectId consists of a private field representing the containing object’s identifier. The ObjectId object is generated when the page or object is constructed.

    The ObjectId object cannot be accessed from outside the assembly. The OneNoteImporter assembly generates and maintains the GUIDs for the user. If you are programming for a scenario that requires you to have access to the GUIDs, there are two approaches to consider:

    ·         Override the class, and make the GUID properties publicly accessible.

    ·         Serialize and persist the Page object between sessions, and then deserialize it as necessary.

    Note that the ObjectId class stores the object GUID internally as a System.Guid data type, which does not store the GUID in registry format, that is, surrounded by curly braces ({}). OneNote only accepts GUIDs in registry format. Therefore, the ObjectId.ToString method has been overridden so that it wraps the guid value in curly braces before returning the string.

    Similarly, if you use the ObjectId(string) overloaded constructor, the constructor code strips off curly braces, if present. As they might be, if you were passing a serialized ObjectId as an argument.

    Committing Objects for Importing

    The CommitPending property determines whether or not an import node is included in the XML passed to the SimpleImporter.Import method. Objects with this property set to True represent one of the following:

    ·         an object that hasn’t been imported into OneNote

    ·         an object which as been changed since the last time it was imported; this includes objects marked for deletion from OneNote

    ·         an object that contains a child object that is either of the two cases above

    Only objects whose CommitPending property is set to True are serialized and included in the import XML string. By default, whenever an object is constructed from an ImportNode-derived class, this property is set to True. Let’s examine how this property is used, and when they value is changed.

    Calling the Page.Commit method imports the page and its contents into OneNote. Once you import a page, the CommitPending property is set to False for that Page object and all the objects contained on it. If you immediately called the Commit method again, neither the page nor any of the objects on it would be included in the XML for the next import.

    However, when you make a change to an object, such as changing a property value, the CommitPending property gets set back to True. Not only that, but the CommitPending property of any parent object gets set to True as well. This happens because of the way the import XML is structured. If you want to re-import a specific object, you must also include information about all the objects that contain it, such as the page it appears on. Setting the CommitPending property to True for all the parent objects ensure all the necessary information is included in the import XML string.

    Here’s how it actually happens in the code:

    You cannot access the CommitPending property from outside the OneNoteImporter class assembly, but if you change any public object properties, the CommitPending property is set to True, so that those object changes are imported into OneNote the next time you call the Page.Commit method. This includes setting the DeletePending property of an object to True.

    In addition, the CommitPending property code also sets the CommitPending property of the parent object, if there is one, to True. Which in turns calls the CommitPending property code again, in case that parent object has a parent, and so on up the import node structure to the Page object.

    Conversely, when the Commit method sets a Page object’s CommitPending property to False, the property code sets the CommitPending property of any children to False as well, which in turn calls the method again, in case any of those objects have children, and so on.

    Adding Objects to Pages and Outlines

    You can add content directly to a page, and to an outline on a page. The public Page.AddObject and OutlineObject.AddContent methods both call the internal ImportNode.AddChild method, which adds the child object to a private ArrayList object. A convenience constructor enables you to name the object when you add it, if you want.

    OutlineContent objects are displayed in the order in which they are added to the OutlineObject object.

    To import text into a OneNote page, you add an HtmlContent object to an OutlineObject, and then add the OutlineObject to the desired Page object.

    Deleting Objects From OneNote

    In the OneNoteImporter assembly, there are two levels of deleting objects:

    ·         Marking an object for deletion the next time the page is imported into OneNote

    ·         Removing an object from the OneNoteImporter object structure; for example, as removing an ImageContent object from an OutlineObject object

    To delete an object from a OneNote page, use the Page.DeleteObject method. This method does not directly remove the object from the Page object’s private children array. Rather, it sets the DeletePending property of the specified object to True. Therefore, when you execute the Page.Commit method, the object gets deleted from the page in OneNote. Only then is the ImportNode.RemoveChild method called. It is this internal method that removes the object from the Page object’s array, using the ArrayList.Remove method. The RemoveChild method also sets the Page object’s CommitPending property to True; however, the Commit method then sets it to False.

    Note You can even call the DeleteObject method for objects that have not been imported into OneNote. While this means the object is serialized and included in the XML string passed to the SimpleImporter.Import method, no error results. OneNote ignores XML elements directing it to delete objects not present on the specific page. This is by design, since the OneNote user may have manually deleted the object themselves.

    Conversely, to remove an object from an outline, use the OutlineObject.RemoveContent method, which directly calls the ImportNode.RemoveChild method. So the object is removed from the OutlineObject object’s children array at the time the RemoveContent method executes. The RemoveChild content also sets the CommitPending property of the OutlineObject object to True, so that the entire updated outline (minus the removed object) is imported into OneNote again the next time the Page.Commit method executes.

    Iterating through an import node’s children

    Both the Page and OutlineObject objects implement the IEnumerable interface. Their respective public GetEnumerator methods provide the ability to iterate through their private children array.

    In part three, we'll get to the whole point of the OneNoteImporter class: importing objects into OneNote.

    Read part three here.

  • Andrew May's WebLog

    OneNote: An In-Depth Look at the OneNoteImporter Managed Assembly (Part 4 of 5)


    In this series of entries, we're taking an in-depth at the OneNoteImporter manage class, which provides an object model interface for the programmability functionality added in OneNote 2003 SP 1.

    Read part one here.

    Read part two here.

    Read part three here.

    Figure 1 diagrams the internal method calls of the Commit method. It shows the OneNoteImporter classes and methods called, and the Import XML elements generated at each step. The various calls to System.Xml methods are not diagrammed.


    Figure 1. The Page.Commit Method, and XML Elements Generated

    In our final entry, we'll look at some object model maps that detail how the OneNoteImporter class is structured.

    Read part five here.

  • Andrew May's WebLog

    SharePoint: Office Developer Conference, Day One


    Man, my brain is full.

    For those of you who don't know, yesterday was the first day of the Office Developer Conference, a three-day extravaganza of technical presentations covering all aspects of Office development, from client applications to server technologies to custom solutions presented by Microsoft partners.

    I have to admit, one of the things I really enjoy about working here is easy access to the people who create the software I use to do so much of my job. Got a question about how Word uses XML? Walk down the hall and ask the guy who wrote the Word SDK. So now, for three days, developers from outside companies can get that same level of access.

    And for me personally, this couldn't have come at a better time. Yesterday I sat through five presentations dealing with developing Windows SharePoint Services. More than just the technical information presented, I now feel like I've got a much better handle on how WSS is structured, why it works the way it does, and what the problems are that it was created to solve. The Q&A's were also a valuable opportunity for a writer-programmer like me to get a peek into what issues are on developers' minds.

    The Opening Keynote was given by Richard McAniff, a Corporate Vice President for Microsoft Office. The two main technologies he stressed were XML and SharePoint. XML being the mechanism by which we enable the user to move their data into and out of the applications they use, and SharePoint as the tool that lets them collaborate on and store that information.

    And here are some random technical issues that seemed to come up in a variety of ways throughout the WSS presentations:

    ·         Seems nobody is a fan of Collaborative Application Markup Language (CAML), which is used to define sites and lists, including fields, views, or forms. The best practice seems to be an iterative approach: make a change in the CAML, then test that change to make sure it functions as you want. Then make another change, re-test, etc. You can't really debug CAML, or apply a schema to it. Yet CAML provides a vital function: it's the reason SharePoint can render hundreds of web sites so quickly without having to have a huge server farm. Call it a necessary evil (or at least, annoyance.)

    (Here's a pointer to more information on CAML: Introduction to CAML.)

    ·         Site definitions: Best practice seems to be, copy an existing site definition, rename it and then modify it. Try not to modify existing site definitions if you can help it. Microsoft reserves the right to make changes to those site definitions, so be aware that if you do modify the default site definitions, we might over-write your changes later.

    ·         Concerning CAML and site definitions in general, I got the impression that the SharePoint team didn't realize so many developers would want to customize them so extensively. It seems like perhaps the SharePoint team thought that Web Parts would be the main avenue by which users developed sites. Certainly the developer story for Web Parts seems a lot more stream-lined and supported. That's just my impression, however.

    ·         How does a site definition differ from a template? A template is actually just the changes a user has made to the properties of a site built based on an existing site definition. A site definition is a complete set of instructions for building a site from the ground up. The template stills needs the site definition it's based on to function; remove that site definition, and the template is broken.

    ·         Customizing pages: When SharePoint renders a web page based on a page definition, it basically reads the user data from the back-end database, then plugs it into the page definition, which resides on the front-end server, and renders the page. That's one of the reasons SharePoint can render pages so quickly. The specific page the user is navigating to doesn't actually exist; it's a union of the specific data and the page definition. However, if you alter the page by editing the aspx, then SharePoint has to store more information in the database. This has three ramifications:

    ·         Database size can increase, because more information about the page, not just the data, must be stored.

    ·         Time to render can increase, since SharePoint has to load more information from the back-end database.

    ·         Inheritance is broken; in other words, if you later make changes to the page definition, those changes do not get propagated to the customized page.

    Because FrontPage alters the aspx of the page, if a user saves the page in FrontPage, that page is now customized. Not that that's necessarily undesirable; but if you let your users edit SharePoint pages in FrontPage, just be aware of what the ramifications might be.

    ·         Mike Fitzmaurice gave a nice shout out to Office Developer content on MSDN during his SharePoint presentation on customizing and branding WSS. Here's the links to articles that discuss pretty much all the technical points I listed above (and probably much more coherently):

    Branding a SharePoint Portal Server 2003 Site (Part 1 of 2)

    Customizing SharePoint Sites and Portals (Part 1 of 3)

    And with that, it's off to the InfoPath sessions…

  • Andrew May's WebLog

    Creating Object Model and XML Schema Maps


    Several people have asked what program(s) I use to create the object model maps and other technical illustrations I use here on the blog and in the articles I write.

    Unsurprisingly, I use Visio. Granted, I may be biased, having worked on the product for several years, but I haven't found any other drawing tool that lets me create, connect, and re-use custom shapes and design elements as easily as Visio does.

    Now here's the bad news: while Visio can dynamically generate class structure diagrams from source code, all the object model maps you see in my blog entries and MSDN articles were created the old-fashion way, one shape at a time.

    I do it by hand because I want to control the focus of the illustration and what information it includes. Most of the time I only want to illustrate part of an object model or class library, and need to be very selective about the information I want to highlight. I tend to do a large amount of tweaking and massaging to the diagram, for both informational and aesthetic reasons. All in all, it's quicker for me to just start with a blank page and fill it manually, rather than have Visio generate an automated diagram and have to then delete most of it and fold, spindle, and mutilate what's left.

    To create the final gif image, I turn off the grid, rulers, and connection points in Visio, and then simply take a screen shot of the image (granted, this doesn't work if your diagram is larger than your monitor display, but MSDN has image size restrictions anyway, so that's not an issue for me.) Just hit ALT + PrintScreen; this puts a bitmap image of the active application onto the clipboard. Once I've got the screen shot, I paste it into Photoshop, crop as necessary, and save it as a gif file. I use Photoshop for two reasons:
    a) I already have it loaded on my computer, so I'm familiar with it
    b) It's got the best 'save as gif' conversion I've found. I've used a few other image editors, and when I save the bitmap image as a gif in them, the image tends to degrade: color gradients become grainy, and text loses crispness. Not with Photoshop; I just accept the default conversion settings, and I have yet to be disappointed.

    Now, using Photoshop for saving screen shots as gifs is like using an elephant gun to hunt squirrel. Since I already have Photoshop, I haven't really looked around for another application that does this well. But if you poke around the web a little, you can probably find something that works just as well for this purpose, either as shareware or at least without Photoshop's hefty price tag. Just look for something that's been optimized to create web-ready graphics.

  • Andrew May's WebLog

    Know Any Good Object Model and XML Schema Map Notations?


    Now that I've told you how I create the technical illustrations I use in my blog entries and MSDN articles, I have a question for you:

    Anybody got a decent notation for diagramming object models or XML schemas?

    I've been looking around, and I don't really see any. As for what I'm using in my articles, well…I've pretty much been making it up as I go along.

    Detailing discreet sections of a COM object model is fairly straight-forward. All you really have to keep track of are objects, members, events, and enumerations. But when I was trying to diagram how Donovan's OneNote SimpleImporter classes were structured, I wanted to include inheritance, as well as both public and private members (with appropriate scope noted). So I ended up with a legend that looked like this:

    Take a look at the finished diagrams in this entry.

    For the most part, this seems to have worked. But I have to think that someone else out there has tackled this problem. Someone with a lot more experience and talent in technical illustration.

    I'm especially interested in how (or if) people are diagramming XML schemas. Here's the notation I used to diagram the OneNote SimpleImport XML schema:

    But, as you can see, it gets pretty complicated pretty quick, even with a relatively short and straight-forward schema like SimpleImport:

    Also, it doesn't show other schema information, such as the maximum/minimum times an element can/must appear. This information wasn't really important in this particular example, but I can see cases where it would be.

    I hate that everything is in a grid, which makes it hard to read. I tried an alternate version of this, with shaded block rather than grid lines, but the results weren't any better. In fact, now that I look at it again, because I both indented the elements and separated them in boxes, I broke a basic rule of information design: I used two design schemes to denote the same information (the element hierarchy), thereby making the diagram more graphically complex than it needed to be.

    That's the problem with producing content on a deadline. But I'm sure this won't be last time I need to diagram an XML schema for an article.

    So, anybody got any better suggestions?

  • Andrew May's WebLog

    OneNote XML Schema Map Notation, Take Two


    So, I was so annoyed when I realized I had broken one of the basic rules of information design with my diagram of the OneNote 2003 SP1 SimpleImport schema, I had to take a few minutes and see if I could fix it. As I mentioned in my last entry, my diagram uses indenting and boxes to show the element hierarchy in the OneNote schema. But, because I was illustrating the same information (the element hierarchy) two different ways (indenting and boxing), the diagram contained redundant information, and was more complicated than it needed to be.

    And I hated those damn boxes anyway. Chopping a diagram up into a grid like that only ends up distracting from the information. I knew it at the time, but I was on a deadline and couldn't come up with a better solution, so…

    Below is my latest attempt. The grid lines are gone; element hierarchy is denoted now solely by the indenting of the element names. By formatting the information inside each element (such as data type and attributes) gray, I think I've been able to keep the element names prominent enough I don’t need boxes to denote where one element ends and another starts.

    One thing that I still see as problematic is the indenting of the element information, like the data type and attributes. In this small example, I was able to keep all that information at the same left-alignment for all the elements, which again keeps that information from distracting from the element names. But, had the schema hierarchy included a few more levels, I would've had to move the element information even further over, perhaps to the point where it was so far removed from the element names in the top-level elements that it would seem disconnected.

    All in all, I think this diagram is quite a bit more successful than the one that currently appears in the article. Looks like it's time to file a bug and get that image swapped out…

Page 1 of 1 (9 items)