SharePoint Development from a Documentation Perspective

Andrew May

  • Andrew May's WebLog

    Importing Content into OneNote 2003 SP1 Preview


    The OneNote Service Pack 1 Preview is currently available for download. This Service Pack includes some cool new features that'll be of great interest to developers. We've already been seeing some questions about this new functionality in the newsgroups, so we posting a draft of the article that will appear on MSDN once SP1 is released. Keep in mind that, while we've tried to make sure the information is as accurate and complete as possible, it is a draft document, and provided as such. Void where prohibited.


    Anyway, here's the article:

    Importing Content into OneNote 2003 SP1 Preview

    Applies To:

        Microsoft Office OneNote 2003 SP1

    Summary:    Learn about the new extensibility features available for developers in Microsoft Office OneNote 2003 SP1. The new OneNote 1.1 Type Library includes functionality which enables you to programmatically import images, ink, and HTML into OneNote.


    For the Service Pack 1 (SP1) Preview, OneNote 2003 has added extensibility functionality that enables applications to interoperate with it in an important, fundamental way—they can add content to OneNote notebooks. You can now push content to OneNote that includes html, images, and even ink (such as from a Tablet PC). You can even create the folder, section, or page onto which you want to place your content.

    Note These extensibility features are only available in the OneNote 2003 Service Pack 1 Preview. You can upgrade to the OneNote SP1 Preview here.

    Using the CSimpleImporterClass

    OneNote SP1 exposes the OneNote 1.1 Type Library, which consists of a single class, CSimpleImporterClass, which enables you to programmatically add content to a OneNote notebook. You can add text in html form, images, and even ink from a Tablet PC. The CSimpleImporterClass enables you to specify where in the notebook you want to place the content; you can even create new folders, sections, and pages for content, and then programmatically display the desired page. OneNote’s import functionality also lets you later delete the content you import.

    The CSimpleImporterClass consists of two methods:

    ·         Import, which enables you to add, update, or delete images, ink, and html content to a specific page in a OneNote folder and section.

    ·         NavigateToPage, which enables you to display a specified page.

    To use the CSimpleImportClass, you must add a reference to the OneNote 1.1 Type Library to your project. To add a reference, in Visual Studio .NET, on the Solution Explorer window, right-click References and then click Add Reference. On the COM tab, select OneNote 1.1 Type Library in the list, click Select, and then click OK.

    While this article focuses on implementing OneNote’s import functionality using .NET languages, you can also use the OneNote 1.1 Type Library with unmanaged code, such as Visual Basic 6.0 or Visual C++.

    The Data Import Schema

    The Import method has the following signature:

    Import (bstrXml as String)

    The method takes an xml string describing the content object(s) you want to import, as well as the location in the notebook where you want them placed. You can also use the import command to delete objects you have previously placed in the notebook.

    When called, OneNote executes the Import method with minimal intrusion on the user. OneNote does open if it is not already opened, which means the OneNote splash screen displays, but OneNote does not assume focus. Nor does it change the user’s location in the notebook if OneNote is already running. To change the focus of the OneNote application to the new content, use the NavigateToPage method, discussed later.

    If the Import method fails, OneNote does not display an error to the user. However, the COM interface does return an “Unknown Error” to the application making the call.

    The figure below outlines the xml schema to which the import file must adhere.


    Figure 1. XML Schema Structure of the Import Root Element



    Figure 2. Schema Structure of the PlaceObjects Element

    The OneNote data import schema can be found at The OneNote 1.1 SimpleImport XML Schema.

    Note The namespace for the Import method will be different in the final version of OneNote SP1 from what it is in the SP1 Preview.

    The current namespace for the OneNote SP1 Preview is:

    While the final namespace for OneNote SP1 will be:

    Be advised that if you’re programming against the Preview namespace, you must update your code for the new namespace in order for it to be compatible with the final OneNote SP1.

    There are two elements directly below the root <Import> element. Use the first element, <EnsurePage>, to make sure the folder, section, and page on which you want to place content exists. Use the second element, <PlaceObjects>, to actually place or delete objects from the page. The schema requires that the root element contain either at least one <EnsurePage> or <PlaceObjects> element. Any <EnsurePage> elements must appear before any <PlaceObjects> element.

    Creating Folders and Pages for Content

    Before you import content, the target pages for that content must exist in the OneNote notebook. Use the <EnsurePage> element to verify or create the target pages for your content. For each page you specify in an <EnsurePage> element, OneNote checks to determine if the page exists, and if not, creates it. You can even create new notebook folders and sections by specifying folders or sections that don’t exist.

    You are required to pass OneNote a string representing the path to the desired page, as well as a GUID for that page. If the path is not fully-qualified, OneNote assumes the string starts from the notebook root location for the user. Additionally, you can specify the title, date, reading direction, and page placement relative to other pages in the notebook.

    By default, OneNote inserts each new page at the end of the specified section. If you specify a page GUID for the insertAfter attribute, OneNote inserts the new page as a sub-page of the page whose GUID you specified. In such cases, OneNote labels the sub-page with the title and date of the page after which it’s inserted, rather than what you specify in the title and date attributes for the sub-page. If the page you specify does not exist (for example, if it was never added, or the user deleted it), then OneNote ignores the insertAfter attribute and inserts the new page at the end of the specified section, with any specified title and date values.

    Consider the following example. This <EnsurePage> element specifies a page in the OneNote section title Negotiations, in the folder Johnson Contract, in the user’s root notebook folder. The page is titled “Budget Concerns”.

          <EnsurePage path="Johnson Contract\"


                      title="Budget Concerns"/>

    OneNote uses the optional attributes of the <EnsurePage> element if it creates the page you specify. If you specify attributes for an existing page, OneNote leaves the page unchanged. For example, if you use a GUID for an existing page, and specify a title that differs from that page’s current title, OneNote does not change the page title.

    Additionally, OneNote only searches the path you specify for the desired page GUID. If the page GUID does not exist in the specified section, OneNote creates it; it does not look for the GUID in other sections of the notebook.

    You can use multiple <EnsurePage> elements to create multiple pages within the OneNote notebook. You must verify or create the page before you can place content on it. You are not required to include an <EnsurePage> element for each page on which you want to place content. However, if you use the <PlaceObjects> element to try and place objects on a page that does not exist, the Import method fails. In some cases, this may be the desired outcome; for example, if you only wanted to update content on a page if the page still exists, and not add the content if the page has been deleted by the user.

    Placing Content on Pages

    Once you’ve ensured that the pages onto which you want to import data exist in the OneNote notebook, you can start placing objects on them using the <PlaceObjects> element. Multiple objects can be imported to multiple pages if desired. You create a <PlaceObjects> element for each page on which you want to place content. Same as the <EnsurePage> element, <PlaceObjects> has two required attributes: the path to the page, and the guid assigned to the page. You must include at least one <Object> element in each <PlaceObjects> element.

    To create the xml string that describes the content you want to import into OneNote, follow these general steps:

    ·         If you want to make sure the page exists to place content onto, create an <EnsurePage> to verify or create the folder, section, and page as necessary.

    ·         Create a <PlaceObjects> element for the page to which you want to add or delete content.

    ·         Create an <Object> element for the first object you want to alter (add, update, or delete) on the page.

    ·         To delete the object, add the <Delete/> element to that object.

    ·         To import the object, add a <Position> element and use its x and y attributes to specify where on the page to place the object.

    ·         Specify the type of object you’re importing to the page by using the appropriate element, <Image>, <Ink>, or <Outline>, and setting the appropriate optional attributes, if desired.

    ·         If you’re importing an outline object, specify the sub-objects the outline contains in the order in which you want them to appear in the outline. You can specify any number and order of <Image>, <Ink>, and <Html> elements. However, you are required to specify at least one <Image>, <Ink>, or <Html> element for the outline.

    ·         Repeat this procedure for all objects you want to alter on the page. Then repeat this procedure for all pages on which you want to alter content.

    Some other technical requirement to keep in mind as you create the xml string:

    ·         OneNote positions objects based on absolute x and y coordinates, where x and y represent measurements in points. Seventy-two points equal one inch.

    ·         Ink objects must be described in Ink Serialized Format (ISF) format, base-64 encoded, or specified by a path to the source file. If you specify a file path, the source file should be a plain file with the byte stream containing the ISF. If you include the ink as data in the XML, then it should be base64 encoded. For more information on programmatically capturing and manipulating ink, see this Ink Serialization Sample.

    ·         Image objects can be specified by a path to a source file, or base-64 encoded.

    ·         Text must be described in html, within a CDATA block, or specified by a path to a source file. In a CDATA block, all characters are treated as a literal part of the element’s character data, rather than as XML markup. XML and HTML use some of the same characters to designate special processing instructions. Using the CDATA block prevents OneNote from misinterpreting HTML content as XML instructions.

    ·         Although the schema does not currently support importing audio or video files, you can include links to these files within the HTML content you import.

    Updating Content

    To update objects that are already in a notebook, simply re-import the objects to the same page, using the same GUIDs. Be aware, however, that re-importing an object overwrites that object without notifying the user. Any changes made to the content since it was last imported are lost.

    Deleting Content

    To delete an object, place the <Delete/> element within the object element. To delete an object, you must be able to identify it by its GUID and path. In practical terms, this generally means an application can only delete objects it places in OneNote to begin with. However, if the application stores the GUID across sessions, it can delete objects it imported into OneNote in previous sessions. You cannot delete folders, sections, or pages, even those you created.

    Sample XML String

    Below is an example of what a typical xml string for the Import method might resemble. This xml file describes the placement of three new objects onto an existing page, and the deletion of an object already contained on that page.

    <?xml version="1.0"?>

    <Import xmlns="">


          <EnsurePage path="MSN Instant Messenger\"






          <PlaceObjects pagePath="MSN Instant Messenger\"



                <Object guid="{5FCFD7F9-02C2-42fc-B6AF-7A8450D43C2D}">

                      <Position x="72" y="72"/>

                      <Image backgroundImage="true">

                            <File path="c:\image.png"/>




                <Object guid="{F6FC4149-1092-48ea-806D-0067C8661A18}">

                      <Position x="72" y="72"/>


                            <File path="c:\ink.isf"/>




                <Object guid="{7EA551C4-F778-40ce-9181-21A3DB6D33CA}">

                      <Position x="72" y="432"/>

                      <Outline width="360">




                                              <html><body><p>Sample text here.</p></body></html>







                <Object guid="{1A6648BA-D792-48f1-AC6A-43DF6E258851}">







    The following example demonstrates a basic implementation of the OneNote import functionality. The code displays a dialog that enables the user to select an xml file, and then passes the contents of that xml file as an argument for the Import method. This example assumes the xml file conforms to the OneNote data import schema. This example also assumes the project contains a reference to the OneNote 1.0 Type Library.

      Dim strFileName As String

      Dim XmlFileStream As StreamReader

      Dim strImportXml As String

      Dim objOneNote As OneNote.CSimpleImporterClass


      OpenFileDialog1.Filter = "XML files (*.XML)|*.XML|Text files (*.TXT)|*.TXT"


      strFileName = OpenFileDialog1.FileName()


      objOneNote = New OneNote.CSimpleImporterClass

      XmlFileStream = New StreamReader(strFileName)

      strImportXml = XmlFileStream.ReadToEnd




    For the sake of simplicity, so as to highlight how the Import method is implemented, this example assumes that an XML file has already been created to use as the string for the Import method. In most cases, however, the application that calls the Import method will first create the XML string itself. For more information on creating XML using the .NET framework, see Well-Formed XML Creation with the XMLTextWriter.

    In addition, most applications will need to create and assign GUIDs to the pages and objects they create. Use the NewGuid method to create a new GUID, and the ToString method to get the string representation of the value of GUID, which the XML string requires. For more information, see GUID Structure in the .NET Framework Class Library.

    Displaying a Specified Page

    By design, the Import method executes with minimal focus, so that when you import data, the user is not distracted by OneNote displaying data they might not want to see, or worse, navigate away from a OneNote page the user is currently using. Also, in the cases where you import multiple objects to multiple pages, OneNote does not have to make assumptions about which page the user wants to see, if any.

    To display a specific page, use the NavigateToPage method. If OneNote is not open, this method opens OneNote to the specified page. If OneNote is already open, the method navigates to the specified page in the current instance of OneNote.

    To select the page to display, you must specify the path to the page, as well as the GUID for that page. If you specify a page that does not exist, OneNote returns an error.

    The NavigateToPage method has the following signature:

    NavigateToPage(bstrPath as String, bstrGuid as string)


    OneNote’s new import functionality opens up exciting possibilities for interacting with other applications. Any application that can save data (either its own or another application’s) as html text, images, or ISF can now push that content into OneNote and place it wherever is desired. And as long as the application retains the GUIDs used, it can update or delete the content it pushed whenever necessary.

  • Andrew May's WebLog

    What Are Content Types, Anyway?


    One of the major new concepts in Windows SharePoint Services V3 is content types. They're a core concept that enables a lot of the functionality in both Windows SharePoint Services and Office SharePoint Server 2007, so they seemed like a logical choice to talk about.

    A content type is a reusable collection of settings you want to apply to a certain category of content. Content types enable you to manage the metadata and behaviors of a document or item type in a centralized, reusable way. Basically, content types include the columns (or fields, if you prefer) you want to apply to a certain type of content, plus other optional settings such as a document template, and custom new, edit, and display forms to use.

    You can think of a content type as a refinement and extension of a Windows SharePoint Services 2.0 list. The list schema defined a single group of data requirements for each item on that list. So in Windows SharePoint Services 2.0, the schema of an item was inextricably bound to its location.

    With content types, you can define a schema for types of content, but that schema is no longer bound to a specific location. And to look at it in reverse, each SharePoint list is no longer limited to a single data schema to which the documents stored there must adhere. You can assign the same content type to multiple document libraries, and you can assign multiple content types to a given document library. Content types, in essence, let you:

    ·         Store the exact same type of content in multiple locations

    ·         Store multiple types of content in the same location

    Site and List Content Types: Parents and Children

    There are two different levels of content types I should mention: site content types, and list content types. Think of site content types as templates, and list content types as instances of those templates. You define site content types at the um, site level, hence the name. Since site content types aren't bound to a specific list, you can assign them to any list within the site you want. The site at which you define the site content type determines the scope of that content type.

    List content types are more like instances, or single-serving content types. When you assign a site content type to a list, a copy of that content type is copied locally onto the list. That's a list content type. You can even tweak a list content type so that it's different from its site content type 'parent'. Also, you can create a content type directly on a list, but it's a one-off. You can't then assign that list content type to other sites or lists.

    One other thing about site content types: you can base content types on other site content types. For example, Windows SharePoint Server comes with a built-in hierarchy of content types for basic SharePoint objects, such as Document, Task, Folder, etc. You can create your own site or list content types based on one of these site content types. Ultimately, all content types derive from the grandfather of them all, System.

    Also, if you make changes to a site content type, Windows SharePoint Services includes a mechanism whereby you can 'push down', or propagate those changes out to the child content types (be they site or list content types). Doesn't work the other way, though; no pushing changes to child content types up the family tree.

    E is for Extensible

    You can see how content types were designed to encapsulate and modularize schema settings in Windows SharePoint Services V3. One very powerful aspects of this is that you can use content types to encapsulate whatever custom data you want to include in them. The content type schema includes a <XMLDocuments> node, which you can use to store nodes of any valid XML. As far as Windows SharePoint Services is concerned, the contents of the <XMLDocuments> node is a black box. Windows SharePoint Services makes no attempt to parse any of the XML documents you store there; it simply makes sure that they are included in any children content types, such as when you assign the content type to a list.

    The <XMLDocuments> node was designed to be utilized by third-party solutions. Use it to store information pertinent to any special settings or behavior you want to specify for a certain type of content. Office SharePoint Server uses this mechanism to store information various features need, such as information policies and document information panels, among others. You can programmatically access a SharePoint item’s content type, and from there access the XML documents include in the content type.

    Find Out More

    To learn out more about content types, browse the topics included in the Content Types section in the Windows SharePoint Services V3 (Beta) SDK.

    And here’s a few places content types are utilized to facilitate Office SharePoint Server functionality:

    Using content types to specify document information panel.

    Using content types to specify document to page converters.

    Using content types to specify information policy.


    Written while listening to Bob Mould : Body of Song

  • Andrew May's WebLog

    Content Type Technical Posters for Download


    The title pretty much says it all. For some of our earlier, internal technical events, I created a number of large-format (11” by 17”) technical illustrations to explain the more complicated aspects of enterprise content management in a SharePoint environment. The two posters I’m offering for download today deal with content types, a core concept in Windows SharePoint Services V3 that I’ve blogged about here. If you’re planning on using content types, I think it’ll be worth your while to take a look at these.

    As you can probably tell, I created the posters with Visio. I then converted them to PDF format using Visio 2007 (Beta)’s spiffy new Publish to PDF feature. I have to say I’m fairly impressed with the results. These are pretty complex diagrams, and as far as I can tell, Visio converted them flawlessly to PDF. Well done.

    Using Columns and Content Types to Organize Your Content in Windows SharePoint Services (version 3)

    This diagram explains the relationship between site and list content types, as well as content type ‘inheritance’ and customizing/deriving content types. It also illustrates how you can use site columns in your content types to ensure data uniformity, and how content type reference site columns and list columns.

    Using Content Types in Windows SharePoint Services (version 3) and SharePoint Server 2007

    Explains what content types are, and the advantages of using them to categorize and manage your enterprise content. Illustrates the conceptual structure of the various feature information you can encapsulate in a content type, such as columns, document templates, workflows, and custom solutions information such as forms and information policy.

    Note that these diagrams were created to be used with Adobe Reader 7.0, and will undoubtedly appear best using that version. I haven’t tested to determine if the diagrams display or print accurately in earlier versions.

    If you download either (or both) of the posters, take a second and let me know what you think of them in the comments section. Were they useful in explaining the various concepts they illustrated? Would you like to see something like this rolled into the SDKs at some point? I’m very visually-oriented; most of the technical illustrations I create start with me jotting something down on a napkin, to explain a concept to myself. But I have no idea if that’s true of developers in general. Are illustrations helpful in technical documentation like SDKs? Let me know what you think.

    And special thanks to Steve and Ryan and all the good folks at Office Zealot, who graciously agreed to host these diagrams for downloaded. I greatly appreciate it. Take a few minutes and visit their site; they’ve got tons of interesting content for the Office developer. It’s time well spent.

    And check back here next week. I hope to get some object model maps done for a few of the SharePoint namespaces, and I’ll make those available for download once I do.

    Written while listening to Isobel Campbell and Mark Lanegan : Ballad of the Broken Seas

  • Andrew May's WebLog

    SharePoint Object Model Maps for Download


    The guys at have generously agreed to host a few more of the large-size diagrams I’ve created for internal usage. So I’m offering up several object model maps for download. These are poster-size (11” by 17”) diagrams that illustrate key objects and namespaces in the SharePoint environment, suitable for sticking up in your office or hanging in your home as fine art.

    Each object model map was created in Visio, but the downloads are PDFs for ease of viewing and printing.

    Just click on any or all of the links below to download the maps you want. And leave me a message in the comments section telling me what you thought of the format, design, layout, etc. of the diagrams. Thanks.

    ·         The Windows.SharePoint.Workflow Namespace

    Workflow is another area you’re going to be hearing a lot about this release. The diagram illustrates the classes and members of the Workflow namespace, which you’d use to associate, initiate, and otherwise manage the workflow templates and instances in a Windows SharePoint Services deployment.

    ·         The Windows.SharePoint.SPContentType Object

    This object model map highlights the members and child objects of the SPContentType object. Content types are a core concept for this next release of Windows SharePoint Services, as I explain here. This object model maps compliments the conceptual diagrams of content types I offered for download here.

    ·         The Microsoft.Office.RightsManagement.InformationPolicy Namespace

    I haven’t talked about information policy on this blog yet, but there’s plenty of material in the Office SharePoint Server 2007 (Beta 2) SDK to fill you in. (Start here.) This diagram illustrates the classes and members of the InformationPolicy namespace, which you’d use to manage the policies, policy features, and policy resources on a SharePoint Server 2007 installation.

    I’ve still got a few more diagrams to make available; check back early next week for more full-color, absolutely free goodness.

    Written while listening to Greg Dulli : Amber Headlights

  • Andrew May's WebLog

    OneNote: An In-Depth Look at the OneNoteImporter Managed Assembly (Part 5 of 5)


    In this series of entries, we're taking an in-depth at the OneNoteImporter manage class, which provides an object model interface for the programmability functionality added in OneNote 2003 SP 1.

    Read part one here.

    Read part two here.

    Read part three here.

    Read part four here.

    Object Model Maps

    The following figures diagram the OneNoteImporter assembly object model, including abstract classes and inheritance. The diagrams mainly document how the objects in the assembly relate to each other. In most cases, when a member takes an object as a parameter, or returns an object, that object is included on the diagram. Value types, such as string or integers, are for the most part not displayed.

    For the sake of clarity, the following object information, pertaining to methods that most of the classes have, has been left off the diagrams:

    ·         The Clone method returns the type of object from which you call it.

    ·         The Equals method takes a System.Object as a parameter.

    ·         The GetHashCode method returns a System.Int32 object suitable for use in hashing algorithms and data structures like a hash table.

    ·         Inheritance from System.Object is not shown.

    Figure 2. The Application Object (and Legend)



    Figure 3. The ImportNode Abstract Class, and Page Class

    Figure 4. The PageObject Abstract Class, and Derived Classes

    Figure 5. The OutlineContent Abstract Class, and Derived Classes

    Figure 6. The Data Abstract Class, and Derived Classes


    The OneNoteImporter managed assembly provides a convenient and multi-functional ‘wrapper’ for working with the SimpleImporter and command line functionality in OneNote 2003 SP1. Moreover, using the provided source files for the assembly, a developer can customize and extend the classes as required for his particular application.

  • Andrew May's WebLog

    OneNote: An In-Depth Look at the OneNoteImporter Managed Assembly (Part 1 of 5)


    A while back I wrote a series of entries dealing with how you can use Donovan Lange's OneNoteImporter manage class to make importing content into OneNote 2003 SP1 even easier. To recap: Donovan's OneNoteImporter managed class assembly provides an object model interface for the programmability functionality added in OneNote 2003 SP 1. Both the Send to OneNote from Outlook and Send to OneNote from Internet Explorer PowerToy add-ins actually use the OneNoteImporter class in their source code.

    You can download the OneNoteImporter class source code from Donovan's blog here.

    You can read my initial series of blog entries about the OneNoteImporter class here: part one, and part two.

    For the next week or so I'll be running a series of entries that examine the OneNoteImporter class in more detail, and really get 'under the hood' on how the class is structured and functions.


    The OneNoteImporter managed class assembly provides an object model interface for the programmability functionality added in OneNote 2003 Service Pack (SP) 1.

    The classes in this assembly enable you to:

    ·         Import content into OneNote SP1 using the SimpleImport.Import method without having to explicitly author the XML string this method takes as a parameter. The OneNote managed assembly does this for you.

    You do not need to be familiar with the SimpleImporter class, or the XML schema used by its Import method, to use the OneNoteImporter managed assembly. However, for more information about the SimpleImporter class and its operation, see Importing Content into Microsoft Office OneNote 2003 SP1.

    ·         Navigate to a specific page in OneNote using the SimpleImport.NavigateTo method.

    This is also covered in the article referenced above.

    ·         Use class methods to invoke some of the more popular command line switches you can use to customize how the OneNote application launches.

    For more information on customizing OneNote 2003 using command line switches, see Customizing OneNote 2003 SP 1 Using New Command Line Switches.

    This article presents a detailed discussion of the internal design and function of the OneNoteImporter classes, in case a developer wants to modify the classes, or just know more about how they operate internally. For a general discussion of how to use the public members of the OneNoteImporter managed assembly, see OneNote Import Managed Assembly: The Quick Rundown (part one and part two).

    Source files for the assembly are available from Donovan Lange's blog here. Developers are encouraged to modify the OneNoteImporter assembly as they desire and redistribute it with their application.

    Note To avoid compatibility issues with other versions of the assembly that might be loaded on the user’s computer, include the .dll in your application directory, rather than the system directory.

    There are several basic steps in using the OneNoteImporter assembly to import content in to OneNote:

    ·         Create the page onto which you want to import content

    ·         Create and add the content to the page

    ·         Import the page into the desired OneNote location

    We’ll discuss the internal operation of the assembly classes during each of these steps.

    The OneNoteImporter assembly also enables you to update and delete pages or specific content on them, as long as you know their unique identifier. This is discussed in detail later in this article.

    It’s worth noting at this point that the OneNoteImporter assembly is designed to import a single OneNote page at a time. The OneNote.SimpleImporter class can take an XML string that includes content to be imported onto multiple pages. However, when using the OneNoteImporter assembly, you create a separate XML string for each page you want to import.

    Examining the Classes

    Before discussing how the classes in the assembly function internally, let’s briefly look at the abstract classes that form the basis of the assembly. There are four such classes, from which almost all of the other assembly classes derive: ImportNode, PageObject, OutlineContent, and Data.

    The ImportNode Class

    The ImportNode class is the base class for the entire assembly. Almost all the other classes inherit from it, including the other three abstract classes. As mentioned before, the SimpleImporter.Import method takes an XML string comprised of elements that detail the OneNote page and contents you want to import. The ImportNode represents a single node (or element) in this XML structure, such as a page, or an object on a page.

    This class provides several important pieces of common functionality:

    ·         Serialization: The ImportNode contains an abstract method, SerializeToXML, that classes derived from it must implement. This ensures that each derived class contains the means to serialize itself into XML for the XML string passed to the SimpleImport.Import method. We discuss how the various classes implement this method later in the article.

    ·         Selection for importing: This class also contains an internal Boolean property, CommitPending, that denotes whether or not to include this object in the XML string passed to the SimpleImport.Import method. By default, all objects based on the ImportNode or its derived classes have their CommitPending property set to True when they are constructed. This denotes that the object has not yet been imported into OneNote. We also discuss how and when an object’s CommitPending property is changed later in this article.

    The PageObject class

    The abstract PageObject class represents an object that can be added, updated, or deleted from the specified OneNote notebook page. There are three classes derived from the PageObject class in the assembly:

    ·         ImageObject, which represents an image, such as a jpeg or gif, on a OneNote page

    ·         InkObject, which represents ink on a OneNote page

    ·         OutlineObject, which represents an outline on a OneNote page. OutlineObject objects are actually comprised of other object, such as images, ink, and text described in html format.

    The SerializeToXml method is implemented in the PageObject class. It serializes the properties common to all PageObject classes:

    ·         Whether to delete the object

    ·         The object’s unique identifier

    ·         The object’s position on the page

    It then calls the abstract PageObject method SerializedObjectToXml. All classes derived from PageObject must implement this method, which serializes the specific attributes of each derived class.

    The OutlineContent class

    The abstract OutlineContent class represents an object that is part of an outline. There are three classes derived from the OutlineContent class in the assembly:

    ·         HtmlContent, which represents text on a OneNote page, described in html format.

    ·         InkContent, which represent ink that is part of an outline.

    ·         ImageContent, which represent an image that is part of an outline.

    The SerializeToXml method is abstract in this class; each class derived from the OutlineContent class must implement the serialization process.

    The Data class

    The final abstract class, Data, represents the actual data of the object to be imported. For example, for an image, this would be either the path to the image file, or the base 64-encoded data of the image itself. There are three concrete derived classes for the different types of data content an object can represent:

    ·         BinaryData, which represents ink or image data that is base-64 encoded.

    ·         FileData, which represents the path to a source file.

    ·         StringData, which represents HTML content.

    The SerializeToXml method is abstract in this class; each class derived from the Data class must implement the serialization process.

    In the next entry, we'll look at creating objects.

    Read part two here.

  • Andrew May's WebLog

    XML Document Property Parsing in SharePoint (1 of 5): XML Parser Overview


    I've just finished putting together a lot of information around how document parsers work within Windows SharePoint Services V3, including how to use the built-in XML parser, and how to create your own custom parsers for custom file types. This material won't be included in the WSS SDK until the next major update, so I figured I'd give you a preview of it here. For the next five posts, I'm going to be covering how to use the built-in XML parser in WSS V3 to promote and demote document properties in your XML files, including InfoPath 2007 forms.

    So without further ado:

    WSS V3 includes a built-in XML document parser you can use to promote and demote the properties included in your XML documents. Your XML files can adhere to any schema you choose. As long as your XML file meets the requirements listed below, WSS V3 automatically invokes the built-in XML parser whenever document property promotion or demotion is required.

    (Property promotion refers to extracting document properties from a document, and writing those property values to the appropriate columns on the document library where the document is stored. Property demotion refers to taking column values from the document library where a document is stored, and writing those column values into the document itself.)

    Using the built-in XML parser for your custom XML files helps ensure that your document metadata is always up-to-date and synchronized between the document library and the document itself. Users can edit document properties in the document itself, and have the property values on the document library automatically updated to reflect their changes. Likewise, users can update property values at the document library level, and have those changes automatically written back into the document itself.

    For WSS V3 to invoke the built-in XML parser for an XML file, that XML file must meet the following requirements:

    ·         The file must have an extension of .xml.

    ·         The file must not be a WordML file. WSS V3 contains a separate built-in parser for WordML files; WSS V3 automatically invokes this parser for XML files created using WordML.

    Additionally, for the XML parser to actually promote and demote document properties, the XML file should be assigned a content type that specifies where each document property is located in the document, and which content type column that property maps to. (We'll talk about that in a later entry in this series.)

    XML Parser Processing

    The following is a brief overview of how the built-in parser operates:

    When a user uploads an XML document, WSS V3 examines the document to determine if the built-in XML parser should be invoked. If the document meets the requirements, WSS V3 invokes the parser to promote the appropriate document properties to the document library.

    Once invoked, the XML parser examines the document to determine the document content type. The parser then accesses the document's content type definition. The content type definition includes information about each column in that content type; this information can include:

    ·         The document property that maps to a given column, if there is one

    ·         The location where the document property is stored in the document itself

    Using this information, the XML parser can extract each document property from the correct location in the document, and pass these properties to WSS V3. WSS V3 then promotes the appropriate document property to the matching column included in the content type.

    Likewise, WSS V3 can also invoke the built-in XML parser to demote properties from the content type columns, on the document library, into the document itself. When WSS V3 invokes the demotion function of the parser, it passes the parser the document and the column values to be demoted into the document. Once again, the parser accesses the document's content type definition. The parser uses the content type definition to determine:

    ·         Which document properties map to the column values passed to it for demotion

    ·         The location of those document properties in the document

    Using this information, the parser writes the column values into the applicable document property locations in the document.

    Enabling Property Demotion

    For a document property to be demoted, the column to which it is mapped must be defined with its ReadOnly attribute set to "false".

    In my next post, we'll discuss how to use content type to specify XML document properties. Stay tuned.

  • Andrew May's WebLog

    What are Content Type IDs?


    So, a reader emailed me the other day, asking for more information on content type IDs: what they are, how to create your own, etc. Because he asked nicely, and because I like to keep both my readers happy, I decided to write up a quick overview of content type IDs.

    The truth is, this is something I had hoped to get into the Beta 2 SDK, but the clock ran out. So his question gave me the perfect excuse to write the material up for the next refresh of the SDK. Consider this a preview then.

    Because I’m planning on incorporating this material into the SDK, you’ll forgive me if I slip into my formal, developer documentation writing style.

    <Authoritative SDK Voice>

    Content type IDs uniquely identify the content type. Content type IDs are designed to be recursive. The content type ID encapsulates that content type’s “lineage”, or the line of parent content types from which the content type inherits. Each content type ID contains the ID of the parent content type, which in turn contains the ID of that content type’s parent, and so on, ultimately back to and including the System content type ID. By parsing the content type ID, you can determine which content types the content type inherits, and how two content types are related.

    Windows SharePoint Services V3 uses this information to determine the relationship between content types, and for push down operations.

    You can construct a valid content type ID using one of two conventions:

    ·         Parent content type ID + two hexadecimal values

    ·         Parent content type ID + “00” + hexadecimal GUID

    There is one special case, that of the System content type, which has the content type ID of “0x”. The System content type is the sealed content type from which all other content types ultimately inherit.

    For all other content types, you must use one of the above methods for constructing a valid content type ID.

    Note that if you use the first method, the two hexadecimal values cannot be “00”.

    A content type ID must be unique within a site collection.

    Let’s examine each of these conventions in turn.

    Windows SharePoint Services V3 uses the first method for generating content type IDs for the default content types that come included with the platform. For example, the content type ID of the Item content type, one of the most basic content types, is 0x01. This denotes that the Item content type is a direct child of System. The content type ID of the Document content type is 0x0101, and the Folder content type has a content type ID of 0x0120. By parsing these content type IDs, we can determine that both Document and Folder are direct children of Item, which in turn inherits directly from System:

    In this way you can determine not only what content types a content type inherits from, but at which point two content types have common ancestors.

    The figure below maps out the relationship of the four content types discussed above. In each, the unique portion of the content type ID is represented by blue text.

    Windows SharePoint Services V3 employs the second content type ID generation convention when creating content type IDs for:

    ·         Site content types you create based on other content types.

    ·         List content types, which are copied to a list when you add a site content type to that list.

    For example, if you have a content type with a content type ID of “0x010100D5C2F139-516B-419D-801A-C6C18942554D”, you would know that the content type was either:

    ·         A site content type that is a direct child of the Document content type, or

    ·         A list content type created when the Document site content type was added to a list.

    In general, the first content type ID generation technique emphasizes brevity, in that it only takes two hexadecimal digits to denote a new content type. The second approach emphasizes uniqueness, as it includes a GUID to denote the new content type. Each approach is best in certain situations.

    We recommend you use the GUID approach to identify any content types that are direct children of content types you did not create. In other words, use the GUID approach if the parent content type is:

    ·         A default content type included in Windows SharePoint Services V3, such as Document.

    ·         A content type developed by a third party.

    That way, you are guaranteed that the content type ID is unique and will not be duplicated later by the developer of the parent content type.

    Once you’ve uniquely identified a content type in this manner, however, you can use the first method to identify any children of that content type. In essence, the GUID used in your content type can act as a de facto namespace for your content type. Any children based on that content type can be identified by just two hexadecimal digits. Because the maximum length of a content type ID is finite, this approach maximizes the number of content type “generations” allowable.

    Content type IDs have a maximum length of 512 bytes. Because two hexadecimal characters can fit in each byte, this gives each content type ID an effective maximum length of 1024 characters.

    For example, suppose you wanted to create a new content type, myDocument, based on the default Windows SharePoint Services V3 content type Document. For the myDocument content type ID, you start with the Document content type ID, 0x0101, and append 00 and a GUID. This uniquely identifies the myDocument content type, guaranteeing Windows SharePoint Services won’t later add another default content type with the same content type ID (which would be possible, if you had only appended two hexadecimal digits). To generate content type IDs for any content types you derive from myDocument, however, you can simply append two hexadecimal digits to the myDocument content type ID. This keeps the length of the content type ID to a minimum, thereby maximizing the number of content type “generations’ allowable.

    The figure below illustrates this scenario. Again, the unique portion of each content type ID is represented by blue text.

    </Authoritative SDK Voice>

    Now, the above information is probably most useful to developers working with the XML definition of content types. This way, if you’re looking at a content type ID in an XML file, or need to generate one for a content type definition file you’re writing, you’ll understand how to construct and parse them manually.

    The SharePoint object model, on the other hand, includes methods to parser and compare content type IDs. Specifically, you can use the SPContentTypeID.Parent method to find the parent of a content type without having to parser the content type ID yourself. The SPContentTypeID object also contains several methods that enable you to compare content types by ID, and identify a common ancestor.

    One last thing that might be of interest. If you want to take a look at actual content type IDs in WSS, here’s what you can do: navigate to the Content Type Gallery for a site. When you click on a content type, the URL to that content type contains a parameter, ctype, which is in fact the content type ID for that content type.


    Written while listening to: The Replacements : Let It Be

  • Andrew May's WebLog

    XML Document Property Parsing in SharePoint (2 of 5): Using Content Types to Specify XML Document Properties


    This is the second in a five-part series on how to use the built-in XML parser in WSS V3 to promote and demote document properties in your XML files, including InfoPath 2007 forms.

    Read part one here.

    When WSS V3 invokes the built-in XML parser to parse XML files, the parser uses the document's content type to determine which document properties map to which content type columns, and where those document properties are stored in the document itself. Therefore, to have WSS V3 use the built-in XML parser with your XML files, you must:

    ·         Create a content type that includes the necessary parsing information. For each document property you want promoted and demoted, include a field definition that includes the name of the document property that maps to the column the field definition represents, and where the document property is stored in the document.

    ·         Make sure that the content type ID is a document property that is demoted into the document itself. This ensures that the built-in XML parser can identify and access the correct content type for the document. (We’ll talk about this more in a later post.)

    Content Type Information for XML Parsing

    Document properties are promoted to and demoted from columns on the document library in which the document is stored. If the document is assigned a content type, these columns are specified in the content type definition. In the content type definition XML, each column included in the content type is represented by a FieldRef element.

    Note   Field elements represent columns that are defined for a site or list. FieldRef elements represent references to columns that are included in the content type. The FieldRef element contains column attributes that you can override for the column as it appears in this specific content type, such as the display name of the column, and whether it is hidden, or required on the content type. This information also includes the location of the document property to map to this column as it appears in the content type. This enables you to specify different locations for the document property that maps to the column in different content types.

    Because of this, to specify the information the built-in XML parser needs to promote and demote a document property, you must edit the FieldRef element that represents the document property's corresponding column in the content type definition.

    The figure below illustrations the actions the parser takes when an XML file is checked in to a document library. WSS V3 invokes the parser, which looks at the content type ID column to determine where in the document the document's content type ID is stored. The parser then looks inside the document for its content type at this location. The parser then examines the content type, to determine which FieldRef elements contain document property information. For each FieldRef element mapped to a document property, the parser looks for the document property at the location in the document specified in the matching FieldRef element. If the parser finds the document property at the specified location, it promotes that value to the matching column.

    When an XML document is first uploaded to a document library, the built-in XML parser must first determine the content type of the document, and whether that content type is associated with the document library.

    There are several attributes you can edit in a Field or FieldRef element to map that element to a document property and specify the location of the property in the document.

    First, the Field or FieldRef element must contain an ID attribute that specifies the ID of the column in the document library. For example:




    Next, add additional attributes to the Field or FieldRef element that specifies the location of the document property in the document. Document properties can be stored in either:

    ·         The XML content of the document, or

    ·         The processing instructions of the document.

    The attributes you add to the Field or FieldRef element to specify the property location depends on whether the property is stored as XML content or processing instructions. These attributes are mutually exclusive; if you add an attribute that specifies a location in the XML content, you cannot also add attributes that specify a location in the processing instructions.

    To edit a column’s field definition schema programmatically, use the SPField.SchemaXML object.

    Specifying Properties in Document XML Content

    If you store the document property in the document as XML content, you specify an XPath expression that represents the location of the property within the document. Add a Node attribute to the Field or FieldRef element, and set it equal to the XPath expression. For example:





    Document Property Value Collections

    If you specify an XPath expression that returns a collection of values, you can also include an aggregation attribute in the Field or FieldRef element. The aggregation attribute specifies the action to take on the value set returned. This action can be either an aggregation function, or an indication of the particular element within the collection.

    Possible values include the following:

    ·         sum

    ·         count

    ·         average

    ·         min

    ·         max

    ·         merge

    ·         plaintext   Converts node text content into plain text.

    ·         first   Specifies that property promotion and demotion be applied to the first element in the collection.

    ·         last   Specifies that property promotion and demotion be applied to the last element in the collection.

    For example:






    Specifying Properties in Document Processing Instructions

    Because processing instructions need not be just XML, XPath expressions are insufficient to identify document properties stored in processing instructions. Instead, you must add a pair of attributes to the Field or FieldRef element that specify the processing instruction and processing instruction attribute you want to use as a document property:

    ·         Add a PITarget attribute to specify the processing instruction in which the document property is stored in the document.

    ·         Add a PIAttribute attribute to specify the attribute to use as the document property.

    For example:






    These attributes would instruct the parser to examine the following processing instruction and attribute for the document property value:

    <?mydocumenttype propertyAttribute="value"?>

    You can also add another pair of attributes, PrimaryPITarget and PrimaryPIAttribute. This attribute pair is optional. Like PITarget and PIAttribute, they work in unison to identify the location of the document property. However, if they are present, the built-in XML parser looks for the document property in the location they specify first. If there is a value at that location, the parser uses that value and ignores the PITarget and PIAttribute attributes. Only if the location specified by the PrimaryPITarget and PrimaryPIAttribute attributes returns a null value does the parser then look for the document property at the location specified by the PITarget and PIAttribute attribute pair.

    If you specify the PrimaryPITarget and PrimaryPIAttribute attributes, you must also specify PITarget and PIAttribute attributes. The parser only uses the PrimaryPITarget and PrimaryPIAttribute attributes if the processing instruction attribute specified by the PITarget and PIAttribute pair does not exist in the document, not if that attribute exists but is null or empty.

    In my next post, we’ll discuss how the XML parser determines a document’s content type in the first place.

  • Andrew May's WebLog

    InfoPath Forms in Office SharePoint Server 2007


    I’m sure it comes as no surprise that Office InfoPath 2007 is the forms designer of choice for Office SharePoint Server 2007. But the average SharePoint developer, used to reaching for ASP.NET when he needs to create a form for SharePoint, might be surprised at all the places you can employ InfoPath to quickly create forms for enterprise management functions.

    The main reason for this is the fact that in InfoPath 2007 you can create what the InfoPath guys are calling symmetrical forms. Symmetrical forms are InfoPath forms that can be hosted either in the Office client applications like Word, PowerPoint, and Excel, as well as in the SharePoint Server browser interface. So you can create a single form, and know it’ll look and work the same whether the user sees it in an Office application or in the browser interface.

    (How is this possible? Short version: SharePoint Server 2007 uses Office Forms Services, a server-based run-time environment for InfoPath 2007 forms, to host the forms in the browser. Office Forms Services consumes the forms you create in the InfoPath client and renders them in an ASP.NET framework, which acts as a run-time environment for the form. This environment presents a form editing experience that matches the InfoPath 2007 client application. The Office client applications, on the other hand, include the ability to host the native InfoPath forms. )

    So naturally, symmetrical forms are very handy for enterprise content management areas, where the user might be working with documents in their native client application, or online through the SharePoint Server browser interface.

    Here’re a couple of the places where you can employ InfoPath forms for enterprise content management in SharePoint Server 2007:

    ·         Document Information Panels

    A document information panel is a form that is displayed within the client application, and which contains fields for the document metadata. Document information panels enable users to enter important metadata about a file anytime they want, without having to leave the Microsoft Office system client application. For files stored in document libraries, the document information is actually the columns of the content type assigned to that file. The document information panel displays a field for each content type property, or column, the user can edit.

    You can create document information panels either from within SharePoint Server, or directly from InfoPath 2007.

    How to: Create or Edit a Custom Document Information Panel from within Office SharePoint Server 2007

    How to: Create a Custom Document Information Panel from InfoPath 

    ·         Workflow Forms

    You can create InfoPath forms for use with workflows in SharePoint Server 2007. That way, the user can interact with the workflow form from within the Office client application, and not just through the browser.

    SharePoint Server 2007 uses Office Forms Services to display workflow forms, be they association, initiation, modification, or edit task forms. The only difference is that there’s a different .aspx page hosting the Office Forms Services control for each type of workflow form. Initiation forms are hosted by a different .aspx page than modification forms, for example. Each different hosting page knows how to submit the information from its type of form to SharePoint Server (and hence, to the workflow engine).

    The .aspx pages that contain the Office Forms Services web part are included as part of SharePoint Server, of course.

    How to: Design an InfoPath Form for a Workflow in Office SharePoint Server 2007

    So if your putting together an enterprise content management solution in Office SharePoint Server, and you’d like your user to be able to interact with your custom forms in the client application and the browser, it might be worth your while to take a look at InfoPath 2007 and Office Forms Services.


    Written while listening to Johnny Cash : Personal File

  • Andrew May's WebLog

    Document Parsers in SharePoint (1 of 4): Overview


    Now that I’ve talked about the built-in XML parser, and how you can use it to promote and demote document properties for XML files, you might be thinking: what about custom files types that aren’t XML? What if I’ve got proprietary binary file types from which I want to promote and demote properties to the SharePoint list?

    We’ve got you covered there as well.

    For the next four entries, I’m going to go over in detail how to construct and register a custom parser that enables you to promote and demote properties between your custom file types and Windows SharePoint Services.

    This information will get rolled into the next update of the WSS SDK, so consider this a preview if case you want to work with the parser framework right now.

    Custom Document Parser Overview

    Managing the metadata associated with your document is one of the most powerful advantages of storing your enterprise content in WSS. However, keeping that information in synch between the document library level and in the document itself is a challenge. WSS provides the document parser infrastructure, which enables you to create and install custom document parsers that can parse your custom file types and update a document for changes made at the document library level, and vice versa. Using a document parser for your custom file types helps ensure that your document metadata is always up-to-date and synchronized between the document library and the document itself.

    A document parser is a custom COM assembly that, by implementing the WSS document parser interface, does the following when invoked by WSS:

    ·         Extracts document property values from a document of a certain file type, and pass those property values to WSS for promotion to the document library property columns.

    ·         Receives document properties, and demote those property values into the document itself.

    This enables users to edit document properties in the document itself, and have the property values on the document library automatically updated to reflect their changes. Likewise, users can update property values at the document library level, and have those changes automatically written back into the document.

    I’ll talk about how WSS invokes document parsers, and how those parsers promote and demote document metadata, in my next entry.

    Parser Requirements

    For WSS to use a custom document parser, the document parser must meet the following conditions:

    ·         The document parser must be a COM assembly that implements the document parser interface.

    I’ll go over the details of the IParser interface in a later entry.

    ·         The document parser assembly must be installed and registered on each front-end Web server in the WSS installation.

    ·         You must add an entry for the document parser in DOCPARSE.XML, the file that contains the list of document parsers and the file types with which each is associated.

    And I’ll give you the specifics of the document parser definition schema in a later entry as well. All in good time.

    Parser Association

    WSS selects the document parser to invoke based on the file type of the document to be parsed. A given document parser can be associated with multiple file types, but you can associate a given file type with only one parser.

    To specify the file type or types that a custom document parser can parse, you add a node to the Docparse.XML file. Each node in this document identifies a document parser assembly, and specifies the file type for which it is to be used. You can specify a file type by either file extension or program ID.

    If you specify multiple document parsers for the same file type, WSS invokes the first document parser in the list associated with the file type.

    WSS includes built-in document parsers for the following file types:

    ·         OLE: includes DOC, XLS, PPT, MSG, and PUB file formats

    ·         Office 2007 XML formats: includes DOCX, DOCM, PPTX, PPTM, XLSX and XLSM file formats

    ·         XML

    ·         HTM: includes HTM, HTML, MHT, MHTM, and ASPX file formats

    You cannot create a custom document parser for these file types. With the XML parser, you can use content types to specify which document properties you want to map to which content type columns, and where the document properties reside in your XML documents.

    Parser Deployment

    To guarantee that WSS is able to invoke a given parser whenever necessary, you must install each parser assembly on each front end Web server in your WSS installation. Because of this, you can specify only one parser for a given file type across a WSS installation.

    The document parser framework does not include the ability to package and deploy a custom document parser as part of a SharePoint feature.

    In my next post, I’ll discuss how the document parser actually parses documents and interacts with WSS.

  • Andrew May's WebLog

    OneNote: An In-Depth Look at the OneNoteImporter Managed Assembly (Part 3 of 5)


    In this series of entries, we're taking an in-depth at the OneNoteImporter manage class, which provides an object model interface for the programmability functionality added in OneNote 2003 SP 1.

    Read part one here.

    Read part two here.

    Importing Objects into OneNote

    The actual creation of the XML import document, and importing the page contents, takes place when you call the Page.Commit method. This method, in turn, invokes a number of methods in other OneNoteImporter objects. Because of this method’s importance and complexity, it’s worth examining how the method functions.

    First, the code checks to see if the page has changed in any way from the last time it was imported. It does this by determining if the Page object’s CommitPending property is set to True. If it is, it calls the SimpleImporter.Import method.

    The code calls the Page.ToString method to generate the XML string it passes to the Import method. The ToString method in turn calls the Page.SerializeToXml method.

    This begins a series of recursive calls to the SerializeToXml methods of the various objects. Each object’s SerializeToXml method includes instructions to call the SerializeToXml method of any child objects, and append the resulting XML to the parent element. This in turn invokes the SerializeToXml method of any child objects the original child object might have, and so on, until the entire page structure has been serialized to xml in a single xml document.

    The Page.SerializeToXml begins by creating a new XmlDocument object, and generating <Import> and <EnsurePage> elements and adding them to the document. Page object property values are used to set the various attributes of the <EnsurePage> element.

    Note that the Commit method generates import XML with both <EnsurePage> and <PlaceObjects> elements for that page. Specifying an <EnsurePage> element for a page guarantees that the page exists before OneNote attempts to import objects onto it. So if your application includes a scenario where you only want to import objects onto a page if the page already exists, you’ll need to modify this method, or use another means.

    The code then generates a <PlaceObjects> element. For each of the Page object’s children whose CommitPending property is set to True, the code calls the PageObject.SerializeToXml method.

    If the page object’s DeletePending property is set to True, the PageObject.SerializeToXml method generates a <Delete> element. If not, the method does three things:

    ·         Generates a <Position> element, whose attributes are set according to Position object property values.

    ·         Calls the SerializeObjectToXml method for the specific PageObject-derived class involved, i.e., ImageObject, InkObject, or OutlineObject.

    ·         Calls the SerializeToXml method for the specific OutlineContent-derived class involved, i.e., HtmlContent, InkContent, or ImageContent.

    Executing the SerializeToXml method for each of these content types includes a call to the SerializeToXml method for the Data-derived object they contain: BinaryData, FileData, or StringData. In this way, the entire page structure is serialized to xml in a single xml document.

    Note that the HtmlContent.SerializeToXml method includes a call to another internal method of that same object, called CleanHtml. The CleanHtml method reads through the html string or file data and makes sure the HTML is formatted in a way that OneNote accepts. It identifies and replaces problematic formatting with characters which OneNote can process. For example, the CleanHtml method wraps the HTML string with the appropriate <html> and <body> tags if the HTML lacks them.

    The serialization of the page nodes is now complete. If the Page object had no children, the <PlaceObjects> element remains empty. In such a case, the Page.SerializeToXml method does not append it to the <Import> element.

    Finally, the Page.SerializeToXml method determines the appropriate namespace designation and adds it to the <Import> element.

    The ToString method then takes the XmlDocument object, saves it as a text stream, converts it to a string, and passes it back to the SimpleImporter.Import method. This Import method uses the XML string to import the specified content into OneNote.

    Now that the content has been imported into OneNote, the Commit method performs some vital housekeeping. Using the RemoveChild method, it removes any of the Page object’s children who have their DeletePending property set to True. It then sets the private committed field to True, thereby making the Date, PreviousPage, RTL, and Title properties read-only. You cannot change these attributes once you import a page into OneNote.

    Lastly, it sets the CommitPending property of the Page to False. This it turn sets the CommitPending properties of all the Page object’s remaining children to False as well.

    In part four, we'll examine the internal method calls of the Commit method.

    Read part four here.

  • Andrew May's WebLog

    The OneNote 1.1 SimpleImport XML Schema


    The Office 2003: XML Reference Schemas download is now available. The download includes a copy of the OneNote import schema, as well as the complete element, type, and attribute documentation we posted online awhile back. It also include an introduction detailing how to use the schema to import content into OneNote.

    To view the OneNote schema reference help online, see the Microsoft Office OneNote 2003 Software Development Kit (SDK) on the Office Developer Center. The SDK contains reference documentation for the OneNote SimpleImport XML Schema, including descriptions of elements and types.

  • Andrew May's WebLog

    Creating Personalized Thank You Cards with Publisher 2003 (Part 1 of 4)


    Well, it’s spring time again, and of course that means one thing: Tech Ed. But rather than focus on the definitive Microsoft conference for building, deploying, securing, and managing connected solutions, I’d like to take a few entries and address another industry that gears up every spring.

    That’s right, it’s wedding season.

    So over the next few entries I’d like to discuss something very cool my wife and I came up with for our wedding: using Publisher to create personalized Thank You cards for wedding gifts. We created an individual Thank You card for everyone who sent us a gift. Printed inside each card was a picture of us holding that person's particular gift, along with printed comments specific to that person.

    We personalized the Thank You because we wanted people to know how important they were to us and how much their gifts meant. Also, since our registry was online, and we had family and friends all over the country, a lot of the gifts got ordered online and shipped directly from the store. In many cases, the people buying the gifts had never really seen what they were sending us.

    Pretty cool, huh? Trust me, people loved these cards.

    And the best thing was, using Publisher’s catalog merge functionality, it was easy. We did all this through the user interface; I didn’t have to write a single line of code to customize or extend Publisher’s functionality. (So consider this a warning: the following entries don’t contain a single line of code.)

    Unnecessary Backstory

    Anyway, about six months ago, I got married. (That in itself is a success story, but not one in which Microsoft products played a large part. I assume. Maybe she created a mathematically-weighted list of my pros and cons in Excel and the balance sheet came out in my favor. I don't know. But if that's the case, she should probably check her math.) No, the cool thing we did with Publisher was to create personalized Thank You cards.

    Once we decided to be married, we actually realized that we’d have to go through the process of getting married. For some unknown reason, we decided not to elope, and instead opted for a simple wedding. ‘Simple wedding’, of course, being equivalent to ‘military intelligence’ on the oxymoron rating scale.

    Simple as our wedding was, most of the planning fell to my fiancée. Once I realized the incredible stress and pressure she was under from dealing with the marital-industrial complex, I resolved to help and support her in any way humanly possible that didn't involve actually dealing with any aspects of the planning process that didn't interest me. In short, I began looking for those activities that would let me continue to dink on the computer while giving the appearance of being an active participant in planning my own wedding. I spent long hard hours on such tasks as generating facility expense comparisons, keeping our invitation address database current, maintaining our website, and much more. I did this incidentally to move our plans for the blessed day forward, but primarily just to avoid being dragged into interminable discussions of floral arrangement plans and linen design options.

    (And trust me, the wedding was much better for my lack of involvement. One of the few times I actually rendered an opinion, it nearly doomed the whole damn thing. Just remember: at weddings, people want and expect cake. Any cake. People do not want wonderful and expensive lemon torte with fresh-picked organic berries--unless it is served on cake. If you tell people ahead of time there will be no cake, they will revolt, and threaten to bring their own cake and serve it tail-gate style out in the reception parking lot. Seriously. Cake.)

    Publisher 2003 was great for stuff like this. I created our website in Publisher; I designed our Save the Date cards and the wedding program in Publisher. I even used Publisher's mail merge feature to address the invitations. Those all turned out looking professional and pleased my fiancée immensely. But the one thing I'm particularly proud of was our Thank You cards.

    We used Publisher's catalog merge functionality. It was easy. Here's the basic steps, each of which I discuss in more detail later:

    ·         Take digital photos of the happy couple with each gift

    ·         Design the outside of your Thank You card

    ·         Create the data source for the catalog merge

    ·         Design the inside of the card as a catalog merge area

    ·         Perform the catalog merge

    ·         Print the cards

    In my entry tomorrow I’ll cover the first two of these steps.

  • Andrew May's WebLog

    Creating Personalized Thank You Cards with Publisher 2003 (Part 2 of 4)


    Now that we’ve covered the basics of the project, let’s jump into the details:


    Here’s the household items you’ll need for this project:

    ·         Elmer’s glue (or paste)

    ·         Safety scissors

    ·         A 6-inch length of string

    ·         Glitter

    Oh wait, that’s a different arts and craft project. The tools for this one are a little more upscale:

    ·         Microsoft Office Publisher 2003

    ·         A digital camera capable of downloading images to a computer

    ·         A color printer capable of producing photo-quality prints (or access to one at work)

    ·         A computer (preferably a laptop or TabletPC)

    ·         A paper cutter (you can substitute safety scissors if you must)

    Now, if you don’t already have these basic household items, don’t be afraid to include them in your wedding budget, especially if her father’s picking up the tab. This works best if you don’t call them out in the budget as individual line items. Group them with similar expenses. For example, go ahead and roll the cost of the digital camera into the “Wedding party corsages” line item, or expense the photo-quality color printer under “Table decorations (misc.).”

    Bribe a Friend

    Actually, I forgot an additional item you’ll need for this project: a friend willing (or willing to be bribed) to take the pictures of the happy couple with each gift. If they can work a digital camera, even better. Alcohol, chocolate, and/or food make excellent inducements, depending on the friend. One piece of advice, though: don’t be too free with the alcohol until all the pictures are taken. Also, things go most smoothly when you unwrap and sort the presents before the camera person gets there. A laptop’s great for entering the gift information in the spreadsheet as you open each gift.

    One thing we hadn’t anticipated was how many people would be giving us the universal gift: cash. Or gift cards/certificates to the stores included on our registry. Since we didn’t feel a picture of us fondling a gift check, or rolling in a pile of low-denomination bills was appropriate, for these generous souls we included a picture of us on our wedding day.

    Now granted, you could take the pictures with a conventional camera, and scan them into the computer later. But the great thing about digital is there’s no wasted film, you can be sure you’ve got the photo you want as soon as you take it, and you don’t have to get the photos developed. And in the end, isn’t your wedding worth it?

    (If she buys that argument, my work here is done.)

    Once you’ve got the pictures taken, download them to the computer.

    Design the Card Exterior

    You’ll use Publisher for the rest of the steps. First, design the outside (front and back) of the card. In our case, we made the cards 5.5 by 8.5 (5.5 by 4.25 when folded) so that two would fit on a regular 8.5 by 11 inch sheet. So the basic layout looked like this:

    Remember that anything on the back of the card needs to be upside-down in your layout, so that’s correct once the card has been cut and folded. To flip shapes, including text boxes, select the shape and then, from the Arrange menu, click Rotate or Flip, and then click Flip Vertical.

    Then print as many as you’ll need.

    The ruler guides mark quarter-inch borders around the actual faces of each card . Once the cards were done, we cut them on the center vertical ruler, and folded them along the center horizontal guide. Just don’t cut yours yet, because we still need to print the personalized inside of the card.

    Which we’ll cover tomorrow.

  • Andrew May's WebLog

    Creating Personalized Thank You Cards with Publisher 2003 (Part 3 of 4)


    Now that we’ve got the card exterior designed, we’re ready to work on the catalog merge.

    Create the Data Source

    Next, create the data source for the cards. We used the same one we used for addressing the invitation. We just added several fields to the spreadsheet:

    ·         Salutation: The informal name you use for the person, like “Uncle Jim and Aunt Jane”, as opposed to the formal name you’d use on the wedding invitation.

    ·         Gift?: This is just a simple yes/no field. You’ll use it to filter the records you’ll use to perform the catalog merge.

    ·         Gift: Informal description of whatever the gift actually was.

    ·         General comment: Whatever heart-felt sentiment you want to say. Because Publisher only prints the first 256 characters in a field, I added two comment field next to each other. Each field is roughly enough for two or three short sentences.

    ·         General comment 2: See above.

    ·         Picture: The path to the picture of you with this gift.

    ·         Record merged?: If you’re not going to merge and print the cards all at once, you might want to include a data field to designate whether or not the card has been included in a merge. Just remember to change the value from False to True once you create the merge and print it. That way you can sort out the printed cards, so they won’t be included in any subsequent merge.

    Fill in the data source records, and you’re ready to create the card interior itself.

    Design the Card Interior

    Below is the design I came up with for the inside of our Thank You cards. Notice that the inside of the card is one large catalog merge area, sized so that it repeats twice per page. I placed the picture of us with the gift on the inner side of the card front, with the personal thank you comments underneath. But that’s hardly the only way to do it. Play around, see what works for you.

    I’ve marked the merge fields blue in the screen shot below to emphasize them, so you can see what they look like before the merge happens. The merge fields don’t appear in blue normally.


    For the purpose of this blog, I’m assuming you know how to create and perform a catalog merge. If you don’t, take a look at these Office Online resources:

    Create a catalog merge

    Demo: Catalog merge turns data into designs

    Create catalogs or directories

    Or, for you programming types, I just happen to have written a few articles on the topic:

    Create Catalogs with Publisher 2003

    Sort and Filter Your Data for Mail or Catalog Merges in Publisher 2003

    Tomorrow, we bring it all together.

  • Andrew May's WebLog

    Animating Shapes in PowerPoint 2000 and 97 (Part 1)


    So, while I was out of town, my first article dealing with PowerPoint animation made its debut:

    Comparing Ways to Control Animation in PowerPoint 2002 and 2003

    As I’ve mentioned in previous blog entries, there are actually two ways to programmatically animate shapes on a slide: one that works best for PowerPoint 2002 and 2003, and one that’s been retained for compatibility with PowerPoint 2000 and 97. This article examines the differences between the two, and discusses when using each is appropriate.

    Slated for publication next week is the first of a two-part article that covers how to use the revised and greatly expanded animation functionality included in PowerPoint 2002 and 2003.

    Now, while my articles naturally focus on the advantages of using the 2002 and 2003 functionality, in order to write these articles I had to learn how to use the 2000 and 97 functionality as well. And since there’s a large number of developers out there still working with those versions, it seems worthwhile to take a few minutes and present that information here. So for the next few days, I’ll be covering how to animate shapes using the AnimationSettings object.

    Applies to:

        Microsoft PowerPoint 2000

        Microsoft PowerPoint 97


    In PowerPoint, the term animations refers to special visual or sound effects you add to text or an object. Animation effects give motion to text, pictures, and other content on your slides. Besides adding action, they help you guide audience focus, emphasize important points, transition between slides, and maximize slide space by moving things on and off.

    These effects can include how the shape (or its component parts) enter the slide, what the shape does once it appears on the slide, and how it appears once the animation sequence moves to the next shape. You can set the animation sequence to advance to the next animation effect by the user clicking on the slide, or pre-set timing settings.

    Important If you are programming for PowerPoint versions 2002 or 2003, you should be using the TimeLine object model for dealing with animation effects; the AnimationSettings object model should only be used for PowerPoint versions 2000 and 97. The two object models are not compatible. Using the AnimationSettings object model for programming PowerPoint 2002 or 2003 is not recommended, as it can have unexpected and undesirable results for your animation sequences. For more information see Comparing Ways to Control Animation in PowerPoint 2002 and 2003.

    Animations in PowerPoint 2000 and 97

    In PowerPoint 2000 and 97, each Shape object on a slide can have a single animation effect, represented by that Shape object’s AnimationSettings object. You create and customize animations using the AnimationSettings object’s members and child objects. While each Shape object can have only one animation effect, in the case of charts or text, the animation effect may be a build effect, in which sub-objects of the shape are animated sequentially so that they combine, or build, to display the complete shape. We discuss build effect animations is greater detail later.

    Shapes that are not animated appear on the slide when it first loads.

    The individual animation effects for each shape collectively make up a slide’s animation sequence. There is one animation sequence per slide, and it starts when the slide loads.

    Setting the Type, Timing, and Order of a Slide’s Animation Sequence

    When you decide to add animation effects to a slide, the general questions you need to answer concern the sequence and timing of the shapes you want to animate:

    ·         Which shapes do you want to animate?

    ·         What kind of animation do you want PowerPoint to perform on each shape?

    ·         In what order do you want PowerPoint to animate the shapes?

    ·         How do you want to initiate each shape’s animation effect?

    Setting Which Shapes Get Animated

    For PowerPoint to animate a shape, you set the Animate property of its AnimationSettings object to msoTrue. PowerPoint automatically sets the Animate property to msoTrue in either of the following instances:

    ·         You set the TextLevelEffect property to a value other than ppAnimateLevelNone.

    ·         You set the EntryEffect property to a value other than ppEntryEffectNone.

    The converse is also true. PowerPoint automatically sets the Animate property to msoFalse if you set the TextLevelEffect property to ppAnimateLevelNone, or set the EntryEffect property to ppEntryEffectNone.

    Even if you have set other properties of the AnimationSettings object, PowerPoint disregards them and does not animate the shape unless the Animate property is set to msoTrue.

    Shapes with an EntryEffect property of ppEffectNone are visible when the slide loads.

    There are instances when you would want to set the EntryEffect property to ppEffectNone. For example, you could have a media object you want visible on the slide when it loads, even if you do not want to play the file until the fourth animation in the sequence. The following code does just that. Note that the code explicitly sets the Animate property is to True after it specifies the EntryEffect property as ppEffectNone.

    With ActivePresentation.Slides(1).Shapes


        Set objShape = .Item("Title")

            With objShape.AnimationSettings

                .EntryEffect = ppEffectBlindsHorizontal

                .AnimationOrder = 1

            End With

        Set objShape = .Item("Shape1")

            With objShape.AnimationSettings

                .EntryEffect = ppEffectFlyFromLeft

                .AnimationOrder = 2

            End With

        Set objShape = .Item("MovieClip")

            With objShape.AnimationSettings

                .EntryEffect = ppEffectNone

                .Animate = msoTrue

                .AnimationOrder = 3

            End With

    End With

    Specifying the Animation Effect PowerPoint Performs on a Shape

    Use the EntryEffect property to specify which animation effect you want PowerPoint to perform on the selected shape. All animation effects created using the AnimationSettings object are entrance effects. Entrance effects are animations that control how the shape becomes visible on the slide. For example, this could involve having the shape appear to move onto the slide from outside the slide boundaries, such as flying in from the right edge of the slide; or having the shape become visible in place in a particular manner, such as dissolving into visibility.

    By default, if you do not set an animation effect for a shape, PowerPoint uses the ‘Appear’ effect, ppEntryEffectAppear. The code example in the following section demonstrates this.

    Tomorrow, we’ll tackle setting the animation order of the shapes on a slide, and how to trigger a shape’s animation effect.

  • Andrew May's WebLog

    Transferring Publisher custom color schemes between users and computers


    I’ve spent some time the last few weeks helping my wife come up with a consistent corporate identity for her business: letterhead, business cards, and all that. (Of course, I’m the one that suggested the overhaul in the first place; that way, I get to play on the computer, and get points for being a supportive spouse. How can you beat that?)

    For a big part of this, of course, I’m working in Publisher. I wanted to take advantage of Publisher’s color schemes feature to create a set of custom colors for her business. And that’s when I discovered a few interesting things about transferring custom color schemes between users on the same computer, or different computers.

    The custom color schemes you create are not stored within the Publisher publication itself. Instead, they’re stored in a separate file, named custcols.scm, in your user directory. For example:

    C:\Documents and Settings\username\Application Data\Microsoft\Publisher\custcols.scm

    All the custom colors you as a user create are stored in this one file.

    Which leads to two limitations of color schemes, if you work on multiple computers, or have multiple users working with the same custom color scheme on the same machine:

    ·         If you open the publication on a different computer, the custom color scheme does not automatically get added to the color schemes available in Publisher on that computer.

    ·         If another user logs onto the same computer and opens Publisher, the color schemes you created are not available to them. Likewise, any color schemes they create won’t be available to you.

    Now, you can transfer custom color schemes by simply copying that custcols.scm file from one user directory to another, or even one computer to another. So once I’ve created the custom color scheme for my wife’s business, I can simply copy it into her user directory, and she’s ready to go. Likewise, she can copy it to her user directory on her machine at her business.

    A few important caveats apply, though:

    Publisher stores all the custom color schemes you’ve created in this one file. You can’t pick and choose which ones to transfer. Also, overwriting the custcols.scm file in this way will nuke any custom color schemes the user might already have in their existing custcols.scm file.

    Also, remember that the files aren’t linked in any way, so there’s no dynamic updating. If I change a custom color scheme while working in Publisher, I’d have to overwrite the custcols.scm file in my wife’s user directory again in order for her to use the updated scheme.

    A slightly more labor-intensive, but more precise, way to transfer custom color scheme is to use a Publisher file to transfer an individual color scheme. When you apply a custom color scheme to a publication, and then open that file using a different user and/or computer, the color scheme isn’t automatically added to the schemes available. However, the custom color scheme information still resides in the publication, and can be saved to the custcols.scm file.

    Here’s how you do it:

    1.      Define the custom color scheme and apply it to the publication.

    2.      Save the publication.

    3.      Open the publication on a different computer, or open it as a different user on the same computer.

    4.      In the Publisher task pane, click on Custom color scheme.

    5.      In the Color Scheme dialog box, on the Custom tab, the various colors of the custom color scheme are shown in the Current column. Without making any changes, click Save Scheme.

    6.      Enter the name of the custom color scheme, and click OK twice.

    Only the current scheme applied to the publication can be transferred in this way. However, since the custom color scheme is added to those currently saved in the user’s custcols.scm file, no previously-defined custom schemes are overwritten.

    Standard disclaimer: This posting is provided "AS IS" with no warranties, and confers no rights.

  • Andrew May's WebLog

    Identifying wizard shapes in a publication


    I recently finished an article on how to generate publications based on pre-designed wizards in Publisher:

    Here’s something I didn’t cover in the article, because it was a little more advanced than the basic overview-type article I was writing.

    But not only do publications and pages have wizard properties, but shapes do as well. Any shape that Publisher adds to the default appearance of a publication based on a wizard design has two properties that uniquely identify it. Publisher uses these properties to keep track of shapes that ‘belong’ to the wizard design, as opposed to any custom shapes the user might add to the publication later. If the user selects a different wizard design for the publication, Publisher ‘morphs’ the publication by changing the wizard shapes (adding, deleting, and changing shape properties) to the default appearance of the selected wizard design. Any shapes the user has added to the publication are left unchanged.

    WizardTag and WizardTagInstance are the two shape properties; together, they comprise a unique identifier for each wizard shape in the publication. WizardTag refers to the type and style of shape (a textbox of a certain style, a photo of a certain type, etc), while WizardTagInstance is the instance of that shape type in the publication. Every shape, whether generated by a wizard or not, has a WizardTag and WizardTagInstance property. For non-wizard shapes (that is, shapes that the user inserted into the publication), the value of each property is 0.

    There are several interesting aspects of wizard tags to keep in mind:

    ·          The instance numbering is not always consecutive, but it is predictable.  If you’ve got five instances of a certain shape type in your publication, don’t assume their WizardTagInstance values run from 1 to 5, because they probably don’t. From what I can tell, numbering appears to be consecutive within a given publication page, but not from page to page. For example, in the photo gallery wizard page, each page starts the first photo shape as a multiple of 64. Play with it before making any assumptions on how the instances are numbered.

    ·          While the WizardTag and WizardTagInstance property values are assigned by Publisher, and used to get track of the shapes generated by the wizard design, the properties themselves are not read-only. You can get and set them at will. Also, Publisher does not validate to guarantee a given WizardTag-WizardTagInstance combination is assigned to only one shape per publication. So you can easily set several shapes to have the same wizard tag and instance combination. In fact, if you wanted to treat these shapes as a subset of the wizard tag type, you might want to do this on purpose.

    ·          You can use the FindByWizardTag method (pass it a tag type and instance number) to return a specific instance, or even all instances of that tag. The method returns a ShapeRange collection. This is one of the few instances where you can create a Shapes collection that spans publication pages. For instance, if you wanted to apply formatting to all shapes of a certain wizard type (article textboxes in a newsletter, for instance), regardless of the page they appear on, this is how you’d do it.

    ·          Changing a shape’s WizardTag property to 0 excludes it from being updated the next time you change wizard designs. The most that will happen is that the shape’s colors may change. This is because the SchemeColor property for each ColorFormat object in the shape is still set; when Publisher changes the color scheme based on the wizard design, each color in the shape updates to the new color specified for that scheme color (Main, Accent1, etc.).

    Wizard properties for shapes should not be confused with group wizard shapes, which are a different beast altogether. I hope to write up a little something about group wizard shapes later in the week, as well as discuss the Tag property of Shapes. I should probably mention AutoShapes as well, just to make sure everyone’s totally confused.

    Until then…

  • Andrew May's WebLog

    Document Parsers in SharePoint (2 of 4): How Parsers Process Documents


    Read part one here.

    In my last entry, I gave you a brief overview of what document parsers are in Windows SharePoint Services V3, and a high-level look at what you need to do to build a custom document parser for your own custom file types. Today we’re going to start digging a little deeper, and examine how a parser interacts with WSS in detail.

    Document Parser Processing

    When a file is uploaded, or move or copied to a document library, WSS determines if a parser is associated with the document's file type. If one is, WSS invokes the parser, passing it the document to be parsed, and a property bag object. The parser extracts the properties and matching property values from the document, and adds them to the property bag object. The parser extracts all the properties from the document.

    WSS accesses the document property bag and determines which properties match the columns for the document. It then promotes those properties, or writes the document property value to the matching document library column. WSS only promotes the properties that match columns applicable to the document. The columns applicable to a document are specified by:

    ·         The document's content type, if it has been assigned a content type.

    ·         The columns in the document library, if the document does not have a content type.

    WSS also stores the entire document property collection in a hash table; this hash table can be accessed programmatically by using the SPFile.Properties properties. There is no user interface to access the document properties hash table.

    The following figure illustrates the document parsing process. In it, the parser extracts document properties from the document and writes them to the property bag. Of the four document properties, three are included in the document's content type. WSS promotes these properties to the document library; that is, writes their property values to the appropriate columns. WSS does not promote the fourth document property, Status, even though the document library includes a matching column. This is because the document's content type does not include the Status column. WSS also writes all four document properties to a hash table that is stored with the document on the document library.

    WSS can also invoke the parser to demote properties, or write a column value into the matching property in the document itself. When WSS invokes the demotion function of the parser, it again passes the parser the document, and a property bag object. In this case, the property bag object contains the properties that WSS expects the parser to demote into the document. The parser demotes the specified properties, and WSS saves the updated document back to the document library.

    The figure below illustrates the document property demotion process. To update two document properties, WSS invokes the parser, passing it the document to be updated, and a property bag object containing the two document properties. The parser reads the property values from the property bag, and updates the properties in the document. When the parser finishes updating the document, it passes a parameter back to WSS that indicates that it has changed the document. WSS then saves the updated document to the document library.

    Mapping Document Properties to Columns

    Once the document parser writes the document properties to the property bag object, WSS promotes the document properties that match columns on the document library. To do this, WSS compares the document property name with the internal names of the columns in the document’s content type, or on the document library itself if the document doesn’t have a content type. When WSS finds a column whose internal name matches the document property, it promotes the document property value into that column for the document.

    However, WSS also enables you to explicitly map a document property to a specific column. You specify this mapping information in the column’s field definition schema.

    Mapping document properties to columns in the column’s field definition enables you to map document properties to columns that may or may not be named the same. For example, you can map the document property ‘Author’ to the ‘Creator’ column of a content type or document library.

    To specify this mapping, add a ParserRef element to the field definition schema, as shown in the example below:

    <Field Type=”Text” Name=”Creator” … >


        <ParserRef Name=”Author” Assembly=”myParser.Assembly”>



    The following elements are used to define a document property mapping:


    Optional. Represents a list of document parser references for this column.


    Optional. Represents a single document parser reference. This element contains the following attributes:

    ·         Name   Required String. The name of the document property to be mapped to this column.

    ·         Assembly   Required String. The name of the document parser used.

    A column’s field definition might contain multiple parser references, each specifying a different document parser.

    In addition, if you are working with files in multiple languages, you can use parser references to manage the mapping of document properties to the appropriate columns in multiple languages, rather than have to build that functionality into the parser itself. The parser can generate a single document property, while you use multiple parser references to make sure the property is mapped to the correct column for that language. For example, suppose a parser extracts a document property named ‘Singer’. You could then map that property to a column named ‘Cantador’, as in the example below:

    <Field Type=”Text” Name=”Cantador” … >


        <ParserRef Name=”Singer” CLSID=”MyParser”>

        <ParserRef Name=”Artist” Assembly=”MP3Parser.Assembly”>



    To edit a column’s field definition schema programmatically, use the SPField.SchemaXML object. There is no equivalent user interface for specifying a parsers for a column.

    In the next entry, we'll discuss how WSS processes document that contain their content type definition.

  • Andrew May's WebLog

    April update to WSS 3.0 SDK online now live!


    I’m happy to announce that the Windows SharePoint Services 3.0 SDK has been updated online on MSDN!


    What’s new in the April update of the WSS 3.0 Online SDK:


    New conceptual sections:

    Content Migration

    Change Log and Synchronizing Applications

    Creating Declarative, No-Code Workflow Editors


    Procedural topics:

    How to: Create a Custom Field Type and Field Control

    How To: Extend the STSADM Utility


    Schema reference topics for the following schemas:

    Content Migration XML Schema Reference: Contains over 180 element topics that detail the eight migration schemas

    Workflow Configuration Schema


    Greatly expanded reference material for over 300 types in the following namespaces:

    ·         Microsoft.SharePoint

    ·         Microsoft.SharePoint.Administration

    ·         Microsoft.SharePoint.Deployment

    ·         Microsoft.SharePoint.EmailIntegration

    ·         Microsoft.SharePoint.Navigation

    ·         Microsoft.SharePoint.MobileControls

    ·         Microsoft.SharePoint.StsAdmin

    ·         Microsoft.SharePoint.WebPartPages

    ·         Microsoft.SharePoint.WebControls

    ·         Microsoft.SharePoint.Workflow


    As well as expanded reference material for the following Web Services:

    ·         Authentication Web Service

    ·         Copy Web Service


    This update also includes numerous updates and revisions to existing SDK content.


    Note: The WSS 3.0 SDK download will be updated with the next online update, currently scheduled for the end of June.


    I want to thank everyone who contributed their time and effort to making this update possible. We’ve added an immense amount of  detailed technical content, with more to come. Check it out!

  • Andrew May's WebLog

    SharePoint Developer Documentation Team Blog Now Live


    Now that I’m managing a documentation team, I just don’t get to blog about SharePoint developer issues nearly as often as I’d like. In part to help remedy that, we’ve launched the SharePoint Developer Documentation Team blog. In the coming months, the programmer-writers for Windows SharePoint Services and Office SharePoint Server will be posting on a range of developer issues, information, and best practices. Our goals with the new blog are several:

    ·         Present “draft” versions of SDK content before it’s available in the SDKs themselves. These drafts are tech-reviewed content that will end up in the SDKS; this is just our way of getting the information out to you as quick as we can.

    ·         Promote content that we publish on MSDN. This includes giving you the latest news on new technical articles, SDK updates, code samples or tools downloads, and anything else new and noteworthy we think you’ll want to take a look at.

    ·         Blog about the developer documentation we produce, and why. Ever wonder why we document what we do, in the way we do? In future posts we’ll explore the decisions we make when planning SharePoint developer documentation, and the data we base those decisions on.

    ·         Engage with users, and find out just how developers are using (or would like to use) the documentation we produce.

    ·         Experiment with presenting developer documentation in new and different formats.

    Already, in the few weeks it’s been live, we’ve posted extensive information on extending workflow actions in SharePoint Designer, as well as the basics of content deployment. And in the coming weeks, we’ll be adding even more draft versions of developer documentation as soon as they’re ready.

    So go check it out. Subscribe to our RSS feed. Even better, drop us a line and tell us what you’d like to see on the blog, or in our developer documentation.

  • Andrew May's WebLog

    SharePoint Terminology: Column vs. Field vs. Property


    Today I want to address a piece of SharePoint terminology that confused me no end when I first started working on the products. So, as a service to beginning SharePoint programmers the world over, here’s a quick (and hopefully precise, if not definitive) overview of three terms that tend to be used to mean the same thing: column, field, and document property.

    Columns on Sites and Lists

    Basically, what all three terms refer to a specific piece of metadata on a SharePoint item.

    In other words, this:

    In the UI, these are called ‘columns’ because, well, that’s what they are displayed as: columns. Each piece of metadata you’re collecting for a list is represented as a column on that list (Whether the column is actually displayed on the list or not. Columns can be hidden.)

    However, if you take a look under the hood, either in the SharePoint schemas or object model, you’ll find they’re called Fields. This is what they were called in V2, and for compatibility sake, that’s what they remain in V3. (Database columns tend to be called ‘fields’, so that might be where the term originally crept in.)

    So far, that’s all pretty straightforward. But V3 adds some complexity (and correspondingly, much flexibility and power) with the addition of content types, and site columns. I’ve talked some about content types here. Site columns can be thought of as templates; you create the site column at the site level, then you can apply it to the various lists and sub-sites as you wish. Site columns are represented as <Field> elements in the site schema, and Field objects in the object model.

    When you add a site column to a list, the column definition is copied locally to the list, as a list column. So, in the list schema, the list column is represented by a <Field> element in the list schema now. In the object model, it’s represented by a Field object.

    Another important point to mention: when you add a site column to a list, the resulting list column has the same field ID as the site column. SharePoint uses this to track which list columns are “children” of a given site column. This enables you to make changes to a site column and propagate those changes out to all the list columns that are children of the site column.

    You can also create columns at the list level. These columns only apply to the list on which they are created. You can think of them as one-offs. You can add list columns to content types on that list, but that's it. List columns are also represented as <Field> elements in the list schema, and SPField objects in the object model. Because they were created from scratch, though, they do not have a parent/child relationship with any other column.

    Columns in Content Types

    Here’s where it gets interesting:

    If there’s certain item metadata you want to track in a content type, you include a column that represents that metadata. However, you cannot create a column in a content type; rather, you have to create the column, and then reference that column in the content type definition. Because of this, when you add a column to a content type, the content type schema doesn’t contain a <Field> element, it contains a <FieldRef> element. This is true of both site and list columns you add to content types.

    The <FieldRef> element is a reference to a column defined elsewhere, either at the site or list level. In the field reference, you can override a subset of the column properties, such as:

    ·         Display name

    ·         XML promotion and demotion attributes

    ·         Whether the field is read-only, required, or hidden

    Changing these properties in the field reference only changes them as they apply to the specific content type that contains the field reference, of course.

    Field references retain the same field ID as the column they reference.

    If you create a content type based on a parent content type, be default all the columns in the parent are included in the child content type, as <FieldRef> elements.

    Now, when you add a content type to a list, the columns referenced in that content type get copied locally onto the list as list columns. In other words, the columns referenced by the various <FieldRef> elements in the content type schema are copied onto the list schema as <Field> elements—again, with the child/parent relationship to the site column.

    As mentioned earlier, when you add a list column to a list content type, it's added as a <FeildRef> in the list content type schema.

    Therefore, columns are always represented by <Field> elements in site and list schemas, but always represented by <FieldRef> elements in content type schemas.

    Field references in content types are represented by the SPFieldLink object in the SharePoint object model.

    Document Properties

    The term 'document property' is most often used in the context of talking about a particular piece of metadata you're interested in tracking for a document. That is, the particular column value for that document. The last modified date, for example. When you upload a bunch of documents and display the last modified date for each, you get a column of last modified date values, as in the screen shot above.

    The document property might be something you're tracking solely at the document library level, or it might also be included in the document itself.

    (In which case, WSS V3 includes functionality that enables you to promote and demote the document property, so the value in the document is always in synch with the value in the column.

    Property promotion refers to extracting document properties from a document, and writing those property values to the appropriate columns on the document library where the document is stored. Property demotion refers to taking column values from the document library where a document is stored, and writing those column values into the document itself.

    WSS includes several built-in document parsers that automatically promote and demote document properties for well-known file types. In addition, you can use the built-in XML parser to promote and demote properties from your custom XML files. Finally, WSS also includes a document parser interface, which enables you to build custom parsers that can promote and demote document properties from your custom file types.)

    Hopefully, the figure below illustrates this relationship. You add the site column Author to a content type; in the content type schema, the column is represented by a <FieldRef> element. When you add the content type to a list, WSS adds the Author column as a <Field> element. Both elements have the same field ID as the Author site column. When you add the list column ItemNo to the list content type, WSS adds it as a <FieldRef> element, with the same field ID. For Document C, the actual values for those two columns are stored in the document itself, and also displayed in the columns on the list.

    The Bottom Line

    So, to review:

    What are called columns in the user interface are referred to as fields in the schema and object model.

    You can create columns at two levels: the site, and list levels. These columns are represented as <Field> elements in the site and list schema, and Field objects in the object model. List columns created when you add a site column to a list retain a child/parent relationship with the site column, and retain the same field ID as the site column.

    You cannot create a column in a content type. When you add a column to a content type, it's added as a <FieldRef> in the content type schema. When you add a content type to a list, the columns referenced by the <FieldRef> elements in that content type schema are added to the list as <Field> elements.

    Therefore, columns are always represented by <Field> elements in site and list schemas, and always represented by <FieldRef> elements in content type schemas.

    Document properties usually just refer to a field as it applies to a specific document. The document property might be something you're tracking solely at the document library level, or it might also be included in the document itself.


    Postscript: A Short Digression Concerning the SPContentType Object

    In the SharePoint object model, the SPContentType object contains both a SPFieldLinkCollection and an SPFieldCollection object. But if columns in content types are represented by field references, how can you have a collection of fields in the content type? Because it's one of the ways we're making your life easier, that's why.

    The SPFieldCollection in the SPContentType object enables developers an easy way to get a 'merged view' of a column's attributes, as they are in that content type. By merged view, I mean all the attributes of a field, merged with those attributes that have been overridden in the field reference. When you access a field through SPContentType.SPFieldCollection["myField"], WSS merges the field definition with the field reference and returns the resulting SPField object to you. That way, you don't have to look up a field definition, then look up all the attributes in the field definition overridden by the field reference for that content type. We do it for you.

    Because of this, there's a 1-to-1 correlation between the items in the SPFieldLinkCollection and SPFieldCollection objects. For each SPFieldLink you add to a content type, we add an SPField object that represents the merged view of that column as it's defined in the content type. You cannot directly add or delete items from an SPFieldCollection object in an SPContentType object; trying to do so throws an error.


    Written while listening to Nick Cave : Henry's Dream

  • Andrew May's WebLog

    XML Document Property Parsing in SharePoint (3 of 5): Determining Document Content Type for XML Parsing


    This is the third in a five-part series on how to use the built-in XML parser in WSS V3 to promote and demote document properties in your XML files, including InfoPath 2007 forms.

    Read part one here.

    Read part two here.

    In order for the built-in XML parser to determine the document’s content type, and thereby access the content type definition, the document itself must contain the content type as a document property. The parser looks for a special processing instruction in your XML documents to identify the document’s content type. You can include processing instructions that identify the document’s content type by content type, and/or document template.

    How the Parser Determines Document Content Type

    When a user uploads an XML document to a document library, WSS V3 invokes the built-in XML parser. Before the parser can promote document properties, it must first determine the content type, if any, of the document itself.

    The parser first looks at the Field element in the document library schema that represents the content type ID column on the document library. The parser examines the Field element for the location in the document where the content type ID should be stored. The parser then determines if the content type ID is indeed stored in the document at this location. If no content type ID is specified at that location, the parser assigns the default content type to the document. The parser then uploads the document and promotes any document properties accordingly.

    If the document does contain a content type ID at the specified location, the parser determines if the content type with that ID is also associated with the document library. If it is, the parser uploads the document and promotes any document properties accordingly.

    If the parser doesn't find an exact match, it examines ID's of the content types on the document library to determine if one or more are children of the document content type. If so, the parser assigns the closest child content type to the document. The parser then uploads the document and promotes any document properties accordingly.

    Note   The parser examines the list for content types that are children of the document content type because, in most cases, the document will have been assigned a site content type. In such cases, the matching list content type will be a child of the site content type.

    If the parser finds no content type match at all, it then looks at the Field element in the document library schema that represents the document template column on the document library, if such a column is present. If the document library does contain a document template column, the parser examines the Field element for the location in the document where the document template should be stored. The parser then determines if the document template is indeed stored in the document at this location.

    If the document does contain a document template, the parser compares the template with the document templates specified in each content type on the document library. If the parser finds a content type with the same document template as the document, the parser assigns that content type to the document. If there are multiple content types with the same document template as the document, the parser simply assigns the first such content type it finds. The parser then uploads the document and promotes any document properties accordingly.

    Finally, if the parser cannot find a content type match, the parser assigns the default content type to the document. The parser then uploads the document and promotes any document properties accordingly.

    The flow chart below illustrates the checks the parser performs to determine a document's content type.

    For more information on how the parser promotes and demotes specific document properties, part two of this series.

    One aspect of the parser's operation that bears emphasis is the fact the parser looks to the document library's content type and document template columns to determine where in the XML file to look for those matching document properties. Therefore, in order for promotion and demotion to work correctly, all content types on a given document library must contain content type and document template column definitions that specify the same location for those document properties as the document library columns. Otherwise, the parser will be looking in the wrong location within the document for those properties.

    In the next entry, we’ll take a look at how you actually specify the content type of the document, inside that document itself.

  • Andrew May's WebLog

    Visio SDK available for download


    We've just put the finishing touches on the Visio 2003 SDK and pushed it out the door:


    I mention this for two reasons:

    1. The SDK is one-stop shopping for developer reference material about the best version of Visio yet.
    2. I wrote or revised several of the technical articles included in the kit.

    So consider this my shameless plug for the day.

Page 1 of 5 (108 items) 12345