SharePoint Development from a Documentation Perspective

Andrew May

  • Andrew May's WebLog

    PowerPoint: A Codeless One-Slide Timer (Part 2 of 2)


    Read part one here.

    Next, we need to create custom animation that controls the order in which PowerPoint displays the shapes on the slide.

    1.      From the Slide Show menu, click Custom Animation.

    2.      Select the shape that displays the second-highest time (in our case, the shape displaying 0:50). In the Custom Animation task pane, click Add Effect, select Entrance, and then click on Appear.

    3.      The animation should now be listed in the animation list in the task pane.

    4.      Select the animation, and in the Start drop-down, select After Previous.

    5.      Click on the drop-down arrow next to the animation name, and then click Timing. In the animation dialog box, on the Timing tab, for Delay enter 10.

    Repeat this for each shape, in descending order of the time each shape displays. So now you've got the shapes appearing at 10 seconds intervals, in the order of most time displayed (1:00) to least time (0:00). Note that you don't assign an animation to the first shape, because you want that shape to be visible as soon as PowerPoint displays the slide.

    Now stack all the shapes on top of each other. That way, as each shape appears, it covers the previous shape. Because all the shapes appear the same, except for the time displayed on each, during a slide show they'll appear as a single shape counting down the seconds left on the time.

    So our timer counts down the seconds correctly, but we also want to move to the next slide when the timer is done. For this, we'll use a slide transition, just as Geetesh does in his timer. We'll set the slide transition to occur one second after the timer finishes its count.

    1.      On the Slide Show menu, click Slide Transition.

    2.      In the Slide Transition task pane, select the type of transition you want.

    3.      Under Advance slide, check Automatically after and enter '01:10' in the text box. Make sure On mouse click is not checked.

    We're done. When PowerPoint displays the slide, the animation sequence starts, displaying each shape at ten second intervals. At the same time, the slide transition is counting down. PowerPoint displays the final counter shape (0:00) one second before it executes the transition to the next slide.

    As you can see, if you wanted a longer timer, say a five or ten minutes, or wanted to count off one-second intervals, creating the timer could get tedious real fast. The main advantage to this approach, as well as Geetesh's, is that it doesn't rely on anything other than PowerPoint's native functionality. Which means you can move the presentation to another machine and not have to worry about stuff like whether the security setting will allow code to run.

    The disadvantages, of course, are that it's somewhat laborious to set up, and it's not very flexible. Because it relies on slide transition functionality, the timer has to start as soon as PowerPoint displays the slide. I couldn't think of a way to make the slide transition dependant on user interaction, which would have enabled me to let the user start the timer whenever they wanted. But for short timers where you don't want to (or can't) use VBA code or add-ins, it could come in handy.

    As I mentioned in the last entry, the PowerPoint FAQ lists several timers available as VBA code or add-ins.

    Speaking of time, it's probably past time I got back to my SharePoint research…

  • Andrew May's WebLog

    InfoPath Forms in Office SharePoint Server 2007


    I’m sure it comes as no surprise that Office InfoPath 2007 is the forms designer of choice for Office SharePoint Server 2007. But the average SharePoint developer, used to reaching for ASP.NET when he needs to create a form for SharePoint, might be surprised at all the places you can employ InfoPath to quickly create forms for enterprise management functions.

    The main reason for this is the fact that in InfoPath 2007 you can create what the InfoPath guys are calling symmetrical forms. Symmetrical forms are InfoPath forms that can be hosted either in the Office client applications like Word, PowerPoint, and Excel, as well as in the SharePoint Server browser interface. So you can create a single form, and know it’ll look and work the same whether the user sees it in an Office application or in the browser interface.

    (How is this possible? Short version: SharePoint Server 2007 uses Office Forms Services, a server-based run-time environment for InfoPath 2007 forms, to host the forms in the browser. Office Forms Services consumes the forms you create in the InfoPath client and renders them in an ASP.NET framework, which acts as a run-time environment for the form. This environment presents a form editing experience that matches the InfoPath 2007 client application. The Office client applications, on the other hand, include the ability to host the native InfoPath forms. )

    So naturally, symmetrical forms are very handy for enterprise content management areas, where the user might be working with documents in their native client application, or online through the SharePoint Server browser interface.

    Here’re a couple of the places where you can employ InfoPath forms for enterprise content management in SharePoint Server 2007:

    ·         Document Information Panels

    A document information panel is a form that is displayed within the client application, and which contains fields for the document metadata. Document information panels enable users to enter important metadata about a file anytime they want, without having to leave the Microsoft Office system client application. For files stored in document libraries, the document information is actually the columns of the content type assigned to that file. The document information panel displays a field for each content type property, or column, the user can edit.

    You can create document information panels either from within SharePoint Server, or directly from InfoPath 2007.

    How to: Create or Edit a Custom Document Information Panel from within Office SharePoint Server 2007

    How to: Create a Custom Document Information Panel from InfoPath 

    ·         Workflow Forms

    You can create InfoPath forms for use with workflows in SharePoint Server 2007. That way, the user can interact with the workflow form from within the Office client application, and not just through the browser.

    SharePoint Server 2007 uses Office Forms Services to display workflow forms, be they association, initiation, modification, or edit task forms. The only difference is that there’s a different .aspx page hosting the Office Forms Services control for each type of workflow form. Initiation forms are hosted by a different .aspx page than modification forms, for example. Each different hosting page knows how to submit the information from its type of form to SharePoint Server (and hence, to the workflow engine).

    The .aspx pages that contain the Office Forms Services web part are included as part of SharePoint Server, of course.

    How to: Design an InfoPath Form for a Workflow in Office SharePoint Server 2007

    So if your putting together an enterprise content management solution in Office SharePoint Server, and you’d like your user to be able to interact with your custom forms in the client application and the browser, it might be worth your while to take a look at InfoPath 2007 and Office Forms Services.


    Written while listening to Johnny Cash : Personal File

  • Andrew May's WebLog

    PowerPoint: A Codeless One-Slide Timer (Part 1 of 2)


    Yesterday, a co-worker asked a few of us in PowerPoint user assistance how to create a timer to use during a presentation break. The only stipulation was that the timer had to use PowerPoint's native capabilities: she wanted something she could give to other people without having to include instructions you'd need if the timer was an ActiveX control or relied on VBA code.

    One of the other team members referred her to the Clocks and Timers section of the PowerPoint FAQ. Sure enough, there was an example of a timer using slide transitions, courtesy of Geetesh Bajaj.

    And it got me thinking: Geetesh uses multiple slides to create a timer, but couldn't you create a timer using shape animation and a single slide? Creating a single-slide timer was primarily an aesthetic preference. The timer functions the same either way. I just preferred to have it on a single slide, so that when I'm in Normal or Slide Sorter view, the timer doesn't appear as a long series of slides. Having everything on a single slide also makes moving the timer, or pasting it into another presentation, easier.

    (Creating it was also a nice distraction from all the SharePoint studying I've been doing. But that's neither here nor there.)

    Anyway, here's how you can create a single-slide, codeless timer:

    As an example, let's say you wanted to create a one-minute timer, with the visual display updated every ten seconds. First, add a new slide to the presentation. Then, create a shape that represents your timer, like so:


    Next, copy and paste the shape back onto the same slide. Change the text so it displays ten seconds less. Repeat this process until you have shapes that count down to zero:

    Now, let's make sure the z-order of the shapes is correct. The z-order basically refers to the order in which shapes are layered on the slide. We want the shapes stacked in order of increasing time, with the one-minute shape on the bottom. Think of it this way:


    (Tip: Don’t actually stack the shapes on top of each other yet. That'll make the shapes to hard to work with.)

    Here's how we can explicitly set the z-order:

    Right-click the shape with the most time displaying (in our case, that's the one-minute shape), point to Order, and click Bring to Front. This moves that shape to the top of the z-order stack. Repeat this process with the shape displaying the next most time, and so on.

    So now we have our shapes in the proper z-order. Next, we need to animate them so that PowerPoint displays them in the proper sequence, and at the proper intervals. Then we need to add a slide transition for when the timer hits zero. All of which we’ll do in the next entry.

  • Andrew May's WebLog

    What Are Content Types, Anyway?


    One of the major new concepts in Windows SharePoint Services V3 is content types. They're a core concept that enables a lot of the functionality in both Windows SharePoint Services and Office SharePoint Server 2007, so they seemed like a logical choice to talk about.

    A content type is a reusable collection of settings you want to apply to a certain category of content. Content types enable you to manage the metadata and behaviors of a document or item type in a centralized, reusable way. Basically, content types include the columns (or fields, if you prefer) you want to apply to a certain type of content, plus other optional settings such as a document template, and custom new, edit, and display forms to use.

    You can think of a content type as a refinement and extension of a Windows SharePoint Services 2.0 list. The list schema defined a single group of data requirements for each item on that list. So in Windows SharePoint Services 2.0, the schema of an item was inextricably bound to its location.

    With content types, you can define a schema for types of content, but that schema is no longer bound to a specific location. And to look at it in reverse, each SharePoint list is no longer limited to a single data schema to which the documents stored there must adhere. You can assign the same content type to multiple document libraries, and you can assign multiple content types to a given document library. Content types, in essence, let you:

    ·         Store the exact same type of content in multiple locations

    ·         Store multiple types of content in the same location

    Site and List Content Types: Parents and Children

    There are two different levels of content types I should mention: site content types, and list content types. Think of site content types as templates, and list content types as instances of those templates. You define site content types at the um, site level, hence the name. Since site content types aren't bound to a specific list, you can assign them to any list within the site you want. The site at which you define the site content type determines the scope of that content type.

    List content types are more like instances, or single-serving content types. When you assign a site content type to a list, a copy of that content type is copied locally onto the list. That's a list content type. You can even tweak a list content type so that it's different from its site content type 'parent'. Also, you can create a content type directly on a list, but it's a one-off. You can't then assign that list content type to other sites or lists.

    One other thing about site content types: you can base content types on other site content types. For example, Windows SharePoint Server comes with a built-in hierarchy of content types for basic SharePoint objects, such as Document, Task, Folder, etc. You can create your own site or list content types based on one of these site content types. Ultimately, all content types derive from the grandfather of them all, System.

    Also, if you make changes to a site content type, Windows SharePoint Services includes a mechanism whereby you can 'push down', or propagate those changes out to the child content types (be they site or list content types). Doesn't work the other way, though; no pushing changes to child content types up the family tree.

    E is for Extensible

    You can see how content types were designed to encapsulate and modularize schema settings in Windows SharePoint Services V3. One very powerful aspects of this is that you can use content types to encapsulate whatever custom data you want to include in them. The content type schema includes a <XMLDocuments> node, which you can use to store nodes of any valid XML. As far as Windows SharePoint Services is concerned, the contents of the <XMLDocuments> node is a black box. Windows SharePoint Services makes no attempt to parse any of the XML documents you store there; it simply makes sure that they are included in any children content types, such as when you assign the content type to a list.

    The <XMLDocuments> node was designed to be utilized by third-party solutions. Use it to store information pertinent to any special settings or behavior you want to specify for a certain type of content. Office SharePoint Server uses this mechanism to store information various features need, such as information policies and document information panels, among others. You can programmatically access a SharePoint item’s content type, and from there access the XML documents include in the content type.

    Find Out More

    To learn out more about content types, browse the topics included in the Content Types section in the Windows SharePoint Services V3 (Beta) SDK.

    And here’s a few places content types are utilized to facilitate Office SharePoint Server functionality:

    Using content types to specify document information panel.

    Using content types to specify document to page converters.

    Using content types to specify information policy.


    Written while listening to Bob Mould : Body of Song

  • Andrew May's WebLog

    SharePoint Terminology: Column vs. Field vs. Property


    Today I want to address a piece of SharePoint terminology that confused me no end when I first started working on the products. So, as a service to beginning SharePoint programmers the world over, here’s a quick (and hopefully precise, if not definitive) overview of three terms that tend to be used to mean the same thing: column, field, and document property.

    Columns on Sites and Lists

    Basically, what all three terms refer to a specific piece of metadata on a SharePoint item.

    In other words, this:

    In the UI, these are called ‘columns’ because, well, that’s what they are displayed as: columns. Each piece of metadata you’re collecting for a list is represented as a column on that list (Whether the column is actually displayed on the list or not. Columns can be hidden.)

    However, if you take a look under the hood, either in the SharePoint schemas or object model, you’ll find they’re called Fields. This is what they were called in V2, and for compatibility sake, that’s what they remain in V3. (Database columns tend to be called ‘fields’, so that might be where the term originally crept in.)

    So far, that’s all pretty straightforward. But V3 adds some complexity (and correspondingly, much flexibility and power) with the addition of content types, and site columns. I’ve talked some about content types here. Site columns can be thought of as templates; you create the site column at the site level, then you can apply it to the various lists and sub-sites as you wish. Site columns are represented as <Field> elements in the site schema, and Field objects in the object model.

    When you add a site column to a list, the column definition is copied locally to the list, as a list column. So, in the list schema, the list column is represented by a <Field> element in the list schema now. In the object model, it’s represented by a Field object.

    Another important point to mention: when you add a site column to a list, the resulting list column has the same field ID as the site column. SharePoint uses this to track which list columns are “children” of a given site column. This enables you to make changes to a site column and propagate those changes out to all the list columns that are children of the site column.

    You can also create columns at the list level. These columns only apply to the list on which they are created. You can think of them as one-offs. You can add list columns to content types on that list, but that's it. List columns are also represented as <Field> elements in the list schema, and SPField objects in the object model. Because they were created from scratch, though, they do not have a parent/child relationship with any other column.

    Columns in Content Types

    Here’s where it gets interesting:

    If there’s certain item metadata you want to track in a content type, you include a column that represents that metadata. However, you cannot create a column in a content type; rather, you have to create the column, and then reference that column in the content type definition. Because of this, when you add a column to a content type, the content type schema doesn’t contain a <Field> element, it contains a <FieldRef> element. This is true of both site and list columns you add to content types.

    The <FieldRef> element is a reference to a column defined elsewhere, either at the site or list level. In the field reference, you can override a subset of the column properties, such as:

    ·         Display name

    ·         XML promotion and demotion attributes

    ·         Whether the field is read-only, required, or hidden

    Changing these properties in the field reference only changes them as they apply to the specific content type that contains the field reference, of course.

    Field references retain the same field ID as the column they reference.

    If you create a content type based on a parent content type, be default all the columns in the parent are included in the child content type, as <FieldRef> elements.

    Now, when you add a content type to a list, the columns referenced in that content type get copied locally onto the list as list columns. In other words, the columns referenced by the various <FieldRef> elements in the content type schema are copied onto the list schema as <Field> elements—again, with the child/parent relationship to the site column.

    As mentioned earlier, when you add a list column to a list content type, it's added as a <FeildRef> in the list content type schema.

    Therefore, columns are always represented by <Field> elements in site and list schemas, but always represented by <FieldRef> elements in content type schemas.

    Field references in content types are represented by the SPFieldLink object in the SharePoint object model.

    Document Properties

    The term 'document property' is most often used in the context of talking about a particular piece of metadata you're interested in tracking for a document. That is, the particular column value for that document. The last modified date, for example. When you upload a bunch of documents and display the last modified date for each, you get a column of last modified date values, as in the screen shot above.

    The document property might be something you're tracking solely at the document library level, or it might also be included in the document itself.

    (In which case, WSS V3 includes functionality that enables you to promote and demote the document property, so the value in the document is always in synch with the value in the column.

    Property promotion refers to extracting document properties from a document, and writing those property values to the appropriate columns on the document library where the document is stored. Property demotion refers to taking column values from the document library where a document is stored, and writing those column values into the document itself.

    WSS includes several built-in document parsers that automatically promote and demote document properties for well-known file types. In addition, you can use the built-in XML parser to promote and demote properties from your custom XML files. Finally, WSS also includes a document parser interface, which enables you to build custom parsers that can promote and demote document properties from your custom file types.)

    Hopefully, the figure below illustrates this relationship. You add the site column Author to a content type; in the content type schema, the column is represented by a <FieldRef> element. When you add the content type to a list, WSS adds the Author column as a <Field> element. Both elements have the same field ID as the Author site column. When you add the list column ItemNo to the list content type, WSS adds it as a <FieldRef> element, with the same field ID. For Document C, the actual values for those two columns are stored in the document itself, and also displayed in the columns on the list.

    The Bottom Line

    So, to review:

    What are called columns in the user interface are referred to as fields in the schema and object model.

    You can create columns at two levels: the site, and list levels. These columns are represented as <Field> elements in the site and list schema, and Field objects in the object model. List columns created when you add a site column to a list retain a child/parent relationship with the site column, and retain the same field ID as the site column.

    You cannot create a column in a content type. When you add a column to a content type, it's added as a <FieldRef> in the content type schema. When you add a content type to a list, the columns referenced by the <FieldRef> elements in that content type schema are added to the list as <Field> elements.

    Therefore, columns are always represented by <Field> elements in site and list schemas, and always represented by <FieldRef> elements in content type schemas.

    Document properties usually just refer to a field as it applies to a specific document. The document property might be something you're tracking solely at the document library level, or it might also be included in the document itself.


    Postscript: A Short Digression Concerning the SPContentType Object

    In the SharePoint object model, the SPContentType object contains both a SPFieldLinkCollection and an SPFieldCollection object. But if columns in content types are represented by field references, how can you have a collection of fields in the content type? Because it's one of the ways we're making your life easier, that's why.

    The SPFieldCollection in the SPContentType object enables developers an easy way to get a 'merged view' of a column's attributes, as they are in that content type. By merged view, I mean all the attributes of a field, merged with those attributes that have been overridden in the field reference. When you access a field through SPContentType.SPFieldCollection["myField"], WSS merges the field definition with the field reference and returns the resulting SPField object to you. That way, you don't have to look up a field definition, then look up all the attributes in the field definition overridden by the field reference for that content type. We do it for you.

    Because of this, there's a 1-to-1 correlation between the items in the SPFieldLinkCollection and SPFieldCollection objects. For each SPFieldLink you add to a content type, we add an SPField object that represents the merged view of that column as it's defined in the content type. You cannot directly add or delete items from an SPFieldCollection object in an SPContentType object; trying to do so throws an error.


    Written while listening to Nick Cave : Henry's Dream

  • Andrew May's WebLog

    SharePoint Object Model Maps for Download


    The guys at have generously agreed to host a few more of the large-size diagrams I’ve created for internal usage. So I’m offering up several object model maps for download. These are poster-size (11” by 17”) diagrams that illustrate key objects and namespaces in the SharePoint environment, suitable for sticking up in your office or hanging in your home as fine art.

    Each object model map was created in Visio, but the downloads are PDFs for ease of viewing and printing.

    Just click on any or all of the links below to download the maps you want. And leave me a message in the comments section telling me what you thought of the format, design, layout, etc. of the diagrams. Thanks.

    ·         The Windows.SharePoint.Workflow Namespace

    Workflow is another area you’re going to be hearing a lot about this release. The diagram illustrates the classes and members of the Workflow namespace, which you’d use to associate, initiate, and otherwise manage the workflow templates and instances in a Windows SharePoint Services deployment.

    ·         The Windows.SharePoint.SPContentType Object

    This object model map highlights the members and child objects of the SPContentType object. Content types are a core concept for this next release of Windows SharePoint Services, as I explain here. This object model maps compliments the conceptual diagrams of content types I offered for download here.

    ·         The Microsoft.Office.RightsManagement.InformationPolicy Namespace

    I haven’t talked about information policy on this blog yet, but there’s plenty of material in the Office SharePoint Server 2007 (Beta 2) SDK to fill you in. (Start here.) This diagram illustrates the classes and members of the InformationPolicy namespace, which you’d use to manage the policies, policy features, and policy resources on a SharePoint Server 2007 installation.

    I’ve still got a few more diagrams to make available; check back early next week for more full-color, absolutely free goodness.

    Written while listening to Greg Dulli : Amber Headlights

  • Andrew May's WebLog

    InfoPath Forms Management in Windows SharePoint Services V3


    Today I'd like to talk about how to store and manage XML forms (InfoPath and otherwise) in WSS V3.

    In general, forms have three special areas of functionality:

    ·         Property Promotion and Demotion   This refers to promoting and demoting document data to and from columns in a SharePoint library.

    ·         Link Management   This refers to how WSS keeps the link to the form template in the XML file up to date, if any site, sub-site, or library is renamed.

    ·         Merge Forms   Sends multiple XML files that have the same schema to a client application to be merged.

    In Windows SharePoint Services Version 2, this special functionality was part of the Form Library site template. Now, in WSS V3, it's encapsulated in the Form content type instead. Instead of creating a new form library, you can create a new content type, inheriting from the Form content type. Your new form becomes the template for the content type.

    This new approach offers several major advantages over the form libraries:

    ·         Central management of forms   Because you're creating a new content type, you gain all the advantages content types offer in terms of centralized management. You can control the form and its metadata from a single location, regardless of how many libraries to which you've added the content type. You can also specify enterprise content management features for your content type, such as property promotion and demotion, workflow, or (if you have Office Server 2007) information policy. And you don't have to republish the form if you make changes to the content type for which it's the form template.

    ·         Add multiple forms to the same library   Again, because each type of form is a separate content type, you can add multiple form type to the same library.

    ·         Add forms to the same libraries as documents   You can form content types to the same libraries that contain document content types. In a very real sense, there's no distinction between form and document libraries in WSS V3; there're just libraries. Any special functionality is encapsulated in the content types themselves.

    (However, the form library site template has been retained in WSS V3 for purposes of backward compatibility with XML-based forms editors that do not support publishing forms to content types, such as InfoPath 2003. But if you are using an XML-based forms editor that supports content types, such as InfoPath 2007, we strongly recommend you use the Form content type instead of the form library template. And with all the obvious advantages, why wouldn't you?)

    Publishing InfoPath 2007 Forms to Content Types

    So let's talk specifically about InfoPath 2007 forms for WSS V3. InfoPath 2007 includes the ability to publish forms directly into new or existing content types on a WSS installation. This includes setting up property promotion and demotion: you're able to choose which data fields on your form you want to map to which SharePoint columns. You get link management and merge forms functionality for free, because the content type to which you publish your form is a child of the Form content type.

    When you publish an InfoPath 2007 form, you have the option to publish it to a SharePoint site. When you do, you have the further option of publishing the form into a new or existing content type.

    (You still have the option of publishing your form to a document library, just as you did with InfoPath 2003. In InfoPath 2007, though, you can update both document libraries as well as form libraries, as long as that library’s default content type is a child of the Form content type. If the library doesn’t have multiple content types, then the single library content type must be a child of Form in order for you to update it. In addition, document property demotion now works in document libraries.)

    When you choose to create a new content type, you select the content type from which you want your new content type to inherit. By default, the new content type is a direct child of the Form content type, but you can specify any descendant of Form to use as your new content type's parent.

    Next, you can choose which data fields you want to promote and demote to and from SharePoint site columns. To do this, you specify the columns to which you want to map the data fields in your form. You can map the data fields to existing site columns, or create new columns.

    If you map a data field to an existing site column, InfoPath adds a <FieldRef> element to the content type for that site column. The <FieldRef> element contains a set of attributes you can override for the site column, as it applies to items assigned this content type. These attributes include those that specify where in an XML form the data field information is located. Because you’re mapping a data field (i.e., the data in your XML) to the column, InfoPath uses the Node attribute to specify an XPath expression for where the data resides in the form.

    For more information on how the built-in XML parser uses the Node attribute for data promotion and demotion, see this earlier post.

    If you create a new site column, InfoPath adds a <Field> element that represents that site column to the site, and then adds a <FieldRef> element to the content type for that site column.

    For more information on how <Field> and <FieldRef> elements differ, and how <FieldRef> elements work in a content type, see my earlier post here.

    If you choose to update the existing content type itself, you can stick with the existing columns for that content type, or add new ones. You can also remove columns for data fields you've deleted or don't want promoted or demoted anymore. If you do delete a column, InfoPath removes the <FieldRef> element from the content type definition, but leaves the site column intact (that is, the <Field> element for that site column remains in the site definition). You cannot remove any columns the content type inherits from its parent content type.

    Once you've published your form to a content type, you can use WSS to further customize the content type, such as adding columns, workflows, and policies. Also, remember that this is a site content type; in order to use the content type in a library, you have to add that site content type to the library through WSS.

    Additional Considerations

    Here are a few other things to keeping mind about InfoPath 2007 forms and form content types:

    Determining the Content Type of InfoPath Forms

    InfoPath forms themselves do not contain the content type ID of the content type they’ve been assigned. Instead, WSS uses the form template to determine the content type of the form. The form template URL is included in the <?xml> processing instruction of the form, and points to the .xsn on which the form is based.

    The built-in XML parser included in WSS first attempts to find the content type ID in the XML file, and, failing that, attempts to determine the content type based on the form template. So, for InfoPath forms, the built-in XML parser always fails its first check, and has to use the form template to determine content type.

    For more information on how the XML parser determines content type, see this earlier post.

    Setting the Product Used to Edit the XML File

    The program ID for InfoPath 2007 forms is included in the forms in the progId2 attribute of the <?mso-infopathdocument> processing instruction. In content type definitions, InfoPath maps this to the ProgID column using the PrimaryPITarget and PrimaryPIAttribute attributes.

    So, for content types for InfoPath 2007 forms, the attributes for the ProgID <FieldRef> element are set to the following values:

    PITarget = mso-infopathdocument

    PIAttribute = InfoPath.Document

    PrimaryPITarget = mso-infopathdocument

    PrimaryPIAttribute = InfoPath.Document.2

    Because WSS looks at the PrimaryPITarget and PrimaryPIAttribute pair first, the InfoPath.Document.2 attribute value is promoted to the ProgID column. The ProgID column determines which application WSS calls to open a selected file.


    Written while listening to The Twilight Singers : Twilight as Played by the Twilight Singers

  • Andrew May's WebLog

    What are Content Type IDs?


    So, a reader emailed me the other day, asking for more information on content type IDs: what they are, how to create your own, etc. Because he asked nicely, and because I like to keep both my readers happy, I decided to write up a quick overview of content type IDs.

    The truth is, this is something I had hoped to get into the Beta 2 SDK, but the clock ran out. So his question gave me the perfect excuse to write the material up for the next refresh of the SDK. Consider this a preview then.

    Because I’m planning on incorporating this material into the SDK, you’ll forgive me if I slip into my formal, developer documentation writing style.

    <Authoritative SDK Voice>

    Content type IDs uniquely identify the content type. Content type IDs are designed to be recursive. The content type ID encapsulates that content type’s “lineage”, or the line of parent content types from which the content type inherits. Each content type ID contains the ID of the parent content type, which in turn contains the ID of that content type’s parent, and so on, ultimately back to and including the System content type ID. By parsing the content type ID, you can determine which content types the content type inherits, and how two content types are related.

    Windows SharePoint Services V3 uses this information to determine the relationship between content types, and for push down operations.

    You can construct a valid content type ID using one of two conventions:

    ·         Parent content type ID + two hexadecimal values

    ·         Parent content type ID + “00” + hexadecimal GUID

    There is one special case, that of the System content type, which has the content type ID of “0x”. The System content type is the sealed content type from which all other content types ultimately inherit.

    For all other content types, you must use one of the above methods for constructing a valid content type ID.

    Note that if you use the first method, the two hexadecimal values cannot be “00”.

    A content type ID must be unique within a site collection.

    Let’s examine each of these conventions in turn.

    Windows SharePoint Services V3 uses the first method for generating content type IDs for the default content types that come included with the platform. For example, the content type ID of the Item content type, one of the most basic content types, is 0x01. This denotes that the Item content type is a direct child of System. The content type ID of the Document content type is 0x0101, and the Folder content type has a content type ID of 0x0120. By parsing these content type IDs, we can determine that both Document and Folder are direct children of Item, which in turn inherits directly from System:

    In this way you can determine not only what content types a content type inherits from, but at which point two content types have common ancestors.

    The figure below maps out the relationship of the four content types discussed above. In each, the unique portion of the content type ID is represented by blue text.

    Windows SharePoint Services V3 employs the second content type ID generation convention when creating content type IDs for:

    ·         Site content types you create based on other content types.

    ·         List content types, which are copied to a list when you add a site content type to that list.

    For example, if you have a content type with a content type ID of “0x010100D5C2F139-516B-419D-801A-C6C18942554D”, you would know that the content type was either:

    ·         A site content type that is a direct child of the Document content type, or

    ·         A list content type created when the Document site content type was added to a list.

    In general, the first content type ID generation technique emphasizes brevity, in that it only takes two hexadecimal digits to denote a new content type. The second approach emphasizes uniqueness, as it includes a GUID to denote the new content type. Each approach is best in certain situations.

    We recommend you use the GUID approach to identify any content types that are direct children of content types you did not create. In other words, use the GUID approach if the parent content type is:

    ·         A default content type included in Windows SharePoint Services V3, such as Document.

    ·         A content type developed by a third party.

    That way, you are guaranteed that the content type ID is unique and will not be duplicated later by the developer of the parent content type.

    Once you’ve uniquely identified a content type in this manner, however, you can use the first method to identify any children of that content type. In essence, the GUID used in your content type can act as a de facto namespace for your content type. Any children based on that content type can be identified by just two hexadecimal digits. Because the maximum length of a content type ID is finite, this approach maximizes the number of content type “generations” allowable.

    Content type IDs have a maximum length of 512 bytes. Because two hexadecimal characters can fit in each byte, this gives each content type ID an effective maximum length of 1024 characters.

    For example, suppose you wanted to create a new content type, myDocument, based on the default Windows SharePoint Services V3 content type Document. For the myDocument content type ID, you start with the Document content type ID, 0x0101, and append 00 and a GUID. This uniquely identifies the myDocument content type, guaranteeing Windows SharePoint Services won’t later add another default content type with the same content type ID (which would be possible, if you had only appended two hexadecimal digits). To generate content type IDs for any content types you derive from myDocument, however, you can simply append two hexadecimal digits to the myDocument content type ID. This keeps the length of the content type ID to a minimum, thereby maximizing the number of content type “generations’ allowable.

    The figure below illustrates this scenario. Again, the unique portion of each content type ID is represented by blue text.

    </Authoritative SDK Voice>

    Now, the above information is probably most useful to developers working with the XML definition of content types. This way, if you’re looking at a content type ID in an XML file, or need to generate one for a content type definition file you’re writing, you’ll understand how to construct and parse them manually.

    The SharePoint object model, on the other hand, includes methods to parser and compare content type IDs. Specifically, you can use the SPContentTypeID.Parent method to find the parent of a content type without having to parser the content type ID yourself. The SPContentTypeID object also contains several methods that enable you to compare content types by ID, and identify a common ancestor.

    One last thing that might be of interest. If you want to take a look at actual content type IDs in WSS, here’s what you can do: navigate to the Content Type Gallery for a site. When you click on a content type, the URL to that content type contains a parameter, ctype, which is in fact the content type ID for that content type.


    Written while listening to: The Replacements : Let It Be

  • Andrew May's WebLog

    Content Type Technical Posters for Download


    The title pretty much says it all. For some of our earlier, internal technical events, I created a number of large-format (11” by 17”) technical illustrations to explain the more complicated aspects of enterprise content management in a SharePoint environment. The two posters I’m offering for download today deal with content types, a core concept in Windows SharePoint Services V3 that I’ve blogged about here. If you’re planning on using content types, I think it’ll be worth your while to take a look at these.

    As you can probably tell, I created the posters with Visio. I then converted them to PDF format using Visio 2007 (Beta)’s spiffy new Publish to PDF feature. I have to say I’m fairly impressed with the results. These are pretty complex diagrams, and as far as I can tell, Visio converted them flawlessly to PDF. Well done.

    Using Columns and Content Types to Organize Your Content in Windows SharePoint Services (version 3)

    This diagram explains the relationship between site and list content types, as well as content type ‘inheritance’ and customizing/deriving content types. It also illustrates how you can use site columns in your content types to ensure data uniformity, and how content type reference site columns and list columns.

    Using Content Types in Windows SharePoint Services (version 3) and SharePoint Server 2007

    Explains what content types are, and the advantages of using them to categorize and manage your enterprise content. Illustrates the conceptual structure of the various feature information you can encapsulate in a content type, such as columns, document templates, workflows, and custom solutions information such as forms and information policy.

    Note that these diagrams were created to be used with Adobe Reader 7.0, and will undoubtedly appear best using that version. I haven’t tested to determine if the diagrams display or print accurately in earlier versions.

    If you download either (or both) of the posters, take a second and let me know what you think of them in the comments section. Were they useful in explaining the various concepts they illustrated? Would you like to see something like this rolled into the SDKs at some point? I’m very visually-oriented; most of the technical illustrations I create start with me jotting something down on a napkin, to explain a concept to myself. But I have no idea if that’s true of developers in general. Are illustrations helpful in technical documentation like SDKs? Let me know what you think.

    And special thanks to Steve and Ryan and all the good folks at Office Zealot, who graciously agreed to host these diagrams for downloaded. I greatly appreciate it. Take a few minutes and visit their site; they’ve got tons of interesting content for the Office developer. It’s time well spent.

    And check back here next week. I hope to get some object model maps done for a few of the SharePoint namespaces, and I’ll make those available for download once I do.

    Written while listening to Isobel Campbell and Mark Lanegan : Ballad of the Broken Seas

  • Andrew May's WebLog

    Bulk Editing Workflow Tasks in Office SharePoint Services 2007


    OK, we're back. Since it's now officially 2007, it seemed appropriate to focus my first entry of the New Year on Office SharePoint Server 2007, and Office InfoPath 2007.

    Shortly before the holidays, someone asked me if I had any guidance for customers who wanted to set up workflow tasks to be bulk editable. To which I promptly replied:

    You can do what?

    I swear, there's so much functionality crammed in this product that sometimes I think it's going to take me the next two product cycles just to cover it all properly. I had no idea you even could edit workflow tasks as a group.

    So, without further ado, here's the scoop:

    In Office SharePoint Server 2007, you can define workflow tasks as bulk editable, and provide a custom InfoPath form view to enable users to edit workflow tasks as a group.

    Bulk editable tasks are workflow tasks that can be edited as a group. All the tasks must be of the same task type, and from the same workflow association. For example, a user might want to mark all tasks of a certain type as 'complete' for a given workflow association.

    In addition, by using an InfoPath form as the workflow task edit form, you can include logic in the InfoPath form to switch form views depending on whether the user is editing an individual task or bulk editing tasks.

    Editing Workflow Tasks

    Workflow tasks are displayed in task lists, just as any other tasks. When the user navigates to the task list, the tasks are displayed based on the View selected, and the access control list rights (ACLs) of the user. A workflow may have several different types of tasks, but all the tasks from a given workflow association are contained on the same task list. However, a given task list can contain the tasks for multiple workflow template and associations.

    Developers can include elements in the workflow template definition that define task types as bulk editable.

    For workflow tasks, users can select to process workflow tasks in bulk. Office SharePoint Server then displays a list of the workflow tasks types on that list that are bulk editable, sorted by workflow template. When the user selects a task type from a given workflow template, Office SharePoint Server displays the task edit form for that type. Developers can include logic in their InfoPath task edit forms that changes the form view based on whether the user is editing tasks individually, or in bulk.

    The user can then edit the tasks as a group. The workflow task edit form need not contain any special logic to bulk edit the tasks; this is completely handled by Office SharePoint Server. When the user submits the task edit form, Office SharePoint Server writes the submitted data to each task of that task type for that workflow template for all associations of that template, as a timer job.

    For information on specifying InfoPath workflow task forms, see Specifying InfoPath 2007 Forms for Workflows.

    So, here's how it looks in the user interface, from the user's perspective:

    1.      Navigate to the task list that contains the workflow tasks you want to bulk edit.

    The workflow tasks are displayed based on the View selected, and the your ACLs.

    2.      From the Actions menu, select Process all tasks.

    Office SharePoint Server displays the Bulk Task Selection page. This page lists each task type on the list that is bulk editable. Task types are sorted by workflow template. Workflow associations are listed in parenthesis after the task type name.

    3.      Select the task type for the workflow template you want to bulk edit, and click OK.

    Office SharePoint Server displays the task edit form. If you've included the proper logic in your form, MOSS displays the bulk edit view of the task edit form.

    4.      Enter your edits, and click Submit.

    Office SharePoint Server writes the submitted data to each task of that task type for that workflow template, as a timer job.

    Note   Because the timer job updates each task, any custom code you have included in an OnTaskChanged event for that task type while run on each task as well.

    Defining a Workflow Task as Bulk Editable

    To define a task type in a given workflow as bulk editable, you add two elements to the workflow template definition.

    How to Define a Workflow Task Type as Bulk Editable

    Add the following two elements to the MetaData element in the workflow template definition, where N represents the task type number:

    ·         TaskN_IsBulkActionable   Optional Boolean. TRUE to define the task type as a task type that can be edited in bulk.

    This element is optional. Office SharePoint Server treats task types that are not explicitly defined as bulk actionable as not bulk actionable.

    This element defines the task type as bulk editable for this specific workflow template only. Therefore, can use the same task type in multiple workflows, and decide on a per workflow basis whether the task type is bulk editable.

    All tasks of a given type are either bulk actionable or not, within a given workflow template.

    ·         TaskN_BulkActionableFormName   Optional Text. The task name to display for this task type in the Office SharePoint Server 2007 user interface. Users click this name to display the workflow bulk edit form.

    For more information on workflow template definitions, see Workflow Definition Schema.

    The following example has been edited for clarity.

    <?xml version="1.0" encoding="utf-8" ?>

    <!-- Copyright (c) Microsoft Corporation. All rights reserved. -->

    <Elements xmlns="">














    Displaying a Custom Bulk Edit Task Form View

    Using InfoPath forms, you can provide a custom form view to enable users to edit workflow tasks as a group. When the user selects the task type they want to bulk edit, Office SharePoint Server displays the task edit form for that workflow task type, passing context data to the form. This context data includes an attribute, isBulkMode, which designates whether the task form is being called for bulk edit operations. Developers can include logic in their forms to specify different views based on whether the isBulkMode attribute is set to TRUE or FALSE.

    We recommend that the task edit form for each workflow task type you define as bulk editable contain two views:

    ·         One for individual task editing.

    ·         One for bulk editing of tasks.

    The bulk edit view should not load data from any particular task upon load, since Office SharePoint Server applies the data the user submits to every task of that type for the selected workflow association.

    How to Display a Custom Bulk Edit View of the Workflow Task Form

    ·         Include logic in the form that parses the context data passed to the form on load. This logic should include having the form display the bulk edit view if the isBulkMode attribute of the context data is set to TRUE.

    Note   To add the workflow context data to the form, you must first create an XML file that represents the context schema, and then import that file into the form as a secondary data source. For more information, see the Adding Workflow Context Data as a Secondary Data Source section of How to: Configure a Contact Selector Control on Your InfoPath Workflow Form.

    For more information on using InfoPath forms with Office SharePoint Server workflows in general, see InfoPath Forms for Workflows.

  • Andrew May's WebLog

    Working with Publisher Wizards and Templates


    Note:   This is the second in a series of entries that aim to introduce experienced programmers to the Publisher object model. The first entry covered creating Web pages programmatically. You can read it here.

    Publisher has a different concept of templates and wizards than other Office programs, such as Word. In Publisher, both terms refer to publication types on which you can base your publications, with important differences:

    ·         Publication wizards are pre-defined publications that come bundled with Publisher. These publication wizards contain text boxes and other design elements that you can customize, and to which you can add your content, in the publications you create using them. Publication wizards also contain design automation options that enable you to quickly change the lay out and design of your publication.

    ·         Templates are user-created publications that you can save to use as the basis for creating other publications. If you save the template to a specific location, Publisher then makes it available as a template on the New Publication task pane in the application interface.

    Let’s examine each of these in detail.

    Using Wizards in Publisher 2003

    Publication wizards are one of the most powerful and versatile features in Microsoft® Office Publisher 2003. As mentioned above, wizards are pre-defined publication templates included as part of Publisher. Wizards enable you to quickly generate professional-looking publications in a wide range of formats, from invitations to flyers to catalogs to websites.

    When you create a publication using a wizard, Publisher populates the new publication with design elements based on the wizard type and design you choose. You can add your content to these elements, as well as customize the elements as you desire. You can change the appearance of the publication later by simply specifying a different design available for that wizard. The wizard then morphs the publication to adhere to that design scheme.

    In Publisher, morphing refers to an object’s ability (be it a shape, page, or entire publication) to change its appearance based on the user’s choice of design. Choose a different design, and Publisher automatically updates the object according to the design chosen.

    Some wizards generate multi-page publications, with different content and design options depending on the page. For example, a newsletter may include different design elements present on the front cover than on the interior pages.

    Unlike templates in applications such as Word, you cannot access or alter publication wizards themselves. However, you can create a user-defined template based on a publication wizard. In such a case, you can alter the template in any way you like, such as adding VBA code to the publication, and the template retains the morphing and other functionality of the wizard on which it is based. For more information, see “Working with Templates.”

    The Publisher object model lets you extend the design flexibility in publication wizards even farther by automating the generation and customization of publications based on wizards. Any publication created based on a publication wizard template has a Wizard object as the child of its Document object. If you create a publication based on a multi-page wizard template, the individual pages in the publication may have wizard properties that apply to only that page. In such a case, each Page object in the publication contains its own Wizard object as well. For more information on programmatically working with publication wizards, see Creating and Customizing Wizard Publications in Publisher 2003.

    But not only do publications and pages have wizard properties, but shapes do as well. Any shape that Publisher adds to the default appearance of a publication based on a wizard design has properties that uniquely identify it. Publisher uses these properties to keep track of shapes that ‘belong’ to the wizard design, as opposed to any custom shapes the user might add to the publication later. For more information, see Identifying wizard shapes in a publication.

    Note   Publisher also includes group wizard shapes, which are related to, but independent from, the publication and page-level wizards. Group wizard shapes are pre-defined group shapes, such as calendars, coupons or Web navigation bars, that contain design automation options. Unlike wizard publications, you can add group wizard shapes to any publication, whether or not it’s based on a publication wizard. Also, You set the design of each group wizard shape individually. For more information, see Working with group wizard shapes.

    Creating and Using Templates in Publisher 2003

    You can also create templates in Publisher. Templates are especially useful if you create certain publications, such as newsletters, flyers, or postcards, over and over again. Templates enable you to design master publications that reflect your company’s brand and identity; you can then use that template to create new publications, adding only the information that is unique to each publication.

    Publisher templates are simply publications saved to a specific user directory. Each time you launch a new instance of Publisher, the application makes the publications in that directory available as template on the New Publication task pane.

    To create a template, save your publication to the following user directory:

    Drive:\Documents and Settings\userName\Application Data\Microsoft\Templates

    Where Drive represents the computer drive letter, and userName represents the name of the user to whom you want to make this publication available as a template. If you want to make the template available to multiple users on the same computer, you must save the template to the above location in each user’s directory.

    Publisher displays available templates under Templates in the New Publications task pane. Because Publisher loads the available templates on launch, you need to open a new instance of Publisher to for the new template to be available. Even then, the new template is not displayed in any instances of Publisher launched before the template was saved.

    If you do not have any templates saved, the Templates folder does not appear on the New Publication task pane.

    To save a publication as a template programmatically, use the Document.SaveAs method, specifying the file location above in the Filename parameter. This is directly equivalent to selecting Save as type: Publisher Template in the Save As dialog box.

    The following function saves the specified document as a template:

    Function SaveAsTemplate(pub As Document, fileName As String)

      With pub

       .SaveAs fileName:= _

       "C:\Documents and Settings\user\Application Data\Microsoft\Templates\" _

          & fileName, _

          Format:=pbFilePublication, _


      End With

    End Function

    You can specify custom categories to group your templates. The custom categories are listed under Templates in the New Publications task pane. To do this, specify a category for the publication before you save it as a template. From the File menu, select Properties. On the Summary tab, enter a value for Category. If you do not specify a category for your template, Publisher displays it in a category named My Templates by default.

    Note   There is no way to programmatically set a publication's properties, such as Category, using the Publisher object model.

    Any Visual Basic for Applications (VBA) code contained in the template gets copied into any publications you later create based on the template.

    If you create a template based on a publication wizard, any publications you create based on the template retain the design automation functionality of the wizard.

    Publications based on a template retain no link to that template. Any changes you later make in the template are not propagated to any publications previously created from that template.

    Programmatically, work with templates as you would with any other Publisher files. There are no object model objects or properties specific to templates. For example, you cannot create a publication based on a template using the NewDocument method. In such a case, use the Document.Open method to open the template file directly, and then use the Document.SaveAs method to create a new publication based on the template.

    Making Macros Available in Publisher Templates

    As mentioned above, you cannot edit any of the publication wizards included in Publisher. You can, however, create a template based on a publication wizard. This template could include any code you wanted to make available for publications based on that publication wizard.

    However, because users employ a number of publication wizards to create a wide range of publication types, in many cases it’s not practical to make your macro code available through creating templates. If you had particular macro functionality you wanted to make available for all publication types, you would have to create a separate template with that code for each publication wizard. In such cases, it’s best to just create an add-in for Publisher and deploy your code that way.

  • Andrew May's WebLog

    PowerPoint: Pause a Sound File During Slide Show


    Here’s one of my rare post concerning things you can do without using code.

    I got an email the other day from a user who had read my blog entries about inserting sound files using code. He had an issue he was looking for help with. He didn’t seem all the comfortable with programming, but was willing to learn some coding if it solved his problem.

    Here’s what he wanted to do: He had recorded narration for a slide as an mp3 file. He knew how to insert the file, and set it to play automatically, but he also wanted to have a Pause/Play button on the slide itself, so he could pause the narration at will during the slide show.

    I knew how I’d set that up in code, but it seemed like a reasonable thing that users would want to be able to do without resorting to automation. So I went to our ever-helpful end-user writers, and sure enough, you can do what he wanted to without writing a single line of code. After they showed me how, I wrote up instructions and mailed them to him. Then I figured I might as well post them on my blog, in case anyone else was trying to do something similar. So here they are.

    When you break it down, here’s what we actually want to do:

    ·        Insert narration that plays automatically when the slide loads.

    ·        Add a button that lets you pause the narration during the slide show.

    So here are the four general tasks we need to set this up:

    1.      Insert the sound file on the slide, and set it to play automatically.

    2.      Add a ‘Pause’ button to the slide.

    3.      Add a custom animation to the slide that pauses the sound file.

    4.      Set the custom animation so that it’s triggered when you click the Pause button.

    First, let’s insert the sound file:

    1.      From the Insert menu, select Movies and Sounds, and then select the option for the sound you want to insert (Sound From File, etc.).

    2.      We want the narration to start when the slide loads, so click Automatically when prompted.

    Next, add the action button:

    1.      From the Slide Show menu, select Action Buttons, and then click the button you want to use.

    2.      Draw the button on your slide.

    3.      When the Action Settings dialog box appears, under Action on Click, select None, and then click OK.

    Next, add a custom animation to your slide:

    1.      From the Slide Show menu, click Custom Animation.

    The Custom Animation pane appears.

    2.      In the Custom Animation pane, select Add Effect, then Sound Actions, and then Pause. This inserts a Pause animation into your animation sequence.

    Finally, set the Pause animation to be triggered when the user clicks your ‘Pause’ action button:

    1.      Select the Pause animation in the Custom Animation pane.

    2.      Right-click and select Timing.

    3.      On the Timing tab, click Triggers. Then click Start effect on click of, and select the action button from the pull-down list. Click OK.

    That should do it. Now, when the sound file plays, clicking the action button will pause the file playback. Clicking the button again starts the sound file from where it was paused.

    The end-user writers were also kind enough to suggest some online articles and trainings that cover these issues in more detail.

    Here’s an article that’s about triggers:

    Use triggers to create an interactive slide show in PowerPoint

    This course covers the new sounds options that were available in 2003:

    Playing Sound

    While the next link is to a course focused on video, its second lesson tells how to set up the button panel to play the video (sound, in this case).

    Playing movies

    And, as always, you can also check out the public PowerPoint newsgroup. It’s a great place to get quick answers for stuff like this.

    PowerPoint General Questions

  • Andrew May's WebLog

    Publisher 2003: Creating and Managing Linked Text Boxes


    (Note:   This is the third in a series of entries that aim to introduce experienced programmers to the Publisher object model.

    The first entry covered creating Web pages programmatically. You can read it here.

    The second entry covered working with wizards and templates. You can read it here.)

    While the way you work with text in the Publisher object model is very similar to how you work with text in other word processing applications, such as Word, in does differ in one very important aspect: because Publisher is a desktop design and publishing application, it provides you to ability to include multiple text flows in a single publication, and programmatically control how those flows are laid out and formatted.

    For example, suppose you needed to create a newsletter. You would probably want to include several different articles, each with their own distinct content and formatting. In addition, a given article might continue from one page to another; those pages might not be contiguous. The final result might look something like the three-page newsletter in Figure 1.


    Figure 1. Lay out of a sample newsletter.

    In Publisher, each distinct flow of text is referred to as a story. A given story may span one or more textboxes, on one or more pages. Those pages need not be contiguous. Consider the sample newsletter in Figure 1. The first story is contained in a single text box on the first page. The second story starts in a text box on page one, then concludes in another textbox on page three. Story three is contained in three text boxes: two on page two, and a single text box on page three.

    Publisher lets you link multiple text boxes together to contain a story, and automatically manages how text flows from one text box to the next. If there isn't room to display all the story text in the first text box, the text flows into the second, and so on. If there's too much text to display in the final text box, Publisher stores the remaining text in an overflow buffer. Publisher automatically adjusts the amount of text contained in each text box as you resize the textboxes, format the text, etc.

    Using the Story Object to Manage Text

    When working with story text through the Publisher object model, it is important to distinguish between the story itself, and the individual text boxes that contain it. Each story in a given publication is represented by a Story object contained in the Document.Stories collection. Operations performed on a Story object affect the entire story, regardless of the textboxes in which it is contained. For example, the following code sets the font size for a story to 12 points, for all the text boxes that contain that story.


    The following example, on the other hand, set the font size for only the story text contained in the specific textbox:

    Activedocument.Pages(1).Shapes("Story2TextBox1") _


    You can not create or delete stories through the Stories collection. Rather, when you add a textbox to the publication, you are also adding a new story to the publication. For example, the following code adds a textbox, and therefore a story, to the active publication, even though no text has been specified for the text box.

    ActiveDocument.Pages(1).Shapes.AddTextbox _

      Orientation:=pbTextOrientationHorizontal, _

      Left:=72, Top:=72, Width:=200, Height:=200

    If you then deleted the text box, or linked it to an existing text box, the number of stories in the publication would decrease by one. Use the Stories.Count property to return the number of stories in a publication at a given time.

    To delete a story, you must delete each text box that contains the story text.

    Use the TextFrame property to access the text frame of the first text box of a story. Use the TextRange property to return the full text of the story.

    A story may also be placed within a table, in which case the story would be contained in one or more Cell objects rather than TextFrame objects. In such a case, use the Cell.TextRange property to access the text in a specific table cell. Use the HasTextFrame property to determine if a story is contained in one or more TextFrame objects.

    The only objects to which you can link a text box are:

    ·         An empty text box that is not already part of a chain of connected text boxes.

    ·         A drawing shape that has a text frame. To determine if a shape has a text frame, use the Shape.HasTextFrame property.

    Using the TextFrame Object to Manage Story Content

    Each text box contains a TextFrame object, which contains the text in shape, as well as the properties that control the margins and orientation of the text frame. The TextFrame object includes several members that enable you to determine if a text box is part of a linked story, and to set or sever the connections between text boxes.

    To determine if a text box is connected to a preceding or following text box, use the HasPreviousLink and HasNextLink properties, respectively. To access the text frames of those connected shapes, use the PreviousLinkedTextFrame and NextLinkedTextFrame properties. To connect one text box to the text box you want to follow it, use the NextLinkedTextFrame property as well.

    To break the forward link for a specified text frame, use the BreakForwardLink method. Applying this method to a shape in the middle of a chain of shapes with linked text frames will break the chain, leaving two sets of linked shapes. All of the text, however, will remain in the first series of linked shapes.

    The following example illustrates the relationship between a story and the connected text boxes that contain it. The example adds three text boxes to the active publication, and adds text to the first text box. At this point, with the text boxes not connected, three stories have been added to the publication's Stories collection. Next, the example links the three text boxes together by setting the NextLinkedTextFrame property of the first two text boxes. By doing this, two stories have been removed from the Stories collection. Note that the code calls the ValidLinkTarget method to determine if each text frame is a legitimate target to which to link.

    Finally, the third text box is disconnected by calling the BreakForwardLink method for the second text box. The story text is now stored only in the first two text boxes, and the overflow buffer if necessary. In addition, text box three now represents a new, empty story.

    Sub BreakTextLink()

      Dim tb1 As Shape

      Dim tb2 As Shape

      Dim tb3 As Shape


      Set tb1 = ActiveDocument.Pages(1).Shapes.AddTextbox _

          (Orientation:=msoTextOrientationHorizontal, _

          Left:=72, Top:=36, Width:=72, Height:=36)

      tb1.TextFrame.TextRange = "This is some text. " _

          & "This is some more text. This is even more text. " _

          & "And this is some more text and even more text."


      Set tb2 = ActiveDocument.Pages(1).Shapes.AddTextbox _

          (Orientation:=msoTextOrientationHorizontal, _

          Left:=72, Top:=108, Width:=72, Height:=36)


      Set tb3 = ActiveDocument.Pages(1).Shapes.AddTextbox _

          (Orientation:=msoTextOrientationHorizontal, _

          Left:=72, Top:=180, Width:=72, Height:=36)




      If tb1.TextFrame.ValidLinkTarget(tb2) Then

        tb1.TextFrame.NextLinkedTextFrame = tb2.TextFrame

      End If


      If tb2.TextFrame.ValidLinkTarget(tb3) Then

        tb2.TextFrame.NextLinkedTextFrame = tb3.TextFrame

      End If







    End Sub


    Function ShowStoryCount()

      MsgBox "There are currently " & _

        ActiveDocument.Stories.Count & _

        " stories in this publication.", , "Story Count"

    End Function

    There is no object in the Publisher object model that represents the overflow buffer for a specific story. Each TextFrame has an Overflowing property, however, that indicates whether text is overflowing from that text box into the overflow buffer. For linked text frames, only the final text frame can be overflowing.

    For a given story, the text in the overflow buffer is the difference between the End property of the final linked TextFrame in the story, and the End property of the Story object itself.

    For example, the following procedures retrieves the text in the overflow buffer. First, the code determines if the selected text box has overflow text. If it does, the procedure retrieves the text contained in the overflow buffer by using the Characters method. This method returns a TextRange object that contains the characters from the end of the text box TextRange object to the end of the Story object. The code then uses the Text property of the resulting TextRange object to return a string representing the overflow text, which it then displays in a message box.

    Sub GetOverflowText()

      Dim ot As String

      With Selection.ShapeRange(1).TextFrame

        If .Overflowing Then

          With .TextRange

            ot = .Characters(.End, .Story.TextRange.End).Text

            MsgBox prompt:=ot, Title:="Overflow Text"

          End With

        End If

      End With

    End Sub

    Also, each TextFrame object also has an AutoFitText property, which lets you set how you want Publisher to deal with overflowing text:

    ·         Allow the text to overflow

    ·         Reduce the text size so that the it fits in the text frame

    ·         Reduce or enlarge the text size so it fills the text frame

    The following text boxes cannot be part of a chain of connected text boxes: headers or footers, navigation bars, inline objects, personal information text boxes, text boxes already containing text, or text boxes set to automatically reduce text size.

    Finally, each TextFrame and TextRange object has a Story property, which enables you to access the Story object associated with a specific text frame or text range.

    The Publisher Story Object Model

    Figure 2 illustrates the structure of the Publisher object model surrounding the Story object. It also includes the TextFrame properties concerned with manipulating the text frames of a specific story.


    Figure 2. The Story Object Model Structure

  • Andrew May's WebLog

    Importing Content into OneNote 2003 SP1 Preview


    The OneNote Service Pack 1 Preview is currently available for download. This Service Pack includes some cool new features that'll be of great interest to developers. We've already been seeing some questions about this new functionality in the newsgroups, so we posting a draft of the article that will appear on MSDN once SP1 is released. Keep in mind that, while we've tried to make sure the information is as accurate and complete as possible, it is a draft document, and provided as such. Void where prohibited.


    Anyway, here's the article:

    Importing Content into OneNote 2003 SP1 Preview

    Applies To:

        Microsoft Office OneNote 2003 SP1

    Summary:    Learn about the new extensibility features available for developers in Microsoft Office OneNote 2003 SP1. The new OneNote 1.1 Type Library includes functionality which enables you to programmatically import images, ink, and HTML into OneNote.


    For the Service Pack 1 (SP1) Preview, OneNote 2003 has added extensibility functionality that enables applications to interoperate with it in an important, fundamental way—they can add content to OneNote notebooks. You can now push content to OneNote that includes html, images, and even ink (such as from a Tablet PC). You can even create the folder, section, or page onto which you want to place your content.

    Note These extensibility features are only available in the OneNote 2003 Service Pack 1 Preview. You can upgrade to the OneNote SP1 Preview here.

    Using the CSimpleImporterClass

    OneNote SP1 exposes the OneNote 1.1 Type Library, which consists of a single class, CSimpleImporterClass, which enables you to programmatically add content to a OneNote notebook. You can add text in html form, images, and even ink from a Tablet PC. The CSimpleImporterClass enables you to specify where in the notebook you want to place the content; you can even create new folders, sections, and pages for content, and then programmatically display the desired page. OneNote’s import functionality also lets you later delete the content you import.

    The CSimpleImporterClass consists of two methods:

    ·         Import, which enables you to add, update, or delete images, ink, and html content to a specific page in a OneNote folder and section.

    ·         NavigateToPage, which enables you to display a specified page.

    To use the CSimpleImportClass, you must add a reference to the OneNote 1.1 Type Library to your project. To add a reference, in Visual Studio .NET, on the Solution Explorer window, right-click References and then click Add Reference. On the COM tab, select OneNote 1.1 Type Library in the list, click Select, and then click OK.

    While this article focuses on implementing OneNote’s import functionality using .NET languages, you can also use the OneNote 1.1 Type Library with unmanaged code, such as Visual Basic 6.0 or Visual C++.

    The Data Import Schema

    The Import method has the following signature:

    Import (bstrXml as String)

    The method takes an xml string describing the content object(s) you want to import, as well as the location in the notebook where you want them placed. You can also use the import command to delete objects you have previously placed in the notebook.

    When called, OneNote executes the Import method with minimal intrusion on the user. OneNote does open if it is not already opened, which means the OneNote splash screen displays, but OneNote does not assume focus. Nor does it change the user’s location in the notebook if OneNote is already running. To change the focus of the OneNote application to the new content, use the NavigateToPage method, discussed later.

    If the Import method fails, OneNote does not display an error to the user. However, the COM interface does return an “Unknown Error” to the application making the call.

    The figure below outlines the xml schema to which the import file must adhere.


    Figure 1. XML Schema Structure of the Import Root Element



    Figure 2. Schema Structure of the PlaceObjects Element

    The OneNote data import schema can be found at The OneNote 1.1 SimpleImport XML Schema.

    Note The namespace for the Import method will be different in the final version of OneNote SP1 from what it is in the SP1 Preview.

    The current namespace for the OneNote SP1 Preview is:

    While the final namespace for OneNote SP1 will be:

    Be advised that if you’re programming against the Preview namespace, you must update your code for the new namespace in order for it to be compatible with the final OneNote SP1.

    There are two elements directly below the root <Import> element. Use the first element, <EnsurePage>, to make sure the folder, section, and page on which you want to place content exists. Use the second element, <PlaceObjects>, to actually place or delete objects from the page. The schema requires that the root element contain either at least one <EnsurePage> or <PlaceObjects> element. Any <EnsurePage> elements must appear before any <PlaceObjects> element.

    Creating Folders and Pages for Content

    Before you import content, the target pages for that content must exist in the OneNote notebook. Use the <EnsurePage> element to verify or create the target pages for your content. For each page you specify in an <EnsurePage> element, OneNote checks to determine if the page exists, and if not, creates it. You can even create new notebook folders and sections by specifying folders or sections that don’t exist.

    You are required to pass OneNote a string representing the path to the desired page, as well as a GUID for that page. If the path is not fully-qualified, OneNote assumes the string starts from the notebook root location for the user. Additionally, you can specify the title, date, reading direction, and page placement relative to other pages in the notebook.

    By default, OneNote inserts each new page at the end of the specified section. If you specify a page GUID for the insertAfter attribute, OneNote inserts the new page as a sub-page of the page whose GUID you specified. In such cases, OneNote labels the sub-page with the title and date of the page after which it’s inserted, rather than what you specify in the title and date attributes for the sub-page. If the page you specify does not exist (for example, if it was never added, or the user deleted it), then OneNote ignores the insertAfter attribute and inserts the new page at the end of the specified section, with any specified title and date values.

    Consider the following example. This <EnsurePage> element specifies a page in the OneNote section title Negotiations, in the folder Johnson Contract, in the user’s root notebook folder. The page is titled “Budget Concerns”.

          <EnsurePage path="Johnson Contract\"


                      title="Budget Concerns"/>

    OneNote uses the optional attributes of the <EnsurePage> element if it creates the page you specify. If you specify attributes for an existing page, OneNote leaves the page unchanged. For example, if you use a GUID for an existing page, and specify a title that differs from that page’s current title, OneNote does not change the page title.

    Additionally, OneNote only searches the path you specify for the desired page GUID. If the page GUID does not exist in the specified section, OneNote creates it; it does not look for the GUID in other sections of the notebook.

    You can use multiple <EnsurePage> elements to create multiple pages within the OneNote notebook. You must verify or create the page before you can place content on it. You are not required to include an <EnsurePage> element for each page on which you want to place content. However, if you use the <PlaceObjects> element to try and place objects on a page that does not exist, the Import method fails. In some cases, this may be the desired outcome; for example, if you only wanted to update content on a page if the page still exists, and not add the content if the page has been deleted by the user.

    Placing Content on Pages

    Once you’ve ensured that the pages onto which you want to import data exist in the OneNote notebook, you can start placing objects on them using the <PlaceObjects> element. Multiple objects can be imported to multiple pages if desired. You create a <PlaceObjects> element for each page on which you want to place content. Same as the <EnsurePage> element, <PlaceObjects> has two required attributes: the path to the page, and the guid assigned to the page. You must include at least one <Object> element in each <PlaceObjects> element.

    To create the xml string that describes the content you want to import into OneNote, follow these general steps:

    ·         If you want to make sure the page exists to place content onto, create an <EnsurePage> to verify or create the folder, section, and page as necessary.

    ·         Create a <PlaceObjects> element for the page to which you want to add or delete content.

    ·         Create an <Object> element for the first object you want to alter (add, update, or delete) on the page.

    ·         To delete the object, add the <Delete/> element to that object.

    ·         To import the object, add a <Position> element and use its x and y attributes to specify where on the page to place the object.

    ·         Specify the type of object you’re importing to the page by using the appropriate element, <Image>, <Ink>, or <Outline>, and setting the appropriate optional attributes, if desired.

    ·         If you’re importing an outline object, specify the sub-objects the outline contains in the order in which you want them to appear in the outline. You can specify any number and order of <Image>, <Ink>, and <Html> elements. However, you are required to specify at least one <Image>, <Ink>, or <Html> element for the outline.

    ·         Repeat this procedure for all objects you want to alter on the page. Then repeat this procedure for all pages on which you want to alter content.

    Some other technical requirement to keep in mind as you create the xml string:

    ·         OneNote positions objects based on absolute x and y coordinates, where x and y represent measurements in points. Seventy-two points equal one inch.

    ·         Ink objects must be described in Ink Serialized Format (ISF) format, base-64 encoded, or specified by a path to the source file. If you specify a file path, the source file should be a plain file with the byte stream containing the ISF. If you include the ink as data in the XML, then it should be base64 encoded. For more information on programmatically capturing and manipulating ink, see this Ink Serialization Sample.

    ·         Image objects can be specified by a path to a source file, or base-64 encoded.

    ·         Text must be described in html, within a CDATA block, or specified by a path to a source file. In a CDATA block, all characters are treated as a literal part of the element’s character data, rather than as XML markup. XML and HTML use some of the same characters to designate special processing instructions. Using the CDATA block prevents OneNote from misinterpreting HTML content as XML instructions.

    ·         Although the schema does not currently support importing audio or video files, you can include links to these files within the HTML content you import.

    Updating Content

    To update objects that are already in a notebook, simply re-import the objects to the same page, using the same GUIDs. Be aware, however, that re-importing an object overwrites that object without notifying the user. Any changes made to the content since it was last imported are lost.

    Deleting Content

    To delete an object, place the <Delete/> element within the object element. To delete an object, you must be able to identify it by its GUID and path. In practical terms, this generally means an application can only delete objects it places in OneNote to begin with. However, if the application stores the GUID across sessions, it can delete objects it imported into OneNote in previous sessions. You cannot delete folders, sections, or pages, even those you created.

    Sample XML String

    Below is an example of what a typical xml string for the Import method might resemble. This xml file describes the placement of three new objects onto an existing page, and the deletion of an object already contained on that page.

    <?xml version="1.0"?>

    <Import xmlns="">


          <EnsurePage path="MSN Instant Messenger\"






          <PlaceObjects pagePath="MSN Instant Messenger\"



                <Object guid="{5FCFD7F9-02C2-42fc-B6AF-7A8450D43C2D}">

                      <Position x="72" y="72"/>

                      <Image backgroundImage="true">

                            <File path="c:\image.png"/>




                <Object guid="{F6FC4149-1092-48ea-806D-0067C8661A18}">

                      <Position x="72" y="72"/>


                            <File path="c:\ink.isf"/>




                <Object guid="{7EA551C4-F778-40ce-9181-21A3DB6D33CA}">

                      <Position x="72" y="432"/>

                      <Outline width="360">




                                              <html><body><p>Sample text here.</p></body></html>







                <Object guid="{1A6648BA-D792-48f1-AC6A-43DF6E258851}">







    The following example demonstrates a basic implementation of the OneNote import functionality. The code displays a dialog that enables the user to select an xml file, and then passes the contents of that xml file as an argument for the Import method. This example assumes the xml file conforms to the OneNote data import schema. This example also assumes the project contains a reference to the OneNote 1.0 Type Library.

      Dim strFileName As String

      Dim XmlFileStream As StreamReader

      Dim strImportXml As String

      Dim objOneNote As OneNote.CSimpleImporterClass


      OpenFileDialog1.Filter = "XML files (*.XML)|*.XML|Text files (*.TXT)|*.TXT"


      strFileName = OpenFileDialog1.FileName()


      objOneNote = New OneNote.CSimpleImporterClass

      XmlFileStream = New StreamReader(strFileName)

      strImportXml = XmlFileStream.ReadToEnd




    For the sake of simplicity, so as to highlight how the Import method is implemented, this example assumes that an XML file has already been created to use as the string for the Import method. In most cases, however, the application that calls the Import method will first create the XML string itself. For more information on creating XML using the .NET framework, see Well-Formed XML Creation with the XMLTextWriter.

    In addition, most applications will need to create and assign GUIDs to the pages and objects they create. Use the NewGuid method to create a new GUID, and the ToString method to get the string representation of the value of GUID, which the XML string requires. For more information, see GUID Structure in the .NET Framework Class Library.

    Displaying a Specified Page

    By design, the Import method executes with minimal focus, so that when you import data, the user is not distracted by OneNote displaying data they might not want to see, or worse, navigate away from a OneNote page the user is currently using. Also, in the cases where you import multiple objects to multiple pages, OneNote does not have to make assumptions about which page the user wants to see, if any.

    To display a specific page, use the NavigateToPage method. If OneNote is not open, this method opens OneNote to the specified page. If OneNote is already open, the method navigates to the specified page in the current instance of OneNote.

    To select the page to display, you must specify the path to the page, as well as the GUID for that page. If you specify a page that does not exist, OneNote returns an error.

    The NavigateToPage method has the following signature:

    NavigateToPage(bstrPath as String, bstrGuid as string)


    OneNote’s new import functionality opens up exciting possibilities for interacting with other applications. Any application that can save data (either its own or another application’s) as html text, images, or ISF can now push that content into OneNote and place it wherever is desired. And as long as the application retains the GUIDs used, it can update or delete the content it pushed whenever necessary.

  • Andrew May's WebLog

    Document Parsers in SharePoint (1 of 4): Overview


    Now that I’ve talked about the built-in XML parser, and how you can use it to promote and demote document properties for XML files, you might be thinking: what about custom files types that aren’t XML? What if I’ve got proprietary binary file types from which I want to promote and demote properties to the SharePoint list?

    We’ve got you covered there as well.

    For the next four entries, I’m going to go over in detail how to construct and register a custom parser that enables you to promote and demote properties between your custom file types and Windows SharePoint Services.

    This information will get rolled into the next update of the WSS SDK, so consider this a preview if case you want to work with the parser framework right now.

    Custom Document Parser Overview

    Managing the metadata associated with your document is one of the most powerful advantages of storing your enterprise content in WSS. However, keeping that information in synch between the document library level and in the document itself is a challenge. WSS provides the document parser infrastructure, which enables you to create and install custom document parsers that can parse your custom file types and update a document for changes made at the document library level, and vice versa. Using a document parser for your custom file types helps ensure that your document metadata is always up-to-date and synchronized between the document library and the document itself.

    A document parser is a custom COM assembly that, by implementing the WSS document parser interface, does the following when invoked by WSS:

    ·         Extracts document property values from a document of a certain file type, and pass those property values to WSS for promotion to the document library property columns.

    ·         Receives document properties, and demote those property values into the document itself.

    This enables users to edit document properties in the document itself, and have the property values on the document library automatically updated to reflect their changes. Likewise, users can update property values at the document library level, and have those changes automatically written back into the document.

    I’ll talk about how WSS invokes document parsers, and how those parsers promote and demote document metadata, in my next entry.

    Parser Requirements

    For WSS to use a custom document parser, the document parser must meet the following conditions:

    ·         The document parser must be a COM assembly that implements the document parser interface.

    I’ll go over the details of the IParser interface in a later entry.

    ·         The document parser assembly must be installed and registered on each front-end Web server in the WSS installation.

    ·         You must add an entry for the document parser in DOCPARSE.XML, the file that contains the list of document parsers and the file types with which each is associated.

    And I’ll give you the specifics of the document parser definition schema in a later entry as well. All in good time.

    Parser Association

    WSS selects the document parser to invoke based on the file type of the document to be parsed. A given document parser can be associated with multiple file types, but you can associate a given file type with only one parser.

    To specify the file type or types that a custom document parser can parse, you add a node to the Docparse.XML file. Each node in this document identifies a document parser assembly, and specifies the file type for which it is to be used. You can specify a file type by either file extension or program ID.

    If you specify multiple document parsers for the same file type, WSS invokes the first document parser in the list associated with the file type.

    WSS includes built-in document parsers for the following file types:

    ·         OLE: includes DOC, XLS, PPT, MSG, and PUB file formats

    ·         Office 2007 XML formats: includes DOCX, DOCM, PPTX, PPTM, XLSX and XLSM file formats

    ·         XML

    ·         HTM: includes HTM, HTML, MHT, MHTM, and ASPX file formats

    You cannot create a custom document parser for these file types. With the XML parser, you can use content types to specify which document properties you want to map to which content type columns, and where the document properties reside in your XML documents.

    Parser Deployment

    To guarantee that WSS is able to invoke a given parser whenever necessary, you must install each parser assembly on each front end Web server in your WSS installation. Because of this, you can specify only one parser for a given file type across a WSS installation.

    The document parser framework does not include the ability to package and deploy a custom document parser as part of a SharePoint feature.

    In my next post, I’ll discuss how the document parser actually parses documents and interacts with WSS.

  • Andrew May's WebLog

    OneNote: An In-Depth Look at the OneNoteImporter Managed Assembly (Part 5 of 5)


    In this series of entries, we're taking an in-depth at the OneNoteImporter manage class, which provides an object model interface for the programmability functionality added in OneNote 2003 SP 1.

    Read part one here.

    Read part two here.

    Read part three here.

    Read part four here.

    Object Model Maps

    The following figures diagram the OneNoteImporter assembly object model, including abstract classes and inheritance. The diagrams mainly document how the objects in the assembly relate to each other. In most cases, when a member takes an object as a parameter, or returns an object, that object is included on the diagram. Value types, such as string or integers, are for the most part not displayed.

    For the sake of clarity, the following object information, pertaining to methods that most of the classes have, has been left off the diagrams:

    ·         The Clone method returns the type of object from which you call it.

    ·         The Equals method takes a System.Object as a parameter.

    ·         The GetHashCode method returns a System.Int32 object suitable for use in hashing algorithms and data structures like a hash table.

    ·         Inheritance from System.Object is not shown.

    Figure 2. The Application Object (and Legend)



    Figure 3. The ImportNode Abstract Class, and Page Class

    Figure 4. The PageObject Abstract Class, and Derived Classes

    Figure 5. The OutlineContent Abstract Class, and Derived Classes

    Figure 6. The Data Abstract Class, and Derived Classes


    The OneNoteImporter managed assembly provides a convenient and multi-functional ‘wrapper’ for working with the SimpleImporter and command line functionality in OneNote 2003 SP1. Moreover, using the provided source files for the assembly, a developer can customize and extend the classes as required for his particular application.

  • Andrew May's WebLog

    XML Document Property Parsing in SharePoint (1 of 5): XML Parser Overview


    I've just finished putting together a lot of information around how document parsers work within Windows SharePoint Services V3, including how to use the built-in XML parser, and how to create your own custom parsers for custom file types. This material won't be included in the WSS SDK until the next major update, so I figured I'd give you a preview of it here. For the next five posts, I'm going to be covering how to use the built-in XML parser in WSS V3 to promote and demote document properties in your XML files, including InfoPath 2007 forms.

    So without further ado:

    WSS V3 includes a built-in XML document parser you can use to promote and demote the properties included in your XML documents. Your XML files can adhere to any schema you choose. As long as your XML file meets the requirements listed below, WSS V3 automatically invokes the built-in XML parser whenever document property promotion or demotion is required.

    (Property promotion refers to extracting document properties from a document, and writing those property values to the appropriate columns on the document library where the document is stored. Property demotion refers to taking column values from the document library where a document is stored, and writing those column values into the document itself.)

    Using the built-in XML parser for your custom XML files helps ensure that your document metadata is always up-to-date and synchronized between the document library and the document itself. Users can edit document properties in the document itself, and have the property values on the document library automatically updated to reflect their changes. Likewise, users can update property values at the document library level, and have those changes automatically written back into the document itself.

    For WSS V3 to invoke the built-in XML parser for an XML file, that XML file must meet the following requirements:

    ·         The file must have an extension of .xml.

    ·         The file must not be a WordML file. WSS V3 contains a separate built-in parser for WordML files; WSS V3 automatically invokes this parser for XML files created using WordML.

    Additionally, for the XML parser to actually promote and demote document properties, the XML file should be assigned a content type that specifies where each document property is located in the document, and which content type column that property maps to. (We'll talk about that in a later entry in this series.)

    XML Parser Processing

    The following is a brief overview of how the built-in parser operates:

    When a user uploads an XML document, WSS V3 examines the document to determine if the built-in XML parser should be invoked. If the document meets the requirements, WSS V3 invokes the parser to promote the appropriate document properties to the document library.

    Once invoked, the XML parser examines the document to determine the document content type. The parser then accesses the document's content type definition. The content type definition includes information about each column in that content type; this information can include:

    ·         The document property that maps to a given column, if there is one

    ·         The location where the document property is stored in the document itself

    Using this information, the XML parser can extract each document property from the correct location in the document, and pass these properties to WSS V3. WSS V3 then promotes the appropriate document property to the matching column included in the content type.

    Likewise, WSS V3 can also invoke the built-in XML parser to demote properties from the content type columns, on the document library, into the document itself. When WSS V3 invokes the demotion function of the parser, it passes the parser the document and the column values to be demoted into the document. Once again, the parser accesses the document's content type definition. The parser uses the content type definition to determine:

    ·         Which document properties map to the column values passed to it for demotion

    ·         The location of those document properties in the document

    Using this information, the parser writes the column values into the applicable document property locations in the document.

    Enabling Property Demotion

    For a document property to be demoted, the column to which it is mapped must be defined with its ReadOnly attribute set to "false".

    In my next post, we'll discuss how to use content type to specify XML document properties. Stay tuned.

  • Andrew May's WebLog

    Windows Server 2003: Unable to Connect to the Internet


    A few months back, when I first moved over to working on Windows SharePoint Services, I decided to turn one of my work computers into a WSS box. I later decided to install a Virtual PC profile on the machine instead. But my new manager forwarded me a few tips concerning issues he'd run into when he initially installed Windows Server 2003. These were things that the Help Desk was, well, less than helpful in fixing.

    So I thought I'd put them out here, in case someone with the same problems runs across them.

    Here's the first:

    Once he had Windows Server 2003 installed, he couldn't connect to the Internet at all. What's more, anytime he browsed to an intranet site, he was prompted for his username and password for each site the master page was linking to.

    Turns out he needed to adjust his internet security configuration. Here's how:

    1.      On the Start menu, point to Settings and click Control Panel.

    2.      Double-click Add or Remove Programs.

    3.      Click Add/Remove Windows Components.

    4.      In the Windows Components Wizard, de-select Internet Explorer Enhanced Security Configuration and then click Next.

  • Andrew May's WebLog

    Information Rights Management in Windows SharePoint Services Poster Available for Download


    Now that we’ve managed to get the Windows SharePoint Services and Office SharePoint Server SDKs out the door, I’ve had a chance to catch my breath and do a couple of one-off side projects. One of which is (big surprise) another poster highlighting the developer functionality in WSS 3.0.

    This time around I’ve focused on the Information Rights Management framework in WSS. In WSS 3.0, administrators can now apply IRM to document libraries and item attachments. IRM lets you create a persistent set of access controls that live with the content, helping you control access to files even after a user downloads them. And if you’ve got custom file types, you can create IRM protectors: custom assemblies that plug into the IRM framework and control the conversion of those file types to and from their encrypted, rights-managed formats.

    WSS gives you the choice of creating two types of custom protectors:

    ·         Integrated protectors, which use Windows SharePoint Services to access the Windows Rights Management Services (RMS) platform in order to generate protected versions of files, and to remove protection from rights-managed files.

    ·         Autonomous protectors, on the other hand, configure and execute the entire rights-management process by themselves.

    The poster illustrates how the two types of protectors fit into the IRM framework in WSS 3.0, and how WSS uses each to rights manage documents during file upload and download process.

    To download the poster, just click on the attachment at the end of this post.

    You can read all about the IRM framework in WSS starting here.

  • Andrew May's WebLog

    Publisher 2003: Using the NewDocument and DocumentOpen Events


    I got an email the other day from a user who asked if the NewDocument event in Publisher actually worked. It does, but it is a little more complicated than you'd think. Read on.

    To enable event handling within a typical application, you create a new class module, and declare an Application object with events:

    Public WithEvents App As Publisher.Application

    Then, write your event handlers:

    Private Sub App_NewDocument(ByVal Doc As Document)

      MsgBox "You've created a new publication", , "New Pub Created"

    End Sub

    Finally, initialize the your object with the Application object:

    Dim pub As New Class1

    Sub Register_Event_Handler()

      Set pub.App = Publisher.Application

    End Sub

    Which is exactly what the user was doing. But whenever he created a new document, either through the user interface or programmatically, his NewDocument event handler was never invoked. The NewDocument event didn't seem to be firing.

    Actually, it was; he just hadn't hooked up his event handler to the right Publisher.Application object.

    Publisher is a single document interface (SDI) application; each document you have open resides in a separate instance of Publisher. Which means if you have an instance of Publisher that already contains an open document, and you open another document, you're actually launching another instance of Publisher first. That second instance of Publisher then opens the document, even though it looks like you're launching the document directly from the first instance.

    That's why his NewDocument event handler never fired. His code was watching the first instance of Publisher; but with a document already open in the first instance, any commands to create a new document actually launched another instance of Publisher, which is where the NewDocument event was raised.

    The same is true of the DocumentOpen event; if the instance of Publisher already contains an open document, then an additional instance launches, and the DocumentOpen event is raised in that second instance.

    The only time either event would occur in an existing instance of Publisher is if Publisher was open, but didn't have a document open. Which means you can't write VBA code in a document to have it watch it's own application instance for those events. By simple fact that the document containing the VBA code was open, additional instances of Publisher would be launched for any open or new document commands.

    You can, of course, use the DocumentOpen and NewDocument events from:

    ·         Another instance of Publisher

    ·         A Publisher add-in

    ·         Another application

    So let's create an example for the first bullet point, another instance of Publisher. The example below creates a new instance of Publisher, and initializes code to respond to events raised within that instance.

    First, create a new class module, and declare an Application object with events, as you would normally:

    Public WithEvents App As Publisher.Application

    Then, write handler procedures for all the application events to which you want to respond:

    Private Sub App_DocumentBeforeClose(ByVal Doc As Document, Cancel As Boolean)

      MsgBox "You are about to close: " & Doc.Name

    End Sub


    Private Sub App_DocumentOpen(ByVal Doc As Document)

      MsgBox "You have opened: " & Doc.Name, , "Document Open"

    End Sub


    Private Sub App_NewDocument(ByVal Doc As Document)

      MsgBox "You've created a new publication", , "New Pub Created"

    End Sub


    Private Sub App_WindowPageChange(ByVal Vw As View)

      MsgBox "Hey", , "Window Page Change Event"

    End Sub

    However, in the procedure that initializes your application object, set your variable to the application instance launched when Publisher opens a new document, not the application instance that contains the document with the VBA project.

    For example, the following code creates a new document, and initializes the pub.App variable with the resulting new instance of Publisher. The event handlers now fire when events are raised in that application instance. Calling the NewDocument method, as the next line of code does, actually results in two events: the DocumentBeforeClose event for the new document open in the previous line, and then the NewDocument event for the new document that replaces it.

    Dim pub As New Class1

    Sub InitializeNewPubInstance()

      Set pub.App = NewDocument.Application


    End Sub

    You could also create a new instance of Publisher using the New VB keyword. However, instances of Publisher launched this way are not visible by default; this enables you to launch Publisher and automate it without displaying it to the user. So you have to explicitly specify the application be visible, like so:

    Sub NewDocTest()

      Set pub.App = New Publisher.Application

      pub.App.ActiveWindow.Visible = True


    End Sub

    Now you event handler can respond to events the user raises in that Publisher instance, including DocumentOpen and NewDocument events. The event handlers I wrote simply pop up message boxes, but you get the idea. Just remember, those message boxes will appear in the Publisher instance running the code, not the Publisher instance actually raising the events.

    One last piece of house-keeping. We want to release the pub.App variable if and when the user closes that instance of Publisher. So let's create a procedure that does that, and place it in the ThisDocument project:

    Public Sub FreeApp()

      Set pub.App = Nothing

    End Sub

    And then call that procedure from the Quit event handler:

    Private Sub App_Quit()


    End Sub

    On a related note, did you know it's impossible to programmatically arrive at a Publisher instance that doesn't have a document open? The New keyword launches a new instance of Publisher, with a new blank publication open. The Document.Close method closes the current publication, but then opens a new blank publication in it's place. There is no way to launch Publisher without also opening a publication. Likewise, there is no way to programmatically close the current publication without either also closing Publisher, or getting a new publication in its place. To get an instance of Publisher without an open publication, you have to go through the user interface.

  • Andrew May's WebLog

    Document Parsers in SharePoint (4 of 4): Parser Schema and Interface


    For these four entries, I’m going to go over in detail how to construct and register a custom parser that enables you to promote and demote properties between your custom file types and Windows SharePoint Services.

    Read part one here.

    Read part two here.

    Read part three here.

    Today, I’ll round out the document parser information I’m presenting by talking about how to register your custom parser with WSS. I’ll also give you a quick overview of the ISPDocumentParser interface, which your parser needs to implement to communicate with WSS.

    Document Parser Definition Schema Overview

    To register a custom document parser with WSS, you must add a node to the document parser definition file that identifies your parser and the file type or types it can parse.

    You can specify the file type or types a document parser can parse either by file extension, or file type program ID.

    WSS stores the document parser definition file, DOCPARSE.XML, at the following location:

    Web Server Extensions\12\CONFIG\DOCPARSE.XML

    The document parser definition schema is as follows:




    Following is a list of the elements in the document parser definition schema.


    Required. Represents the root element of the document parser definition schema.


    Required. Each docParser element represents a document parser and its associated file type. This element contains the following attributes:

    ·         Name   Required string. The file type associated with the parser. For docParser elements within the ByExtension element, set the Name attribute to the file extension. For docParser elements within the ByProdId element, set the Name attribute to the program Id of the file type. To associate a parser with multiple file types, add a docParser element for each file type.

    ·         ProgId   Required string. The program ID of the parser. This represents the ‘friendly name’ of the parser. This enables you to upgrade a parser without having to edit its document parser definition entry in the DOCPARSE.XML file. However, this prevents you from installing different versions of the same parser side-by-side.

    Document Parser Definition Example

    Below is an example of a document parser definition file.


        <docParser name="abc" ProgId="AdventureWorks.AWDocumentParser.ABCParser"/>

        <DocParser name="AWApplication.Document" ProgId="AdventureWorks.AWDocumentParser.ABCParser"/>


    Document Parser Interface Overview

    In order for a custom document parser to perform document property promotion and demotion in WSS, it must implement the following document parser interfaces. These interfaces enable the document parser to be invoked by WSS, and send and receive document properties when so invoked.

    ·         ISPDocumentParser

    Represents a custom document parser. This class includes the methods WSS uses to invoke the document parser.

    ·         IParserPropertyBag

    Represents a property bag object used to transmit document properties between the document parser and WSS. Includes methods that enable the document parser to access the content type and document library schemas for the specified document.

    ·         IEnumParserProperties

    Represents an enumerable document property collection. Includes methods the document parser can use to enumerate through the document property collection.

    ·         IParserProperty

    Represents a single document property. Includes methods for the document parser to get and set the document property value and data type.

  • Andrew May's WebLog

    OneNote: An In-Depth Look at the OneNoteImporter Managed Assembly (Part 3 of 5)


    In this series of entries, we're taking an in-depth at the OneNoteImporter manage class, which provides an object model interface for the programmability functionality added in OneNote 2003 SP 1.

    Read part one here.

    Read part two here.

    Importing Objects into OneNote

    The actual creation of the XML import document, and importing the page contents, takes place when you call the Page.Commit method. This method, in turn, invokes a number of methods in other OneNoteImporter objects. Because of this method’s importance and complexity, it’s worth examining how the method functions.

    First, the code checks to see if the page has changed in any way from the last time it was imported. It does this by determining if the Page object’s CommitPending property is set to True. If it is, it calls the SimpleImporter.Import method.

    The code calls the Page.ToString method to generate the XML string it passes to the Import method. The ToString method in turn calls the Page.SerializeToXml method.

    This begins a series of recursive calls to the SerializeToXml methods of the various objects. Each object’s SerializeToXml method includes instructions to call the SerializeToXml method of any child objects, and append the resulting XML to the parent element. This in turn invokes the SerializeToXml method of any child objects the original child object might have, and so on, until the entire page structure has been serialized to xml in a single xml document.

    The Page.SerializeToXml begins by creating a new XmlDocument object, and generating <Import> and <EnsurePage> elements and adding them to the document. Page object property values are used to set the various attributes of the <EnsurePage> element.

    Note that the Commit method generates import XML with both <EnsurePage> and <PlaceObjects> elements for that page. Specifying an <EnsurePage> element for a page guarantees that the page exists before OneNote attempts to import objects onto it. So if your application includes a scenario where you only want to import objects onto a page if the page already exists, you’ll need to modify this method, or use another means.

    The code then generates a <PlaceObjects> element. For each of the Page object’s children whose CommitPending property is set to True, the code calls the PageObject.SerializeToXml method.

    If the page object’s DeletePending property is set to True, the PageObject.SerializeToXml method generates a <Delete> element. If not, the method does three things:

    ·         Generates a <Position> element, whose attributes are set according to Position object property values.

    ·         Calls the SerializeObjectToXml method for the specific PageObject-derived class involved, i.e., ImageObject, InkObject, or OutlineObject.

    ·         Calls the SerializeToXml method for the specific OutlineContent-derived class involved, i.e., HtmlContent, InkContent, or ImageContent.

    Executing the SerializeToXml method for each of these content types includes a call to the SerializeToXml method for the Data-derived object they contain: BinaryData, FileData, or StringData. In this way, the entire page structure is serialized to xml in a single xml document.

    Note that the HtmlContent.SerializeToXml method includes a call to another internal method of that same object, called CleanHtml. The CleanHtml method reads through the html string or file data and makes sure the HTML is formatted in a way that OneNote accepts. It identifies and replaces problematic formatting with characters which OneNote can process. For example, the CleanHtml method wraps the HTML string with the appropriate <html> and <body> tags if the HTML lacks them.

    The serialization of the page nodes is now complete. If the Page object had no children, the <PlaceObjects> element remains empty. In such a case, the Page.SerializeToXml method does not append it to the <Import> element.

    Finally, the Page.SerializeToXml method determines the appropriate namespace designation and adds it to the <Import> element.

    The ToString method then takes the XmlDocument object, saves it as a text stream, converts it to a string, and passes it back to the SimpleImporter.Import method. This Import method uses the XML string to import the specified content into OneNote.

    Now that the content has been imported into OneNote, the Commit method performs some vital housekeeping. Using the RemoveChild method, it removes any of the Page object’s children who have their DeletePending property set to True. It then sets the private committed field to True, thereby making the Date, PreviousPage, RTL, and Title properties read-only. You cannot change these attributes once you import a page into OneNote.

    Lastly, it sets the CommitPending property of the Page to False. This it turn sets the CommitPending properties of all the Page object’s remaining children to False as well.

    In part four, we'll examine the internal method calls of the Commit method.

    Read part four here.

  • Andrew May's WebLog

    OneNote: An In-Depth Look at the OneNoteImporter Managed Assembly (Part 4 of 5)


    In this series of entries, we're taking an in-depth at the OneNoteImporter manage class, which provides an object model interface for the programmability functionality added in OneNote 2003 SP 1.

    Read part one here.

    Read part two here.

    Read part three here.

    Figure 1 diagrams the internal method calls of the Commit method. It shows the OneNoteImporter classes and methods called, and the Import XML elements generated at each step. The various calls to System.Xml methods are not diagrammed.


    Figure 1. The Page.Commit Method, and XML Elements Generated

    In our final entry, we'll look at some object model maps that detail how the OneNoteImporter class is structured.

    Read part five here.

  • Andrew May's WebLog

    OneNote XML Schema Map Notation, Take Two


    So, I was so annoyed when I realized I had broken one of the basic rules of information design with my diagram of the OneNote 2003 SP1 SimpleImport schema, I had to take a few minutes and see if I could fix it. As I mentioned in my last entry, my diagram uses indenting and boxes to show the element hierarchy in the OneNote schema. But, because I was illustrating the same information (the element hierarchy) two different ways (indenting and boxing), the diagram contained redundant information, and was more complicated than it needed to be.

    And I hated those damn boxes anyway. Chopping a diagram up into a grid like that only ends up distracting from the information. I knew it at the time, but I was on a deadline and couldn't come up with a better solution, so…

    Below is my latest attempt. The grid lines are gone; element hierarchy is denoted now solely by the indenting of the element names. By formatting the information inside each element (such as data type and attributes) gray, I think I've been able to keep the element names prominent enough I don’t need boxes to denote where one element ends and another starts.

    One thing that I still see as problematic is the indenting of the element information, like the data type and attributes. In this small example, I was able to keep all that information at the same left-alignment for all the elements, which again keeps that information from distracting from the element names. But, had the schema hierarchy included a few more levels, I would've had to move the element information even further over, perhaps to the point where it was so far removed from the element names in the top-level elements that it would seem disconnected.

    All in all, I think this diagram is quite a bit more successful than the one that currently appears in the article. Looks like it's time to file a bug and get that image swapped out…

  • Andrew May's WebLog

    Document Parsers in SharePoint (2 of 4): How Parsers Process Documents


    Read part one here.

    In my last entry, I gave you a brief overview of what document parsers are in Windows SharePoint Services V3, and a high-level look at what you need to do to build a custom document parser for your own custom file types. Today we’re going to start digging a little deeper, and examine how a parser interacts with WSS in detail.

    Document Parser Processing

    When a file is uploaded, or move or copied to a document library, WSS determines if a parser is associated with the document's file type. If one is, WSS invokes the parser, passing it the document to be parsed, and a property bag object. The parser extracts the properties and matching property values from the document, and adds them to the property bag object. The parser extracts all the properties from the document.

    WSS accesses the document property bag and determines which properties match the columns for the document. It then promotes those properties, or writes the document property value to the matching document library column. WSS only promotes the properties that match columns applicable to the document. The columns applicable to a document are specified by:

    ·         The document's content type, if it has been assigned a content type.

    ·         The columns in the document library, if the document does not have a content type.

    WSS also stores the entire document property collection in a hash table; this hash table can be accessed programmatically by using the SPFile.Properties properties. There is no user interface to access the document properties hash table.

    The following figure illustrates the document parsing process. In it, the parser extracts document properties from the document and writes them to the property bag. Of the four document properties, three are included in the document's content type. WSS promotes these properties to the document library; that is, writes their property values to the appropriate columns. WSS does not promote the fourth document property, Status, even though the document library includes a matching column. This is because the document's content type does not include the Status column. WSS also writes all four document properties to a hash table that is stored with the document on the document library.

    WSS can also invoke the parser to demote properties, or write a column value into the matching property in the document itself. When WSS invokes the demotion function of the parser, it again passes the parser the document, and a property bag object. In this case, the property bag object contains the properties that WSS expects the parser to demote into the document. The parser demotes the specified properties, and WSS saves the updated document back to the document library.

    The figure below illustrates the document property demotion process. To update two document properties, WSS invokes the parser, passing it the document to be updated, and a property bag object containing the two document properties. The parser reads the property values from the property bag, and updates the properties in the document. When the parser finishes updating the document, it passes a parameter back to WSS that indicates that it has changed the document. WSS then saves the updated document to the document library.

    Mapping Document Properties to Columns

    Once the document parser writes the document properties to the property bag object, WSS promotes the document properties that match columns on the document library. To do this, WSS compares the document property name with the internal names of the columns in the document’s content type, or on the document library itself if the document doesn’t have a content type. When WSS finds a column whose internal name matches the document property, it promotes the document property value into that column for the document.

    However, WSS also enables you to explicitly map a document property to a specific column. You specify this mapping information in the column’s field definition schema.

    Mapping document properties to columns in the column’s field definition enables you to map document properties to columns that may or may not be named the same. For example, you can map the document property ‘Author’ to the ‘Creator’ column of a content type or document library.

    To specify this mapping, add a ParserRef element to the field definition schema, as shown in the example below:

    <Field Type=”Text” Name=”Creator” … >


        <ParserRef Name=”Author” Assembly=”myParser.Assembly”>



    The following elements are used to define a document property mapping:


    Optional. Represents a list of document parser references for this column.


    Optional. Represents a single document parser reference. This element contains the following attributes:

    ·         Name   Required String. The name of the document property to be mapped to this column.

    ·         Assembly   Required String. The name of the document parser used.

    A column’s field definition might contain multiple parser references, each specifying a different document parser.

    In addition, if you are working with files in multiple languages, you can use parser references to manage the mapping of document properties to the appropriate columns in multiple languages, rather than have to build that functionality into the parser itself. The parser can generate a single document property, while you use multiple parser references to make sure the property is mapped to the correct column for that language. For example, suppose a parser extracts a document property named ‘Singer’. You could then map that property to a column named ‘Cantador’, as in the example below:

    <Field Type=”Text” Name=”Cantador” … >


        <ParserRef Name=”Singer” CLSID=”MyParser”>

        <ParserRef Name=”Artist” Assembly=”MP3Parser.Assembly”>



    To edit a column’s field definition schema programmatically, use the SPField.SchemaXML object. There is no equivalent user interface for specifying a parsers for a column.

    In the next entry, we'll discuss how WSS processes document that contain their content type definition.

Page 1 of 5 (108 items) 12345