I'm really happy to announce the release of the 3rd CTP for the Open XML SDK 2.0 for Microsoft Office! So what did we do in this CTP? Well, there were three main improvements we made to the SDK:
Let's go back to the Open XML SDK architecture diagram I showed you when we first announced the Open XML SDK:
As mentioned in a previous post, the April 2009 CTP of the Open XML SDK added schema level validation support for Office 2007 Open XML files. In the August 2009 CTP, one of the big things we added is semantic level validation support for Office 2007 Open XML files:
Semantic level validation goes beyond restrictions or rules defined by schemas. Semantic level validation allows developers to validate files against restrictions defined within the prose of the Open XML documentation. These are restrictions, which cannot be expressed in an XSD language.
Let's look at a semantic level restriction example. Specifically, let's look at the element endnote (Section 17.11.2 of Part 1 in the ISO/IEC-29500 specification). In the standard, it states that the id attribute of endnote, "specifies a unique ID which shall be used to match the contents of a footnote or endnote to the associated footnote/endnote reference mark … If more than one footnote shares the same ID, then this document shall be considered non-conformant. If more than one endnote shares the same ID, then this document shall be considered non-conformant." As you can see, having more than one endnote with the same id value will result in a non-conformant document. This non-conformant document may not be interpreted properly by a consuming application, like Word.
The Open XML SDK can now help you find these types of problems and will report the error to you by giving you the following information:
We hope that you can use this type of information to more easily find and fix problems. I will devote at least one blog post in the future to go into details on the validation functionality.
As defined by the ISO/IEC-29500 specification, there are several ways to extend markup within the Open XML formats. Some of the extension mechanisms, like ignorable content and alternate content blocks, may result in differences within the XML tree structure of a document. Here is an example of markup that contains an alternate content block:
In the example above, the expected child of the run element differs depending on the chosen alternate content choice. The fallback choice is what one would expect from a document created in Office 2007, while the choice requiring the wps namespace is from a document created in Office 2010. Imagine you are a solution developer working with Open XML who has deployed a solution that works perfectly on top of Office 2007 Open XML files. How would your solution work with files coming in from Office 2010? Specifically, would your solution work with documents that contain these types of extension mechanisms?
As part of the August 2009 CTP we have added functionality that allows developers to abstract away some of the difficulty intrinsic with markup compatibility and extensibility. This feature allows you to preprocess the content of Open XML files based on specific Office versions. Using the example above, if we use the August CTP to open the document based on Office 2007 we will only see the following XML markup:
If your solution expected a pict element as a child of a run element, then your solution would work perfectly with this file. In other words, using this feature, solutions won't break when future versions of Office introduce new markup into the format.
First off we want to thank everyone for their feedback and suggestions! Based on your feedback we made the following big changes to the SDK:
Our next task for the SDK is to add Office 2010 Office Open XML support. Expect to see another CTP in the next several months released with this functionality. Our goal is to be done with the Open XML SDK 2.0 around the same time as Office 2010 ships (date not public yet).
Please continue to send us your feedback, either on this blog or at our Microsoft Connect site for the Open XML SDK https://connect.microsoft.com/site/sitehome.aspx?SiteID=589. We look forward to hearing from you.
Zeyad Rajabi