Today we've published another set of document-format implementation notes, this time for the ECMA-376 1st Edition implementation in Office 2007 SP2. As with the ODF 1.1 implementation notes we published in December, the goal of publishing these notes is to help other implementers improve interoperability with Office, by transparently documenting the details of our implementation.
To get to the ECMA-376 implementer notes, go to the DII home page and click on Reference and then select ECMA-376 1st Edition from the dropdown list. You'll then see a treeview control in the panel on the left, which contains the entire structure of the ECMA-376 spec.
You can drill down into any node to see that part of the spec. For example, here's a screenshot of the treeview expanded to see the nodes under Part 4, section 2.4, Tables:
Note the small red "N" next to most of the sub-sections under 2.4. Those markers indicate which sections have implementer notes, and in this example all of the sections have notes except for 2.4.12.
After you navigate down to a specific section, you'll see the full text of that part of the spec. This is handy for browsing the spec itself, and in the top right corner you'll also see these three buttons:
Let's look at an example of the types of information you'll find in the implementer notes. As readers of this blog know, I'm a big fan of Open XML's support for custom XML markup, so I've chosen Part 4, section 184.108.40.206, which covers the customXmlMoveToRangeStart element in WordprocessingML. This element is at the heart of some fairly complicated interaction between two different concepts: custom XML markup and tracked changes. As it says in the first two paragraphs of section 220.127.116.11:
This element specifies the start of a region within which all custom XML markup was moved to this location in the document and this move was tracked as a revision. The id attribute on this element shall be used to link this element with the corresponding custom XML move destination end marker in the document. Providing a physical representation of the start and end tags of custom XML markup results in regions which can be inserted and deleted independently, but cannot be encapsulated by a single revision element, since their representation in WordprocessingML is the start or end XML tag for the custom XML markup which it represents. Therefore, the start/end "cross structure" annotation format surrounds the WordprocessingML region to which this move destination applies.
Under the implementer notes for this section, you'll first find some references to other sections of the spec. For example, the first implementer note says "Click here to view additional notes in 18.104.22.168 ins (Inserted Run Content)" and if you follow the link you'll see this note:
So we're saying that the spec is slightly ambiguous in this area (sort of like the chart-series issue I talked about in my last post), and we're clarifying exactly what we've done in Word 2007's implementation. This helps other implementers understand Word's behavior, and they can use that information to improve interoperability.
There are many cross-links like this between various sections of the spec, because of the many relationships between different elements and attributes in the spec. There are also some full-text notes under the customXmlMoveToRangeStart element, such as these:
I picked these particular notes because they're a good example of the variety found in the implementer notes. The first note above tells you that Word also applies customXmlMoveToRangeStart to structured document tags (or "content controls"); the second tells you that although the spec allows for overlapping ranges, Word doesn't support that; and the third acknowledges that Word doesn't predictably handle customXML move tracking in some situations involving equations in oMathPara elements.
To document our implementation to this level of detail, we had to carefully read every section of the ECMA-376 specification. In doing so, we found some errors. For example, in Part 4, section 12.3.20, we found that the root namespace for the styles part includes a typo: an extra "s" at the end. In that case, we searched the published ISO/IEC IS29500 spec for the same error and found that it's still there, so we've submitted that to Ecma TC45 as a defect report, which TC45 will submit to SC 34 WG4 to be corrected in the maintenance of IS29500.
Any complex technical specification like Open XML is going to have some typographical errors like that example, as well as substantive errors. This is the case with any document format specification. Similarly, any application implementing these specifications will have implementation-related issues that it needs to identify and work through over time.
We understand that users want to see interoperability between document format implementations in the marketplace and are taking the steps we believe any responsible vendor should take. This includes actively participating in the maintenance of the standards we support, identifying and addressing implementation issues going forward, and working collaboratively with other vendors to improve interop between products over time. People are welcome to point out where we may have issues as part of this overall effort between vendors (and customers). I think in the end our customers — and the broader interoperability community of implementers, users and standards participants — will be better off as a result of the things we're doing.
For a closer look at a specific implementer note that is useful to developers, check out the first post on Stephen Peront's blog: Implementer Notes Just Make Good Sense. Stephen's name may be familiar to those who followed the DIS29500 process closely, because he was a member of INCITS V1, the US technical committee that reviewed the specification. Stephen joined Microsoft just last week, and he's working with me on the Office Interoperability team. He's a great asset to our team, and you're going to see a lot more developer-oriented interoperability content on his blog going forward.
I'd like to thank all of the people here at Microsoft who have worked so hard to roll out these implementer notes. The coolest part of my job is working with so many talented and energetic people, and there are too many who played key roles in this project for me to dare to try to name them all. I'm looking forward to seeing the creative things developers will do with this information.
There has been quite a bit of discussion lately in the blogosphere about various approaches to document