As promised last month, the binary documentation (.doc, .xls, .ppt) is now live. In addition to this, the project to create an open source translator (binary -> Open XML) has now been formed on sourceforge, and the development roadmap has been published. Read my earlier post for more background on this: http://blogs.msdn.com/brian_jones/archive/2008/01/16/mapping-documents-in-the-binary-format-doc-xls-ppt-to-the-open-xml-format.aspx

Here's an overview of what's now available:

Office Binary (doc, xls, ppt) Translator to Open XML

The "Office Binary (doc, xls, ppt) Translator to Open XML" project is now live on sourceforge: http://b2xtranslator.sourceforge.net/

As you may remember, this was a request from a number of national bodies, and while Ecma TC45 believed it was outside of the scope of DIS 29500, they did talk with Microsoft and come to this agreement:

Nonetheless, Ecma International discussed this subject with Microsoft Corporation, the author of the Binary Formats.  To make it even easier for third party conversion of Binary Format-to-DIS 29500, Microsoft agreed to:

  • Initiate a Binary Format-to-ISO/IEC JTC 1 DIS 29500 Translator Project on the open source software development web site SourceForge (http://sourceforge.net/ ) in collaboration with independent software vendors.  The Translator Project will create software tools, plus guidance, showing how a document written using the Binary Formats can be translated to DIS 29500.  The Translator will be available under the open source Berkeley Software Distribution (BSD) license, and anyone can use the mapping, submit bugs and feedback, or contribute to the Project.  The Translator Project will start on February 15, 2008. 
  • Make it even easier to get access to the  Binary Formats documentation by posting it and making it available for a direct download on the Microsoft web site no later than February 15, 2008.  The Binary Formats have been under a covenant not to sue and Microsoft will also make them available under its Open Specification Promise (see www.microsoft.com/interop/osp) by the time they are posted.

We will modify DIS 29500 to include an informative reference to the SourceForge project.

While the project is still in its infancy, you can see what the planned project roadmap is, as well as an early draft of a mapping table between the Word binary format (.doc) and the Open XML format (.docx).

Microsoft Office Binary (doc, xls, ppt) File Formats

The binary documentation itself is available up here: http://www.microsoft.com/interop/docs/OfficeBinaryFormats.mspx

  • Word 97-2007 Binary File Format (.doc) Specification PDF | XPS
  • PowerPoint 97-2007 Binary File Format (.ppt) Specification PDF | XPS
  • Excel 97-2007 Binary File Format (.xls) Specification PDF | XPS
  • Office Drawing 97-2007 Binary Format Specification PDF | XPS

It's all covered under the Open Specification Promise.

Another Surprise

Another great surprise in all of this is that we've made the documentation for a few other supporting technologies available as it may be of use to folks implementing the binary formats: http://www.microsoft.com/interop/docs/supportingtechnologies.mspx

The technologies included are:

  • Windows Compound Binary File Format Specification PDF | XPS
  • Windows Metafile Format (.wmf) Specification PDF | XPS
  • Ink Serialized Format (ISF) Specification PDF | XPS

These technologies are also all available under the Open Specification Promise.

Have a great weekend everyone!

-Brian