Anyone else been following the latest blog posts from IBM and Sun discussing the Office Open XML formats? It looks like they're stepping up their push to try make ODF the only choice in file formats. I read Tim Bray's post yesterday, but there have actually been a number of other posts folks have pointed out to me as well. Everyone knows that Sun and IBM have a lot riding on ODF financially (they're large corporations, not philanthropies <g/>). It's clear that their plan is to somehow convince governments into mandating just ODF and remove any choice between the two formats.

Thankfully, what you're actually seeing in most places is that governments are asking for 'open formats' in general, not just ODF (contrary to what is usually written in the headlines). Most of those governments understand that Office Open XML is on the verge of becoming an international standard as well and it serves a very important purpose that ODF doesn't. This has raised the alarm bells for IBM and Sun though, and that's why we see the latest smear campaign kicking into gear. It could be that this is more innocent and that instead there is just a lack of technical knowledge. Based on the strong reputations of the folks involved in this campaign though it seems more malicious. I'm saying this after reading their claims that the spec is too complex and therefore not interoperable, which is just ridiculous. Too much information? Every developer I've talked to (even those working for companies that compete directly with Microsoft) is extremely grateful for the amount of information the spec has provided. Look at the 600 developers up on the site building all kinds of powerful solutions across a number of different platforms (Linux; Mac; Windows).

I think it's pretty ignorant for folks to call this effort a rubber stamp. Talk to the people from Apple, Novell, the British Library, the Library of Congress, Intel, BP, StatOil, Toshiba, Essilor, NextPage, and Microsoft who spent over 200 hours in group discussions around the formats. Look at the results of all the hours that went on in the smaller groups tasked with solving particular problems or those working on the actual documentation that had to go on between the weekly group meetings. The schemas themselves changed significantly and the spec went from 2000 to 6000 pages. Rubber stamp? You must be joking. <g/>

Another thing I've seen from an IBM employee is that he's trying to get more technical by examining the Office Open XML standard looking for minor nits and then attempting to turn them into big issues. That's fine and everyone is entitled to their own opinion. It's kind of funny though that many of the issues he raises are even worse in the ODF spec.

Why would IBM and Sun push for a more limited format?

There is this false claim from some high profile IBM and Sun employees that the Office Open XML spec is not interoperable because it's too big. These statements really help to paint a picture of their strategic interest in ODF. What's the easiest way to compete with another product that has a richer set of features? Get governments to mandate a file format that doesn't support that richer set of features. This way, if the other product (Microsoft Office in this case) has to use the format that was designed for your product, you've just brought them down to your level. It's a brilliant approach, and that shows why there are IBM vice-presidents flying around talking to governments about the need to mandate ODF. It also shows why they want to discredit the Office Open XML format… IBM and Sun feel they have a lot to lose if Office Open XML is standardized, and that's why they've been fighting so strongly in opposition.

Now, contrast that with the Microsoft position, where we've never opposed ODF. We didn't plan on supporting it, but we had no problem with other people using it. The only opposition we've ever had is to policies mandating ODF and blocking Office Open XML. We want choice; IBM and Sun on the other hand absolutely want to block choice. The spin they try to put on this is that by blocking choice in formats they are providing freedom to choose your application… what they don't way though is that we're doing that to an even greater degree. We're sponsoring a free open source project for translating between the two formats, which gives everyone the freedom to choose both the application and the format. Microsoft's view has been that open formats are really important and there is nothing wrong with both ODF and Open XML. IBM and Sun on the other hand want one specific open format (ODF), and that's it.

Now, if you look at it technically, there is no reason to complain about the size of the spec unless you are trying to limit the features supported by the spec. There are plenty of large specifications out there (look at the Java spec) that are completely interoperable. As an implementer of the Office Open XML specification, you are free to decide what pieces you want to implement.

Let's think about this complaint though that the specification is too large. What are the ways in which you could fix that:    

  1. Less documentation and explanation??? - I can't imagine anyone wanting this. Remember, the standard isn't a novel you're supposed to read end to end. It's a detailed description of every piece of the Office Open XML file formats and how it all works. More documentation is an important thing in this case.
  2. Less features??? - Who gains from this? Any implementer has the freedom to pick which part of the spec they want to support. Only applications who want to compete by bringing everyone down to their level would actually want features removed.

There are a lot of features between the three main schemas (WordprocessingML, PresentationML, and SpreadsheetML), and as a result the file format is very large. The ODF spec most likely would have been bigger if they had done a more thorough job documenting it, but even then it still doesn't compare in terms of functionality. One of the other justifications I've heard for the ODF spec being so much smaller is that it reuses other standards. That may account for some, but it still doesn't get you all the way (not even close).

We also looked at reusing other standards where it makes sense (Dublin core, ZIP, XML), but there are plenty of places where that didn't make sense (MathML). Take the example of MathML. It wasn't specifically designed for representing math in a wordprocessing document, but instead math in general. It's a good spec, and it does do a decent job in a wordprocessing document, but it's not able to handle everything that our customers would expect. It doesn't allow for the rich type of formatting and edit history that most customers of a wordprocessing application would want (see Murray's post for more details). Even more interesting though, to date there aren't any ODF wordprocessing applications out there that even support all of MathML. I think that Office 2007 actually has better MathML support with our import/export funcationlity. Another example given is the use of XSL-FO. It's a nice spec to reuse, but it doesn't fully define how international numbering should be done, so as a result OpenOffice has already extended the format in their own proprietary way.

XML itself has only been a standard for about 8 years. For one to assume that all the great thinking and tough problems in the Office document space have already been handled since then is ridiculous.