Welcome to MSDN Blogs Sign in | Join | Help

Everybody's blogging about Massachusetts

I really don't have anything to do with Office XML formats so can't contribute much of substance to the debate over Massachusetts' draft Enterprise Technical Reference Model v 3.5 which mandates the OASIS Open Document Format. This has generated a lot of weblog posts, mostly from open source advocates or employees of Microsoft competitors fulsomely praising it and hoping that this political decision will give their preferred technologies more economic clout.

A couple of more independent assessments raise issues that I find more interesting. For example, Stephen O'Grady acknowledges that Microsoft (or third parties) could easily support the ODF format is there is a business need to do so.  That gets to the  question of whether Massachusetts' very real needs for document interoperability and longevity are best met by the XML familiy of technologies in general, or one specific format in particular.  MS Office and OpenOffice / StarOffice have taken different paths here. Office 2003 can handle custom XML schemas and stylesheets quite dynamically, whereas OO has been hard coded to handle a specific schema, and the latest beta versions have evolved to support the OASIS ODF.  It's not at all clear to me why supporting specific schema(s), whether or not they are endorsed by a standards body, is considered more "open" or "standards based" than support for the more fundamental XML, XML Schema, and XSLT standards that Office 2003 implements. 

One technical note - one can't simply configure Office 2003 to handle OASIS ODF documents directly because the OASIS Technical Committee chose to define that format using the RELAX NG schema spec (endorsed by OASIS and ISO) rather than the W3C XML Schema spec, the W3C Recommendation and which  Office 2003 supports.   There are some plausible technical reasons for this in that RELAX NG is simpler, based on a more solid formal underpinning, and somewhat better suited for defining complex textual document formats than is W3C XML Schema.  Unfortunately, that advantage does not carry over into tool support (few mainstream XML editors currently support RELAX NG validation) or support for structured data within the text.  For example, tools that support the popular data-oriented XML programming technique known as data binding (which allows XML to be parsed easily into instances of application-level programming objects rather than abstract node trees) almost all require W3C schemas as input.

This gets to a second issue, raised by Joe Wilcox (and something I debated at length in previous O'Grady postings):

Considering the OpenDocument format is only truly supported by OpenOffice 2.0, which isn't even available yet, I'm at a loss to see how the XML-based format meets the Commonwealth's goals for openness or backward compatibility. Nobody's really using the format yet, right? How, uh, open is that?

My other problem is one of definition. Looks like the Commonwealth considers Adobe's PDF as open, because the spec is openly published. OK, I'm scratching my head, because if you download Corel's WordPerfect SDK the WPD specification is right there. As for Microsoft, while I'm grumbly about the company's liberal use of open, I have to say if PDF meets the Commonwealth's standard so should Office formats; at the least the XML-based formats coming with Office 12. Microsoft does publish its XML schemas and license them on a royalty-free basis.

There have been all sorts of inconclusive debates about the real meaning of terms such as "standard" and "open", and I don't want to go there.  It's important, however, to appreciate that there is a very real distinction between a specification ratified by a committee or standards organization, and a "standard" that enables real-world interoperability.  MS Word's binary format or PDF are usually considered "de facto standards" in the sense that one can reasonably expect a random correspondent to be able to read a document in one of those formats.  Obviously we can do better, and  almost all concerned believe that moving to some sort of openly documented XML format is a better way to achieve short term interoperability and long term usability of documents.  In general, the more eyes that have helped debug a spec and the more organizations have endorsed it, the more of a real standard it will be. But there are plenty of examples of standards organizations producing specifications that have not led to real world  interoperability of any significance.  W3C XLinkOASIS WS-Reliability,  and ISO HyTime are clear examples of this in the SGML/XML world.

The important thing to remember is that industries, not industry standards committees, are the ones who produce industry standards.  Knowing many of the people who helped produce OASIS ODF, I expect it to be a carefully crafted and good-quality specification, but it has to prove itself capable of solving real-world problems before it can legitimately be called an industry standard.  For example, its reliance on RELAX NG makes it pleasing to XML geekdom, but greatly complicates the processing task for most actual developers. Will  the mainstream be diverted by the need to support ODF documents, or will ODF remain in a backwater?  Good intentions don't often make for successful policies.

Bismarck supposedly said ""People who enjoy eating sausage and obey the law should not watch either being made".  That applies to industry standards, which we enjoy once they've been "cooked", but get produced by a messy process at best.  It's easy to sympathize with Massachusetts' desire to buy its document standard sausage only from nice clean  kitchens that use wholesome cruelty-free ingredients, but I've spent too much time in the XML standards sausage factory to believe it until I taste it. 

Tofu bratwurst for the Labor Day picnic, anyone? 
Published Friday, September 02, 2005 4:49 PM by mikechampion

Comments

# re: Everybody's blogging about Massachusetts

Friday, September 02, 2005 8:07 PM by Noory
This is hilarious.

Of course it was a political decision!

Very few people would dispute that Office XML would be a better 'open' format than OpenDocument on technical reasons, if for no other reason than maximum backcompat.

But Office XML isn't 'open':
1) it is patent encumbered
2) it might be royalty free but it requires each user to relicense
3) it comes with no guarantees about future openness
4) it is controlled by a company with a history of bad practices who must look after its shareholders before the people of MA
5) Unlike GPL2/LGPL which have been around for over a decade the licensing implications aren't clear to software developers, for both Free Software or non-Free developers.
6) Despite being asked repeatedly Brian jones still hasn't answered simple questions about licensing. IANAL is easy to say, but it doesn't inspire confidence.
7) Archival issues are very different to day-to-day document management issues, a lot of the points you bring up simply aren't valid in an archival setting.

# re: Everybody's blogging about Massachusetts

Friday, September 02, 2005 9:24 PM by Bruce
Your point about RNG is a red herring and smacks of FUD. There is plenty of solid tool support for the language, and even if your tools don't support it directly, it's trivial in most cases to generate an XSD version of an RNG schema via Trang.

Indeed, that was one of the reasons why the OD TC chose RNG. The same with DocBook and TEI, each of which are now authored in RNG but provide alternate representations via Trang (and each of which have more history as document formats than either WordML or OD).

I do take your point that it's a little strange to mandate specific formats, rather than to simply insist they be (fully) open.

# re: Everybody's blogging about Massachusetts

Saturday, September 03, 2005 12:36 PM by Stefan Tilkov
Mike, I'm sure you are aware of the tools available for easily transforming RELAX NG grammars into XML Schema. I'm also very curious what would happen if you put the XSD for any usefultext document format into any of the data binding tools available. My bet is that not a single one will be able to process them.

# [Outside Voice] 'Oh yum, I love Tofu' [Inside Voice] 'Oohhhh... Labor Day... Shoot, I'm, umm, busy working that day (isn't laboring what you do on Labor Day?)' [via Mike Champion]

Monday, September 05, 2005 8:24 PM by <XSLT:Blog/> Quote of the Day
mikechampion's weblog : Everybody's blogging about Massachusetts Based on the following paragraph we get the quote of the day which immediattely follows it via the above link... --- Bismarck supposedly said &quot;&quot;People who enjoy eating sausage and obey the law...

# re: Everybody's blogging about Massachusetts

Tuesday, September 06, 2005 5:32 PM by Bilbo
You all have to get over it. Open formats are more important to the customer than any feature microsoft can offer. Reading a document in 50 years is more important than inserting video in a transient document today.

Your little schema patent and continual attempt to shut out competitors with changing interfaces put an end to any possibility that the large amount of work done on XML by microsoft would be usefull in the long term.

The industry has moved on, it not "gee wiz" any more, its building infustructure for long term use.

I like microsoft products, but if you can't/won't offer what I need then I can't use them.

CrazyEnginner

# Massachusetts / Open Document Format Followup: Q&amp;A

Tuesday, September 06, 2005 6:17 PM by tecosystems
Although I promised you a summary of what I discussed with Microsoft last Friday, much of it's already been told - if not discussed. Like CRN's Paula Rooney, I got the chance to connect with Microsoft's Alan Yates (GM of...

# Reply waiting for you at my blog

Friday, September 09, 2005 9:20 PM by Simon Phipps
I'm not sure I like being dismissed so easily as wrong because I am a partisan competitor, Mike, I am making points that I actually believe in (like I hope you are) and anyway, like IBM, I thought we were partners :-) Anyway, I just left a monster reply to your monster comment over on my blog at http://blogs.sun.com/roller/comments/webmink/Weblog/coursey_is_wrong_on_massachusetts#comment4

# &amp;quot;Standardizing on XML&amp;quot; is far from useless!

Sunday, September 11, 2005 3:13 AM by Microsoft XML Team's WebLog
The war of words over Massachusetts' proposal to standardize on the OASIS Open Document Format continues...
New Comments to this post are disabled
 
Page view tracker