Standards, Patents, and the OSP
24 November 09 07:24 PM

Arthur C. Clarke famously wrote that “any sufficiently advanced technology is indistinguishable from magic.”  Another characteristic of advanced technologies is that they often contain intellectual property (IP) that may be covered by patents.

When such a technology becomes standardized, though, implementers need to feel comfortable that they have access to the technology on a non discriminatory basis. This is typically dealt with in one of three ways: either the technology owner agrees to license necessary patents on a RAND basis, the technology owner’s agreement to license necessary patents on a RANDz basis or the technology owner agrees not to assert necessary patents.  The reason international standards exist is to enable and facilitate interoperability between multiple implementations, and it would surely limit adoption if implementers had to worry about whether they could obtain access to IP that was necessary to implement the standards.

Some examples of companies’ agreements not to assert necessary patents are IBM’s Interoperability Specifications Pledge, Sun’s OpenDocument Patent Statement, and Microsoft’s Open Specification Promise (OSP).

In the case of the OSP, we’ve published a list of covered specifications, and we’ve made a public commitment that we will not assert patents that we own (if any) which are needed to implement those specifications.  The list include ISO/IEC standards such as ISO/IEC 29500, standards from consortia such as Ecma and OASIS, and also documentation that Microsoft has published for various protocols and formats that we use in our products.  The list of covered specifications is constantly growing, and includes not only the Open XML spec itself (both ECMA-376 and ISO/IEC 29500), but also many closely related technologies including the Office binary formats, the Office 2010 extensions, and the published implementer notes.

It’s important to emphasize that the protection of the OSP applies to implementers regardless of whether they perform any other action.  You don’t need to file a request, provide notice, sign an agreement, or do anything else.  You can simply get to work implementing any of the covered specifications, and know that you have free global use of any Microsoft patent necessary to implement those specifications.  You also don’t need to consider the details of any specific patents, because the OSP’s protection applies to patents which Microsoft currently owns, patents that may be issued to Microsoft in the future, and patents that Microsoft may acquire in the future.

I’ve summarized the OSP here because I’ve been asked a few questions about it lately, and I thought it would be good to provide a simple explanation in layman’s terms of what the OSP is all about and how it works.  That said, I should add the obvious caveat that I am not a lawyer.  So if you’re looking to dig into the details of how IP issues are addressed in standards development, I’m no expert.  But that’s the point, really: you don’t need to be a legal expert, or do any kind of legal analysis, to know that you can freely implement Open XML (and many other standards) without worrying about whether a particular Microsoft patent applies or not.

Postedby dmahugh | 1 Comments    
Open XML support in new tools, apps, and custom solutions
18 November 09 03:38 PM

One of the more interesting aspects of my job is meeting people who are developing software that interoperates with Office through the various formats that we support.  It’s exciting to hear their plans, work with them on the details of how our products can collaborate or share data, and then see the final solutions, whether they’re released in shrink-wrapped packages, available for download on the web, or delivered as a custom solution for a specific organization.

Three of the more innovative firms I’ve had the pleasure of meeting and working with over the last few years are Altova, Datawatch, and PSC.  Here’s an overview of the latest work each of them has been doing with the Open XML formats.

Altova StyleVision

Altova

Altova’s MissionKit is a powerful suite of XML tools for developers and XML architects.  I know them as an Open XML implementer, and also as a tool vendor.  We use XMLSpy, DiffDog, and other Altova tools for all kinds of things on our team, and we’ve even used them for Ecma TC45’s work on maintenance of the Open XML specification itself.

Altova first rolled out Open XML support in April 2007, when they added to XMLSpy the ability to create DOCX, XLSX, and PPTX documents from scratch, as well as the ability to edit Open XML documents with context-sensitive help, auto-completion, and all the other productivity-enhancing features of XMLSpy.  The latest release of their product line, Version 2010, was rolled out last month with even more Open XML support options, as well as a host of other new features.

StyleVision 2010, the stylesheet designer in the Altova lineup, now offers an entirely new design paradigm for creating XML forms that can be saved in DOCX and many other formats.  The key breakthrough in this release is that StyleVision is now a true electronic form design tool in every sense of the phrase.  Users can precisely position form elements on a canvas and specify templates within configurable layout containers, and they can design the form first, then add data-source connections later.  This provides many benefits, not least of which is that a graphic designer or similar person can create a great-looking form and then a developer or XML expert can add the necessary technical plumbing later.  If you’ve spent much time around e-forms, you know that that approach works much better than the other way around. :-)  StyleVision still offers free-flow (HTML-style) form creation as well as the new absolute-positioning functionality, for scenarios where a free-flowing form is more appropriate.

Monarch Context in action

Datawatch

Datawatch’s flagship product Monarch has been helping people create reports since 1991, and has supported the Open XML formats since February 2007, just two months after the publication of ECMA-376.  The latest release (Monarch Version 10.5) includes several changes related to their Open XML support.

Monarch is a report mining and analysis tool that allows users to non-programmatically retrieve data from various information sources (usually reports in PDF, HTML, text or other formats) and create consolidated reports from that data.  It is both a consumer and a producer of the XLSX/XLSM formats, and it is used by hundreds of thousands of users to generate reports from data sources that would otherwise be very difficult to consolidate or work with.

Version 10.5 includes support for digital signatures in XLSX/XLSM documents, improved PDF import, and other features, but the really interesting feature from an Open XML point of view is Monarch Context, a free add-in for Excel.  Monarch 10.5 has the ability to take advantage of the flexibility of OPC (the Open Packaging Convention) to store a full-fidelity XML representation of the original source report in the output XLSX document.  It also includes metadata on each row of the spreadsheet that ties back to the source of that row’s data in the original report.  With Monarch Context (see screenshot above), you can then navigate to a row in in the generated spreadsheet, click on Display Source, and see the source data for that row.

This is a powerful example of the creative possibilities that developers have when working with OPC.  The generated report includes its own audit trail and source data, and you can even ask Monarch to sign the XLSX/XLSM file, to assure future viewers of the integrity of the report for compliance and auditing purposes.

PPTX report generated by PSC's custom system for RDI

PSC

PSC Group LLC is a Chicago-area professional services consulting firm, a Microsoft Gold Partner, an IBM partner, and active in a wide range of technology services and projects.  I’ve gotten to know John Head, PSC’s Director of Enterprise Collaboration, fairly well through DII workshops and other activities, where he is a perennial force to be reckoned with.  John’s a hands-on expert and is definitely not the kind of guy who ever leaves you wondering what he really thinks -- it’s great to get that kind of direct informed feedback when we participate in DII events.

One of PSC’s clients is RDI (Research Director, Inc.), a Maryland-based radio-research consulting firm that analyzes, interprets and presents Arbitron audience research data for over 200 radio stations.  RDI came to PSC with a problem: they had been using the same custom system for 10 years, and their needs had grown so much that it was taking three weeks to build the presentations they needed to generate each quarter.  As Marc Greenspan of RDI explains, “Taking three weeks to process the data was no longer an option.  By the time we’d be done, the next set of data would be coming in, and the presentations would be close to useless to our customers. We needed a scalable, sustainable production system, and our motivation was literally business survival.”

PSC developed a custom Open XML solution for RDI that now delivers these same reports in just three days, enabling RDI to offer its analyses 13 times per year instead of 4 times per year, and dramatically reducing the cost of generating their analysis.  John Head explains that “We were able to improve the application performance so drastically by not having to automate PowerPoint, because it’s not involved until the user actually opens the file.  We couldn’t have done that without Open XML. We couldn’t support document generation on the server with binary formats. It was too hard and it didn’t always work. Open XML changed that.”

There’s a new case study on our web site about the system that John’s team has built for RDI.  It’s a good example of how the world of automated document assembly is rapidly changing with the advent of standardized XML-based document formats.  I had the opportunity to learn about this system early on, when the PSC team gave us a demo in Redmond last year, and it’s great to see it rolled out and in production.

Microsoft

And finally, for those who haven’t seen the news yet … we have a new Open XML implementation available, too.  More on that one later. ;-)

Postedby dmahugh | 1 Comments    
DII workshop, Brussels
17 November 09 04:17 AM

DII workshop

Last week I participated in the DII workshop that took place in Brussels.  Attendees included a variety of document-format experts from the ODF and Open XML worlds, including members of SC34 working groups, the OASIS ODF and OIC TCs, ODF and Open XML implementers, public-sector experts in interoperability and archiving, and others.

The roundtable discussions at this event covered some interesting issues, including various approaches to round-tripping content through different formats, extensibility strategies and technical considerations, and future possibilities for the Strict and Transitional conformance classes of ISO/IEC 29500.

The workshop presentations are now available for download here.

Belgian beer parliament Brussels Antwerp

Postedby dmahugh | 0 Comments    
ODF plugfest and OOoCon, Orvieto
07 November 09 01:42 PM

I’ve spent the last week in the city of Orvieto, perched atop a hill in Umbria, Italy.  Monday and Tuesday I particpated in the second ODF Plugfest, and then Wednesday through Friday I attended OOoCon, the annual OpenOffice.org conference.  I gave a presentation on Wednesday about Office’s approach to interoperability with OpenOffice.org, which you can find on the OOoCon presentation page, and you can find the presentations from the plugfest, as well as the test scenarios we went through, on the plugfest web site.

It was great to see everyone I had met at the last plugfest, and I also had the opportunity to finally meet in person many people I’ve only known via email and the ODF TC calls, including Svante Schubert, Charles Schulz, Louis Suarez-Potts, Eike Rathke and others.  Everyone was great, and made me feel very welcome.

I was planning to do some sightseeing in Rome this weekend, but there is a train strike that begins at 21:00 today (Saturday), so I’m going to stay right here in Orvieto until Monday, when I’ll fly to Brussels for meetings and preparations for the upcoming DII workshop on Thursday, November 12.  If you’d like to see the photos I’ve taken in Orvieto this week, you can find them on Flickr, and I’ve also included thumbnails of a few favorites below.

And now, after a long day of photographing the sights of Orvieto, it’s time to get out and enjoy some local cuisine.  Buon appetito!

image image image image image image image image image image image image image image image image image image image image

Postedby dmahugh | 1 Comments    
DII workshop – Brussels, November 12
13 October 09 10:38 PM

The next DII (Document Interoperability Initiative) workshop will take place in Brussels on November 12.  As always, the goal of the DII workshops is to share information with the developer community and solicit feedback on how we can work together to improve interoperability.

Much has changed since the last DII workshop in Brussels, when we discussed Office’s future plans for ODF support and the pending rollout of the implementer notes, as well as the need for a validator and document test library to improve ISO/IEC 29500 interoperability.  Now there are two versions of Office that offer built-in ODF support (Office 2007 SP2 and the Office 2010 Technical Preview), implementer notes have been published for ODF, ECMA-376, and ISO/IEC 29500, and validator and test-library projects are underway.

This workshop will include presentations on a variety of document interoperability topics.  I'll blog the details of the agenda after it's finalized, but I wanted to let everyone know the date so those who are interested can make plans to attend.  In the meantime, here are a few of the presentations that are already being planned:

  • I'll be covering the Office 2010 extensions (as was covered by the Office program managers in last month's DII workshop in Redmond).  I will also present the latest news on how we’re working to improve ODF interoperability between Office and other popular applications, and talk about our plans for the future.
  • Alex Brown will be covering present and future plans for the Office-o-tron validator project.
  • Klaus-Peter Eckert of Fraunhofer FOKUS will present the latest status of the document test library project and other work Fraunhofer is doing to improve interoperability.

Like all DII workshops, this event is open to anyone and everyone, and there is no cost to attend.  If you're interested in attending, let us know by sending an email to diievents@microsoft.com, and we’ll follow up with information about the venue, agenda, and other details.

On a more personal note, I know that many people in the document formats community (myself included) are photo geeks, and cameras may have outnumbered laptops at some of the standards meetings and interop events I’ve attended.  I’ve been to Brussels twice before, but never had time to snap more than a few photos on the way between meetings, like the snapshots below.  This time, however, I’ll be in Brussels a day early and am planning to get out and take a bunch of pictures on the Armistice Day holiday, November 11.  So if you’re coming to the workshop and would like to go on a photo outing the day before, let me know.  The more the merrier.

Brussels

See you in Brussels!

Postedby dmahugh | 0 Comments    
DII Workshop on Office 2010 Extensions
23 September 09 08:11 PM

presentations at the DII workshop, 9/18/2009

If you’re using the Office 2010 Technical Preview and you’re the type of person who likes to look closely at the markup of documents that you’ve created (as most readers of this blog are), you may have noticed some new namespaces that Word, Excel, and PowerPoint used for new functionality such as sparklines in spreadsheets or presentation sections.

Last Friday, at a DII workshop in Redmond, program managers from the Office product groups explained how these new namespaces are used in Office 2010.  This event was scheduled for the day after the SC34 Plenary which took place in nearby Bellevue, so that interested SC34 members could easily attend.  In addition to the SC34 attendees, several US-based ISVs were also present, which led to a good roundtable discussion in the afternoon with a variety of perspectives represented.

Office 2010 takes advantage of two types of extensibility mechanisms in ISO/IEC 29500: extension lists and ACBs (alternate content blocks).  These extension points, which are documented in the text of the standard (see Part 3), provide implementers with a standardized way to innovate and add new functionality while maintaining conformance to the standard itself.  The core concept is that an implementer can provide more than one representation for an object (a shape on a slide, say), and then consumers of that document can render the version that they understand.

The workshop featured four presentations on how Office 2010 uses ACBs and extension lists.  Word/Excel/PowerPoint program managers (Zeyad Rajabi, Chris Rae, and Ric Bretschneider, respectively) explained examples from each of those products, and Nick Chiang of the graphics team covered some general concepts about how graphics are handled in Office 2010.  The presentations included demos of various new types of functionality in Office, followed by deep-dive explanations of the markup used to serialize this new functionality.

All of the presentations are now available for download here, and documentation of the specific extensions used by Office 2010 can be found in the Microsoft Office File Formats section of the Open Specifications Developer Center on MSDN::

After the presentations on Friday, we had a roundtable discussion of the topics covered during the day, and a variety of related topics.  Here are some of the points of discussion that I found most interesting:

  • Standardizing implementer notes.  Should there be a standardized approach for documenting the details of a specific implementation of a standard?  If a standardized schema were used by all implementers, it would be possible to build tools to work with these notes and search them or mine the data to identify possible improvements to the standard and trends among implementations.  There was consensus that this is a topic worth further discussion and consideration.
  • Should the Office 2010 extensions be standardized?  I was interested to see the range of opinions on this question.  Some people feel they should, others feel that they should not, and others felt that further investigation and study is needed to determine what subset of these extensions may be appropriate or worthwhile for de jure standardization.
  • Test suites and profiles.  We discussed various approaches to creating document test suites and standards conformance profiles that would enable better interoperability.  Participants in this discussion included members of the OASIS OIC TC, SC34 WG4 and SC34 WG5, as well as software developers who (as John Head put it) “don’t care about standards, we just want everyone to implement the same thing in the same way.”
  • Should all implementations have published implementer notes?  This topic was raised by a participant who asked about implementer notes for Mac Office, which currently don’t exist.  Some felt that such documentation is needed from all implementers, and one person even suggested that publication of implementer notes could be a requirement for claiming conformance to a standard.  Others felt that such a requirement might be an obstacle to adoption of the standard, given the magnitude of the effort needed to create comprehensive documentation for an implementation.
  • Can some things be standardized in a way that allows for their re-use in both ODF and Open XML?  One of the attendees suggested that slide transitions could be standardized independent of the underlying document format, to allow them to be re-used consistently in more than one format.  This would certainly simplify some aspects of translation between formats, but as others pointed out, it may be more effective to focus on translation between the existing standards rather than modifying each of them to support a new approach, which could be technically and politically challenging.
  • Do standards slow the pace of innovation?  I hadn’t expected this topic to come up, but it prompted an interesting discussion of whether forward progress in document formats is limited by the fact that the  major alternatives are all published international standards that must adhere to well-defined processes for their maintenance and evolution. The conclusion on this topic seemed to be that this is not an issue (i.e., standards don’t restrict innovation), but more could be done to educate implementers and users about how innovation and standards can peacefully co-exist.
  • What’s the best way to handle schema validation for documents that use MCE?  MCE (Markup Compatibility and Extensibility – Part 3 of ISO/IEC 29500) requires a consumer to pre-process the XML markup before doing schema validation.  We discussed how implementers can best address this requirement.  Some felt NVDL might be the best tool for the job, and others suggested that there is a need for more examples of how to work with MCE, and tools that support MCE, across all platforms.  This is another area we agreed needs further discussion and consideration.

It’s great to have so many experts in one room, sharing their thoughts on these topics, and I learned a lot from the wide variety of opinions expressed.  What do you think?  Do you have a strong opinion on any of these topics?  Let me know in the comments below.

Postedby dmahugh | 0 Comments    
SC34 WG4 Meeting, Bellevue
15 September 09 04:33 PM

WG4 meeting, Sunday 09/13/2009

WG4 has been meeting in Bellevue, Washington the last three days, as part of a full week of activities around the upcoming SC34 plenary here this Thursday.  We wrapped up the meeting with a half-day this morning (Tuesday), after meeting all day on Sunday and Monday.

WG4 covered a variety of topics in this week’s meetings.  I won’t get into great detail on those, because some of these topics are still open and WG4 deliberations will continue in future calls and meetings, but here are a few of the more interesting topics we covered: 

  • The past, present and future of Strict and Transitional; WG4 is working toward consensus on the long-term intent of this distinction, to help guide future work on new functionality and other maintenance activities.
  • The W3C “Widgets 1.0: Packaging and Configuration” candidate recommendation, and how it compares/contrasts to OPC.
  • Possible use of Assembla for tracking defect reports and maintaining the IS29500 schemas.
  • Open defect reports on media types, fonts, and custom XML, discussed proposed resolutions, and closed most of the DRs we covered.
  • Issues around ISO 8601 dates, including how to define a profile of the 8601 standard for use in IS29500, and the question of whether ISO 8601 dates should be removed from Transitional.
  • Various options for addressing the unqualified attributes issue that has been raised by the Czech Republic.
  • How future reprints and revisions to the standard will be handled, and how  the upcoming changes to the ISO/IEC Directives will affect that work.

In addition to the 20-25 people in the  room each day, we had a few people call in to contribute and participate.  Especially noteworthy, in my opinion, were the phone contributions from Japan’s Toshiya Suzuki (who was on the phone from roughly 2:00AM to 9:00AM his time) and Denmark’s Jesper Lund Stocholm, who juggled LiveMeeting, a phone call, and a 2-month-old baby with aplomb.  Thanks for calling in, guys.

We also had two presentations from invited Microsoft experts.  Zeyad Rajabi covered custom XML, and Jeff Chen covered Markup Compatibility and Extensibility.  This gave WG4 members an opportunity to ask questions of subject-matter experts in these areas, to help inform future maintenance activities.

Other SC34 working groups are also meeting here this week prior to the Thursday plenary, including WG5 (ODF-OXML translation), WG3 (topic maps), and WG1 (validation).  This afternoon and tomorrow, I’m attending AHG3, the ad-hoc group looking into ODF maintenance and how SC34 and OASIS can work together on maintenance going forward.  I’m going to also try to spend some time with WG5 if I can (which is meeting at the same time as AHG3), and then Thursday is the SC34 plenary itself, and Friday we have a DII workshop over in Redmond on the Microsoft campus.  Busy week.

I’ll cover some of these topics in more detail after decisions are made in WG4 and/or relevant resolutions are approved in the SC34 plenary.

Lake Washington and Olympic Mountains, from Kirkland 09/14/2009

Postedby dmahugh | 0 Comments    
DII Workshop: MCE Deep Dive, Redmond
21 July 09 12:49 PM

We’ll be hosting another DII workshop soon, and this one will be of special interest to those who want to understand the inner workings of MCE (Markup Compatibility and Extensibility) as defined in Part 3 of ISO/IEC 29500.  We’ve used MCE for new functionality in documents created by Office 2010, and members of SC34 WG4 have expressed interest in understanding the details of our implementation.  So we’re planning a workshop in Redmond on Friday, September 18, the day after the upcoming SC34 plenary in nearby Bellevue, to do a deep-dive review of how Office 2010 uses MCE.

For those who aren’t familiar with MCE, here’s how it is described in the Scope clause of the standard (ISO/IEC 29500-3):

This Part of ISO/IEC 29500 describes a set of conventions that are used by Office Open XML documents to clearly mark elements and attributes introduced by future versions or extensions of Office Open XML documents, while providing a method by which consumers can obtain a baseline version of the Office Open XML document (a version without extensions) for interoperability.

By using MCE for the new functionality in Office 2010, we can deliver innovations like sparklines in Excel 2010 or new slide transitions in PowerPoint 2010, while maintaining compatibility with the Open XML standard.  At this one-day event, members of the Word, Excel, PowerPoint and graphics teams will demonstrate these new capabilities and show how these new capabilities are stored in MCE alternate content blocks and extension lists.

The date for this workshop will be Friday, September 18, and the location will be Microsoft’s Redmond campus.  If you’re interested in attending, please contact my colleague Amruta Gulanikar via this form to get on the list.  Amruta can also provide information about travel and hotel options.

I hope to see you there!

Postedby dmahugh | 3 Comments    
Open XML developers: where to get answers
09 July 09 06:10 PM

I recently returned from a long business trip, and while working through my email backlog I’ve come across several questions from developers who are working with the Open XML formats.  I’ve responded to each of them with some tips on how to best get such questions answered, and I  thought I’d summarize that information here for others who may find it useful.

Your first stop for most Open XML development questions should be the forums section of the Open XML Developer web site.  You can post a question there and it will be seen by the people who manage that site, and also by the broader Open XML developer community.  Over 4,000 comments have been posted to those forums, so you can also learn quite a bit by reviewing the existing threads.  And the library section of the site has dozens of articles with Open XML code samples in many programming languages, including Java, C#, C++, PHP, Ruby, Python, and others.

Another great one-stop shop for Open XML development topics is the Open XML Developer Portal.  There you can find a huge  number of code samples, articles, whitepapers, how-to videos, and free downloads, as well as links to many other resources.  This site is focused on the needs of developers who are working with Microsoft’s tools, including the Open XML SDK and the System.IO.Packaging API.

For developers working with the Open XML SDK, be sure to check out Erika Ehrli’s recent summary of “Getting Started Best Practices” for SDK developers.  You can also find a rapidly growing collection of great in-depth blog posts about Open XML SDK development over on Brian Jones’s blog, where Zeyad Rajabi has been covering a wide variety of common scenarios for Open XML developers.

Speaking of Microsoft blogs for Open XML developers, any serious Open XML developer should also include Eric White’s blog in their RSS feeds.  Eric is a leading proponent of the use of functional programming techniques based on LINQ to XML technology, and he covers the Open XML SDK and SharePoint topics as well as more general XML development concepts.  Another Microsoft blogger to follow is Stephen Peront, who works with me on the Office Interoperability team and recently covered the File Format Converter API that we released in Office 2007 SP2.

If you’re looking for detailed information about Office’s implementation of Open XML, check out the implementer notes published on the DII web site.  There you can find information about our support for ECMA-376, and soon you’ll find a similarly detailed set of notes for our implementation of ISO/IEC 29500 in Office 2010.  This information can be very useful for maximizing interoperability with Office.

You can find related information about Office’s support for file formats (including other formats in addition to Open XML) at the Open Specifications Developer Center on MSDN.  That site hosts thousands of pages of documentation for protocols and formats that are supported by Office, and the forums section of the site is a good place to get specific questions answered.  The MSDN forums also include a very active forum for Open XML SDK developers.

One final detail worth mentioning is where to get the Open XML specification itself.  You can download ECMA-376 from the Ecma International web site, and the home page of the Open XML Developer web site has a handy set of links for that purpose.  And you can download the ISO/IEC 29500 specification from the “Freely Available Standards” page on the ISO web site.

As you can see, there are many good resources available for Open XML developers.  If you have questions about how to do something specific, or are looking for general advice on working with Open XML, the links above are the best places to get help.  Feel free to contact me as well, through this blog, if you’re looking for something you can’t find at these sites, and I’ll try to help you track down the best place to get the information you need.

OK, time to get back to working through that email backlog …

image

Postedby dmahugh | 2 Comments    
WG4/WG5 Meetings
29 June 09 07:32 AM

WG4 meeting, Copenhagen

I’ve just returned last night from a week in Copenhagen, where I attended the SC34 WG4/WG5 meetings that were hosted by Danish Standards.  As usual, it was several days of non-stop document format discussions, in the meetings as well as over breakfast, lunch, dinner, and Carlsbergs.  A typical comment from one of the delegates Wednesday afternoon: “let’s take a break from sitting down and continue this debate standing up for a while.”

Other attendees have posted some thoughts about the meetings already, and I expect we’ll see more discussion of the details on participants’ blogs going forward.  See Alex Brown and Jesper Lund Stocholm for information about some of the topics we discussed, including boolean values, ISO 8601 dates, and other aspects of IS29500 maintenance.

WG4 (IS29500 Maintenance)

The main focus of WG’s work, as always, was processing of DRs (defect reports) that have been submitted by member bodies.  As of the end of the meeting, we had closed 205 of the 284 defect reports that have been submitted to date; watch the WG4 statistics page for an update in the next few days that will reflect the latest status.  The biggest submitter of DRs to date has been the UK, although I see that Japan plans to take the lead soon, according to a comment from WG4 convenor Murata Makoto on Alex Brown’s Flickr stream.

In addition to discussing proposed solutions and closing DRs, we discussed at some length two topics Alex raised in presentations to the group: the intent of IS29500’s division into Transitional and Strict at the BRM last year (including how they’re related and the long-term maintenance implied by this structure), and various approaches to conformance testing for IS29500 (and also IS26300).  We also reviewed the planned schedule for COR1 (the first set of technical corrigenda for IS29500) and AM1 (the first set of amendments).  Project editor Rex Jaeschke is already working on these documents, and WG4 hopes to be ready to approve them on the July 23 conference call, or on a July 30 call if needed.  After that, they’ll proceed to SC34 and JTC1 for balloting.  One other interesting topic we discussed was how we can implement a public email archive for all WG4 correspondence.

I’ll have more to say on all of these topics as we move forward with them in WG4.  For the next few weeks, however, the main focus of WG4’s work will be to get COR1 and AM1 ready for approval and publication, so that the IS29500 standard can be updated to reflect all of the work WG4 has done to date.

WG5 (ODF-OXML Translation)

After WG4 met on June 22-24, WG5 met on June 25.  WG5 is the working group on translation between the ODF and Open XML formats, and the main work item in WG5 right now is a TR (technical report) that will provide guidance on various details of the translation process.

Klaus-Peter Eckert (Fraunhofer FOKUS), the editor of the TR, provided an overview of the current status and outlined some of the scenarios the report is based upon.

Alex Brown presented on conformance and validation, covering the same topics he had covered for WG4 earlier in the week.  We discussed possible collaboration between WG5 members, Fraunhofer FOKUS, and others, to create a community-developed validator that would benefit from the expertise of WG1, WG4, and WG5.  These discussions are ongoing.

I presented a recap of the ODF Plugfest I had attended in The Hague, and explained how members of the OIC TC will be contributing to the plugfest wiki.  We discussed whether a similar wiki-based approach might be appropriate for testing translation scenarios as defined by WG5.  We also went through the schema that OIC TC chair Bart Hanssens has created for managing interoperability test cases, and looked at how that approach might be useful to WG5.  Although WG5 and the OIC TC have different missions – the OIC TC is all about ODF interoperability, whereas WG5 is about ODF-OXML translation – there is quite a bit of conceptual overlap in their work.  We decided that it would be good to keep WG5 informed of developments in the OIC TC going forward, and I’ll be playing that role for WG5.

As usual at these meetings, I got out and took some pictures around town in the evenings.  The fact that it was light out until after 23:00 every evening certainly extended photo-taking hours, and we were fortunate to be in town during the week of the solstice celebration (the witch-burning to which one attendee alluded earlier in the week), as well as the week of a crazy graduation ritual in which truckloads of young Danes in white hats cruise around the city yelling through bullhorns, sounding sirens, drinking beer, and generally carrying on.

celebrating graduation

Postedby dmahugh | 3 Comments    
ODF Plugfest, The Hague
16 June 09 11:09 PM

Over the last two days I’ve been attending the ODF Interoperability Workshop, a fascinating event that brought together ODF implementers from many countries to talk about the issues and collaborate on interoperability testing.  The workshop web site covers the details of the agenda, provides a variety of related content (including the presentations), and lists the objectives of the event:

The aim is to provide a low-level hands-on interoperability testing environment in which
vendors and community members can fine tune the interoperability capabilities of their ODF implementations and make test cases, recommendations and create best practices for implementors.

The ultimate goal is to achieve full seamless interoperability for the entire feature set of ODF across all suppliers, platforms and supported technologies.

The workshop is meant for people who write and architect the code to handle the actual ODF in
applications - desktop editors and viewers, online apps, mobile, etc. Participants should represent every major team behind the various competing ODF products, their direct (technical) management and community leaders, as well as the members of the ODF OIC committee.

It was a productive two days, both in terms of what was accomplished in the official activities of the event and also in terms of the networking opportunities it provided.

The first day started with a speech by Frank Heemskerk, the Minister of Foreign Trade for the Netherlands.  He discussed the the Dutch government’s policy on the use of open standards, and made a direct appeal to the attendees to “go beyond compliancy and help achieve broad-based open standards."

Mr. Heemskerk was followed by Ineke Schop, Program Manager for Netherlands in Open Connection.  Ms. Schop described her view of the goals and aspirations of open standards in general, as well as some of the specific steps being taken by her organization to deliver on those objectives.

After that we got into the details of working through ODF interoperability issues.  There were a variety of sessions by implementers and members of the ODF TC and OIC TC, and you can find all the details in the agenda posted on the workshop web site.  Note that the presentations are also included in the online agenda – several have already been posted, and the rest will be available soon.  Video interviews were recorded with many of the attendees, and those should be available soon as well.

It was great to see some old friends again, and I also met many people I knew before only through their voices on ODF TC calls or their online presence in the ODF community, including Oliver-Rainier Wittman (Sun), Mingfei Jai (IBM), Marc Maurer ( AbiWord), Zaheda Bhorat (Google), and many others.  My colleague Peter Amstein, the chief architect of our ODF support, was also in attendance, and it was an opportunity for him to get to know the people behind many other ODF implementations.

During the afternoon of each day, we did interoperability testing and had many informal discussions about specific technical issues.  Some of the tests were based on specific issues that people already knew about, and at other times we worked through specific scenarios that OIC TC chair Bart Hanssens had defined, as well as scenarios that attendees created.  This testing resulted in identification of varying interpretations of the spec, bugs, and other issues that can now be resolved to improve the overall state of interoperability.  I’ll not be talking about specific details of those tests, because we were asked to conduct ourselves in accordance with the Chatham House Rule and not name specific products in post-event blogging or reporting.  This policy was in place to assure that the event could be productive and results-oriented, and I’d say this worked very well – all of the implementers were open and pragmatic about working through issues that came up in testing.

There were many bloggers and Twitterers in attendance, so I expect others will post their thoughts on the event after everyone gets back home; I noticed that Floschi already has a nice summary posted.

It was a very useful event, and I’d like to give special thanks to Fabrice Mous and Michiel Leenaars, who worked tirelessly to provide a great experience for the attendees.

image

Postedby dmahugh | 4 Comments    
Testing Office’s ODF Implementation
14 June 09 09:32 PM

In this blog post, I’m going to cover some of the details of how we approached the challenges of testing our ODF 1.1 implementation that was released in Office 2007 SP2.

Adding support for a new document format such as ODF to Office is a large and complex project.  Office has a very broad range of functionality, and we had to map that functionality to the structures defined in ODF.  This mapping then needed to be rigorously tested, in isolation and also in rich documents that reflect typical usage of various combinations of features, to assure that our generated documents are conformant to the specification and to maximize interoperability with other implementations.

High-Level Planningimage

When we began work on our ODF 1.1 implementation, we started by developing a set of high-level guiding principles that we would follow.  I covered those in a blog post last year, as well as a recent post that explained how we see the relationship between standards and interoperability.

After we had reached agreement on these principles, the various feature teams began designing the details.  A “feature team” here at Microsoft is made up of three groups of people: program managers (PMs), developers, and testers.  In broad simple terms, PMs are responsible for writing down the specifications, developers are responsible for implementing those specifications, and testers are responsible for verifying that everything works as intended.  Since there was a specification for ODF in hand already, the main job of the feature team was to write down the details of how we would implement it.  In this post I’ll be focusing on the work of the testers, although inevitably that will include some discussion of the work of the PMs and developers, because the  three disciplines work very closely together in an iterative manner.

Most of the people who planned and executed our ODF implementation are members of the same teams that are responsible for other aspects of the design, development and testing of the Office clients.  We created an “ODF virtual-team” that included specific individuals from each of the relevant product teams – Word, Excel, PowerPoint, and graphics, primarily --- and the v-team approached the project with the same management structure and business processes that we use for other work on Office.  Attendees of the DII workshop in Redmond last summer had a chance to meet several key members of the ODF v-team, who gave presentations and participated in the roundtable discussions at that event.

In addition to these people in Redmond, we have other teams that we can call on for projects like this one, and for the testing work on our ODF implementation we pulled in people from the Office group in four countries, as well as people who worked on Office years ago but have moved on to other roles (for their expertise in older features that we wanted to verify are supported correctly in our ODF implementation).

Mapping Between ODF and Open XML

Office’s internal representation of documents is very closely aligned with the Open XML formats, so one of the first steps in planning our ODF implementation was to do detailed mapping between the Open XML structures that Office already supported, and the ODF structures that we would be saving and loading to/from in ODF 1.1 documents.

The PMs had primary responsibility for this, and they created sets of spreadsheets to capture the mappings between every ODF and Open XML element and attribute.  This mapping needed to be defined in both directions: OXML->ODF for File/Save operations, and ODF->OXML for File/Open operations.

As a simple example of how that worked, here is part of the spreadsheet for the concept of bold text, as mapped from OXML to ODF:

image

This excerpt is just a subset of what was captured in the mapping; the PMs also identified required/optional status, default values, and other information.

And here’s the converse mapping for bold text, going from ODF to OXML:

image

I’ve used a very simple example here, and yet as you can see there are many details involved.  There were thousands of details like this in the mapping spreadsheets, and collectively these spreadsheets served two roles:

  • they were the spec for the developers
  • they defined the scope of the test plan for the testers

The process of creating the mapping spreadsheets is interesting unto itself, due to the many places where ODF and Open XML had different approaches or different capabilities.  I’ll cover the mapping spreadsheets in more detail in a future blog post.

Test Tools and Test Documentsimage

Like any professional test team, the Office testers have a wide variety of tools they’ve built to help automate their work.  Here are a few examples of the tools that were used to test Office’s ODF implementation:

  • Verifying conformance to the schemas in the standard was a high priority, and we used Jing (called by an internal tool we call ODE) to validate against ODF’s RNG schemas.
  • The Excel team used an internal tool named Trippy to automate round-tripping.  They ran this tool against a test library of over 700,000 test documents, each of which was saved as an ODS file and then validated against the reference schemas.
  • The Word team used tool called OHarness, which can be used to run the same operation on each one of a batch of files.  They used a library of over 100,000 documents, saving each one as an ODT file, logging bugs for the developers, and repeating the tests until they drove the number of non-conformant documents to zero.

These tools, and others developed by the test teams, all work against large collections of documents.  These test documents came from a variety of sources:

  • Text documents that have been used in the past for the binary formats documentation and other purposes.
  • Real-world documents which have been given to us by customers for the purpose of helping us see how they use our products and seeing the problems they have run into.
  • Documents from test libraries created by other organizations, such as the test documents from the University of Central Florida atomic test suite and the test documents that Dialogika has created based on their work in developing the European Commission’s corporate style package for official and legislative documents.
  • Documents manually created by the testers to cover every element, attribute and attribute value defined in the ODF schemas.
  • Public documents collected from the internet.

Our libraries of test documents are dynamic and constantly growing.  As a recent example, we found that the latest Committee Draft of the ODF 1.2 specification uses styles in a way that exposed a bug in Word’s implementation.  (Rick Jelliffe has blogged about this bug.)  So we’ve added that document to our test library going forward.  (We’ve also fixed that bug and tested the fix, which will appear in a future update.)

Verifying Mapping

After the developers had written code to handle the mappings as defined in the spreadsheets (which were essentially the specs for their work), the testers got to work testing this code.

One aspect of testing was the small documents for verifying specific elements and attributes.  These were handled in an automated manner using tools such as Trippy and OHarness, as mentioned above.

Another aspect of this testing was the creation of complex “real-world documents” that contained combinations of functionality to test various scenarios that we’ve found typically occur in actual use of Word, Excel, or PowerPoint.

For example, many Excel users create spreadsheet documents that contain a large worksheet of raw data like this one:

image

… and that data is often summarized that data in pivot tables and/or formatted reports like these:

image

The test team would create documents like this one, then manually verify that the document could be saved as either an ODS or XLSX file without change in appearance or functionality.  In this particular case, the test team verified that a variety of details were handled the same in Open XML and ODF, including:

  • Formatting of cell content, including conditional formatting
  • Data with Autofilter on data sheet
  • PivotTable in Pivot sheet based on above data
  • Results of formula calculations
  • Data validation

Verifying Conformance

As I mentioned earlier, the product teams each have a large corpus of test document that are used for automated testing of conformance.  Binary documents and Open XML documents are opened and then saved as ODF, and each of these documents is validated against the ODF schemas.  By analyzing the results of these tests, the testers can identify problems that need to be corrected, and then the tests are re-run.

The goal of this process is simple: to drive the number of non-conformant documents to zero.  We reached that goal for the Office 2007 SP2 implementation of ODF, and as of this writing I don’t know of a way to make Word, Excel or PowerPoint write a non-conformant ODF document.  It may theoretically be possible to do so – and if anyone happens to come across such a scenario please let me know – but we have verified that the hundreds of thousands of documents in our test libraries can be saved as fully conformant ODF 1.1 files from Office 2007 SP2.  By conformant,  I mean here fully schema-compliant and also conformant with our reading of the text of the ODF 1.1 spec.

Security Testing

When we add support for a new format, one area that requires intensive testing is security.  Does our implementation of the new format create any new security risks that need to be mitigated?  Is there any way that an ODF document can be corrupted (deliberately or accidentally) that could cause a security problem?  The test teams were responsible for answering these questions.

The key tool used for this aspect of the test plan was Distributed File Fuzzing (DFF).  The basic concept is that thousands of documents are corrupted in random ways, and these documents are opened on large numbers of PCs in a distributed environment.  Data is collected on the ways in which these corrupted files fail to open, and this data is used to verify that there are not security problems caused by bad error handlers, buffer overruns, integer overflow, or other issues.

When issues are found in security testing,  the process is the same as in the other types of testing: the testers log bugs, and the developers check whether the problem is in design or implementation, and based on those findings we either modify the design and re-code, or correct the code.  The tests are then repeated, and this process continues until the number of open security issues reaches zero.

Testing Interoperabilityimage

The final piece of the testing puzzle is interoperability testing: verifying that documents created in Office can be opened in other implementations, and vice versa.

This type of testing is nothing new for the test teams, because we do it every time we add a feature to Office.  In the past, we focused primarily on interoperability between various versions of Office, but now that test matrix has been expanded to include the latest versions of major ODF implementations.

To verify interoperability with other ODF implementations, the test teams created documents from scratch in OpenOffice.org and Symphony, and then opened those documents in Office.  They also created documents in Office and opened them in the other implementations.

In addition to these types of simple tests, we also wanted to verify that our implementation was not dependent on details of other implementations that aren’t actually standardized in the specification.

A good example of this sort of issue is the question of how parts are named and where they’re stored in the ZIP package that comprises an ODF document.  I’ve blogged in the past about this same issue in Open XML – an implementation of the Open XML standard shouldn’t assume that the document start part is word/document.xml, just because Word happens to use that name and location.

In ODF, some of those details are standardized – the start part is always named content.xml, for example – but others are not.  So the testers used ODE to manually modify documents that had been created by OpenOffice.org, to change certain details such as the name of the folder containing embedded images.  They then opened these documents in Office, to verify that our implementation will be able to interoperate with implementations that have made different design decisions within the range of options that the ODF standard allows.

Summary

As you can see, there are many things to consider when creating and executing a test plan for support of a new document format in Office.  At an abstract level, it’s just another test plan – we design, then code, then test, with ongoing revisions to all three as needed to reach our design goals.  But the specifics of the ODF implementation test plan were geared toward the details of the ODF standard, as outlined above.

Due to the work our test teams did on the ODF 1.1 implementation in Office 2007 SP2, we are very confident that the implementation we produced adheres to the details of the design we had created, as documented on the implementer notes web site.  I realize that some people may disagree with some of the design decisions we made in our implementation, and we welcome constructive debate of those details.

I’m posting this from The Hague, where I will be attending the ODF plugfest today and tomorrow.  My colleague Peter Amstein – who led the technical work on our ODF implementation – is also here, and we’re looking forward to learning about how other implementers approach document format interoperability testing, and discussing how we can all work together on ODF interoperability going forward.

Parliament building, The Hague

Postedby dmahugh | 16 Comments    
Standards-Based Interoperability
05 June 09 07:02 AM

There has been quite a bit of discussion lately in the blogosphere about various approaches to document format interoperability.  It’s great to see all of the interest in this topic, and in this post I’d like to outline how we look at interoperability and standards on the Office team.  Our approach is based on a few simple concepts:

  • Interoperability is best enabled by a multi-pronged approach based on open standards, proactive maintenance of standards, transparency of implementation, and a collaborative approach to interoperability testing.
  • Standards conformance is an important starting point, because when implementations deviate from the standard they erode its long-term value
  • Once implementers agree on the need  for conformance to the standard, interoperability can be improved through supporting activities such as shared stewardship of the standard, community engagement, transparency, and collaborative testing

It’s easy to get bogged down in the details when you start thinking through interoperability issues, so for this post I’m going to focus on a few simple diagrams that illustrate the basics of interoperability.  (These diagrams were inspired by a recent blog post by Wouter Van Vugt.)

Interoperability without Standards image

First, let’s consider how software interoperability works when it is not standards-based.

Consider the various ways that four applications can share data, as shown in the diagram to the right.  There are six connections between these four applications, and each connection can be traversed in either direction, so there are 12 total types of interoperability involved.  (For example., Application A can consume a data file produced by Application B, or vice versa.)

As the number of applications increases, this complexity grows rapidly.  Double the number of applications to 8 total, and there will be 56 types of interoperability between them:

image

Let’s go back to the simple case of 4 applications that need to interoperate with one another, and take a look at another factor: software bugs.  All complex software has bugs, aimagend some bugs can present significant challenges to interoperability.  Let’s consider the case that 3 of the 4 applications have bugs that affect interoperability, as shown in the diagram to the right.

The bugs will need to be addressed when data moves between these applications.  Some bugs can present unsolvable roadblocks to interoperability, but for purposes of this discussion let’s assume that every one of these bugs has a workaround.  That is, application A can take into account the known bug in application B and either implement the same buggy behavior itself, or try to fix up the problem when working with files that it knows came from application B.

Here’s where those workarounds will need to be implemented:

image

Note the complexity of this diagram.  There are 6 connections between these 4 applications, and everyone one of them has a different set of workarounds for bugs along the path.  Furthermore, any given connection may have different issues when data moves in different directions, leading to 12 interoperability scenarios, every one of which presents unique challenges.  And what happens if one of the implementers fixes one of their bugs in a new release?  That effectively adds yet another node to the diagram, increasing the complexity of the overall problem.

In the real world, interoperability is almost never achieved in this way.  Standards-based interoperability is much better approach for everyone involved, whether that standard is an open one such as ODF (IS26300) or Open XML (IS29500), or a de-facto standard set by one popular implementation.

Standards-Based Interoperability

In the world of de-facto standards, one vendor ends up becoming the “reference implementation” that everyone else works to interoperate with.  In actual practice, this de-facto standard may or may not even be written down – engineers can often achieve a high degree of interoperability simply by observing the reference implementation and working to follow it.

De-facto standards often (but not always) get written down to become public standards eventually. One simple example of this is the “Edison base” standard for screw-in light bulbs and sockets, which started as a proprietary approach but has long since been standardized by the IEC.   In fact this is a much more common way for standards to become successful than the “green field” approach in which the standard is written down first before there are any implementations.

Once a standard becomes open and public, the process for maintaining it and the way that implementers achieve interoperability with one another changes a little.

The core premise of open standards-based interoperability is this: each application implements the published standard as written, and this provides a baseline for delivering interoperability.  Standards don’t address all interoperability challenges, but the existence of a standard addresses many of the issues involved, and the other issues can be addressed through standards maintenance, transparency of implementation details, and collaborative interoperability testing.

In the standards-based scenario, the standard itself is the central mechanism for enabling interoperability between implementations:

image

This diagram is much simpler than the other diagram above that showed 56 possible connections between 8 implementations.  The presence of the standard means that there are only 8 connections, and each connection only has to deal with the bugs in a single implementation.

How this all applies to Office 2007 SP2

I covered last summer the set of guiding principles that we used to guide the work we did to support ODF in Office 2007 SP2. These principles were applied in a specific order, and I’d like to revisit the top two guiding principles to explain how they support the view of interoperability that I’ve covered above.

Guiding Principle #1: Adhere to the ODF 1.1 Standard

In order to achieve the level of simplicity shown in the diagram above, the standard itself must be carefully written and implementers need to agree on the importance of adhering to the published version of the standard.  That’s why we made “Adhere to the ODF 1.1 Standard” our #1 guiding principle.  This is the starting point for enabling interoperability.

Recent independent tests have found that our implementation does in fact adhere to the ODF 1.1 standard, and I hope others will continue to conduct such tests and publish the results.

Guiding Principle #2: Be Predictable

The second guiding principle we followed in our ODF implementation was “Be Predictable.”  I’ve described this concept in the past as “doing what an informed user would likely expect,” but I’d like to explain this concept in a little more detail here, because it’s a very important aspect of our approach to interoperability in general.

Being predictable is also known as the principle of least astonishment.  The basic concept is that users don’t want to be surprised by inconsistencies and quirks in the software they use, and software designers should strive to minimize or eliminate any such surprises.

There are many ways that this concept comes into play when implementing a document format such as ODF or Open XML.  One general category is mapping one set of options to a different set of options, and I used an example of this in the blog post mentioned above:

image When OOXML is a superset of ODF, we usually map the OOXML-only constructs to a default ODF value. For example, ODF does not support OOXML’s doubleWave border style, so when we save as ODF we map that style to the default border style.

Our other option in this case would have to turn the text box and the border into a picture.  That would have made the border look nearly identical when the user opened the file again, but we felt that users would have been astonished (in a bad way) when they discovered that they could no longer edit the text after saving and reopening the file.

What about Bugs and Deviations?

image

Of course, the existence of a published standard doesn’t prevent interoperability bugs from occurring.  These bugs may include deviations from the requirements of the standard.  In addition, they may include different interpretations of ambiguous sections of the standard.

The first step in addressing these sorts of real-world issues is transparency.  It’s hard to work around bugs and deviations if you’re not sure what they are, or if you have to resort to guesswork and reverse engineering to locate them.

Our approach to the transparency issue has been to document the details of our implementation through published implementer notes.  We’ve done that for our implementations of ODF 1.1 and ECMA-376, and going forward we’ll be doing the same for IS29500 and future versions of ODF when we support them.

Interoperability Testing

The final piece of the puzzle is hands-on testing, to identify areas where implementations need to be adjusted to enable reliable interoperability.

This is where the de-facto standard approach meets the public standard.   If the written standard is unclear or allows for multiple approaches to something, but all of the leading implementations have already chosen one particular approach, then it is easy for a new entrant to the field to see how to be interoperable.   If other implementers have already chosen diverging approaches however, then it is not so clear what to do.   Standards maintainers can help a great deal in this situation by clarifying and improving the written standard, and new implementers may want to wait on implementing that particular feature of the standard until the common approach settles out.

We did a great deal of interoperability testing for our ODF implementation before  we released it, both internally and through community events such as the DII workshops.  We’ve also worked with other implementers in a 1-on-1 manner, and going forward we’ll be participating in a variety of interoperability events.  These are necessary steps in achieving the level of interoperability and predictability that customers expect these days.

In my next post, I’ll cover our testing strategy and methodology in more detail.  What else would you like to know about how Office approaches document format interoperability?

Postedby dmahugh | 20 Comments    
Tracked Changes
13 May 09 09:52 PM

When I blogged about the release of SP2 with ODF support two weeks ago, I mentioned that I was planning to blog about a few of the tough decisions we faced in our SP2 implementation of ODF, such as the decision not to support tracked changes.  I’ve spent some time since then covering our approach to formulas in ODF, and now I’d like to move on to answering the question of why we aren’t supporting ODF tracked changes.

For those who just want the summary, here’s a high-level recap of what I’ll cover in more detail below:

  • Tracked changes is a very complex aspect of document format functionality; for example, the ECMA-376 specification devotes over 100 pages to describing tracked changes
  • Microsoft Word has a long history of supporting tracked changes, and this functionality is used by a large number of Word users
  • Due to its role in collaborative processes, tracked changes is often used for documents with legal, financial or technical implications that are reviewed and edited by multiple people; in such scenarios, accuracy and reliability are critical
  • ODF 1.1 has a very limited description of tracked changes, covered in only 4 pages of the specification.  ODF 1.1 does not does explain how to implement change tracking for many of Word’s commonly used features, and in some cases it is not even clear if the ODF mechanism makes it possible at all.
  • As a result of these differences, we found that it is not possible to implement robust and reliable tracked changes with ODF; even very simple concepts, such as deleting a row from a table, are not supported by any existing ODF implementation of tracked changes
  • There is almost no interoperability among the various non-Microsoft implementations of ODF when it comes to tracked changes.
  • To protect our customers from losing data when using tracked changes, and to avoid making an interoperability promise that would turn out to be hollow, we made the difficult decision to not support tracked changes at all in ODF

The rest of this post will cover the details of the points summarized above.  This is a long post, and it gets a little technical in places, because change tracking is inherently a complex topic.

State of Tracked Changes Interoperability

SP2 is a new implementation of ODF, but there are many existing implementations of ODF that are already in wide use.  I’ve done an informal review of them to try to understand existing practices around the use of tracked changes in ODF documents.

Here’s what I’ve found:

If anyone knows of additional information on these implementations, or any other ODF implementation that supports tracked changes, especially if you know of one which is not derived from the OpenOffice.org source code, please let me know and I’ll update that list.

To test interoperability between current ODF implementations of tracked changes, I created a simple document with some tracked changes, saved it in ODF, and then looked at what happened when I opened that document in other ODF implementations.

So the first step is to create a test document.  Using Symphony 1.2, I followed these steps:

  • Click on “Create a new Document”
  • Insert a table (Create/Table), and put some text in each cell to identify the rows
  • Add a paragraph of text, below the table, containing two sentences
  • Add a numbered list of four items, below the paragraph

The starting point for my document looks like this:

image

Then I added some change-tracking, as follows:

  • Turn on change tracking (Edit/Revisions/Record)
  • Delete the second row from the table (right-click, Row/Delete)
  • Highlight the last sentence of the paragraph and the first two items of the numbered list, up through the (DELETE) on the second item, and delete that region

My document now looks like this in Symphony:

image

One things you’ll notice here is that the row I deleted from the table is simply gone, with no change tracking recorded.  This is due to an inherent limitation in ODF’s approach to change tracking, which does not allow table changes to be tracked in a standardized manner.

More on that later, but first let’s see what happens when I save this document as ODF 1.1.  After I click Save, here’s what I  see:

image

Take a close look at the numbering of the list items, and you’ll see that the second list item has no numbering any longer.  Very strange.  And if I reject all changes in the document, the numbering of that item doesn’t come back – it disappeared somehow, the instant I saved my document as ODF 1.1.

I suppose some people might be tempted to suggest that I should use the latest OpenOffice.org release for this test, which came out a couple weeks ago.  I tried that, and I get similar – but not identical – strange behavior by following the steps above.

Speaking of OpenOffice.org 3.1, let’s open this saved document in that implementation of ODF.  When I do, here’s what I see:

image

At first glance, it looks like all of the changes were accepted.  But in fact, the changes are still in the document, and you must go into Edit/Changes/Show to make the tracked changes appear.

In Google Docs, we see essentially the same thing that OpenOffice.org displayed by default:

image

Google Docs automatically accepts tracked changes in ODF documents, and then uses its own entirely different approach for managing change tracking.  Google Docs uses a Revision History feature to track changes to documents; for example, here’s what I see when I click on Tools, Revision History when viewing this document in Google Docs:

image

It appears that Google Docs is pretty committed to this approach to change tracking, based on this recent exchange on the Google Docs Help Center site:

Jcuesta: We need Track Changes.  When?

Gill (Google Docs Guru): Who knows?  Given that we already have Revisions, quite possibly never.

Moving on to another ODF 1.1 implementation, AbiWord 2.6.8 (which does not support tracked changes), here’s how my test document appears:

image

AbiWord doesn’t support tracked changes, so I would have expected to either see the document with no changes at all, or with all changes accepted.  Instead, I see what appears to be a random re-arrangement of the document content.  On closer inspection, I think this is due to ODF’s approach to handling deletions, which requires that deleted content be stored at a location separate from where it was deleted.  I’ll explain that in more detail below.

So far, we have two applications that seem to agree on how to display this document (OpenOffice.org 3.1 and Google Docs), and two others that each have a different way of displaying the document.  Sounds messy, but it gets even worse if you start varying which application creates the document in the first place.

For example, I followed the same steps outlined above, but started from OpenOffice.org 3.1 instead of Symphony 1.2.  Here’s the result:

image

But if I load this OO.o-created document in Google Docs, I see something quite different from what I saw when I loaded the Symphony-created document in Google Docs.  Instead of all tracked changes being accepted, and the deleted text gone, now I see all tracked changes being ignored, and the deleted text (except for the deleted table row) is present, although the list numbering skips over the second item:

image 

So we’ve seen that none of these implementations track changes to tables, and the behavior when loading tracked-changes documents into applications other than OpenOffice.org or Symphony varies between several possibilities, including accepting changes, ignoring changes, and restoring deleted content to a different position in the document.  Furthermore, this is only a simple test that includes nothing but deletions.  If you start combining deletions and insertions in the ways that people typically do while collaborating on documents, you’ll find even more surprising behavior when those documents are opened in applications other than the one that created  them.  This is the state of ODF tracked-changes interoperability today.

The Cause of the Problem

The problems above are not just caused by bugs in these implementations.  Rather, they are the result of inadequate specification of change-tracking functionality in ODF 1.1, combined with a peculiar design decision in ODF’s approach to tracking deletions.

To get a feel for how thoroughly ODF specifies change tracking, it’s instructive to compare the size of the relevant sections of the ODF 1.1 and ECMA-376 specifications.  ECMA-376, which supports 100% of the change-tracking functionality that Word uses, devotes 121 pages to change tracking in Part 4, Section 2.13.5.  ODF 1.1, by comparison, has only 4 pages devoted to change tracking in section 4.6 of ODF 1.1.

There are many areas where we found that ODF 1.1’s approach to tracked changes couldn’t provide the functionality and reliability that our customers have come to expect.

Where to put deleted content?

When you delete content with tracked changes on, the content remains in the document, marked as deleted by a particular user on a particular date/time.  But where in the document?  The answer is different for Open XML and ODF.

Let’s look at a simple example, and see how the two formats handle the deleted text.  Here’s the example we’ll use, a single sentence with a word deleted from it:

image

First let’s look at how Open XML handles this deletion.  Here’s the ECMA-376 markup that Word 2007 writes out for this sentence:

image

You can see that the deleted text is inline, right where it was before it was deleted, surrounded by a delText tag.

Now let’s look at the ODF markup that OpenOffice.org 3.1 writes for this deletion:

image

In this case, the deleted word does not appear inline.  Rather, there is a text:change element inline, with an ID of ct205721376.  Within the text:tracked-changes element (which occurs earlier in the body of the document), you can see where ID ct205721376 is defined as being a deletion by Doug Mahugh, containing the word deletion inside a text:p element.

There are two problems with this approach: one problem for implementations that don’t support tracked changes, and one problem for implementations that do support tracked changes.

To see the problem for implementations that don’t support tracked changes, refer above to the AbiWord screen shot.  AbiWord doesn’t know about tracked changes, but it does know about paragraphs (text:p elements), so it displays every paragraph it finds in the document, in the order that it finds them.  Since the deleted “paragraphs” appear first in the markup, they appear first in the displayed document.

I put paragraphs in quotes there for a reason: in the simple example we’re looking at here, I did not delete a paragraph, I deleted a word from inside a paragraph.  So why is the deleted text wrapped inside a paragraph element?

The answer is that the ODF spec requires deleted content (as contained in a text:deletion element) to be schema-compliant, regardless of whether the deleted region was a well-formed element or (as in this case) merely a fragment within some other structure, such as a word within a paragraph.

This is the source of the problem I alluded to above, for implementers who choose to support ODF tracked changes.  Each implementer must decide how to synthesize markup to make each piece of deleted content into well-formed XML, and then later – when it comes time to accept or reject the change – each implementer must make decisions about how to distinguish between the synthesized packaging and the deleted content itself.

Unfortunately, the ODF specification doesn’t provide much guidance on this complex topic.  Here’s the guidance provided in ODF 1.1 (Section 4.6.4 Deletion):

To reconstruct the text before the deletion took place, do:

  • If the change mark is inside a paragraph, insert the text content of the <text:deletion> element as if the beginning <text:p> and final </text:p> tags were missing.
  • If the change mark is inside a header, proceed as above, except adapt the end tags to match their new counterparts.
  • Otherwise, simply copy the text content of the <text:deletion> element in place of the change mark.

This guidance works for very simple cases, but does not allow for complex situations such as deleting part of a table, as described below.  A specific implementer may come up with an approach that works within their application, but since the spec doesn’t say how to synthesize the markup for the shim, what shows up as a deletion in one application might show up as a different deletion, or not deleted at all, in a different application.

The approach used by ECMA-376, as shown in the example above, keeps the delete text inline where it was deleted, thus eliminating all of these issues.  There is no extra synthesized markup added when a deletion is saved, and therefore implementers don’t need to make decisions about how or whether to remove that markup when it comes time to accept or reject the changes.

Changes to Tables

The ODF 1.1 specifiation says (in section 8.11) that “Change tracking of tables is not supported for text documents.”

And indeed, no existing ODF implementation that I’m aware of attempts to track changes to tables, such as adding or deleting rows or cells, modifying table properties or grid layout, and so on.  Looking at Section 4.6, it’s easy to see why this is so: there is no information provided about how to track table changes, and it’s not at all obvious how one would do so within the current mechanism.

Deleted sections of tables would be especially problematic in ODF, because of the need to create a shim to make the relocated deleted content schema-valid.  The ODF spec provides some guidance on how to revert deleted paragraph content (as quoted above), but for tables, there is no such guidance.

So if a row of a table is deleted, what should an implementer do?  Store in <text:tracked-changes> a table with one row inside the deleted-content section?  And how would another implementation know whether that indicates a deleted row of a table, or a deleted one-row table?

In the ECMA-376 specification, on the other hand, there are defined mechanisms for tracking changes to tables.  As one example, consider the simple act of deleting an row from a table while change-tracking is turned on.  In ODF, that row is simply gone, and reverting your tracked changes later will not recover the row.  But in Open XML, the <del> element can be applied to a table row, and as stated in Section 2.13.15.4, “This element specifies that the parent table row shall be treated as a deleted row whose deletion has been tracked as a revision. This setting shall not imply any revision state about the table cells in this row or their contents (which must be revision marked independently), and shall only affect the table row itself.“

Format Changes

Tracking changes also entails tracking changes to document formatting properties.

ECMA-376 has many elements dedicated to tracking formatting changes, including pPrChange, rPrChange, sectPrChange, tblPrChange, tblPrExChange, tcPrchange, and trPrChange.  These elements are described over 17 pages (pages 1015-1032 of Part 4).

ODF 1.1, on the other hand, has a single format-change element, which is documented as follows in Section 4.6.5, Format Change:

A format change element represents any change in formatting attributes. The region where the change took place is marked by a change start and a change end element.

Note: A format change element does not contain the actual changes that took place.

Much was made during the IS29500 standards process of the difference in the size of the ODF and Open XML specifications.  This is a good example of where that difference comes from: in this case, a concept glossed over in three vague sentences of the ODF spec gets 17 pages of documentation in the Open XML spec.

Summary

This has been a long blog post, but I wanted to make sure that people understand why we made the difficult decision to not support tracked changes in our Office 2007 SP2 implementation of ODF.

When you load an ODF document containing tracked changes into Word 2007 SP2, all existing changes will be accepted, and you will not be able to save any further tracked changes in the document unless you save as DOCX.  This is an inconvenience, but a necessary one to protect users from unexpected surprises in the various scenarios outlined above.  Keep in mind that you can still use Word’s document compare feature to compare a previous version of an ODT file to a newer version, in order to see what changed.

Finally, there are a few questions that I anticipate some people may ask, so I’d like to address those here …

Couldn’t you have at least supported tracked changes for simple cases, as OpenOffice.org does?

Change tracking that handles “some” or even "most” of the changes a user makes would be extremely risky to use, because the user may be surprised to discover later that certain types of changes were not being tracked.  We’ve learned through clear feedback we get from our customers that a feature which works “most of the time” can be worse than no feature at all.  Users count on accurate, reliable change tracking for managing updates to their critical business documents.

We really wanted to make change tracking work for our ODF implementation in Office 2007 SP2. I’ve spoken to some of the developers on the Word team, who wrote a lot of code for this and really tried to solve the problems. But ultimately our test team pointed out that the feature was just not “ship quality” and there was no good way to make it better without extending ODF - which our first principle of Adhere to the ODF 1.1 standard told us not to do.

Will change tracking be improved in ODF 1.2?

Unfortunately, it doesn’t look like it.  The current draft of ODF 1.2 contains no additions to Section 4.6 of ODF 1.1 (which is Section 4.5 in ODF 1.2 due to renumbering).  The only change is that the examples have been removed from the section.

Why didn’t Microsoft work to get this fixed in the ODF TC?

We joined the OASIS ODF TC last June, and we started slowly because some people have stated concerns about Microsoft having too much influence on ODF’s direction.  The first proposal we made was a very simple proposal to add two optional attributes to indicate maximum grid size for spreadsheet applications, which would have addressed a specific real-world interoperability problem we encountered with a major ODF implementation.  Other TC members argued against this proposal, and after several such exchanges we decided not to push the matter.

We then continued submitting proposed solutions to specific interoperability issues, and by the time proposals for ODF 1.2 were cut off in December, we had submitted 15 proposals for consideration.  The TC voted on what to include in version 1.2, and none of the proposals we had submitted made it into ODF 1.2.

We look forward  to contributing more to the ODF TC in the future, and we would welcome the opportunity to work with other TC members to improve ODF’s ability to handle tracked changes.

Postedby dmahugh | 33 Comments    
1 + 2 = 1?
09 May 09 11:26 PM

Does 1 plus 2 equal 3?   After last week’s sometimes acrimonious discussion about formulas in ODF, you may be glad to hear that IBM and Microsoft appear to agree on that answer to this simple question.  But OpenOffice.org is not so certain – maybe the answer is just 1 sometimes – and the question itself turns out not to be so simple after all.  Let me explain.

The State of ODF Formula Interoperability Today

What is the current reality of ODF formula interoperability?  Understanding the status of the ODF ecosystem will help clarify the set of issues and options that we faced when making the tough decisions we had to make about how to best support formulas in ODF spreadsheets.

For this example, I’ll use the latest released versions of two well-known ODF implementations: IBM Lotus Symphony (version 1.2, download here) and OpenOffice.org (version 3.1, download here).  I want to talk about current reality, so I’m not using any outdated versions of software (the OO build I’m using, for example, was released in the last week).  I also stayed away from unreleased or private beta versions that might become available sometime in the future, and I used the default settings for each application.

First, I fired up Symphony 1.2, and followed these steps:

  • Enter a numeric value of 1 in cell A1.
  • Format cell A2 as text, right-justified, then enter a 2 in that cell.
  • In cell A3, enter the formula =A1+A2.

In Symphony 1.2, here’s what I see:

image

After saving this spreadsheet as an ODS file, I open it in OpenOffice.org 3.1 and see this:

image

Clearly this is a problem.  The exact same data, in the exact same spreadsheet, when operated on with the exact same formula, provides different results.

Some might be tempted to say that formatting a cell as text and then using it in a calculation is dumb.  And I’d agree that there are few people who ever do such a thing intentionally.  But in a large complex spreadsheet, with thousands of cells involved in complex calculations, it’s easy to make mistakes like this.  In fact, if you’ve spent any amount of time at all creating complex spreadsheets, I’ll bet that on more than one occasion you’ve wasted a bunch of time trying to debug a problem that turned out to be caused by such mistakes; I know I sure have.

Similar issues arise with boolean values – what does it mean to “sum” a column of cells that includes both numeric values and boolean values?  Not all spreadsheet implementations agree on the answer to that question, either. This can create interactions between formatting and calculating – change the format of some cells, and the totals change in your spreadsheet.  Most users find such behavior very confusing, to say the least.

One of the most interesting things I found in my testing of these two implementations was that although they write different markup for formulas, the exact same interoperability problem occurs regardless of which application is used to create the spreadsheet.

If you create the spreadsheet in Symphony 1.2, as I did, the table:table-cell element has a table:formula attribute with a value of "=[.A1]+[.A2]".  And this formula will yield a result of 3 in Symphony and 1 in OpenOffice.org, as described above.

If instead you create the same spreadsheet in OpenOffice.org 3.1, when you open it in Symphony 1.2 you'll see of:=A1+A2 in cell A3.  But after you manually correct the formula, this spreadsheet, too, will yield a result of 3 in Symphony and 1 in OpenOffice.org.

So these two ODF implementations do not have predictable formula interoperability, regardless of where you start.  And these are not obscure implementations – they are the latest released versions of the implementations from IBM and Sun, the two companies that together chair the ODF TC.  Even if both companies released fixes tomorrow, there will still be many copies of the current non-interoperable versions of these applications in use for a long time to come.  This is the state of formula interoperability among ODF spreadsheets today.

Fixing the Problem

This difference in behavior is a well-known issue among those who work with spreadsheet formulas.  As Rob Weir said three years ago “Automatic string conversions considered dangerous. They are the GOTO statements of spreadsheets.”  (One of the ODF TC members even has that line in his email auto-signature.)

How to manage string conversions is far from the only problem with spreadsheet interoperability across vendors (and even across versions of the same product in some cases). The current draft OpenFormula specification contains 254 notes (by my count) about other issues similar to this one.

The OpenFormula sub-committee of the ODF TC has worked hard to address this.  Here is an excerpt from the draft OpenFormula specification (emphasis added):

6.2.4 Conversion to Number

If the expected type is Number, then if value is of type:

  • Number, return it.
  • Logical, return 0 if FALSE, 1 if TRUE.
  • Text: The specific conversion is implementation-defined; an application may return 0, an error value, or the results of its attempt to convert the text value to a number (and fall back to 0 or error if it fails to do so). Applications may apply VALUE() or some other function to do this conversion, should they choose to do so. Conversion depends on the actual locale the application runs in, especially if group or decimal separators are involved. Note that portable spreadsheet files cannot depend on any particular conversion, and shall avoid implicit conversions from text to number.

After OpenFormula is approved and published, this approach, with its explicitly defined concept of “portable spreadsheet files,” will allow more predictable and consistent interoperability for ODF spreadsheet users.

But in the current environment, with no standardization of formula markup across major ODF implementations, users who want to avoid interoperability problems need to stick to a very conservative strategy.  As Burton Group analyst Guy Creese said last week:

“… this in-between time (between the OpenOffice.org de facto standard and the wait for the officially approved 1.2 standard) means there isn't one way to handle this problem. The vendors would like you to believe that there is (their way), but in reality there isn't. Ultimately, this will resolve itself over time. ODF 1.2 will be approved, and there will finally be an approved standard that everyone--IBM, Microsoft, Sun (Sun/Oracle)--can follow.

Until then, if an enterprise does want to use ODF, the best strategy is to stick with one productivity suite as a way to avoid these interoperability problems. That way, even if formula support is idiosyncratic, it at least will be consistent within the enterprise.”

 How Excel 2007 SP2 Handles ODF Formulas

The question of how to handle formulas in SP2’s ODF implementation was one of the tough decisions we faced in our ODF implementation.  We had made conformance to the ODF 1.1 specification a top priority, and yet the spec doesn’t specify a formula language. 

It seemed clear to us that we couldn’t simply omit the  namespace, as the current version of Symphony does.  That would be in violation of Section 8.1.3 of the ODF specification, where it says “Every formula should begin with a namespace prefix specifying the syntax and semantics used within the formula.”

What about using the same of: namespace that OpenOffice.org 3.1 uses?  We saw a couple of pretty serious problems with that approach as well:

  • It would not be interoperable with some existing implementations, such as the widely  used current version of IBM Lotus Symphony.
  • It is based on a draft specification that has not been finalized or approved as a standard, and therefore could still change.

What about using the oooc: namespace that OpenOffice.org 3.1 writes when you choose its ODF 1.1 compatibilty mode? That syntax is on its way out for everyone, and we saw no point creating yet another new implementation of something that is clearly going to be deprecated soon.  And it doesn’t solve the problem: OpenOffice.org 3.1 writes the oooc: namespace prefix in its ODF 1.0/1.1 compatibility mode, and those spreadsheets still can yield different results in OpenOffice.org and Symphony.

After a robust internal debate on the topic, it became clear what we needed to do to apply the first two of our five prioritized guiding principles for Office’s ODF implementation:

  • Adhere to the ODF 1.1 standard
  • Be Predictable
  • Preserve User Intent
  • Preserve Editability
  • Preserve Visual Fidelity

As we discussed in several DII workshops starting back in July of 2008 (with multiple ODF implementers and multiple ODF TC members in attendance), these guiding principles are in priority order. When we could not achieve them all, we choose the top ones first.

To adhere to the ODF 1.1 standard, we begin formulas with “a namespace prefix specifying the syntax and semantics used within the formula.”  Excel 2007 SP2 uses an msoxl prefix and write the formula attribute like this:

table:formula="msoxl:=A1+A2"

That fulfills our goal of adhering to the standard since ISO/IEC 29500 defines both the syntax and semantics of this namespace.  Then, to provide a predictable user experience across all spreadsheets, we elected to support this namespace, and only this namespace.

If I move my spreadsheet from one application to another, and then discover I can’t recalculate it any longer, that is certainly disappointing.  But the behavior is predictable: nothing recalculates, and no erroneous results are created.

But what if I move my spreadsheet and everything looks fine at first, and I can recalculate my totals, but only much later do I discover that the results are completely different than the results I got in the first application?

That will most definitely not be a predictable experience.  And in actual fact, the unpredictable consequences of that sort of variation in spreadsheet behavior can be very consequential for some users.  Our customers expect and require accurate, predictable results, and so do we. That’s why we put so much time, money and effort into working through these difficult issues.

What Does Excel 2007 SP2 Do With the Example Above?

The answer is that we agree with IBM: 1 + 2 = 3.

Excel does the same thing Symphony 1.2 does, converting the text “2” to a numeric 2 and using that value in the calculation, so that the total is 3.  Excel does this because this type of automatic conversion – which has been a popular Excel feature for a very long time – is allowed by the semantics of the formula markup language Excel uses.

The formula markup that Excel uses is based on the formula language defined in ECMA-376 and ISO/IEC 29500, and here’s what it says about type conversion in Section 18.17.2.6 (Types and Values) of Part 1 of IS29500:

An implementation is permitted to provide an implicit conversion from string-constant to number. However, the rules by which such conversions take place are implementation-defined. [Example: An implementation might choose to accept "123"+10 by converting the string "123" to the number 123. Such conversions might be locale-specific in that a string-constant such as "10,56" might be converted to 10.56 in some locales, but not in others, depending on the radix point character. end example]

Excel’s approach to formulas in ODF, as well as our approach to other difficult issues, is completely public and fully documented in the implementer notes for SP2.  As the note for this issue explains:

The standard defines the attribute table:formula, contained within the element <able:table-cell>, contained within the parent element <office:spreadsheet table:table-row>

This attribute is supported in core Excel 2007. This attribute is supported in core Excel 1. When saving the Table:Formula attribute, Excel precedes its formula syntax with the "msoxl" namespace. 2. When loading the attribute Table:formula, Excel first looks at the namespace. If the namespace is "msoxl", Excel will load the value of Table:formula as a formula in Excel. 3. When loading the Table:formula attribute, if the namespace is missing or unknown, the Table:formula attribute is not loaded, and the value "Office:value" is used instead. If the result of the formula is an error, the element <text:p> will be loaded and mapped to an Error data type in Excel. Error types not supported by Excel are mapped to #VALUE!

The Question of Syntax

I’d like to also address the issue of cell reference syntax in the ODF 1.1 specification, since that was also a topic of much discussion on several blogs last week.  I’ll start with some quick background for those who don’t wallow in standards documents for a living.

The English language is inherently an ambiguous thing,  and great literature sometimes uses the ambiguity to good effect.  Words can have more than one meaning, and verb phrases might be intended to go with one noun or with another, as in famously ambiguous job references like “You will be very fortunate to get this person to work for you."

Writers of technical standards like to use rules and procedures that are designed to avoid this sort of problem.  These rules, which place requirements on the  use of words like should, shall, must and may,  tend to result in a stilted writing style which gets tedious fast, but reduces the need to agree on what is “obvious” or “implied” when interpreting the meaning of the text later.

A standards document is said to contain both normative language and informative language.   The things you must do to comply with a standard are supposed to be in the normative part,  and things like examples and introductions are informative.   So that everyone can be  sure about which parts are which,  the normative parts use specific phrases like “shall” and “shall not” to clearly label the things the standard actually requires you to do.

So the debate about Excel 2007 SP2’s cell reference syntax comes down to whether the few sentences in the ODF 1.1 spec which cover this were meant to be informative or normative.  The section of ODF 1.1  in question does not use the words shall or must.   It introduces the topic with the phrases “typically” and “can include”.   In our reading of it,  this language makes that part of the specification informative, stating no requirements for implementers.

The ODF 1.1 spec is casual about applying the rules of normative language, and as a result ODF 1.1 has more than its share of ambiguity.  The ODF 1.2 draft, however, is already much improved in this regard, mainly through the great work of ODF editor Patrick Durusau.  The OpenForumla draft specification is extremely careful in its use of normative language, and that will help implementers a great deal when they sit down to write their software.

When Will Office Support OpenFormula?

This question has come up on some blogs, so I’d like to address it here as well.

The real question is “when will Office support ODF 1.2,” since OpenFormula is simply a part of the ODF 1.2 specification.  And the answer is that we don’t know yet, because nobody knows yet when ODF 1.2 will be published as an OASIS or ISO standard.  As I said in the previous post, “we will look closely at Open Formula when it becomes a standard and make a decision then about how to best proceed.”  (It looks like IBM has committed to supporting ODF 1.2 and OpenFormula in late 2010.)

In the meantime, if you want to use Excel 2007 SP2 to edit documents that contain formulas from OpenOffice.org or Symphony, and preserve those formulas through editing sessions, and you understand the risk that the results might not be the same, you have a couple of free options.

The Open XML / ODF Translator Add-Ins for Office can be used with Office 2007 SP2, and as covered on the translator team blog, supports a variety of formula namespaces.

The Sun ODF Plugin provides yet another option, and apparently works with SP2.

Postedby dmahugh | 58 Comments    
More Posts Next page »

This Blog

Syndication

Page view tracker