I was honored to be invited to present a short 45 minute talk about interoperability at OpenOffice.org conference in Budapest, Hungary on September 2nd 2010. This gave me to opportunity to catch a good number of other sessions during the course of the event, and would like to share some general impressions:

Interoperability for OpenOffice.org appears to have a strong focus on the Open Document file format (ODF). I heard fewer mentions of the ISO 29500 standard and there were no specific sessions on Office Open XML that I participated in (unless you consider mine to be a partial contribution).

Although the conference was called OpenOffice.org conference, other office implementers participated as well, e.g. K-Office and its offspring FreOffice (and of course as I already mentioned, Microsoft). Having a broader spectrum of technologies represented at the conference made things so much more interesting for me (from an interoperability perspective) and I applaud the organizers for “opening up” and inviting these additional parties to participate in the conference.

Session highlights and notes (in chronological order):

Day 1 - September 1st

ODF Development - Adopting the train model, Michael Brauer

Michael discussed the “big picture” of how ODF could evolve using an analogy of trains, taking different stops on sometimes diverging but eventually converging tracks (merging, not crashing the trains as a result, of course …). Trust me, it’s all clear and simple if you don’t try to reconstruct it in your head using this description but download and take a look at Michael’s presentation. It was very interesting to consider how we might have to deal with more frequent “minor” releases of ODF in the future and how to reconcile those with less frequent ISO publications. One question that came up towards the end was about the possibility to start on newer OASIS versions (“train leaving the station”) before final publication of a pending ISO submission. Michael indicated that he thought that this had to be a sequential process, but based on the question/comment from the audience I didn’t understand if this already is the final word on the matter (of course the implications for Microsoft and our commitment to ODF are significant, as we plan for support of newer ISO standards in our office products).

ODFDOM – the next generation ODF library, Svante Schubert

Fantastic presentation by Svante Schubert, who is aiming to provide through ODFDOM a “low level” DOM as well as a higher level automation approach for programmatic document creation and manipulation that eventually could be used as an abstraction layer to, among other things, further interoperability goals (the idea being that if you take a dependency on a more abstract method call, that this method could then be mapped in turn to different underlying document models and formats, e.g. ODF, HTML or Open XML). Svante is a great presenter and has a lot of experience in this area, so his concept looked quite polished to me (on slides, at least).  During the Q&A, Rob Weir made an insightful remark about the need to consider a tool like TidyHTML for ODF to clean up broken markup from unknown and possibly malformed sources. Using ODFDOM as a filter might be one way to transform these documents and make them standards compliant. Svante acknowledged that currently in ODFDOM not all cases of non-compliant documents are being caught (in my opinion quite understandable especially in case of low-level direct DOM manipulations) but he mentioned that measures to ensure compliance as per the standards specification are being worked on. Great session, very informative and good to keep an eye on the future development of ODFDOM.

OOo Accessibility – past, present and future, Malte Timmermann, Bing Yin

This session was about cross platform accessibility technologies for OpenOffice.org with a special focus on IBM’s IAccessible 2 implementation for Lotus Symphony that IBM intends to contribute back to the main OpenOffice.org project. Clearly, any improvement in the area of accessibility is at least as important as any other aspect in implementing frequently used applications such as office suite, so I enjoyed this session at lot (if this had been a conference with evaluation sheets, I would have given the session good scores). One aspect that wasn’t discussed in the session was the relationship between IAccessible 2 (IA2) and Microsoft’s newer UI Automation technology; this was initially puzzling to me as there was some (rightful) criticism of the older MSAA standard (that dates back to 1997) that was chosen to serve as the underlying foundation for IA2 (effectively giving people one more reason to continue using MSAA rather than moving away from it). Ming Feng Jia of IBM was kind enough to later explain to me that he sees one of the main differences between IA2 and UI Automation in the way the document object model of an application like an office editor or a web browser can be exposed through IA2. So IA2 has merit, but I need to do my homework to better understand why IA2 prefers MSAA over UI Automation and also compare the high level architecture of UI Automation and IA2 to understand differences and synergies, maybe this could become another future opportunity for key industry technology providers to work together on unifying our approaches and therefore extending the reach and quality of assistive technologies (= everybody wins!). Great topic and great presenters!

lpOD: The coming of age of the smart platform for ODF documents, Charles H. Schulz, Luis Belmar

This presentation demonstrated (through many live demos) a package for document automation called lpOD with examples like concatenation of presentation files, document creation from Wiki pages and bulk stylesheet transformations. lpOD has support for multiple scripting languages including Python and Ruby. My first thought was about trying to duplicate some of the examples using PowerShell and the Open XML SDK J Very passionate presenters (Luis is a script wizard) and this is clearly built with real world examples in mind, it was featured deservedly in multiple sessions and promotes server side document manipulation over old school macro based office programmability, I absolutely agree with this premise. Everybody was having a good time, as far as I could tell. Great progress for such a relatively young project!

ODF Backward Compatibility Analysis and Development in Symphony 3.0, Ming Fei Jia

Ming Fei is an amazingly insightful and open person! He did a fantastic analysis of the challenges and opportunities in evolving ODF based document editors, using the example of IBM Lotus Symphony. Among the examples discussed were

·         Different computational formulas for list indentation

·         No default values, affecting e.g. the rendering of

o   Text wrap in custom shape

o   Fill color in custom share

o   Spacing to contents in text box

For the default values issue, the equally simple and effective strategy proposed by Ming Fei is to make all defaults explicit on export – I was wondering why that hadn’t always been the case, but better late than never and glad that this will take care of the issue going forward (I hope that other implementers, including us (if we aren’t doing it already?) will take note and follow this example).

What I liked in particular was the technical credibility of the presentation, i.e. Ming Fei acknowledged and gave specific examples for backwards compatibility issues between ODF 1.2 and 1.1 (to quote from his session abstract “ODF 1.2 is not a simple superset of ODF 1.1”).

ODF Modularization: A simple Step with its tremendous Benefits, Svante Schubert

Svante Schubert strikes again! Svante presented some ideas about developing a methodology for automatic test generation based on principles that include an approach that attempts to determine a “minimal feature graph” for a test case (what constitutes the test case could also be determined heuristically). To achieve test coverage according to spec, Svante recommends to use context and range constraints to create test cases that cover all contextual permutations and boundary conditions. Again, this work would be useful beyond ODF and could be applied to other document markup languages. I have a feeling that if this succeeds, you might have to order a new test server farm or utilize cloud computing to deal with explosion of test cases, maybe we have to go back and as a next step develop smart heuristics for selective testing with the full test suite being used less frequently. Svante’s suggestion of building minimal test cases of course already helps to reduce complexity. Overall, this session had some great ideas and could be a foundation that we can all build on.

Day 2 - September 2nd

ODF 1.2 interop demo

A lot of work clearly went into this session where a significant number of implementers showcased their products that have ODF 1.2 support. We did not quite have parity between presenters and audience, but it got pretty crowded on both sides of the table. This must have been difficult to put together, I would like to express my gratitude to everybody involved for working hard to deliver a coherent session.

An (incomplete) list of the demos:

·         Inge from KOffice demoed their UI experience for RDF tags, which shows again the value of broadening the scope of the conference beyond its namesake (OpenOffice.org doesn’t have an UI for RDF entities, yet).

·         Rob Weir showed MathML support by replacing placeholders in an ODF package with math ML documents, using simple file manipulation (my favorite computer algebra system, Mathematica, as source for the MathML files).

·         The lpOD examples were familiar from the dedicated lpOD session (including the Wiki and spreadsheet/forms example), nothing wrong with reusing them here as they contributed nicely to the variety of use cases.

·         Open framework systems as, a company from Norway, showed a document editing solution to selectively assign document sections to different authors and perform s subsequent document assembly from individual fragments. In my experience, use cases like this are one of the applications where both ODF as well as Open XML are much superior to the old binary file formats; I like to see the new XML file formats being used like this in genuinely new and useful ways.

·         FreOffice (based on KOffice) is taking ODF mobile. This new office product is being developed by Nokia and was appropriately demoed on a cell phone that was plugged into the projector. Digital signatures were shown on mobile as well as calling a contact based on phone number information stored in a RDF entity, the user experience looked quite good given the constraints of the mobile form factor (resources, screen size, different UI paradigms). OpenGL slide transitions in presentations worked on the mobile device, making presentations directly from the smart phone more practical.

One roundtrip demo concluded the demo showcase. It consisted essentially of an ODP presentation file with one image per slide and some bullet points. The slides had been automatically generated by lpOD from a database, based on a real world solution implemented at the Louvre in Paris (introduced earlier in the session). The ODP file was passed from lpOD to KOFFICE and from lpOD to Novell OpenOffice. lpOD again showed the power of the command line by changing document styles without opening a client application.

One thing that I would have liked to see more of (caveat: without being familiar with the OASIS rules and objectives for this kind of interop demo), would have been round-tripping of documents with new ODF 1.2 features  between different applications. Not to mention interop between 1.1 and 1.2 in areas with known changes (interesting to get an impression of the robustness of both the implementations and the standard). Perhaps this was unrealistic to hope for due to time constraints (I wouldn’t have minded staying longer!)? The last demo, which did include multiple applications, did show inter-application interoperability but did not contain obvious new 1.2 features like digital signatures, tables in presentations, accessibility or RDF entities.

Considering the different applications of ODF 1.2 that were demonstrated across diverse implementations (script, PC, mobile), I concur with Rob Weir who promised at the beginning that (I am paraphrasing) this session would serve as an overview of ODF capabilities.

Finally, I was asked why Microsoft didn’t participate. Easy answer: We would have loved to, but we don’t have an ODF 1.2 implementation in Office (yet) – our current plan of record is to implement ODF 1.2 after the final spec is published by ISO.

Beyond [ODF] 1.2

This panel session was a forum to articulate ideas and wishes for the next version of ODF beyond 1.2. Here are some of the ideas according to my notes:

·         Jos did a comparison of ODF and HTML and wants to take ODF to the web, including collaborative protocols (e.g. like Microsoft OneNote, co-authoring in Office 2010 or Google Wave)

·         Thorsten would like to see an extensibility concept that sounds a bit like alternate content blocks in Open XML (not quite sure if that would meet all of this requirements)

·         Svante would like, among other things, to get RESTful access to document data like spreadsheets (… cf. Excel Server 2010).

·         Marco would simply like to see ODF “on every desktop” and also see the ideas behind the Universal Business Language translated in ODF terms (Jos commented that he believes that ODF 1.2 can already do that today)

·         The previous presenter from Open framework systems as (somehow seem to not have noted his name, apologies for that) had a very interesting real world example where the national archive in Norway have a requirement that (in my terms, as far as I understood it) would be similar to having ACLs on RDF entities. Svante mentioned work in ISO TC5 to create a higher abstraction layer that might be useful to support this. My immediate idea was to consider using something like extension list (in Open XML) to model the requirement in the file format (then the interpretation of the ACL contained in the metadata would be left to the implementation, so on that level further work might be required to e.g. define a standard way of encrypting protected information).

I have several key personal takeaways from this session:

1.       It is reassuring to see that Microsoft’s work in several areas of open standards as well as implementation level features is validated as being relevant through ODF.next feature requests J

2.       Beyond our ongoing contributions to ODF 1.2, there’s a huge opportunity for Microsoft to further share additional best practices as suggestions with the ODF TC and other implementers (the latter in cases where the proposed solutions are out of scope for ODF but could be implemented using complementary protocols). This could help ODF and at the same time reduce the complexity of translations to enable interoperability with standards like ISO 29500.

Building Bridges, (my own session)

I’ll do a summary of my own session in a separate blog post to further elaborate on some key points and incorporate feedback that I received at the conference.

On one point I owe some source pointers to the audience, so I would like to use this post as a vehicle to already provide that information right here:

In my opinion, ODF and Open XML have more in common then what sometimes gets acknowledged. To illustrate one of the reasons why that would be a realistic assumption to me, one might consider the origins of both file formats, tracing back the history of how they came into being:

In case of ODF, the Open Office XML file format was donated to OASIS by Sun Microsystems to serve as the starting point for what then became ODF. The name was change from Open Office XML to Open Document Format relatively late in the game before the standardization of ODF 1.0:

·         Open Office XML Format TC Meeting Agenda 10 January 2005

o   Topic: discussion/voting of TC name

I found it very interesting, considering this obvious relationship, to read the mission statement of the OpenOffice.org XML File Format and highlight similarities to aspects of Open XML, such as the ability to "be non-lossy and support (at least) the full capability of a StarOffice/OpenOffice document".

For what it’s worth, Wikipedia seems to agree on the pedigree of ODF:

·         OpenDocument format (ISO/IEC 26300:2006) is based on OpenOffice.org XML and these formats are very similar in many technical areas. However, OpenDocument is not the same as the older OpenOffice.org XML format and these formats are not directly compatible.

As any observer knows, KOffice elected to also use ODF as their native file format which influenced some aspects of ODF beyond a 1:1 feature mapping with OpenOffice.org. Fast forward to 2010, things have progressed now to a point where OpenOffice.org has outgrown the ODF standard and is using a flavor of what is called “ODF 1.2 extended” in the user interface. This illustrates that ODF today (and as part of what OASIS is concerned with) is no longer a strict subset or superset (or identical to) whatever format OpenOffice.org is using this month or the next; in fact, one of the sessions talked about "ODF's influence on OOo's feature development" (which would indicate a reversal of the relationship over time) - one example for how OpenOffice.org has to consider how to catch up with new ODF 1.2 features would be RDF support (as I already mentioned).