• Sign in
 
  •  
  • MSDN Blogs
  • Microsoft Blog Images
  • More ...
Search
Tags
  • .NET
  • Altova
  • blogging
  • code samples
  • Codeplex
  • Custom XML
  • DII
  • DIS29500
  • ECMA-376
  • IBM
  • Java
  • Monarch
  • ODF
  • Office 2007
  • OpenXMLDeveloper.org
  • PHP
  • Redmond
  • SharePoint
  • System.IO.Packaging
  • TechEd
  • UOF
  • VSTO
  • Windows
  • WordprocessingML
  • workshops
Archives
Archives
  • January 2012 (1)
  • October 2011 (1)
  • July 2011 (2)
  • April 2011 (1)
  • March 2011 (3)
  • December 2010 (1)
  • August 2010 (1)
  • June 2010 (1)
  • May 2010 (1)
  • April 2010 (3)
  • March 2010 (1)
  • November 2009 (4)
  • October 2009 (1)
  • September 2009 (2)
  • July 2009 (2)
  • June 2009 (4)
  • May 2009 (5)
  • April 2009 (4)
  • March 2009 (4)
  • February 2009 (2)
  • January 2009 (4)
  • December 2008 (4)
  • November 2008 (3)
  • October 2008 (4)
  • September 2008 (3)
  • August 2008 (2)
  • July 2008 (5)
  • June 2008 (7)
  • May 2008 (5)
  • April 2008 (8)
  • March 2008 (14)
  • February 2008 (15)
  • January 2008 (13)
  • December 2007 (12)
  • November 2007 (5)
  • October 2007 (9)
  • September 2007 (6)
  • August 2007 (10)
  • July 2007 (9)
  • June 2007 (8)
  • May 2007 (12)
  • April 2007 (14)
  • March 2007 (12)
  • February 2007 (10)
  • January 2007 (17)
  • December 2006 (14)
  • November 2006 (10)
  • October 2006 (11)
  • September 2006 (12)
  • August 2006 (12)
  • July 2006 (12)
  • June 2006 (23)
  • May 2006 (14)
Common Tasks
  • Blog Home
  • Email Blog Author
  • About
  • RSS for comments
  • RSS for posts

1 + 2 = 1?

Doug Mahugh - Office Interoperability
MSDN Blogs > Doug Mahugh > 1 + 2 = 1?

1 + 2 = 1?

Doug Mahugh
10 May 2009 2:26 AM
  • Comments 58

Does 1 plus 2 equal 3?   After last week’s sometimes acrimonious discussion about formulas in ODF, you may be glad to hear that IBM and Microsoft appear to agree on that answer to this simple question.  But OpenOffice.org is not so certain – maybe the answer is just 1 sometimes – and the question itself turns out not to be so simple after all.  Let me explain.

The State of ODF Formula Interoperability Today

What is the current reality of ODF formula interoperability?  Understanding the status of the ODF ecosystem will help clarify the set of issues and options that we faced when making the tough decisions we had to make about how to best support formulas in ODF spreadsheets.

For this example, I’ll use the latest released versions of two well-known ODF implementations: IBM Lotus Symphony (version 1.2, download here) and OpenOffice.org (version 3.1, download here).  I want to talk about current reality, so I’m not using any outdated versions of software (the OO build I’m using, for example, was released in the last week).  I also stayed away from unreleased or private beta versions that might become available sometime in the future, and I used the default settings for each application.

First, I fired up Symphony 1.2, and followed these steps:

  • Enter a numeric value of 1 in cell A1.
  • Format cell A2 as text, right-justified, then enter a 2 in that cell.
  • In cell A3, enter the formula =A1+A2.

In Symphony 1.2, here’s what I see:

image

After saving this spreadsheet as an ODS file, I open it in OpenOffice.org 3.1 and see this:

image

Clearly this is a problem.  The exact same data, in the exact same spreadsheet, when operated on with the exact same formula, provides different results.

Some might be tempted to say that formatting a cell as text and then using it in a calculation is dumb.  And I’d agree that there are few people who ever do such a thing intentionally.  But in a large complex spreadsheet, with thousands of cells involved in complex calculations, it’s easy to make mistakes like this.  In fact, if you’ve spent any amount of time at all creating complex spreadsheets, I’ll bet that on more than one occasion you’ve wasted a bunch of time trying to debug a problem that turned out to be caused by such mistakes; I know I sure have.

Similar issues arise with boolean values – what does it mean to “sum” a column of cells that includes both numeric values and boolean values?  Not all spreadsheet implementations agree on the answer to that question, either. This can create interactions between formatting and calculating – change the format of some cells, and the totals change in your spreadsheet.  Most users find such behavior very confusing, to say the least.

One of the most interesting things I found in my testing of these two implementations was that although they write different markup for formulas, the exact same interoperability problem occurs regardless of which application is used to create the spreadsheet.

If you create the spreadsheet in Symphony 1.2, as I did, the table:table-cell element has a table:formula attribute with a value of "=[.A1]+[.A2]".  And this formula will yield a result of 3 in Symphony and 1 in OpenOffice.org, as described above.

If instead you create the same spreadsheet in OpenOffice.org 3.1, when you open it in Symphony 1.2 you'll see of:=A1+A2 in cell A3.  But after you manually correct the formula, this spreadsheet, too, will yield a result of 3 in Symphony and 1 in OpenOffice.org.

So these two ODF implementations do not have predictable formula interoperability, regardless of where you start.  And these are not obscure implementations – they are the latest released versions of the implementations from IBM and Sun, the two companies that together chair the ODF TC.  Even if both companies released fixes tomorrow, there will still be many copies of the current non-interoperable versions of these applications in use for a long time to come.  This is the state of formula interoperability among ODF spreadsheets today.

Fixing the Problem

This difference in behavior is a well-known issue among those who work with spreadsheet formulas.  As Rob Weir said three years ago “Automatic string conversions considered dangerous. They are the GOTO statements of spreadsheets.”  (One of the ODF TC members even has that line in his email auto-signature.)

How to manage string conversions is far from the only problem with spreadsheet interoperability across vendors (and even across versions of the same product in some cases). The current draft OpenFormula specification contains 254 notes (by my count) about other issues similar to this one.

The OpenFormula sub-committee of the ODF TC has worked hard to address this.  Here is an excerpt from the draft OpenFormula specification (emphasis added):

6.2.4 Conversion to Number

If the expected type is Number, then if value is of type:

  • Number, return it.
  • Logical, return 0 if FALSE, 1 if TRUE.
  • Text: The specific conversion is implementation-defined; an application may return 0, an error value, or the results of its attempt to convert the text value to a number (and fall back to 0 or error if it fails to do so). Applications may apply VALUE() or some other function to do this conversion, should they choose to do so. Conversion depends on the actual locale the application runs in, especially if group or decimal separators are involved. Note that portable spreadsheet files cannot depend on any particular conversion, and shall avoid implicit conversions from text to number.

After OpenFormula is approved and published, this approach, with its explicitly defined concept of “portable spreadsheet files,” will allow more predictable and consistent interoperability for ODF spreadsheet users.

But in the current environment, with no standardization of formula markup across major ODF implementations, users who want to avoid interoperability problems need to stick to a very conservative strategy.  As Burton Group analyst Guy Creese said last week:

“… this in-between time (between the OpenOffice.org de facto standard and the wait for the officially approved 1.2 standard) means there isn't one way to handle this problem. The vendors would like you to believe that there is (their way), but in reality there isn't. Ultimately, this will resolve itself over time. ODF 1.2 will be approved, and there will finally be an approved standard that everyone--IBM, Microsoft, Sun (Sun/Oracle)--can follow.

Until then, if an enterprise does want to use ODF, the best strategy is to stick with one productivity suite as a way to avoid these interoperability problems. That way, even if formula support is idiosyncratic, it at least will be consistent within the enterprise.”

 How Excel 2007 SP2 Handles ODF Formulas

The question of how to handle formulas in SP2’s ODF implementation was one of the tough decisions we faced in our ODF implementation.  We had made conformance to the ODF 1.1 specification a top priority, and yet the spec doesn’t specify a formula language. 

It seemed clear to us that we couldn’t simply omit the  namespace, as the current version of Symphony does.  That would be in violation of Section 8.1.3 of the ODF specification, where it says “Every formula should begin with a namespace prefix specifying the syntax and semantics used within the formula.”

What about using the same of: namespace that OpenOffice.org 3.1 uses?  We saw a couple of pretty serious problems with that approach as well:

  • It would not be interoperable with some existing implementations, such as the widely  used current version of IBM Lotus Symphony.
  • It is based on a draft specification that has not been finalized or approved as a standard, and therefore could still change.

What about using the oooc: namespace that OpenOffice.org 3.1 writes when you choose its ODF 1.1 compatibilty mode? That syntax is on its way out for everyone, and we saw no point creating yet another new implementation of something that is clearly going to be deprecated soon.  And it doesn’t solve the problem: OpenOffice.org 3.1 writes the oooc: namespace prefix in its ODF 1.0/1.1 compatibility mode, and those spreadsheets still can yield different results in OpenOffice.org and Symphony.

After a robust internal debate on the topic, it became clear what we needed to do to apply the first two of our five prioritized guiding principles for Office’s ODF implementation:

  • Adhere to the ODF 1.1 standard
  • Be Predictable
  • Preserve User Intent
  • Preserve Editability
  • Preserve Visual Fidelity

As we discussed in several DII workshops starting back in July of 2008 (with multiple ODF implementers and multiple ODF TC members in attendance), these guiding principles are in priority order. When we could not achieve them all, we choose the top ones first.

To adhere to the ODF 1.1 standard, we begin formulas with “a namespace prefix specifying the syntax and semantics used within the formula.”  Excel 2007 SP2 uses an msoxl prefix and write the formula attribute like this:

table:formula="msoxl:=A1+A2"

That fulfills our goal of adhering to the standard since ISO/IEC 29500 defines both the syntax and semantics of this namespace.  Then, to provide a predictable user experience across all spreadsheets, we elected to support this namespace, and only this namespace.

If I move my spreadsheet from one application to another, and then discover I can’t recalculate it any longer, that is certainly disappointing.  But the behavior is predictable: nothing recalculates, and no erroneous results are created.

But what if I move my spreadsheet and everything looks fine at first, and I can recalculate my totals, but only much later do I discover that the results are completely different than the results I got in the first application?

That will most definitely not be a predictable experience.  And in actual fact, the unpredictable consequences of that sort of variation in spreadsheet behavior can be very consequential for some users.  Our customers expect and require accurate, predictable results, and so do we. That’s why we put so much time, money and effort into working through these difficult issues.

What Does Excel 2007 SP2 Do With the Example Above?

The answer is that we agree with IBM: 1 + 2 = 3.

Excel does the same thing Symphony 1.2 does, converting the text “2” to a numeric 2 and using that value in the calculation, so that the total is 3.  Excel does this because this type of automatic conversion – which has been a popular Excel feature for a very long time – is allowed by the semantics of the formula markup language Excel uses.

The formula markup that Excel uses is based on the formula language defined in ECMA-376 and ISO/IEC 29500, and here’s what it says about type conversion in Section 18.17.2.6 (Types and Values) of Part 1 of IS29500:

An implementation is permitted to provide an implicit conversion from string-constant to number. However, the rules by which such conversions take place are implementation-defined. [Example: An implementation might choose to accept "123"+10 by converting the string "123" to the number 123. Such conversions might be locale-specific in that a string-constant such as "10,56" might be converted to 10.56 in some locales, but not in others, depending on the radix point character. end example]

Excel’s approach to formulas in ODF, as well as our approach to other difficult issues, is completely public and fully documented in the implementer notes for SP2.  As the note for this issue explains:

The standard defines the attribute table:formula, contained within the element <able:table-cell>, contained within the parent element <office:spreadsheet table:table-row>

This attribute is supported in core Excel 2007. This attribute is supported in core Excel 1. When saving the Table:Formula attribute, Excel precedes its formula syntax with the "msoxl" namespace. 2. When loading the attribute Table:formula, Excel first looks at the namespace. If the namespace is "msoxl", Excel will load the value of Table:formula as a formula in Excel. 3. When loading the Table:formula attribute, if the namespace is missing or unknown, the Table:formula attribute is not loaded, and the value "Office:value" is used instead. If the result of the formula is an error, the element <text:p> will be loaded and mapped to an Error data type in Excel. Error types not supported by Excel are mapped to #VALUE!

The Question of Syntax

I’d like to also address the issue of cell reference syntax in the ODF 1.1 specification, since that was also a topic of much discussion on several blogs last week.  I’ll start with some quick background for those who don’t wallow in standards documents for a living.

The English language is inherently an ambiguous thing,  and great literature sometimes uses the ambiguity to good effect.  Words can have more than one meaning, and verb phrases might be intended to go with one noun or with another, as in famously ambiguous job references like “You will be very fortunate to get this person to work for you."

Writers of technical standards like to use rules and procedures that are designed to avoid this sort of problem.  These rules, which place requirements on the  use of words like should, shall, must and may,  tend to result in a stilted writing style which gets tedious fast, but reduces the need to agree on what is “obvious” or “implied” when interpreting the meaning of the text later.

A standards document is said to contain both normative language and informative language.   The things you must do to comply with a standard are supposed to be in the normative part,  and things like examples and introductions are informative.   So that everyone can be  sure about which parts are which,  the normative parts use specific phrases like “shall” and “shall not” to clearly label the things the standard actually requires you to do.

So the debate about Excel 2007 SP2’s cell reference syntax comes down to whether the few sentences in the ODF 1.1 spec which cover this were meant to be informative or normative.  The section of ODF 1.1  in question does not use the words shall or must.   It introduces the topic with the phrases “typically” and “can include”.   In our reading of it,  this language makes that part of the specification informative, stating no requirements for implementers.

The ODF 1.1 spec is casual about applying the rules of normative language, and as a result ODF 1.1 has more than its share of ambiguity.  The ODF 1.2 draft, however, is already much improved in this regard, mainly through the great work of ODF editor Patrick Durusau.  The OpenForumla draft specification is extremely careful in its use of normative language, and that will help implementers a great deal when they sit down to write their software.

When Will Office Support OpenFormula?

This question has come up on some blogs, so I’d like to address it here as well.

The real question is “when will Office support ODF 1.2,” since OpenFormula is simply a part of the ODF 1.2 specification.  And the answer is that we don’t know yet, because nobody knows yet when ODF 1.2 will be published as an OASIS or ISO standard.  As I said in the previous post, “we will look closely at Open Formula when it becomes a standard and make a decision then about how to best proceed.”  (It looks like IBM has committed to supporting ODF 1.2 and OpenFormula in late 2010.)

In the meantime, if you want to use Excel 2007 SP2 to edit documents that contain formulas from OpenOffice.org or Symphony, and preserve those formulas through editing sessions, and you understand the risk that the results might not be the same, you have a couple of free options.

The Open XML / ODF Translator Add-Ins for Office can be used with Office 2007 SP2, and as covered on the translator team blog, supports a variety of formula namespaces.

The Sun ODF Plugin provides yet another option, and apparently works with SP2.

  • 58 Comments
  • 1 + 2 = 1? | Microsoft Share Point
    10 May 2009 2:39 AM

    PingBack from http://microsoft-sharepoint.simplynetdev.com/1-2-1/

  • TheUndeadable entwickelt
    10 May 2009 6:23 AM

    Nachdem einige bemängelt haben, dass Microsoft nicht standardisierte Dinge so umgesetzt hat, wie es der Standard vorgibt (mit eigenen Xml-Namensräumen), hat Microsoft einen Kurzkommentar zu Formeln und ODF (Das OpenOffice-Format) abgegeben: http://blogs

  • Rick Jelliffe
    10 May 2009 8:04 AM

    I wonder whether an "adaptive standard" approach is better. Adaptive standards are Ken Kerchmer's observation that in a world with ubiquitous and reliable  web presence, many fewer things may need to be standardized.

    The poster case for this is probably CODECs. If your data can give enough information for your application to download the appropriate CODEC free, there is much less need for standards for particular CODECs: they become more disposable. (Of course, long-term data needs to transcoded to open standard formats for archiving still.)

    If every implementation of formula provided some kind of script to convert to a common format (OOXML formula, Open Formula, even both. or even some more active notation like .NET or Java etc.) in some transformation that also finessed the idiosyncracies (such as ECMA OOXML's CEILING() function) appropriately, then there would not be as strong a need for one true exchange format.

    In fact, this would seem to fit in with Microsoft's Model Driven Development project: what is that language called? It is very UNIX c.1975, or Omnimark c.1995. You parse a special notation (e.g. a formula language) into interpretable data structures.  You open source and expose these scripts to the public so that system integrators and FOSS people are never stymied.  And it takes the scripts out of the long-term maintenance millstone.

    Of course, for this kind of scenario, knowing exactly which dialect was being used is necessary. So not re-using an existing namespace prefix for your own dialect would indeed be appropriate for that case.

    Look at the way that IE treats RSS feeds. The RSS feeds from MSDN have a link to a styleheet, but IE ignores them and uses more elaborate built-in ones. Perhaps that is the way the adaptive standards will need to go: provide draft-quality tasters for auto-download by other applications, but build your own higher quality transformers into your product.

  • Where is there an end of it?
    10 May 2009 8:59 AM

    Notes on Document Conformance and Portability #3

  • Jomar Silva
    10 May 2009 10:49 AM

    Doug,

    It is the second time on the past days that you talk about discussion with "multiple ODF TC members" on workshops about your formula implementation, as if the ODF TC approves the mess you guys did.

    As a TC member, I'm offended with this tentative to confuse the audience, and I would like o state that I never accepted any Microsoft invitation to attend interoperability workshops, because I knew you guys could use my participation on those events as if it was my approval of your stupid decisions, and it seems that I was right.

    To clarify things, and to be fair with othe TC members, I would like to ask you to name all the TC members that attended those workshops, to allow us to verify if they approve (as you're trying to say) Microsoft's decisions.

    If you can't (or don't want) to do that, please stop talking about the ODF TC this way, and don't involve TC members with Microsoft business decisions (on this episode, yours 'throw mud on everyone' strategy is going far from the acceptable limits).

    This also raise additional suspicious about your real intention on the ODF TC (at least to me).

  • Rob Weir
    10 May 2009 12:36 PM

    Doug, this argument falls apart with the Microsoft/Sun ODF Add-in and the Sun Plugin for Office.  Both solutions use Excel as the underlying calculation engine, so they will give exactly the same answer as SP2 would for the same formula.

    So why is Excel trashing these documents, causing ODF spreadsheet interoperability to fail for Microsoft's own paying customers?  Saying to use the Add-in rather than SP2 is not a satisfactory answer, since a user may exchange a document with another user who has SP2 and their document will be silently corrupted.  How is that predictable or expressing the users intent?

    In any case, this is something that would have been so easy to fix.  You just need to send a note to the ODF TC, or to the vendors directly and say, "Hey, we have a problem with adding text and numbers consistently, how about we agree to do X.?  We'll all implement it now to help improve interoperability and at the same time make sure that OpenFormula reflects the agreed on convention".  This would have been very quick and easy.  But you didn't do that.  You took the nuclear option, stripping out all formulas, from all vendor spreadsheets, whether they had this problem or not, even from your own customer's documents using your own add-in -- a combination with demonstrably does not have this problem.  That's the thing that is causing you all the trouble, that you took the nuclear option without exploring other options with the ODF TC or with the vendors directly, options which would could have easily improved interoperability here and now.  

    I hope you still consider doing this, and slip stream a new version of the ODF support in SP2 before it goes into automatic download mode.

  • fmerletti
    10 May 2009 2:22 PM

    Citing Alex Brown:

    "When [OpenFormula] is published vendors will cease to have an excuse to create non-interoperable implementations, at least in this area. "

    All this Mahugh "essay" with screenshots! ( boy, it is really hard to justify why didn't you put a square bracket in a formula uh? ) can be represented with just one word:

    *Excuse*.

    Keep playing the standards game Microsoft, it is funny , isn't it?

  • Doug Mahugh
    10 May 2009 3:41 PM

    Rick – I believe M is the modeling language you're thinking of.  The adaptive-standard concept interesting.  The target of the transformation would need to be standardized for archival purposes as you mentioned, so right now the ECMA-376/IS29500 formula markup is the only option, and OpenFormula will present another option when it's done.

    Jomar – I certainly did not say, or even mean to imply,  that the ODF TC officially approved of our implementation.   So for the record:  they did not,  nor is it the TC's role to do something like that.    My point was that members of the TC were given a chance to give us feedback on our decisions very early in the process,  when there was still plenty of time for us to do something about it if they thought we were making a mistake.    I’m truly sorry you and Rob did not accept any of our invitations to attend the DII events where we first explained our plans for ODF support.  It would have been very useful to hear your opinions about our decisions back then instead of now,  when it is too late for us to change them.

    As for the people who did attend and give us feedback, I’m in a tough spot.    Some of them don’t want me to publically identify them because they know the vehemently anti-Microsoft crowd out there will heap vitriol upon them – as Rob Weir has done with me and Gray – and they don’t need the aggravation.   So I’m going to let them come out themselves if they are willing,  but I’ll respect their desire that I don’t “out” them.

    Rob – I’m simply going to paraphrase a response you just gave to Jesper on Alex Brown’s blog … I think all of your questions have already been clearly answered in my posting above.  Please go back and read it more carefully.

    As for the “nuclear option” – it seems to me that you are the one who is pulling out the heavy weaponry here,  not me.  As a result of your own initial and inflammatory post on  the topic and the subsequent discussion,  the world is once again reminded that spreadsheet interoperability in ODF 1.1 has been a myth from the beginning.   If you are upset about that inconvenient truth,  I’m sorry.

    Today is Mother’s day in the United States and many other countries.   So   I'm going to spend some time with my wife and mother, and take a break from moderating the comment stream. I’d recommend everyone else enjoy Sunday as well.  We can get back to this discussion later.

  • Luc Bollen
    10 May 2009 4:58 PM

    Doug, you identified an interoperability problem between Symphony and OpenOffice.org.  OK, it is well known that this is not the only one: this is why OASIS created the "ODF Interoperability and Conformance" TC, in order to identify and solve the interoperability issues.

    But I really cannot see how an existing interoperability issue justifies that Microsoft creates much more serious interoperability problems.

    It looks like your goal is in fact to create as big as possible a mess, surely not to achieve any interoperability goal.

  • orcmid
    10 May 2009 5:29 PM

    Although not an OASIS member at the time, I was at that July 30, 2008 DII meeting in Redmond where the principles were first discussed and there was discussion of how they were looking at handling ODF table:formula values for Excel 2007 SP2.  My personal observations on that are posted at <http://orcmid.com/blog/2008/08/microsoft-odf-interoperability-workshop.asp>.  

    I'm not sure anyone there flinched about what Excel would be doing, although I see that I was hoping for more that what was described.  

    I'm not so sure it matters that there were any ODF TC members present at this or the previous Redmond DII, and I don't know that they would have seen a basis for objecting.  

    Although we can argue all we want about whether or not a cell-reference scheme is mandated, the proposed accomodation that Rob Weir insists on is at the level of practical, partial interoperability, not ODF 1.1 conformance.  And to do that you'd have had to introduce yet-another namespace or else only accept the other .ods ones and not produce any of them, something I somehow doubt Microsoft wants to be seen doing.  

    As painful as the current solution is, I don't see a mutual way out absent a standard OpenFormula that all can start to converge on for increasing interoperable spreadsheet usage with ODF.  The harmonization of OOXML SpreadsheetML formulas and OpenFormula then becomes a meaningful opportunity as well.  I am impressed by the degree to which OpenFormula work seems poised for that.

  • Rob Weir
    10 May 2009 7:29 PM

    Sorry Doug, I must have missed it.  Can you point out to me where exactly you addressed why it was necessary to break compatibility with documents created by the Sun ODF Plugin for Microsoft Office and documents created by the Microosft/CleverAge ODF Add-in?  It seems to me that these all use Excel as the calculation engine, so formulas could be interchanged and would give the same results.  So why the nuclear option there?

    Obviously every one really wants their documents to be compatible with Microsoft Office.  You do after all dominate the market.  But by throwing out this nuclear option like this you put every other ODF implementation in a tough position.  Do they change their code to put formulas in a new namespace so that Excel does not strip them out?  But if an application does this they will break compatibility with every other ODF spreadsheet that they were previously compatible with.  The effect, whether intentional or not, is a strong probability that you have just forked the standard.

    Since Microsoft has consistently argued the need to preserve "legacy compatibility" as the rationale to for every wart in OOXML, I wonder whether when ODF 1.2 comes out we'll hear a similar appeal to "legacy compatibility" of these SP2 ODF documents to justify why Office does not add support for OpenFormula.  With standards you are either converging or diverging.  The other vendors are converging on OpenFormula.  We had an interoperability workshop in Beijing -- I invited you but you did not attend -- where we worked on spreadsheet formula interop.  As of ODF 1.2 CD02, ODF requires the use of OpenFormula spreadsheet formulas in spreadsheets.  So the direction the standard and the other implementations  are going is clear.  You are either converging or diverging.   Which will it be?

  • fmerletti
    10 May 2009 7:45 PM

    People, please don't organize more DII workshops. Judging for the output of such events ( incredibly bad implementation of state-of-the-art formula handling in ODF implementations ) it seems that you only have drinks and snacks in this events ;-)

    It seems that the "interoperability" word will be bastardized too ( following the steps of the word "standard" )

  • A Nonymous
    10 May 2009 9:55 PM

    @Doug Mahugh:

    The crux of the problem can easily be summarized as follows:

    The whole point of the namespace prefix on a formula, is that a spreadsheet can consist of cells, whose individual formulas are from a heterogenous namespace (or set of namespaces).

    The whole point is, that an ODF document is flexible enough to encode multiple application-dependent behaviors, so that they are preserved across edits by multiple application.

    Yes, individual cells whose namespace is foreign to a particular application MIGHT be incorrectly (or inconsistently) calculated. However, native namespace formulas, by definition, will not.

    Having an ODF spreadsheet with data cells A1, A2, and A3, whose contents are arbitrary values, and cells B1 and B2, whose formulas are of:=A1+A2, and msoxl:=A1+A2, should not only be legal, but should survive unaltered, across being opened, edited (in an effectively trivial fashion), and saved, by ANY conforming ODF application.

    Yes, the value of B1 might be wrong, or B2, or even both, in some third party application. But back in their native application, the respective values WILL be correct.

    Having Application X incorrectly calculate the behavior for Namespace Y, is a problem for Application X to address, at least for the most popular values of Y. This can be seen as a differentiator between applications.

    Ultimately, this should be resolved by OpenFormula, and by extension ODF 1.2, but in the meantime is something that can be gleaned by taking a "primer" spreadsheet of base values/types and formulas, and comparing the expected versus observed values for each formula in each namespace.

    By encoding the anomalies and backwards engineering the "problems", it is possible to universally compute the behavior of foreign namespaces - at which point perfect encode/decode/calculate is trivially possible for each such namespace.

    An internal structure consisting of pairs (foreign namespace formula, converted formula) would allow for efficient internal operation of the calculation engine, and preserve formulas on export.

    If I can write this up in ten minutes, how is it that a small army of developers can't or won't do this at Microsoft?

    The number of extant namespaces is quite small, and the necessary logic to identify only the "exceptions" cannot be that large. If 124 is the number of footnotes where there is a potential problem, there can be at most N x 124 entries in a mapping table - where N is certainly not ridiculously large.

    Any comments on this suggestion/observation?

    Name Withheld

  • Doug Mahugh
    10 May 2009 11:14 PM

    Rob - There is no reliable interoperability for spreadsheet formulas between any of the major implementations of OASIS ODF 1.1.  People like James Clark, Tim Bray and David Wheeler were warning of this outcome years ago, when you and others decided to leave formulas out of the standard.  Is the formula handing in the current releases of Symphony and OpenOffice.org, as demonstrated above, the sort of "compatibility" you're asking us to participate in?  That may be good enough for you and Maya, but in my experience, most users expect the same data and the same formula to return the same result consistently.

    Fmerletti - I'm surprised to hear you refer to the current ODF formula interoperability between Symphony and OpenOffice.org, which I've demonstrated above, as "state-of-the-art formula handling in ODF implementations."  I take it, then, that you agree with Rob that such inconsistencies meet the "good enough" bar for most users?

    Anonymous - your thoughts on how to apply reverse-engineering techniques to ODF formulas to infer the behavior of foreign namespaces is a great example of the root problem here: ODF formula interoperability doesn't exist, and requires reverse-engineering of every implementation by every other implementation.  Standards are supposed to prevent that situation, not require it.

  • hAl
    11 May 2009 4:09 AM

    As someone working in organisations working with spreadsheet I can easily confirm that is is better to have spreadsheet that show correct values then to have  spreadsheet capable of calculations but without correct values.

    So MS Office implementation choice might not be ideal but suggested alternatives preferring formulas to be left intact even though this might confront users with incorrect data are totally unacceptable.

Page 1 of 4 (58 items) 1234
  • © 2013 Microsoft Corporation.
  • Terms of Use
  • Trademarks
  • Privacy & Cookies
  • Report Abuse
  • 5.6.426.415