Imagine you are using OpenOffice.org. Or let's say not you, somebody else, hypothetically, once was. Let's call that person "Mr. K.".

Mr. K. received a spreadsheet in the OpenDocument Format. Or maybe he found that floppy disc under a pile of paper on which he saved a document two months ago. It doesn't really matter. What matters is the document format, the file extension ".ods".

Now, here's what Mr. K.'s screen looked like when he last was looking at the document (at the time he was still a Mac user, but has since started to enjoy Windows 7, as you will see on the later screenshots):

"It is a summation of sorts", said Mr. K. "I don't think it's summed correctly, but I know that it has been working like that for 8 years. Also it was classified as a feature, not a bug". 

Now, after having updated to OpenOffice 3.2 (to fix some security vulnerabilities), Mr. K. found that unexpected changes had occured to his document:

Mr. K didn't quite know what to make of it. He was uncertain how to ever establish what the spreadsheet had looked like when he, or rather his tax consultant, as this was part of calculating his W-4 personal tax allowances, had last checked its validity.

Frustrated, Mr. K. turned off his PC.

On the next day, after getting back to his work from Starbucks where he had put in a custom order for a 4 shot Espresso, Mr. K. reluctantly returned to the matter of the confused spreadsheet. Then he had a brilliant idea. He remembered how some people had vocally objected to Microsoft's ODF implementation, supposedly because it preserved the original values rather than taking pot shots at unspecified (as part of ODF) formula semantics. He silently thanked them for making so much noise, this was helpful in remembering a possible resolution to his quandary.

Using Microsoft Excel to open the spreadsheet, this is what Mr. K. saw:

Apparently Microsoft had provided him with the time machine that he needed to deal with OpenOffice.org version upgrades. But what about the future? There was no way Mr. K. would be able to tell which OpenOffice.org version had been used by others to author documents handed over to him. So it was very important that in any case the numbers should "add up". To determine if Excel could solve this problem as well, Mr. K. saved the spreadsheet using OpenOffice.org 3.2, deliberately overwriting the original results with those dictated by the newer formula regime.

Regardless of the underlying changes, again Excel preserved the same interpretation of formulas that the spreadsheet author had assumed to be valid:

Mr. K. concluded: "The thinking man should reconsider using document format implementations where basic calculation semantics change arbitrarily."

 

 

Technical postscriptum:

  • cell A2 contains the number 1 quoted as text, like this: '1
  • As you can see, Excel flags the possible mistake by putting a marker in the cell that has a contextual menu to automatically change the data to the more appropriate "number" type.
  • OpenOffice.org 3.2 has joined the industry consensus (e.g. GNOME Office Gnumeric, SoftMaker Office, Excel, Quattro Pro, Lotus 1-2-3, ...) in interpreting numerical values in direct summations as numbers, even if they are formatted as strings (e.g. as a side effect of a .csv data import).
    • By doing so, a degree of semantic compatibility with any prior version of OpenOffice.org (or derivatives like IBM Lotus Symphony) is lost.
  • The OpenDocument Format doesn't help because (in all versions that are final today) it doesn't contain a specification for formula semantics.
  • Interestingly, OpenOffice 3.1.1 is supposed to be implementing a draft version of ODF 1.2. OpenOffice 3.2 is supposedly doing the same. Apparently this doesn't help sufficiently in consistenly evaluating A1+A2=A3 (?)