Welcome to MSDN Blogs Sign in | Join | Help

Beyond the basics

It's interesting to see the varying emotions people have had in reaction to the standardization of the Office Open XML formats. It was obviously an immense amount of work, and it resulted in a 6,000 page standard. As you all know by now, the reason there was so much information in the specification was that we absolutely had to ensure that all the information people currently had in their existing set of Office binary documents could be ported over into the new format (with no loss).

In addition to that, we wanted to make sure we did our best to ensure that anyone coming trying to build support for the OpenXML formats in their solution had all the information they would need for consuming and producing those documents. This meant we had to go beyond the basics, and document all the pieces of the format we could. This is where you start to see the differences between the approach taken with the ODF standard and the OpenXML standard. In the ODF standard, the decision was made to leave a lot of the pieces out of the standard and to instead have them be application defined. This means you have a simpler standard, but it also means that interoperability is much more difficult to achieve. Let me explain:

As you start to document a specific feature in a file format, you have a few choices:

  • Fully document – This means you define the syntax for representing the feature, and all the information people would need to interpret it.
    • Pros: Everyone can build support for the feature using the specification.
    • Cons: None really (other than it makes the spec bigger)
  • Partially document – This means you define the syntax for representing the feature, but don't provide all the details for implementation.
    • Pros: If two application already have the behavior, it means they have a prescribed method for persisting that in the file format. So while the spec doesn't tell you how to implement the behavior, it does help you interoperate with others who already know the behavior.
    • Cons: It's not as interoperable as when you fully document something, but it still allows for more interoperability than the next approach.
  • Application defined – This means you allow the specification to be extended by an implementer to represent the feature as they choose. Usually you provide a way in which they can use their own namespace for identifying it as not being part of the standard but instead an extension.
    • Pros: This is the only option to account for things not yet in use when the standard was written.
    • Cons: There is no way to interoperate using the spec. You need to look at the implementation for any applications you will be interacting with and see how they extended the spec.

One of the sections people have been wondering about in the Open XML spec is section 2.15 in part 4 (Markup Language Reference). That section defines a couple hundred document level settings. Some of for application behaviors like "don't check spelling." Others are for display behaviors like "show gridlines on tables." And others are for layout behaviors like "align tables row by row."

The important thing for people to realize is that many of these settings are common in a number of Wordprocessing applications. KOffice, OpenOffice, and Microsoft Office all have a similar set of document settings. In ODF, they took the "application defined" approach for the settings, rather than either partially or fully documenting (similar to the original approach with spreadsheet formulas). So, if you save out a blank text document or spreadsheet from OpenOffice you'll get the following set of extended properties, and not a single one of these appear in the ODF standard (many of these directly impact the interoperability of the document in terms of layout and display):

OpenOffice extensions to ODF standard using the http://openoffice.org/2004/office namespace

Name Type Primary Use Interoperability Section in Spec
ActiveTable string App Behavior Application Defined N/A
AddExternalLeading boolean View Application Defined N/A
AddFrameOffsets boolean View Application Defined N/A
AddParaSpacingToTableCells boolean View Application Defined N/A
AddParaTableSpacing boolean Layout Application Defined N/A
AddParaTableSpacingAtStart boolean Layout Application Defined N/A
AlignTabStopPosition boolean Layout Application Defined N/A
AllowPrintJobCancel boolean App Behavior Application Defined N/A
ApplyUserData boolean View Application Defined N/A
AutoCalculate boolean Display Application Defined N/A
CharacterCompressionType short Layout Application Defined N/A
ChartAutoUpdate boolean Display Application Defined N/A
ClipAsCharacterAnchoredWriterFlyFrames boolean ??? Application Defined N/A
ConsiderTextWrapOnObjPos boolean Layout Application Defined N/A
CurrentDatabaseCommand string Data Application Defined N/A
CurrentDatabaseCommandType int Data Application Defined N/A
CurrentDatabaseDataSource string Data Application Defined N/A
DoNotCaptureDrawObjsOnPage boolean Display Application Defined N/A
DoNotJustifyLinesWithManualBreak boolean View Application Defined N/A
DoNotResetParaAttrsForNumFont boolean Display Application Defined N/A
FieldAutoUpdate boolean Display Application Defined N/A
GridColor long Display Application Defined N/A
HasColumnRowHeaders boolean Display Application Defined N/A
HasSheetTabs boolean App Behavior Application Defined N/A
HorizontalScrollbarWidth int App Behavior Application Defined N/A
IgnoreFirstLineIndentInNumbering boolean Display Application Defined N/A
IgnoreTabsAndBlanksForLineCalculation boolean Display Application Defined N/A
InBrowseMode boolean App Behavior Application Defined N/A
IsKernAsianPunctuation boolean Display Application Defined N/A
IsLabelDocument boolean App Behavior Application Defined N/A
IsOutlineSymbolsSet boolean Display Application Defined N/A
IsRasterAxisSynchronized boolean Display Application Defined N/A
IsSelectedFrame boolean App Behavior Application Defined N/A
IsSnapToRaster boolean Display Application Defined N/A
LinkUpdateMode short App Behavior Application Defined N/A
LoadReadonly boolean App Behavior Application Defined N/A
OutlineLevelYieldsNumbering boolean Display Application Defined N/A
PageViewZoomValue int View Application Defined N/A
PrintAnnotationMode short Printing Application Defined N/A
PrintBlackFonts boolean Printing Application Defined N/A
PrintControls boolean Printing Application Defined N/A
PrintDrawings boolean Printing Application Defined N/A
PrintEmptyPages boolean Printing Application Defined N/A
PrinterIndependentLayout string Printing Application Defined N/A
PrinterName string Printing Application Defined N/A
PrinterSetup base64Binary Printing Application Defined N/A
PrintFaxName string Printing Application Defined N/A
PrintGraphics boolean Printing Application Defined N/A
PrintLeftPages boolean Printing Application Defined N/A
PrintPageBackground boolean Printing Application Defined N/A
PrintPaperFromSetup boolean Printing Application Defined N/A
PrintProspect boolean Printing Application Defined N/A
PrintReversed boolean Printing Application Defined N/A
PrintRightPages boolean Printing Application Defined N/A
PrintSingleJobs boolean Printing Application Defined N/A
PrintTables boolean Printing Application Defined N/A
RasterIsVisible boolean Display Application Defined N/A
RasterResolutionX int Display Application Defined N/A
RasterResolutionY int Display Application Defined N/A
RasterSubdivisionX int Display Application Defined N/A
RasterSubdivisionY int Display Application Defined N/A
RedlineProtectionKey base64Binary ??? Application Defined N/A
SaveGlobalDocumentLinks boolean Data Application Defined N/A
SaveVersionOnClose boolean Data Application Defined N/A