Blog - Title

Open XML WordprocessingML Style Inheritance (Post #4)

Open XML WordprocessingML Style Inheritance (Post #4)

  • Comments 7

When working with WordprocessingML, nearly all of the information that we need to render paragraphs, tables, and numbered items is contained in styles, stored in the WordprocessingML Style Definitions part.  Styles are somewhat complicated because styles have inherited behavior – one style can be based on another style.  Rendering of text that has the derived style then is dependent on the derived style, it's base class, that base class's base class, and so on.  The Open XML specification refers to this list of styles that are derived from other styles as the 'style chain', which accurately describes the abstraction.

This is one in a series of posts on transforming Open XML WordprocessingML to XHtml.  You can find the complete list of posts here.

This blog is inactive.
New blog: EricWhite.com/blog

Blog TOC
When determining the set of properties for rendering a paragraph or table, the first job is to 'roll up' all styles in the style chain, creating a single set of properties that we can apply to the paragraph or table.  This process of 'rolling up' styles is made somewhat more complicated because there are four variations of semantics that we must apply to elements in the rolling-up process.

However, it's not too complicated, and after carefully defining the semantics of 'rolling-up' styles in the style chain, we can write a small bit of generalized code to do this – probably less than 100 lines of code.

You'll notice something about the semantics of style inheritance – by far, when rolling up the styles, the most common operation is to replace any elements in base styles with an element in a derived style.  In the code that I'm going to write which will roll-up styles, if the inheritance semantics are other than merging attributes or merging child elements, then the default behavior will be to do element replacement.  This will make the code as small and robust as possible.

This post probably isn't of very much interest to most people, but to the folks who are interested, it will be very important.  I'm in the process of writing a fairly compact conversion of Open XML to XHtml, and needed to work out the exact behavior of style inheritance.  After working it out, it made good sense to blog it to make life easier for others who need to work with rendering issues of WordprocessingML.

Merging Attributes

In some cases, we must iterate through attributes of a particular element, and if the element in the derived style has an attribute, we must apply that attribute, overriding the attribute in the base style.  In many cases, the base style may not define that particular attribute, so in that case, we must simply add the attribute to the element in the rolled-up style.  For example, we may have a style, SpaceBefore, which defines a style that has space before the paragraph, but no space after:

<w:stylew:type="paragraph"
         w:customStyle="1"
         w:styleId="SpaceBefore">
  <w:namew:val="SpaceBefore"/>
  <w:basedOnw:val="Normal"/>
  <w:qFormat/>
  <w:rsidw:val="00A670C6"/>
  <w:pPr>
    <w:spacingw:before="200"
               w:after="0"/>
  </w:pPr>
</w:style>

We may have a style, SpaceBeforeAndAfter, which defines the w:spacing element with a w:after attribute, like this:

<w:stylew:type="paragraph"
         w:customStyle="1"
         w:styleId="SpaceBeforeAndAfter">
  <w:namew:val="SpaceBeforeAndAfter"/>
  <w:basedOnw:val="SpaceBefore"/>
  <w:qFormat/>
  <w:rsidw:val="00A670C6"/>
  <w:pPr>
    <w:spacingw:after="200"/>
  </w:pPr>
</w:style>

After 'rolling-up' the style chain, the style that we must apply to a paragraph that has the SpaceBeforeAndAfter style would look like this:

<w:stylew:type="paragraph"
         w:customStyle="1"
         w:styleId="SpaceBeforeAndAfter">
  <w:namew:val="SpaceBeforeAndAfter"/>
  <w:basedOnw:val="SpaceBefore"/>
  <w:qFormat/>
  <w:rsidw:val="00A670C6"/>
  <w:pPr>
    <w:spacingw:before="200"
               w:after="200"/>
  </w:pPr>
</w:style>

Merging Child Elements

In some cases, we must merge child elements.  We must iterate through all child elements of an element in the derived style, and if the base style doesn't contain a particular element, we must add that element to the 'rolled-up' style.  If the base style does contain the element of interest, then we must either merge attributes or replace the child elements, based on the semantics defined for that child element.  The w:pPr and w:rPr elements are examples of elements that require this type of inheritance.

Consider the style NotIndented, which defines paragraph properties (w:pPr) as follows:

<w:stylew:type="paragraph"
         w:customStyle="1"
         w:styleId="NotIndented">
  <w:namew:val="NotIndented"/>
  <w:basedOnw:val="Normal"/>
  <w:qFormat/>
  <w:rsidw:val="00082E03"/>
  <w:pPr>
    <w:spacingw:after="0"/>
  </w:pPr>
</w:style>

The following style, Indented, derives from NotIndented:

<w:stylew:type="paragraph"
         w:customStyle="1"
         w:styleId="Indented">
  <w:namew:val="Indented"/>
  <w:basedOnw:val="NotIndented"/>
  <w:qFormat/>
  <w:rsidw:val="00082E03"/>
  <w:pPr>
    <w:indw:left="720"/>
  </w:pPr>
</w:style>

After rolling up all styles in the style chain, the style that we should apply to text styled as Indented would be defined as follows:

<w:stylew:type="paragraph"
         w:customStyle="1"
         w:styleId="Indented">
  <w:namew:val="Indented"/>
  <w:basedOnw:val="NotIndented"/>
  <w:qFormat/>
  <w:rsidw:val="00082E03"/>
  <w:pPr>
    <w:spacingw:after="0"/>
    <w:indw:left="720"/>
  </w:pPr>
</w:style>

Note that both the w:spacing and w:ind elements require that their attributes be merged.  In most cases, per the list below, elements are replaced (as opposed to merging of attributes).

Replacing Elements

In some cases, while rolling-up styles, we must replace an element and its attributes wholesale.  We don't need to iterate through attributes, replacing individual attributes.  The w:top (Paragraph Border Above Identical Paragraphs) element has these semantics.  Consider the following style that defines a single line, with a size of 4 eighth's of a point, and with a color of red (FF0000 in hex):

<w:stylew:type="paragraph"
         w:customStyle="1"
         w:styleId="TopBorder1">
  <w:namew:val="TopBorder1"/>
  <w:basedOnw:val="Normal"/>
  <w:qFormat/>
  <w:rsidw:val="007850D3"/>
  <w:pPr>
    <w:pBdr>
      <w:topw:val="single"
             w:sz="4"
             w:space="1"
             w:color="FF0000"/>
    </w:pBdr>
  </w:pPr>
</w:style>

Here is a derived style, TopBorder2, which defines a top border, with a size of 18 eighth's of a point, and no color defined:

<w:stylew:type="paragraph"
         w:customStyle="1"
         w:styleId="TopBorder2">
  <w:namew:val="TopBorder2"/>
  <w:basedOnw:val="TopBorder1"/>
  <w:qFormat/>
  <w:rsidw:val="00315108"/>
  <w:pPr>
    <w:pBdr>
      <w:topw:val="single"
             w:sz="18"
             w:space="1"/>
    </w:pBdr>
  </w:pPr>
</w:style>

After rolling up the styles in the style chain, the resulting style that should be applied to a paragraph styled TopBorder2 should be like this:

<w:stylew:type="paragraph"
         w:customStyle="1"
         w:styleId="TopBorder2">
  <w:namew:val="TopBorder2"/>
  <w:basedOnw:val="TopBorder1"/>
  <w:qFormat/>
  <w:rsidw:val="00315108"/>
  <w:pPr>
    <w:pBdr>
      <w:topw:val="single"
             w:sz="18"
             w:space="1"/>
    </w:pBdr>
  </w:pPr>
</w:style>

Notice that the w:color attribute was not inherited from TopBorder1.  The w:top element, along with its attributes, was replaced wholesale.

(Update December 13, 2009 - I've written a bit of code to show how to implement XML inheritance.)

Style Conditional Table Formatting Properties

There is one special case where merging semantics are slightly more complicated.  Table styles have a very powerful feature called conditional table formatting.  This feature allows us to specify a special set of formatting properties for the top row, the first column, the bottom row, banded columns, banded rows, cells at the top left, top right, etc.  Conditional table formatting is defined in the w:tblStylePr element.  The following table style (markup has been simplified) contains a w:tblStylePr element for the first row, and a w:tblStylePr element for the first column:

<w:stylew:type="table"
         w:customStyle="1"
         w:styleId="LightListRedHeader">
  <w:namew:val="Light List Red Header"/>
  <w:basedOnw:val="LightList"/>
  <w:tblStylePrw:type="firstRow">
    <w:pPr>
      <w:spacingw:before="0"
                 w:after="0"
                 w:line="240"
                 w:lineRule="auto"/>
    </w:pPr>
    <w:rPr>
      <w:b/>
      <w:bCs/>
      <w:colorw:val="FFFFFF"
               w:themeColor="background1"/>
    </w:rPr>
    <w:tblPr/>
    <w:tcPr>
      <w:shdw:val="clear"
             w:color="auto"
             w:fill="FF0000"/>
    </w:tcPr>
  </w:tblStylePr>
  <w:tblStylePrw:type="firstCol">
    <w:rPr>
      <w:b/>
      <w:bCs/>
    </w:rPr>
  </w:tblStylePr>

A table style definition most often will have several w:tblStylePr elements.  We can't simply merge child elements for the w:tblStylePr element.  We must first match the w:type attribute, and then merge child elements.

Summary of Style Inheritance Semantics

The table at the end of this post summarizes the semantics that we must apply when 'rolling-up' styles.

A fair number of elements in the style hierarchy exist solely for the user interface or other purposes.  We are only interested in rolling up those elements that impact presentation, so I'm eliminating elements that don't apply.  A few elements (name and basedOn) are used in the rolling-up process, so I am listing those.

Note that this is only part of the story around putting together the style information for a cell in a table.  After rolling up styles in a style chain into a single set of properties for a table, we must also roll up character formatting information, which involves rolling up run formatting information for the table, for paragraph styles, and for run styles.  Before rolling any of this up, we need to take the global run properties into consideration.  And when rolling up this information over the hierarchy (table, paragraph, run), we need to handle something called toggle properties.  Finally, where appropriate, we must retrieve information from the theme of the document.  Stay tuned…


Element

Ecma376

Semantics

style

2.7.3.17

Merge child elements

  name

2.7.3.9

Used when assembling inheritance information

  basedOn

2.7.3.3

Used when assembling inheritance information

  pPr

2.7.7.2

Merge child elements

  rPr

2.7.8.1

Merge child elements

  tblPr

2.7.5.4

Merge child elements

  tblStylePr

2.7.5.6

Merge child elements (Conditional Table Formatting Properties).  See the note about this element above.

  tcPr

2.7.5.9

Merge child elements

  trPr

2.7.5.11

Merge child elements

pPr

17.7.8.2

 

  adjustRightInd

2.3.1.1

Replace element

  autoSpaceDE

2.3.1.2

Replace element

  autoSpaceDN

2.3.1.3

Replace element

  bidi

2.3.1.6

Replace element

  cnfStyle

2.3.1.8

Replace element

  contextualSpacing

2.3.1.9

Replace element

  framePr

2.3.1.11

Replace element

  ind

2.3.1.12

Merge attributes

  jc

2.3.1.13

Replace element

  keepLines

2.3.1.14

Replace element

  keepNext

2.3.1.15

Replace element

  kinsoku

2.3.1.16

Replace element

  mirrorIndents

2.3.1.18

Replace element

  numPr

2.3.1.19

Replace element

  outlineLvl

2.3.1.20

Replace element

  overflowPunct

2.3.1.21

Replace element

  pageBreakBefore

2.3.1.23

Replace element

  pBdr

2.3.1.24

Merge child elements

  rPr

2.3.1.29

Merge child elements

  shd

2.3.1.31

Replace element

  snapToGrid

2.3.1.32

Replace element

  spacing

2.3.1.33

Merge attributes

  suppressAutoHyphens

2.3.1.34

Replace element

  suppressLineNumbers

2.3.1.35

Replace element

  suppressOverlap

2.3.1.36

Replace element

  tabs

2.3.1.38

Merge child elements

  textAlignment

2.3.1.39

Replace element

  textboxTightWrap

2.3.1.40

Replace element

  textDirection

2.3.1.41

Replace element

  topLinePunct

2.3.1.43

Replace element

  widowControl

2.3.1.44

Replace element

  wordWrap

2.3.1.45

Replace element

rPr

2.7.8.1

 

  b

2.3.2.1

Replace element

  bCs

2.3.2.2

Replace element

  bdr

2.3.2.3

Replace element

  caps

2.3.2.4

Replace element

  color

2.3.2.5

Replace element

  cs

2.3.2.6

Replace element

  dstrike

2.3.2.7

Replace element

  eastAsianLayout

2.3.2.8

Replace element

  effect

2.3.2.9

Replace element

  em

2.3.2.10

Replace element

  emboss

2.3.2.11

Replace element

  fitText

2.3.2.12

Replace element

  highlight

2.3.2.13

Replace element

  i

2.3.2.14

Replace element

  iCs

2.3.2.15

Replace element

  imprint

2.3.2.16

Replace element

  kern

2.3.2.17

Replace element

  lang

2.3.2.18

Merge attributes

  oMath

2.3.2.20

Replace element

  outline

2.3.2.21

Replace element

  position

2.3.2.22

Replace element

  rFonts

2.3.2.24

Replace element

  rtl

2.3.2.28

Replace element

  shadow

2.3.2.29

Replace element

  shd

2.3.2.30

Replace element

  smallCaps

2.3.2.31

Replace element

  snapToGrid

2.3.2.32

Replace element

  spacing

2.3.2.33

Replace element

  specVanish

2.3.2.34

Replace element

  strike

2.3.2.35

Replace element

  sz

2.3.2.36

Replace element

  szCs

2.3.2.37

Replace element

  u

2.3.2.38

Replace element

  vanish

2.3.2.39

Replace element

  vertAlign

2.3.2.40

Replace element

  w

2.3.2.41

Replace element

  webHidden

2.3.2.42

Replace element

tblPr

 

 

  bidiVisual

2.4.1

Replace element

  jc

2.4.23

Replace element

  shd

2.4.35

Replace element

  tblBorders

2.4.38

Merge child elements

  tblCellMar

2.4.39

Merge child elements

  tblCellSpacing

2.4.43

Replace element

  tblInd

2.4.48

Replace element

  tblLayout

2.4.49

Replace element

  tblLook

2.4.51

Replace element

  tblOverlap

2.4.53

Replace element

  tblpPr

2.4.54

Replace element

  tblStyleColBandSize

2.7.5.5

Replace element

  tblStyleRowBandSize

2.7.5.7

Replace element

  tblW

2.4.61

Replace element

tblStylePr

 

 

  pPr

2.7.5.1

Merge child elements

  rPr

2.7.5.2

Merge child elements

  tblPr

2.7.5.3

Merge child elements

  tcPr

2.7.5.9

Merge child elements

  trPr

2.7.5.10

Merge child elements

tcPr

 

 

  hideMark

2.4.15

Replace element

  noWrap

2.4.28

Replace element

  shd

2.4.33

Replace element

  tcBorders

2.4.63

Merge child elements

  tcFitText

2.4.64

Replace element

  tcMar

2.4.65

Merge child elements

  tcW

2.4.68

Replace element

  textDirection

2.4.69

Replace element

  vAlign

2.4.80

Replace element

trPr

 

 

  cantSplit

2.4.6

Replace element

  gridAfter

2.4.10

Replace element

  gridBefore

2.4.11

Replace element

  hidden

2.4.14

Replace element

  jc

2.4.22

Replace element

  tblCellSpacing

2.4.42

Replace element

  tblHeader

2.4.46

Replace element

  trHeight

2.4.77

Replace element

  wAfter

2.4.82

Replace element

  wBefore

2.4.83

Replace element

 

Leave a Comment
  • Please add 5 and 4 and type the answer here:
  • Post
  • Eric, great article. I recently had this exact need - take a base element and an inheretid element and merge them together. Your article was the only one available in every search I did that was the exact same issue I need to address. I was a bit saddened that you didn't have code examples, but I'm sure that's forthcoming. Great job though!

  • Hi Otaku, I've put together a code sample that shows how to do this.  Take a look at Implementing 'Inheritance' in XML.

    -Eric

  • Do you know if there is documentation about how this pipeline of inheritance happens? Or is it just trying to connect the dots in the Ecma docs by reading and logging on your own in a sheet?

  • Hi Todd, It's all in the spec.  Sections 17, 17.1, and 17.2 of IS29500 are of particular importance.  It is possible to infer by the element type the semantics of inheritance.  Some aspects could be considered as implementation/representation details also, which isn't necessarily within scope of the spec.  Having worked this out, I wanted to make it easier for others who also need to process styling information in the spec.  Reading the spec and presenting that information in a 'narative' form is one of the most enjoyable aspects of my job.

    -Eric

  • Thanks Eric. Okay, that's kind of what I was thinking that I would have to go ghrough the ISO/ECMA docs and infer the pipeline of inheritance. Appreciate you taking the time to do part of this for us! Now how about Excel and PowerPoint? :) Just kidding...

  • Hi Todd, I *believe* that in this series of blog posts, I've covered all of the semantics of style inheritance (with the exception of some aspects that are super-easy to get from the spec).  I'm just about to write the code to assemble actual styling information for a specific paragraph/table/cell, so if I've missed some aspect, it will become apparent soon enough.  But if you find some aspect of styling information that isn't clear in these blog posts, feel free to email me / leave questions as comments.  As time permits, I hopefully will write an MSDN article that summarizes the assembly of styling information.

    As for Excel, it's in my sights...

    -Eric

  • Let me know how to merge border of the w:tblStylePr. I have lot of issue in this portion. Please help me.....

Page 1 of 1 (7 items)