[Blog Map] This blog is inactive. New blog: EricWhite.com/blog
This is one in a series of posts on transforming Open XML WordprocessingML to XHtml. You can find the complete list of posts here.
When we want to render a paragraph and its runs inside of a cell, we need to assemble the paragraph and run properties from a number of places. In a previous post, I explained how style inheritance works, and how you 'roll-up' styles from the style chain. That is only part of the story. This post details how we assemble styling information from:
In the process of assembling paragraph and run properties, we also need to correctly handle something called 'Toggle Properties'.
Note: next, I'm going to tackle the semantics of numbering styles, and then I believe I'm ready to start coding in earnest. I've made a decision in this project to first implement a transform to XHtml without styling information. The resulting XHtml will contain just the content of the document. I decided to do this because it is useful in its own right, and we need it for another project. Having code to extract the contents of a document in the most succinct form possible has a lot of uses. Of course, this XHtml still can be rendered in a browser, and in some cases, it will be useful to do this. Then, after publishing that code, I'll start implementing styling behavior.
A very powerful and cool feature of Word is that when you are applying a table style you pick and choose which aspects of the style you want to apply. For consistency, you can apply the same style to all tables in your document, but some tables may have a total row, and other tables may not:
You can apply the same style to both, then pick and choose which aspects of the table style to apply. When you are applying a table style, this is what the Ribbon in Word 2007 looks like:
You can see the range of check boxes in the Table Style Options section of the Ribbon, and how you can pick and choose which aspects of the style you want to apply. This has ramifications for us when we are assembling styling information for a cell. We know the table style for the table, and we also know the values of those check boxes, so we have to apply the various aspects of the table style per the user's preferences from those check boxes.
Before we dive into table styles in depth, we need to cover toggle properties, which play a part in how table styles work.
Toggle properties consist of a set of run properties that have a little twist in their semantics when assembling formatting information in preparation to rendering paragraphs in some fashion. The w:b element (which styles a run as bold) is a good example of a toggle property.
Here's how toggle properties work:
Toggle properties only have their toggle behavior when associated with table styles, paragraph styles, and character styles. If a run has been made bold per the table style, and the user applies a paragraph style that also has the w:b element, the net result is that original bolded text is now made not bold. And if some portion of that paragraph has a bold character style applied to it, that portion is now made bold again.
This makes sense. The table style designer designated that a cell be bold. The paragraph style designer had the intention of making text in the paragraph stand out. But the text is already bold, so that intent won't be satisfied, so to make it stand out, we reverse the boldness of the text. The same reasoning also applies to a character style that has the w:b element.
It's just these three types of styles (table, paragraph, and character) that we need to process in this fashion. If the user subsequently selects that text and presses the bold button on the toolbar, setting the properties on the run itself (not a style), we honor his or her intention, regardless of the boldness of the table, paragraph, or run styles. Also, the global run properties completely override the toggling behavior (but not directly applied formatting). If the w:b element is set on the global run property, effectively making the entire document bold, the entire document remains bold, unless formatting is set directly on a run.
The set of toggle properties are: §188.8.131.52 (Bold), §184.108.40.206 (Complex Script Bold), §220.127.116.11 (Display All Characters as Capital Letters), §18.104.22.168 (Embossing), §22.214.171.124 (Italics), §126.96.36.199 (Complex Script Italics), §188.8.131.52 (Imprinting), §184.108.40.206 (Display Character Outline), §220.127.116.11 (Shadow), §18.104.22.168 (Small Caps), §22.214.171.124 (Single Strikethrough), and §126.96.36.199 (Hidden Text). The section numbers are for Ecma-376 version 1.
Due to the richness of table styles, as shown above, table, row, cell, paragraph and run properties can be stored in multiple places in a table style. Determining the properties for a table style involves rolling up those styles, in the exact same fashion as I described for rolling up style properties in the previous blog post. While rolling up that information, we need to either merge attributes, merge child elements, or replace elements.
Shading of the table cells comes from the table cell properties (w:tcPr). Formatting of the text in table cells comes from the paragraph properties (w:pPr) and run properties (w:rPr). Other necessary properties for rendering come from the table properties (w:tblPr) and table row properties (w:trPr). The process for assembling the correct table styling information for a cell is the same for each of these. In the following section, I describe the process of assembling styling information for runs in a table per the table style, but the same approach applies to assembling styling information for the other aspects of a table style (table, row, cell, and paragraph properties). When I write code to do this, of course, I'm going to write only one set of methods to do this assembling of styling information, and parameterize those methods so that I can use it for assembling all aspects of conditional table formatting properties.
To determine the run properties from a style for a cell in a table, we do the following, in order:
Note that this only gets the run properties for a table style. Once we have rolled up the run properties for the table style, we assemble the following, in order:
We then roll these three up, implementing the toggling behavior for toggle properties that I described earlier. Once we have done this process, we assemble the following, in order:
We roll these up, and we finally have the run properties that we can apply to the run.
When we're assembling the paragraph properties for a table style, we follow a similar procedure. Once we have that rolled-up property, we need to assemble a new list of paragraph properties, in the following order:
We then roll up these three sets of paragraph properties, and we have the paragraph properties that we can apply to the paragraph in the cell.
This seems harder than it actually is. While this is a bit involved, this is what enables the very cool table styling capabilities that we see in Word. I just have to say, this is one of those cases where I really appreciate LINQ to XML. I personally really would not want to write old-style imperative code to do this.
One more point about this – I mentioned in an earlier post about an approach of adding paragraph and run properties with ordering applied to every paragraph and run in the document. I still think that this approach will work best. It means that I can assemble the style paragraph properties for a cell, then add them to every paragraph in the cell. I can assemble the style run properties for a cell, then add them to every run. This means that I'll only need to compute the style paragraph properties for a particular cell once, not for every paragraph in the cell. Same holds true for runs also.