Blog Map
[Blog Map] This blog is inactive. New blog: EricWhite.com/blog
This is one in a series of posts on transforming Open XML WordprocessingML to XHtml. You can find the complete list of posts here.
Some XML vocabularies implement a powerful XML pattern that is analogous to inheritance in programming language type systems. Open XML WordprocessingML has these semantics around styles. When you define a new style, you can base this style on another style, and if the new style doesn't explicitly define some aspect of the style, the new style 'inherits' this aspect from its base style. This post presents some code that uses one approach for implementing inheritance semantics.
Consider the following XML:
<Root> <!-- Style: Merge child elements --> <Style StyleId="RootStyle"> <!-- Font and descendants: Replace --> <Font> <Family Val="Courier"/> <Size Val="12"/> </Font> <!-- VisualProps: Merge attributes --> <VisualProps ForeColor="Black"/> <!-- Positioning: Merge child elements --> <Positioning> <SpaceAfter Val="12"/> </Positioning> </Style> <Style StyleId="CompanyThemeHeading1" BaseStyleId="RootStyle"> <Font> <Family Val="Cambria"/> <Size Val="14"/> </Font> <VisualProps ForeColor="Blue" BackColor="OffWhite"/> <Positioning> <SpaceAfter Val="10"/> </Positioning> </Style> <Style StyleId="Heading1" BaseStyleId="CompanyThemeHeading1"> <VisualProps Bold="true"/> <Positioning> <Indent Val="10"/> </Positioning> </Style></Root>
The Heading1 style is based on the CompanyThemeHeading1 style, which is itself based on RootStyle. The programming task is to 'roll up' these styles, and assemble a new Style element that contains all inherited child elements as appropriate. Note that the order of the Style elements in the XML document is not significant. The method to 'roll up' these styles should work properly regardless of the document order of the Style elements.
In the case of Open XML styles, there are three varieties of inheritance semantics:
We should note that this doesn't cover all possibilities of inheritance semantics. Other possibilities:
For a detailed examination of the Open XML inheritance semantics, see Open XML WordprocessingML Style Inheritance.
When implementing this in code, the first task is to write an iterator that will return a collection of styles. We pass the StyleId of the most derived style, and this method will follow the style chain, yielding up each style in order.
using System;using System.Collections.Generic;using System.Linq;using System.Xml.Linq;class Program{ static IEnumerable<XElement> StyleChainReverseOrder(XElement styles, string styleId) { string current = styleId; while (true) { XElement style = styles.Elements("Style") .Where(s => (string)s.Attribute("StyleId") == current).FirstOrDefault(); yield return style; current = (string)style.Attribute("BaseStyleId"); if (current == null) yield break; } } static void Main(string[] args) { XElement styles = XElement.Load("Styles.xml"); foreach (var style in StyleChainReverseOrder(styles, "Heading1")) { Console.WriteLine(style); Console.WriteLine(); } }}
If you are not familiar with writing iterators, see The Yield Contextual Keyword.
When you run this example, you see:
<Style StyleId="Heading1" BaseStyleId="CompanyThemeHeading1"> <VisualProps Bold="true" /> <Positioning> <Indent Val="10" /> </Positioning></Style><Style StyleId="CompanyThemeHeading1" BaseStyleId="RootStyle"> <Font> <Family Val="Cambria" /> <Size Val="14" /> </Font> <VisualProps ForeColor="Blue" BackColor="OffWhite" /> <Positioning> <SpaceAfter Val="10" /> </Positioning></Style><Style StyleId="RootStyle"> <!-- Font and descendants: Replace --> <Font> <Family Val="Courier" /> <Size Val="12" /> </Font> <!-- VisualProps: Merge attributes --> <VisualProps ForeColor="Black" /> <!-- Positioning: Merge child elements --> <Positioning> <SpaceAfter Val="12" /> </Positioning></Style>
We've retrieved the collection of relevant styles (Heading1, CompanyThemeHeading1, and RootStyle), but they are in the reverse order to the order that we want to process them. That is easy enough to fix - we write a method that returns the collection in reverse:
using System;using System.Collections.Generic;using System.Linq;using System.Xml.Linq;class Program{ static IEnumerable<XElement> StyleChainReverseOrder(XElement styles, string styleId) { string current = styleId; while (true) { XElement style = styles.Elements("Style") .Where(s => (string)s.Attribute("StyleId") == current).FirstOrDefault(); yield return style; current = (string)style.Attribute("BaseStyleId"); if (current == null) yield break; } } static IEnumerable<XElement> StyleChain(XElement styles, string styleId) { return StyleChainReverseOrder(styles, styleId).Reverse(); } static void Main(string[] args) { XElement styles = XElement.Load("Styles.xml"); foreach (var style in StyleChain(styles, "Heading1")) { Console.WriteLine(style); Console.WriteLine(); } }}
Now we can write a method to merge two styles. The Style element has 'merge child semantics', so we'll write a method to do that:
using System;using System.Collections.Generic;using System.Linq;using System.Xml.Linq;class Program{ static IEnumerable<XElement> StyleChainReverseOrder(XElement styles, string styleId) { string current = styleId; while (true) { XElement style = styles.Elements("Style") .Where(s => (string)s.Attribute("StyleId") == current).FirstOrDefault(); yield return style; current = (string)style.Attribute("BaseStyleId"); if (current == null) yield break; } } static IEnumerable<XElement> StyleChain(XElement styles, string styleId) { return StyleChainReverseOrder(styles, styleId).Reverse(); } static XElement MergeChildElements(XElement mergedElement, XElement element) { // If, when in the process of merging, the source element doesn't have a // corresponding element in the merged element, then include the source element // in the merged element. if (mergedElement == null) return element; XElement newMergedElement = new XElement(element.Name, element.Attributes(), element.Elements().Select(e => { // Replace if (e.Name == "Font" // || e.Name = "OtherElementWithReplaceSemantics" ) return e; // Merge attributes if (e.Name == "VisualProps") return new XElement(e.Name, e.Attributes(), mergedElement.Elements(e.Name).Attributes() .Where(a => !(e.Attributes().Any(z => z.Name == a.Name)))); // All other elements have merge child elements XElement correspondingElement = mergedElement.Element(e.Name); if (correspondingElement == null) return e; return new XElement(e.Name, e.Attributes(), e.Elements().Select(c => MergeChildElements(correspondingElement.Element(c.Name), c)), correspondingElement.Elements().Where(m => e.Element(m.Name) == null) ); }), mergedElement.Elements() .Where(m => !element.Elements(m.Name).Any())); return newMergedElement; } static void Main(string[] args) { XElement styles = XElement.Load("Styles.xml"); var styleChain = StyleChain(styles, "Heading1").ToArray(); XElement s1 = MergeChildElements( styleChain[0], styleChain[1]); Console.WriteLine(s1); }}
This is a recursive method - when merging child elements, we need to implement appropriate semantics for each of the child elements. To test the method, we merge the first two styles in the style chain. When we run this on our sample XML document, we see:
<Style StyleId="CompanyThemeHeading1" BaseStyleId="RootStyle"> <Font> <Family Val="Cambria" /> <Size Val="14" /> </Font> <VisualProps ForeColor="Blue" BackColor="OffWhite" /> <Positioning> <SpaceAfter Val="10" /> </Positioning></Style>
This is what we expected. We see the Font element from the CompanyThemeHeading1 style. The attributes of the VisualProps element are merged. Positioning, which has 'merge child elements' semantics, contains the correct SpaceAfter element, which replaced the SpaceAfter element in the RootStyle.
Now we can write a one statement method that uses the Aggregate extension method to roll up all styles:
using System;using System.Collections.Generic;using System.Linq;using System.Xml.Linq;class Program{ static IEnumerable<XElement> StyleChainReverseOrder(XElement styles, string styleId) { string current = styleId; while (true) { XElement style = styles.Elements("Style") .Where(s => (string)s.Attribute("StyleId") == current).FirstOrDefault(); yield return style; current = (string)style.Attribute("BaseStyleId"); if (current == null) yield break; } } static IEnumerable<XElement> StyleChain(XElement styles, string styleId) { return StyleChainReverseOrder(styles, styleId).Reverse(); } static XElement MergeChildElements(XElement mergedElement, XElement element) { // If, when in the process of merging, the source element doesn't have a // corresponding element in the merged element, then include the source element // in the merged element. if (mergedElement == null) return element; XElement newMergedElement = new XElement(element.Name, element.Attributes(), element.Elements().Select(e => { // Replace if (e.Name == "Font" // || e.Name = "OtherElementWithReplaceSemantics" ) return e; // Merge attributes if (e.Name == "VisualProps") return new XElement(e.Name, e.Attributes(), mergedElement.Elements(e.Name).Attributes() .Where(a => !(e.Attributes().Any(z => z.Name == a.Name)))); // All other elements have merge child elements XElement correspondingElement = mergedElement.Element(e.Name); if (correspondingElement == null) return e; return new XElement(e.Name, e.Attributes(), e.Elements().Select(c => MergeChildElements(correspondingElement.Element(c.Name), c)), correspondingElement.Elements().Where(m => e.Element(m.Name) == null) ); }), mergedElement.Elements() .Where(m => !element.Elements(m.Name).Any())); return newMergedElement; } static XElement AssembleDerivedStyle(XElement styles, string styleId) { return StyleChain(styles, styleId) .Aggregate( new XElement("Style"), (mergedStyle, style) => MergeChildElements(mergedStyle, style)); } static void Main(string[] args) { XElement styles = XElement.Load("Styles.xml"); Console.WriteLine(AssembleDerivedStyle(styles, "Heading1")); }}
When you run this on the sample data, you see:
<Style StyleId="Heading1" BaseStyleId="CompanyThemeHeading1"> <VisualProps Bold="true" ForeColor="Blue" BackColor="OffWhite" /> <Positioning> <Indent Val="10" /> <SpaceAfter Val="10" /> </Positioning> <Font> <Family Val="Cambria" /> <Size Val="14" /> </Font></Style>
This is what we expected. VisualProps has merged attributes, Font is inherited from CompanyThemeHeading1, and Positioning has the SpaceAfter element from the CompanyThemeHeading1 style, and the Indent element from the Heading1 style.