Blog - Title

Retrieving the Default Style Name from the Styles Part

Retrieving the Default Style Name from the Styles Part

  • Comments 0

[Blog Map]  [Table of Contents]  [Next Topic]

There is a problem in the example presented in the previous topic, which is that it sets the Style property of the anonymous type to null if there is no style on the paragraph.  This is incorrect; we should use another query to find the default style in the styles part.

This blog is inactive.
New blog: EricWhite.com/blog

Blog TOC
Here is the query:

string defaultStyle =
    (string)styleDoc.Root
                    .Elements(w + "style")
                    .Where(style =>
                        (string)style.Attribute(w + "type") == "paragraph" &&
                        (string)style.Attribute(w + "default") == "1")
                    .First()
                    .Attribute(w + "styleId");
 

We can then use this variable in the query that extracts the paragraphs.  Notice that I switched to use a statement lambda expression, so that I could write the code to call GetParagraphStyle only once per iteration:

var paragraphs =
    mainPartDoc.Root
               .Element(w + "body")
               .Descendants(w + "p")
    .Select(p =>
    {
        string style = GetParagraphStyle(p);
        string styleName = style == null ? defaultStyle : style;
        return new
        {
            ParagraphNode = p,
            Style = styleName
        };
    }
    );
 

We can write part of the query that retrieves the default style using a query expression.  However, there is no way to express the First call in a query expression, so we must surround the query expression with parentheses, and then dot into the First method:

string defaultStyle =
    (string)(
        from style in styleDoc.Root.Elements(w + "style")
        where (string)style.Attribute(w + "type") == "paragraph" &&
              (string)style.Attribute(w + "default") == "1"
        select style
    ).First().Attribute(w + "styleId");
 

My personal preferred style is to use method syntax in this situation.

One more point about this assignment:  because we used the First extension method, the source is iterated, and the value of the variable is set immediately.  Unlike the query that finds the paragraphs, which actually does nothing until we iterate through the query using a foreach statement, the First extension method causes the query to execute immediately, and the value of the string defaultStyle variable to be set.

Now, when we run the program, we see:

Heading1     document/body/p/
Normal       document/body/p/
Code         document/body/p/
Code         document/body/p/
Code         document/body/p/
Code         document/body/p/
Code         document/body/p/
Code         document/body/p/
Code         document/body/p/
Code         document/body/p/
Normal       document/body/p/
Code         document/body/p/
 

This is what we wanted from this transformation.

The following code is attached to this page.  The entire listing follows.  Note that we had to read the styles part into an XDocument.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.IO;
using System.Xml;
using System.Xml.Linq;
using DocumentFormat.OpenXml.Packaging;
 
public static class LocalExtensions
{
    public static string GetPath(this XElement el)
    {
        return
            el
            .AncestorsAndSelf()
            .Aggregate("", (seed, i) => i.Name.LocalName + "/" + seed);
    }
}
 
class Program
{
    readonly static XNamespace w =
      "http://schemas.openxmlformats.org/wordprocessingml/2006/main";
 
    public static XDocument LoadXDocument(OpenXmlPart part)
    {
        XDocument xdoc;
        using (StreamReader streamReader = new StreamReader(part.GetStream()))
            xdoc = XDocument.Load(XmlReader.Create(streamReader));
        return xdoc;
    }
 
    public static string GetParagraphStyle(XElement para)
    {
        return (string)para.Elements(w + "pPr")
                           .Elements(w + "pStyle")
                           .Attributes(w + "val")
                           .FirstOrDefault();
    }
 
    static void Main(string[] args)
    {
        const string filename = "SampleDoc.docx";
 
        using (WordprocessingDocument wordDoc =
            WordprocessingDocument.Open(filename, true))
        {
            MainDocumentPart mainPart = wordDoc.MainDocumentPart;
            StyleDefinitionsPart stylePart = mainPart.StyleDefinitionsPart;
            XDocument mainPartDoc = LoadXDocument(mainPart);
            XDocument styleDoc = LoadXDocument(stylePart);
 
            string defaultStyle =
                (string)styleDoc.Root
                                .Elements(w + "style")
                                .Where(style =>
                                    (string)style.Attribute(w + "type") == "paragraph" &&
                                    (string)style.Attribute(w + "default") == "1")
                                .First()
                                .Attribute(w + "styleId");
 
            var paragraphs =
                mainPartDoc.Root
                           .Element(w + "body")
                           .Descendants(w + "p")
                .Select(p =>
                {
                    string style = GetParagraphStyle(p);
                    string styleName = style == null ? defaultStyle : style;
                    return new
                    {
                        ParagraphNode = p,
                        Style = styleName
                    };
                }
                );
 
            foreach (var p in paragraphs)
                Console.WriteLine("{0} {1}",
                    p.Style != null ?
                    p.Style.PadRight(12) :
                    "".PadRight(12),
                    p.ParagraphNode.GetPath());
        }
    }
}
 

[Blog Map]  [Table of Contents]  [Next Topic]

Attachment: RetrievingTheDefaultStyleNameFromTheStylesPart.cs
Leave a Comment
  • Please add 8 and 6 and type the answer here:
  • Post