Welcome to MSDN Blogs Sign in | Join | Help
Refactoring using a Pure Function

[Table of Contents]  [Next Topic]

It would be useful to refactor this example to clean up the code that determines the style of the paragraph. We can make a function that has no side effects that returns the style name:

public static string GetParagraphStyle(XElement para)

{

    return (string)para.Elements(w + "pPr")

                       .Elements(w + "pStyle")

                       .Attributes(w + "val")

                       .FirstOrDefault();

}

 

Now, the query is as follows:

var paragraphs =

    mainPartDoc.Root

               .Element(w + "body")

               .Descendants(w + "p")

    .Select(p =>

        new

        {

            ParagraphNode = p,

            Style = GetParagraphStyle(p)

        }

    );

 

We can rewrite the version that uses a query expression:

var paragraphs =

    from p in mainPartDoc.Root.Element(w + "body").Descendants(w + "p")

    let style = GetParagraphStyle(p)

    select new

    {

        ParagraphNode = p,

        Style = style,

    };

 

The entire example follows.  The code is attached to this page.

using System;

using System.Collections.Generic;

using System.Linq;

using System.Text;

using System.IO;

using System.Xml;

using System.Xml.Linq;

using DocumentFormat.OpenXml.Packaging;

 

public static class LocalExtensions

{

    public static string GetPath(this XElement el)

    {

        return

            el

            .AncestorsAndSelf()

            .Aggregate("", (seed, i) => i.Name.LocalName + "/" + seed);

    }

}

 

class Program

{

    readonly static XNamespace w =

      "http://schemas.openxmlformats.org/wordprocessingml/2006/main";

 

    public static XDocument LoadXDocument(OpenXmlPart part)

    {

        XDocument xdoc;

        using (StreamReader streamReader = new StreamReader(part.GetStream()))

            xdoc = XDocument.Load(XmlReader.Create(streamReader));

        return xdoc;

    }

 

    public static string GetParagraphStyle(XElement para)

    {

        return (string)para.Elements(w + "pPr")

                           .Elements(w + "pStyle")

                           .Attributes(w + "val")

                           .FirstOrDefault();

    }

 

    static void Main(string[] args)

    {

        const string filename = "SampleDoc.docx";

 

        using (WordprocessingDocument wordDoc =

            WordprocessingDocument.Open(filename, true))

        {

            MainDocumentPart mainPart = wordDoc.MainDocumentPart;

            XDocument mainPartDoc = LoadXDocument(mainPart);

 

            var paragraphs =

                mainPartDoc.Root

                           .Element(w + "body")

                           .Descendants(w + "p")

                .Select(p =>

                    new

                    {

                        ParagraphNode = p,

                        Style = GetParagraphStyle(p)

                    }

                );

 

            foreach (var p in paragraphs)

                Console.WriteLine("{0} {1}",

                    p.Style != null ?

                    p.Style.PadRight(12) :

                    "".PadRight(12),

                    p.ParagraphNode.GetPath());

        }

    }

}

 

This is easier to read.

Because we wrote the GetParagraphStyle function without side effects, we were free to use it without worrying about how it would impact the execution of our query.

[Table of Contents]  [Next Topic]

 

Posted: Wednesday, October 04, 2006 5:15 AM by EricWhite

Attachment(s): RefactoringUsingAPureFunction.cs

Comments

No Comments

Leave a Comment

(required) 

(required) 

(optional)

(required) 

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Page view tracker