Blog - Title

The Final Results

The Final Results

  • Comments 0

[Blog Map]  [Table of Contents]  [Next Topic]

We need one more query to retrieve the comments:

This blog is inactive.
New blog: EricWhite.com/blog

Blog TOC
var groupedCodeWithComments =
    groupedCodeParagraphs.Select(g =>
        {
            var id =
                (string)g.Select(p => p.ParagraphNode)
                         .Elements(w + "commentRangeStart")
                         .First()
                         .Attribute(w + "id");
 
            var commentNode =
                commentsDoc.Root
                           .Elements(w + "comment")
                           .Where(c => (string)c.Attribute(w + "id") == id)
                           .First();
 
            var comment =
                commentNode.Elements(w + "p")
                           .StringConcatenate(node =>
                               node.Descendants(w + "t")
                                   .Select(t => (string)t)
                                   .StringConcatenate()
                               + "\n");
 
            return new
            {
                ParagraphGroup = g,
                Comment = comment
            };
        }
    );
 

Notice that in this query that we used a statement lambda expression to get the comment.

The first query takes the paragraph node (for all paragraphs in the group), and uses the Elements extension method to find all child elements named commentRangeStart:

var id =
    (string)g.Select(p => p.ParagraphNode)
             .Elements(w + "commentRangeStart")
             .First()
             .Attribute(w + "id");
 

This might seem like it is doing too much work, but because we used the First extension method, the query is short circuited just as soon as the first commentRangeStart element is found.  For all practical purposes, this query just follows a few links in a linked list (although it does a fair amount of other work to set up the query, but that work is the same every time, and is not too bad).

Once we have the id of the comment, we can write the query to find the comment node in the comments part.  This is a straight forward query; you have seen everything in this query before.

var commentNode =
    commentsDoc.Root
               .Elements(w + "comment")
               .Where(c => (string)c.Attribute(w + "id") == id)
               .First();
 

Finally, once we have the comment node, the following query assembles the text of the comment.  Text in comments are represented in the XML in a similar way to representation of text in the main document part.  Under the comment node are run elements (w:r), and under the run elements are text elements (w:t):

var comment =
    commentNode.Elements(w + "p")
               .StringConcatenate(node =>
                   node.Descendants(w + "t")
                       .Select(t => (string)t)
                       .StringConcatenate()
                   + "\n");
 

The outer StringConcatenate does the concatenation of each line followed by a new line.  The inner string concatenation assembles each line of the comment.  It's a little more complicated, but this is how I naturally wrote it.

When we run this program, we see:

Code Block
==========
using System;
 
class Program {
    public static void Main(string[] args) {
        Console.WriteLine("Hello World");
    }
}
 
 
Meta Data
=========
<Test SnipId="0001">
  <Output SnipId="0002"/>
</Test>
 
 
Code Block
==========
Hello World
 
Meta Data
=========
<Test SnipId="0002"/>
 

This accomplished the goal of our exercise.  However, that last query is a bit complex.  We can simplify it by factoring out the code to get the comment text from the comment part.  The GetCommentText method looks like this:

public static string GetCommentText(XDocument commentsDoc, string id)
{
    var commentNode =
        commentsDoc.Root
                   .Elements(w + "comment")
                   .Where(c => (string)c.Attribute(w + "id") == id)
                   .First();
 
    var comment =
        commentNode.Elements(w + "p")
                   .StringConcatenate(node =>
                       node.Descendants(w + "t")
                           .Select(t => (string)t)
                           .StringConcatenate()
                       + "\n");
    return comment;
}
 

And the modified query looks like this:

var groupedCodeWithComments =
    groupedCodeParagraphs.Select(g =>
        {
            var id =
                (string)g.Select(p => p.ParagraphNode)
                         .Elements(w + "commentRangeStart")
                         .First()
                         .Attribute(w + "id");
            return new
            {
                ParagraphGroup = g,
                Comment = GetCommentText(commentsDoc, id)
            };
        }
    );
 

The complete listing follows in the next topic.  It is ~230 lines long, but if we consider the extension methods to be part of a reusable library, and just consider the queries, the query code is only ~100 lines long, and there is a fair amount of white space.  It does quite a bit of work for a little example.

[Blog Map]  [Table of Contents]  [Next Topic]

Attachment: ParseWordML.cs
Leave a Comment
  • Please add 1 and 7 and type the answer here:
  • Post