Blog - Title

Processing all Content Parts in an Open XML WordprocessingML Document

Processing all Content Parts in an Open XML WordprocessingML Document

  • Comments 1

In Open XML WordprocessingML documents, there are five types of parts that can contain content such as paragraphs (with or without tracked revisions), tables, rows, cells, and any of a variety of content controls:

  • This blog is inactive.
    New blog: EricWhite.com/blog

    Blog TOC
    Main document part
  • Header parts (there can be more than one)
  • Footer parts (there can be more than one)
  • Endnotes (there can be zero or one)
  • Footnotes (there can be zero or one)

There are certain Open XML programming scenarios where you need to process all varieties of parts that contain content:

  • You need to search for specific words in a document, regardless of where those words occur.
  • You need to accept tracked changes anywhere they appear in the document.
  • You need to process content controls anywhere they occur in the document, perhaps to bind them to XML in a custom XML part.

The following example shows how to search for all content controls in a document, regardless of whether those content controls are in the main document part, in the headers/footers, or in endnotes/footnotes.  This example uses LINQ to XML.  If you are using the strongly-typed OM of the Open XML SDK, the code would be identical, except for the code to actually process the content controls.

using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
using System.Text;
using System.Xml;
using System.Xml.Linq;
using DocumentFormat.OpenXml.Packaging;

public static class Extensions
{
    public static XDocument GetXDocument(this OpenXmlPart part)
    {
        XDocument partXDocument = part.Annotation<XDocument>();
        if (partXDocument != null)
            return partXDocument;
        using (Stream partStream = part.GetStream())
        using (XmlReader partXmlReader = XmlReader.Create(partStream))
            partXDocument = XDocument.Load(partXmlReader);
        part.AddAnnotation(partXDocument);
        return partXDocument;
    }
}

class Program
{
    private static void IterateContentControlsForPart(OpenXmlPart part)
    {
        XNamespace w = "http://schemas.openxmlformats.org/wordprocessingml/2006/main";
        XDocument doc = part.GetXDocument();
        foreach (var sdt in doc.Descendants(w + "sdt"))
        {
            Console.WriteLine("Found content control");
            Console.WriteLine("=====================");
            Console.WriteLine(sdt.ToString());
            Console.WriteLine();
        }
    }

    public static void IterateContentControls(WordprocessingDocument doc)
    {
        IterateContentControlsForPart(doc.MainDocumentPart);
        foreach (var part in doc.MainDocumentPart.HeaderParts)
            IterateContentControlsForPart(part);
        foreach (var part in doc.MainDocumentPart.FooterParts)
            IterateContentControlsForPart(part);
        if (doc.MainDocumentPart.EndnotesPart != null)
            IterateContentControlsForPart(doc.MainDocumentPart.EndnotesPart);
        if (doc.MainDocumentPart.FootnotesPart != null)
            IterateContentControlsForPart(doc.MainDocumentPart.FootnotesPart);
    }

    static void Main(string[] args)
    {
        using (WordprocessingDocument doc = WordprocessingDocument.Open("Test.docx", false))
            IterateContentControls(doc);
    }
}

Leave a Comment
  • Please add 6 and 3 and type the answer here:
  • Post
  • Eric,

    I'm curious why did you not include the following items:

    doc.MainDocumentPart.CustomXmlParts

    doc.MainDocumentPart.WordprocessingCommentsPart

    -John

Page 1 of 1 (1 items)