Blog - Title

Creating Data-Bound Content Controls using the Open XML SDK and LINQ to XML

Creating Data-Bound Content Controls using the Open XML SDK and LINQ to XML

  • Comments 20

Data-bound content controls are a powerful and convenient way to separate the semantic business data from the markup of an Open XML document.  After binding content controls to custom XML, you can query the document for the business data by looking in the custom XML part rather than examining the markup.  Querying custom XML is much simpler than querying the document body.  However, it’s a little bit involved to create data-bound content controls (but only a little bit).  But there is a trick we can do – we can take a document that has un-bound content controls, generate a custom XML part automatically (inferring the elements of the custom XML from the content controls), and then bind the content controls to the custom XML part.

This blog is inactive.
New blog: EricWhite.com/blog

Blog TOC
(Update March 10, 2009 - modified code to work with latest Open XML SDK.) 

This approach has two benefits – first, it can serve as a way to conveniently create a document with data-bound content controls, and second, it serves to demonstrate exactly what you must do to create data-bound content controls.

This example uses the following approach:

  • Using Word 2007, you create a document with any number of content controls in it.
  • When creating each content control, you set the Tag of the content control to the desired XML element name in the custom XML.
  • You then run this example code on the document, which creates the custom XML part, creates the custom XML properties part, and then adds the markup to the body of the document that binds each content control to the custom XML.

This example uses the Open XML SDK V1 and LINQ to XML.

Data-Bound Content Controls

A document that contains properly set-up data-bound content control has the following characteristics:

  • The main document part has a relation to the custom XML part.
  • The custom XML part has a relation to a custom XML properties part.
  • The custom XML properties part contains a GUID in an attribute (ds:itemID).  This GUID is used to associate the data binding elements in the main document part to the relevant custom XML part.
  • Within the content control markup in the main document part, the data binding element (w:dataBinding) defines the data binding.  This element has an attribute (w:storeItemID) that contains the same GUID as in the custom XML properties part.  In addition, the element has an attribute (w:xpath) that contains the XPath expression to the relevant node in the custom XML.

The following screen clipping shows the word document with content controls in the cells of a table:

To set the properties of the content control, click on the Content Controls Properties button (on the Developer tab of the ribbon):

In this example, the element name in the custom XML part comes from the Tag field in the content control properties window:

The following screen clipping (using the Open XML Package Editor, which comes with Visual Studio Power Tools) shows that there is a relation from the main document part (document.xml) to the custom XML part (../customXml/item1.xml):

The following shows the relation from the custom XML part to the custom XML properties part (itemProps1.xml):

The custom XML for the example included with this post looks like this:

<?xmlversion="1.0"encoding="utf-8"?>
<Root>
  <Name>Eric White</Name>
  <Company>Microsoft Corporation</Company>
  <Address>One Microsoft Way</Address>
  <City>Redmond</City>
  <State>WA</State>
  <Country>USA</Country>
  <PostalCode>98052</PostalCode>
</Root>

This custom XML is automatically generated by this example.

The custom XML properties part looks like this:

<?xmlversion="1.0"encoding="utf-8"standalone="no"?>
<ds:datastoreItem
    ds:itemID="{F351E99C-3283-4B75-927A-A56C9FD3BFFC}"
    xmlns:ds="http://schemas.openxmlformats.org/officeDocument/2006/customXml">
  <ds:schemaRefs/>
</ds:datastoreItem>

The GUID in the ds:itemID attribute is generated when the example is run.

The content control with properly set-up data binding looks like this:

<w:sdt>
  <w:sdtPr>
    <w:aliasw:val="Name"/>
    <w:tagw:val="Name"/>
    <w:idw:val="13264407"/>
    <w:placeholder>
      <w:docPartw:val="DefaultPlaceholder_22675703"/>
    </w:placeholder>
    <w:dataBinding
      w:xpath="/Root/Name"
      w:storeItemID="{F351E99C-3283-4B75-927A-A56C9FD3BFFC}"/>
    <w:text/>
  </w:sdtPr>
  <w:sdtContent>
    <w:tc>
      <w:tcPr>
        <w:tcWw:w="4410"
               w:type="dxa"/>
      </w:tcPr>
      <w:pw:rsidR="00E850CC"
           w:rsidRDefault="00FF4549"
           w:rsidP="00FF4549">
        <w:r>
          <w:t>Eric White</w:t>
        </w:r>
      </w:p>
    </w:tc>
  </w:sdtContent>
</w:sdt>

The GUID in the w:storeItemID attribute is the same as in the custom XML properties part.  This creates the association between the data-bound content control and its custom XML part.

If you edit the document that has bound content controls, and change the contents in one of them, the custom XML is modified to reflect the changed content.  For instance, if you edit the document and change the name to Tai Yee, then the custom XML will be:

<?xmlversion="1.0"encoding="utf-8"?>
<Root>
  <Name>Tai Yee</Name>
  <Company>Microsoft Corporation</Company>
  <Address>One Microsoft Way</Address>
  <City>Redmond</City>
  <State>WA</State>
  <Country>USA</Country>
  <PostalCode>98052</PostalCode>
</Root>

Because the GUID that creates the association is in the custom XML properties part and not in the custom XML itself, the custom XML can have any schema you desire.  You can take XML from any source, with any schema, and place it, unmodified, in a custom XML part, and create the appropriate data-binding to content controls.

Example using the Open XML SDK V1 and LINQ to XML

The example first copies Template.docx to Test.docx.  It opens Test.docx using the Open XML SDK, creates the custom XML part, creates the custom XML properties part, and then adds the data binding elements to the content controls in the main document part.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.IO;
using System.Xml;
using System.Xml.Linq;
using DocumentFormat.OpenXml;
using DocumentFormat.OpenXml.Packaging;

public static class LocalExtensions
{
    public static string StringConcatenate<T>(this IEnumerable<T> source,
            Func<T, string> func)
    {
        StringBuilder sb = new StringBuilder();
        foreach (T item in source)
            sb.Append(func(item));
        return sb.ToString();
    }

    public static string StringConcatenate(this IEnumerable<string> source)
    {
        StringBuilder sb = new StringBuilder();
        foreach (string item in source)
            sb.Append(item);
        return sb.ToString();
    }

    public static XDocument GetXDocument(this OpenXmlPart part)
    {
        XDocument xdoc = part.Annotation<XDocument>();
        if (xdoc != null)
            return xdoc;
        using (Stream str = part.GetStream())
        using (StreamReader streamReader = new StreamReader(str))
        using (XmlReader xr = XmlReader.Create(streamReader))
            xdoc = XDocument.Load(xr);
        part.AddAnnotation(xdoc);
        return xdoc;
    }
}

class Program
{
    private static XNamespace w =
        "http://schemas.openxmlformats.org/wordprocessingml/2006/main";
    private static XName r = w + "r";
    private static XName ins = w + "ins";
    private static XNamespace ds =
        "http://schemas.openxmlformats.org/officeDocument/2006/customXml";

    static string GetTextFromContentControl(XElement contentControlNode)
    {
        return contentControlNode.Descendants(w + "p")
            .Select(
                p => p.Elements()
                      .Where(z => z.Name == r || z.Name == ins)
                      .Descendants(w + "t")
                      .StringConcatenate(element =>
                          (string)element) + Environment.NewLine
            ).StringConcatenate();
    }

    static void Main(string[] args)
    {
        File.Delete("Test.docx");
        File.Copy("Template.docx", "Test.docx");

        // Open the Open XML doc as a word processing doc
        using (WordprocessingDocument document =
            WordprocessingDocument.Open("Test.docx", true))
        {
            // Create the contents of the custom XML part
            XElement customXml = new XElement("Root",
                document
                .MainDocumentPart
                .GetXDocument()
                .Descendants(w + "sdt")
                .Select(sdt =>
                    new XElement(
                        sdt.Element(w + "sdtPr")
                            .Element(w + "tag")
                            .Attribute(w + "val").Value,
                        GetTextFromContentControl(sdt).Trim())
                )
            );

            // Create a new custom XML part
            CustomXmlPart customXmlPart =
                document.MainDocumentPart.AddCustomXmlPart(CustomXmlPartType.CustomXml);
            using (Stream str = customXmlPart.GetStream(
                FileMode.Create, FileAccess.ReadWrite))
            using (XmlWriter xw = XmlWriter.Create(str))
                customXml.Save(xw);

            Guid idGuid = Guid.NewGuid();

            // Create the contents of the properties part
            XDocument propertyPartXDoc = new XDocument(
                new XElement(ds + "datastoreItem",
                    new XAttribute(ds + "itemID",
                        "{" + idGuid.ToString().ToUpper() + "}"),
                    new XAttribute(XNamespace.Xmlns + "ds",
                        ds.NamespaceName),
                    new XElement(ds + "schemaRefs")
                )
            );

            // Add the custom XML properties part
            CustomXmlPropertiesPart customXmlPropertyPart =
                customXmlPart.AddNewPart<CustomXmlPropertiesPart>();
            using (Stream str = customXmlPropertyPart.GetStream(
                FileMode.Create, FileAccess.ReadWrite))
            using (XmlWriter xw = XmlWriter.Create(str))
                propertyPartXDoc.Save(xw);

            // Load the main document part into an XDocument
            XDocument mainDocumentXDoc;
            using (Stream str = document.MainDocumentPart.GetStream())
            using (XmlReader xr = XmlReader.Create(str))
                mainDocumentXDoc = XDocument.Load(xr);

            // Add the data binding elements to the main document
            foreach (XElement sdt in mainDocumentXDoc.Descendants(w + "sdt"))
                sdt.Element(w + "sdtPr")
                    .Element(w + "placeholder")
                    .AddAfterSelf(
                        new XElement(w + "dataBinding",
                            new XAttribute(w + "xpath",
                                "/Root/" + sdt.Element(w + "sdtPr")
                                    .Element(w + "tag")
                                    .Attribute(w + "val").Value),
                            new XAttribute(w + "storeItemID",
                                "{" + idGuid.ToString().ToUpper() + "}")
                        )
                    );

            // Serialize the XDocument back into the part
            using (Stream str = document.MainDocumentPart.GetStream(
                FileMode.Create, FileAccess.Write))
            using (XmlWriter xw = XmlWriter.Create(str))
                mainDocumentXDoc.Save(xw);
        }
    }
}

Code is attached.

Attachment: DataBoundContentControls.zip
Leave a Comment
  • Please add 3 and 6 and type the answer here:
  • Post
  • PingBack from http://osrin.net/2008/10/eric-white-has-too-much-to-say/

  • Stephen McGibbon has screenshots of the Open XML and ODF support coming in Windows 7 Wordpad , as announced

  • Suite à la PDC 2008 et au workshop Open XML donné par Microsoft à Redmond ( Doug , encore mille excuses

  • Question regarding the GetTextFromContentControl method in your example. This looks for "p" elements and there is normally (as far as I've seen) no "p" tags within the "sdt" elements, which is the parameter into the method.

    Looking at some of my own Open XML documents, it looks like the following example would be more correct. Yet, this example does not support placeholders that allows carriage returns.

    e.Element(w + "sdtContent").Element(w + "r").Element(w + "t").Value.Trim()

    Additionally the code will fail whenever there is placeholders that does not have any tag specified, to avoid this you can make a check in the foreach loops, something like:

    if (sdt.Element(w + "sdtPr").Element(w + "tag") != null)

    Thanks for a great example!

  • I just read Brian Jones' <a href="http://blogs.msdn.com/brian_jones/archive/2009/01/05/taking-advantage-of-bound-content-controls.aspx" title="Taking Advantage of Bound Content Controls">post</a> where he completely swaps out the custom XML part. The code appears much more concise, but does it lack in the area of property reconstructing the Custom XML Part Properties?

  • Hi Eric,

    Can we do the custom binding for content controls that are in header and footer parts?

  • Is there any way I can toggle the content control bordering and highlighting?

    I have some content controls that are very close together and they exhibit some really strange behavior.

  • Hi Eric

    Great blog, and good info on content controls here, but, I'm using Word 2007, and when I create a docx with one content control nested within another, save and then try to reload that document, Word throws an error, and offers to "correct the currupted document"

    When I click YES, the doc loads, but the nested content control has been stripped and converted to text.

    This is in a completely fresh doc on a system with a fresh install of office 2007, so I'm a bit stumped. Are nested content controls +really+ supported?

    Thanks

  • @Engr_Muneer, have you taken a look at "design mode" for content controls?  It can really help with how you interact with them.  Take a look at this post:

    http://blogs.msdn.com/ericwhite/archive/2010/03/02/using-nested-content-controls-for-data-and-content-extraction.aspx

    @Darin,

    I tried creating nested content controls using Word 2007, and it worked just fine for me.  I tried on multiple installs.  Can you try on some other Word 2007 installs, see if it works elsewhere?

    -Eric

  • @satchi, yes, you can link content controls in headers/footers to custom XML.  The XPath expression refers to elements/attributes in the custom XML part that is related to the main document part.

    -Eric

  • Very strange.

    I'm running word 12.0.4518.1000 (ie original shpping version from what I can tell).

    It definitely says that the doc has been corrupted once it's saved and reloaded.

    I went to a colleague's desk, he's running 12.0.6500,5000 (it says it's SP2) and his version works completely differently. No matter what we do at has desk, we can't get it to insert a nested content control at all. The ribbon buttons for controls on the developer ribbon are greyed when the cursor is in a content control.

    Strange

  • I'm running Win Update on this image now. Just have to see if maybe the Sp has something to do with it.

  • Aha! Finally figured out what's up with this.

    Just fyi for anyone else that might come across this page,

    You can't insert a content control into a "Plain text" content control. But it appears that you CAN insert a content control into a "Rich text" content control.

    I suppose that makes a certain amount of sense, but it sure wasn't clear (and the older Word 2007 definitely would LET me do it, even though it appears that it shouldn't have.

    And the controls do appear to save and reload properly, without Word stripping them out.

  • Thank you Darin for figuring this out.  This was one of those assumptions that was so ingrained in my mind that I forgot to mention.  I'm going to update the nested content control blog post to tell this.

    -Eric

  • Since the custom XML part is removed from Word from January 10.... Does anyone knows how to achieve content-controls/custom XML mapping in Word 2010? In other words, how it will be done in Word 2010? We are using method specified in this article for filling content controls from custom XML (Word 2007 - before January 10.), but how will we achieve that in Word 2010. I'm thinking about a future...

Page 1 of 2 (20 items) 12