Welcome to MSDN Blogs Sign in | Join | Help
Using the Open XML SDK

[Table of Contents]  [Next Topic]

Open XML Packages

To follow this tutorial, you don't need to delve into all of the details of working with packages.  This topic presents a small chunk of code that you can use as boilerplate code – it opens a word document and retrieves the main part, the style part, and the comment part.  It uses LINQ to XML to count the XML nodes in the three parts, and prints the counts to the console.

The boiler plate code uses the Open XML SDK, a set of managed classes for .NET that provides more convenient access to Open XML documents.  Using the SDK, you can get the main part of the document, and navigate to related parts more easily.  It cuts down your code by quite a bit.  This blog post is a summary of the differences between the classes in System.IO.Packaging and the classes in the Open XML SDK.  This example uses the April 2008 CTP of the Open XML SDK.  This blog post gives lots of information about the Open XML SDK, including where to download it.

Before attempting to compile, don't forget to:

·         Add a reference to the WindowsBase assembly.

·         Download and install the Open XML SDK.

·         Add a reference to the Microsoft.Office.DocumentFormat.OpenXml assembly.

For the interested:

Just a few points about packages.  Various parts in the package are related.  You never rely on absolute paths to retrieve a part, even if you know the path.  Instead, you start from the main part, and use relationships to navigate to the other parts.  As mentioned, many of these parts are XML documents, including files that specify the relationships between parts.  You can access the parts and the relationship files using any conformant XML parser and a library that can open and read from ZIP files.  However, the classes in the namespace System.IO.Packaging (in the WindowsBase assembly) allow you to work with packages in a more convenient way.  You can see a quick summary of how to use relationships to navigate from part to part here.

The following code is attached to this page.  Here is the boiler plate code:

using System;

using System.Collections.Generic;

using System.Linq;

using System.Text;

using System.IO;

using System.Xml;

using System.Xml.Linq;

using Microsoft.Office.DocumentFormat.OpenXml.Packaging;

 

class Program

{

    public static XDocument LoadXDocument(OpenXmlPart part)

    {

        XDocument xdoc;

        using (StreamReader streamReader = new StreamReader(part.GetStream()))

            xdoc = XDocument.Load(XmlReader.Create(streamReader));

        return xdoc;

    }

 

    static void Main(string[] args)

    {

        const string filename = "SampleDoc.docx";

 

        using (WordprocessingDocument wordDoc = WordprocessingDocument.Open(filename, true))

        {

            MainDocumentPart mainPart = wordDoc.MainDocumentPart;

            StyleDefinitionsPart styleDefinitionsPart = mainPart.StyleDefinitionsPart;

            WordprocessingCommentsPart commentsPart = mainPart.CommentsPart;

            XDocument xDoc = LoadXDocument(mainPart);

            XDocument styleDoc = LoadXDocument(styleDefinitionsPart);

            XDocument commentsDoc = LoadXDocument(commentsPart);

            Console.WriteLine("The main document part has {0} nodes.", xDoc.DescendantNodes().Count());

            Console.WriteLine("The style part has {0} nodes.", styleDoc.DescendantNodes().Count());

            Console.WriteLine("The comments part has {0} nodes.", commentsDoc.DescendantNodes().Count());

        }

    }

}

 

[Table of Contents]  [Next Topic]

Posted: Tuesday, April 22, 2008 11:42 PM by EricWhite
Filed under:

Attachment(s): UsingTheOpenXmlSdk.cs

Comments

Joris Kalz said:

I had to change CommentsPart to WordprocessingCommentsPart, else it would not find CommentsPart in Apr08 version of the SDK.

# May 21, 2008 5:51 PM

EricWhite said:

You're right.  This type's name changed with the Apr08 CTP.  I've updated the code in the post.  Thanks.

# May 21, 2008 6:17 PM

Eric White's Blog said:

[Table of Contents] [Next Topic] An easy way to see the XML in the parts of a WordprocessingML document

# June 19, 2008 3:20 AM

Marv Schwartz said:

After adding a reference to the OpenXML SDK, I had to change the using statement to:

using DocumentFormat.OpenXml.Packaging;

Also, the code in this section has not yet been updated to WordProcessingCommentsPart. Am I studying a copy that has been superseded?

# July 10, 2008 11:07 AM
Leave a Comment

(required) 

(required) 

(optional)

(required) 

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Page view tracker