Welcome to MSDN Blogs Sign in | Join | Help
Working with In-Memory Open XML Documents

[Blog Map] 

Sometimes you want to work with Open XML documents in memory.  There are two scenarios that I know of:

  • When working with document libraries in SharePoint, you retrieve a document from the document library as a byte array.  You can then modify it as necessary, and then put it back into the document library, either as a new document, or replacing the original.  This post shows how to do this.
  • In a web application, you may want to fabricate Open XML documents on the fly and serve them up to remote users.  You don’t want to serialize such temporary documents to the file system.  After creating them, you want to send them directly to the end user of the web application.

This blog post presents a bit of code that shows how to work with in-memory documents as a MemoryStream.  The code works with either Open XML SDK V1 or CTP1 of the Open XML SDK V2.

There is one important point to make about using the Open XML SDK with MemoryStream objects.  There is a MemoryStream constructor that takes a byte array as an argument.  However, we can’t use that constructor because it creates a non-resizable instance of the MemoryStream class, and the Open XML SDK needs a resizable memory stream, as parts may change in size when serialized back into the Open XML package.  Instead, we use the constructor that takes no parameters.  This creates a resizable MemoryStream.  We can then write the byte array to the MemoryStream, and then open the Open XML package from the MemoryStream (using the WordprocessingDocument class in this example).

After opening the WordprocessingDocument, we can work with the document as normal using the Open XML SDK.  After leaving the scope of the ‘using’ statement that opens the document, the memory stream will contain the new, modified document.

using System;

using System.Collections.Generic;

using System.Linq;

using System.Text;

using System.IO;

using System.Xml;

using System.Xml.Linq;

using DocumentFormat.OpenXml.Packaging;

 

public static class LocalExtensions

{

    public static XDocument GetXDocument(this OpenXmlPart part)

    {

        XDocument xdoc = part.Annotation<XDocument>();

        if (xdoc != null)

            return xdoc;

        using (StreamReader sr = new StreamReader(part.GetStream()))

        using (XmlReader xr = XmlReader.Create(sr))

            xdoc = XDocument.Load(xr);

        part.AddAnnotation(xdoc);

        return xdoc;

    }

 

    public static void PutXDocument(this OpenXmlPart part) {

        XDocument xdoc = part.GetXDocument();

        if (xdoc != null) {

            // Serialize the XDocument object back to the package.

            using (XmlWriter xw =

                XmlWriter.Create(part.GetStream

               (FileMode.Create, FileAccess.Write))) {

                xdoc.Save(xw);

            }

        }

    }

 

    public static string StringConcatenate(

        this IEnumerable<string> source)

   {

        return source.Aggregate(

            new StringBuilder(),

            (s, i) => s.Append(i),

            s => s.ToString());

    }

}

 

class Program

{

    static void Main(string[] args)

    {

        byte[] byteArray = File.ReadAllBytes("Test.docx");

        using (MemoryStream mem = new MemoryStream())

        {

            mem.Write(byteArray, 0, (int)byteArray.Length);

            using (WordprocessingDocument wordDoc =

                WordprocessingDocument.Open(mem, true))

            {

                XNamespace w =

                    "http://schemas.openxmlformats.org/wordprocessingml/2006/main";

 

                // modify the document as necessary

                // for this example, we'll convert the first paragraph to upper case

                XDocument doc = wordDoc.MainDocumentPart.GetXDocument();

                XElement firstParagraph = doc

                    .Element(w + "document")

                    .Element(w + "body")

                    .Element(w + "p");

                if (firstParagraph != null)

                {

                    string text = firstParagraph

                        .Descendants()

                        .Where(n => n.Name == w + "t" || n.Name == w + "ins")

                        .Select(n => (string)n)

                        .StringConcatenate();

                    firstParagraph.ReplaceWith(

                        new XElement(w + "p",

                            new XElement(w + "r",

                                new XElement(w + "t", text.ToUpper()))));

                    // write the XDocument back into the Open XML document

                    wordDoc.MainDocumentPart.PutXDocument();

                }

            }

            // at this point, the MemoryStream contains the modified document.

            // We could write it back to a SharePoint document library or serve

            // it from a web server.

 

            // in this example, we'll serialize back to the file system to verify

            // that the code worked properly.

            using (FileStream fileStream = new FileStream("Test2.docx",

                System.IO.FileMode.CreateNew))

            {

                mem.WriteTo(fileStream);

            }

        }

    }

}

 

Code is attached.

Posted: Wednesday, December 10, 2008 8:13 PM by EricWhite
Attachment(s): Program.cs

Comments

Julien Chable said:

Parmi les posts techniques à ne pas manquer : Comment assembler des documents Word 2007 (utilisation

# December 12, 2008 1:20 AM

Bryan S. said:

Thanks a million for your insight on this subject. Using your logic I got my asp.net app working like a champ now.

# December 22, 2008 2:07 PM

Doug Mahugh said:

Man, it's already the second week of 2009. Where does the time go? Here are a few links to posts and

# January 12, 2009 11:56 PM

Jason West said:

I've been banging my head against the wall trying to figure out how to copy a template into a new file. Finally, something that works!! This is AWESOME! Thx!

# September 11, 2009 11:33 AM

EricWhite said:

Thanks Jason!  I'm happy it worked for you!

# September 11, 2009 1:53 PM

Karen said:

hi,

i have a problem using the memory stream, maybe you can help..

when working on the localhost the code works great!

but when i copy the project to the main server i recieve : http 404 file or directory not found.

i don't understand why. i am able to read into a filestream the file. and then send the memory stream as in your example.

but when i try : wordprocessingdocument.open (memstream,true)

i get the error. how can this be? i an working on the memorystream and still get the error of file (404)?

please help!!!!

# December 24, 2009 1:58 AM

EricWhite said:

Hi Karen,  I would guess that opening the memory stream is not the cause of the issue.  Opening the memory stream doesn't cause an http 404 error.  The WordprocessingDocument.Open method throws exceptions (ArgumentNullException and OpenXmlPackageException).  I suspect that you have a security configuration issue on your web server.  I am not an expert in those areas, but that is where I would start looking.

-Eric

# December 24, 2009 9:10 AM
Leave a Comment

(required) 

(required) 

(optional)

(required) 

  
Enter Code Here: Required

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Page view tracker