Blog - Title

Working with In-Memory Open XML Documents

Working with In-Memory Open XML Documents

  • Comments 11

Sometimes you want to work with Open XML documents in memory.  There are two scenarios that I know of:

  • This blog is inactive.
    New blog: EricWhite.com/blog

    Blog TOC
    When working with document libraries in SharePoint, you retrieve a document from the document library as a byte array.  You can then modify it as necessary, and then put it back into the document library, either as a new document, or replacing the original.  This post shows how to do this.
  • In a web application, you may want to fabricate Open XML documents on the fly and serve them up to remote users.  You don’t want to serialize such temporary documents to the file system.  After creating them, you want to send them directly to the end user of the web application.

This blog post presents a bit of code that shows how to work with in-memory documents as a MemoryStream.  The code works with either Open XML SDK V1 or CTP1 of the Open XML SDK V2.

There is one important point to make about using the Open XML SDK with MemoryStream objects.  There is a MemoryStream constructor that takes a byte array as an argument.  However, we can’t use that constructor because it creates a non-resizable instance of the MemoryStream class, and the Open XML SDK needs a resizable memory stream, as parts may change in size when serialized back into the Open XML package.  Instead, we use the constructor that takes no parameters.  This creates a resizable MemoryStream.  We can then write the byte array to the MemoryStream, and then open the Open XML package from the MemoryStream (using the WordprocessingDocument class in this example).

After opening the WordprocessingDocument, we can work with the document as normal using the Open XML SDK.  After leaving the scope of the ‘using’ statement that opens the document, the memory stream will contain the new, modified document.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.IO;
using System.Xml;
using System.Xml.Linq;
using DocumentFormat.OpenXml.Packaging;
 
public static class LocalExtensions
{
    public static XDocument GetXDocument(this OpenXmlPart part)
    {
        XDocument xdoc = part.Annotation<XDocument>();
        if (xdoc != null)
            return xdoc;
        using (StreamReader sr = new StreamReader(part.GetStream()))
        using (XmlReader xr = XmlReader.Create(sr))
            xdoc = XDocument.Load(xr);
        part.AddAnnotation(xdoc);
        return xdoc;
    }
 
    public static void PutXDocument(this OpenXmlPart part) {
        XDocument xdoc = part.GetXDocument();
        if (xdoc != null) {
            // Serialize the XDocument object back to the package.
            using (XmlWriter xw =
                XmlWriter.Create(part.GetStream
               (FileMode.Create, FileAccess.Write))) {
                xdoc.Save(xw);
            }
        }
    }
 
    public static string StringConcatenate(
        this IEnumerable<string> source)
   {
        return source.Aggregate(
            new StringBuilder(),
            (s, i) => s.Append(i),
            s => s.ToString());
    }
}
 
class Program
{
    static void Main(string[] args)
    {
        byte[] byteArray = File.ReadAllBytes("Test.docx");
        using (MemoryStream mem = new MemoryStream())
        {
            mem.Write(byteArray, 0, (int)byteArray.Length);
            using (WordprocessingDocument wordDoc =
                WordprocessingDocument.Open(mem, true))
            {
                XNamespace w =
                    "http://schemas.openxmlformats.org/wordprocessingml/2006/main";
 
                // modify the document as necessary
                // for this example, we'll convert the first paragraph to upper case
                XDocument doc = wordDoc.MainDocumentPart.GetXDocument();
                XElement firstParagraph = doc
                    .Element(w + "document")
                    .Element(w + "body")
                    .Element(w + "p");
                if (firstParagraph != null)
                {
                    string text = firstParagraph
                        .Descendants()
                        .Where(n => n.Name == w + "t" || n.Name == w + "ins")
                        .Select(n => (string)n)
                        .StringConcatenate();
                    firstParagraph.ReplaceWith(
                        new XElement(w + "p",
                            new XElement(w + "r",
                                new XElement(w + "t", text.ToUpper()))));
                    // write the XDocument back into the Open XML document
                    wordDoc.MainDocumentPart.PutXDocument();
                }
            }
            // at this point, the MemoryStream contains the modified document.
            // We could write it back to a SharePoint document library or serve
            // it from a web server.
 
            // in this example, we'll serialize back to the file system to verify
            // that the code worked properly.
            using (FileStream fileStream = new FileStream("Test2.docx",
                System.IO.FileMode.CreateNew))
            {
                mem.WriteTo(fileStream);
            }
        }
    }
}
 

Code is attached.

Attachment: Program.cs
Leave a Comment
  • Please add 8 and 5 and type the answer here:
  • Post
  • Parmi les posts techniques à ne pas manquer : Comment assembler des documents Word 2007 (utilisation

  • Thanks a million for your insight on this subject. Using your logic I got my asp.net app working like a champ now.

  • Man, it's already the second week of 2009. Where does the time go? Here are a few links to posts and

  • I've been banging my head against the wall trying to figure out how to copy a template into a new file. Finally, something that works!! This is AWESOME! Thx!

  • Thanks Jason!  I'm happy it worked for you!

  • hi,

    i have a problem using the memory stream, maybe you can help..

    when working on the localhost the code works great!

    but when i copy the project to the main server i recieve : http 404 file or directory not found.

    i don't understand why. i am able to read into a filestream the file. and then send the memory stream as in your example.

    but when i try : wordprocessingdocument.open (memstream,true)

    i get the error. how can this be? i an working on the memorystream and still get the error of file (404)?

    please help!!!!

  • Hi Karen,  I would guess that opening the memory stream is not the cause of the issue.  Opening the memory stream doesn't cause an http 404 error.  The WordprocessingDocument.Open method throws exceptions (ArgumentNullException and OpenXmlPackageException).  I suspect that you have a security configuration issue on your web server.  I am not an expert in those areas, but that is where I would start looking.

    -Eric

  • How can i perform a find and repolace in this document?

    ex:

    I have a string to replace "<<NAME>>" and i want to replace with another value.

  • Hi Eric,

    How can I serve the file to the client to download from a web application?

    Thanks

  • Hi Chieri, I'm not an asp.net expert, but found some links:

    http://dotnetslackers.com/Community/blogs/haissam/archive/2007/04/03/Downloading-Files-C_2300_.aspx

    http://msdn.microsoft.com/en-us/library/aa478985.aspx

    http://www.west-wind.com/weblog/posts/76293.aspx

    -Eric

  • Thank you Eric, I spent an embarassing amount of time getting this to work. I don't think that it would have been possible to come up with PutXDocument() without this blog. One other issue that I had though was in returning a byte[] to my database. I couldn't find a way to get the correct contents out of the memorystream so I wrote to another one. I suspect that there is a better way...

    so this: return ms.GetBuffer()  Did not work

    but this:

               MemoryStream output = new MemoryStream();

               ms.WriteTo(output);

               return output.GetBuffer();

Page 1 of 1 (11 items)