Erika Ehrli - Adventures with Office Products & Technologies
MSDN & TechNet: Releasing Office, SharePoint, Exchange & Lync Centers and content for developers and IT professionals.

Data-driven document generation with Word 2007 and the Office XML File Formats: Part 2

Data-driven document generation with Word 2007 and the Office XML File Formats: Part 2

  • Comments 49

So you are looking for a component/technology/solution to help you generate Office files (documents, workbooks, and presentations) using a server-side application, and of course, using managed code.

I can tell you that this need has been brought to my attention at some conferences, questions in DLs, customer feedback, and as a common customer requirement when I was working as a consultant. You know that using automation and interop is definitely not an option. However, the Office XML File Formats are a great option.

You can build a server-side application using Visual Studio to generate data-rich documents using the Office XML File Formats and  the .NET Framework 3.0 (aka Microsoft WinFX ). Here's how...

In my previous blog entry I demonstrated how to build and Word 2007 template and connect an item in the data store. You built a customer letter document template with content controls that map to an XML file. In this blog entry, I will show you how to create a server-side data-driven document generation application that will allow you to select a company name to generate a custom letter. The application will retrieve customer data from a SQL Server database and use the customer letter document template to build a new document that displays customer data based on a user selection. The document will display the following information:

  • Company Name
  • Contact Name
  • Contact Title
  • Phone

To create a server-side application that pulls data from a SQL Server database and generates a new Word 2007 document

 

You can create a Web-based application that enables users to select a company name and generate a custom letter. The Web-based application retrieves customer data from a Microsoft SQL Server database, opens the customer letter document template, and creates a new document that displays customer data based on a user selection. This application does not require the use of Word 2007 or VBA. You can use your favorite managed code (Microsoft Visual Basic .NET or C#) language to build this application. To build this application, do the following:

  1. Open Microsoft Visual Studio 2005 or Microsoft Visual Web Developer 2005.
  2. Create an ASP.NET Web site and name it SqlServerSample.
  3. Connect the ASP.NET Web site to a Microsoft SQL Server database.
  4. Add a connection string to the Web.config file as follows:
    <connectionStrings>
     
    <add name="NorthwindConnectionString"
         connectionString
    ="data source=(local);database=Northwind; integrated security=SSPI;persist security info=false;"
         providerName
    ="System.Data.SqlClient" />
    </
    connectionStrings>
  5. Add the CustomerLetterTemplate.docx to the App_Data folder.
  6. Download and install the Microsoft .NET Framework 3.0 (formerly Microsoft WinFX).
  7. Configure the assembly in the Web.config file as follows:
    <compilation debug="false">
          
    <assemblies>
            
    <add assembly="WindowsBase, Version=3.0.0.0, Culture=neutral, PublicKeyToken=31BF3856AD364E35"/>
       </
    assemblies>
     
    </compilation>
  8. Create a Web Form and replace the default.aspx code with the following sample code.

The following sample shows how to bind to a Microsoft SQL Server database to retrieve data based on a customer selection and create a new document based on the CustomerLetterTemplate.docx.

[VB.NET]

<%@ Page Language="VB" AutoEventWireup="true" %>
<%@ Import 
Namespace="System.Data" %>
<%@ Import 
Namespace="System.Data.SqlClient" %>
<%@ Import 
Namespace="System.IO" %>
<%@ Import 
Namespace="System.IO.Packaging" %>
<%@ Import 
Namespace="System.Xml" %>
<!DOCTYPE html 
PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns
="http://www.w3.org/1999/xhtml">
<head id
="Head1" runat="server">
    <title>Data-Driven Document Generation - SQL Server Sample</title>
</head>

<script runat
='server'>
    
    
Private Const strRelRoot As String = "http://schemas.openxmlformats.org/officeDocument/2006/relationships/officeDocument"
    
    
Private Sub CreateDocument()
        
' Get the template file and create a stream from it
        
Const TemplateFile As String = "~/App_Data/CustomerTemplate.docx"
        
        
' Read the file into memory
        
Dim buffer() As Byte = File.ReadAllBytes(Server.MapPath(TemplateFile))
        
Dim memoryStream As MemoryStream = New MemoryStream(buffer, True)
        buffer 
Nothing
        
        
' Open the document in the stream and replace the custom XML part
        
Dim pkgFile As Package Package.Open(memoryStream, FileMode.Open, FileAccess.ReadWrite)
        
Dim pkgrcOfficeDocument As PackageRelationshipCollection pkgFile.GetRelationshipsByType(strRelRoot)
        
For Each pkgr As PackageRelationship In pkgrcOfficeDocument
            
If (pkgr.SourceUri.OriginalString "/"Then
                                
               
                
' Add a custom XML part to the package
                
Dim uriData As Uri = New Uri("/customXML/item1.xml", UriKind.Relative)
                
If pkgFile.PartExists(uriData) Then
                
                    
' Delete template "/customXML/item1.xml" part
                    
pkgFile.DeletePart(uriData)
                
End If
                
                
' Load the custom XML data
                
Dim pkgprtData As PackagePart pkgFile.CreatePart(uriData, "application/xml")
                GetDataFromSQLServer(pkgprtData.GetStream, ddlCustomer.SelectedValue)
            
End If
        Next
        
        
' Close the file
        
pkgFile.Close()
        
        
' Return the result
        
Response.ClearContent()
        Response.ClearHeaders()
        Response.AddHeader(
"content-disposition""attachment; filename=document.docx")
        Response.ContentEncoding 
System.Text.Encoding.UTF8
        memoryStream.WriteTo(Response.OutputStream)
        memoryStream.Close()
        Response.
End()
    
End Sub
    
    Private Sub 
GetDataFromSQLServer(ByVal stream As Stream, ByVal customerID As String)
        
        
'Connect to a Microsoft SQL Server database and get data
        
Dim source As String = ConfigurationManager.ConnectionStrings("NorthwindConnectionString").ConnectionString
        
Const SqlStatement As String = "SELECT CompanyName, ContactName, ContactTitle, Phone FROM Customers WHERE CustomerID=@customerID"
        
Dim conn As SqlConnection = New SqlConnection(source)
        conn.Open()
        
Dim cmd As SqlCommand = New SqlCommand(SqlStatement, conn)
        cmd.Parameters.AddWithValue(
"@customerID", customerID)
        
Dim dr As SqlDataReader cmd.ExecuteReader
        
If dr.Read Then
            Dim 
writer As XmlWriter XmlWriter.Create(stream)
            writer.WriteStartElement(
"Customer")
            writer.WriteElementString(
"CompanyName"CType(dr("CompanyName"), String))
            writer.WriteElementString(
"ContactName"CType(dr("ContactName"), String))
            writer.WriteElementString(
"ContactTitle"CType(dr("ContactTitle"), String))
            writer.WriteElementString(
"Phone"CType(dr("Phone"), String))
            writer.WriteEndElement()
            writer.Close()
        
End If
        
dr.Close()
        conn.Close()
    
End Sub
    
    Protected Sub 
SubmitBtn_Click(ByVal sender As ObjectByVal As EventArgs)
        CreateDocument()
    
End Sub
</script>

<body>
    <form id
="form1" runat="server">
        <div>
            <h1>
                Customer Letter Generator</h1>
            <table border
="0" cellpadding="0" cellspacing="0" style="width: 100%; height: 12%">
                <tr>
                    <td>
                        
Choose a customer:</td>
                    <td>
                        <asp:DropDownList ID
="ddlCustomer" runat="server" AutoPostBack="True" DataSourceID="CustomerData"
                            
DataTextField="CompanyName" DataValueField="CustomerID" Width="301px">
                        </asp:DropDownList>
                        <asp:SqlDataSource ID
="CustomerData" runat="server" ConnectionString="<%$ ConnectionStrings:NorthwindConnectionString %>"
                            
SelectCommand="SELECT [CustomerID], [CompanyName] FROM [Customers]" ProviderName="<%$ ConnectionStrings:NorthwindConnectionString.ProviderName %>">
                        </asp:SqlDataSource>
                    </td>
                </tr>
            </table>
        </div>
        <br />
        <asp:Button ID
="Button1" runat="server" OnClick="SubmitBtn_Click" Text="Create Letter"
            
Width="123px" />
    </form>
</body>
</html>
 
 

[C#]

<%@ Page Language="C#" AutoEventWireup="true"%>
<%@ Import Namespace
="System.Data" %>
<%@ Import Namespace
="System.Data.SqlClient" %>    
<%@ Import Namespace
="System.IO" %>
<%@ Import Namespace
="System.IO.Packaging" %>
<%@ Import Namespace
="System.Xml" %>

<!DOCTYPE html PUBLIC 
"-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

<html xmlns
="http://www.w3.org/1999/xhtml" >
<head id
="Head1" runat="server">
    <title>Data-Driven Document Generation - SQL Server Sample</title>
</head>
    <script language
="C#" runat="server">
        
        
private const string strRelRoot "http://schemas.openxmlformats.org/officeDocument/2006/relationships/officeDocument";

        private void 
CreateDocument() {
            
// Get the template file and create a stream from it
            
const string TemplateFile @"~/App_Data/CustomerTemplate.docx";

            
// Read the file into memory
            
byte[] buffer File.ReadAllBytes(Server.MapPath(TemplateFile));
            
MemoryStream memoryStream = new MemoryStream(buffer, true);
            
buffer = null;

            
// Open the document in the stream and replace the custom XML part
            
Package pkgFile Package.Open(memoryStream, FileMode.Open, FileAccess.ReadWrite);
            
PackageRelationshipCollection pkgrcOfficeDocument pkgFile.GetRelationshipsByType(strRelRoot);
            foreach 
(PackageRelationship pkgr in pkgrcOfficeDocument) {
                
if (pkgr.SourceUri.OriginalString == "/") {
                  
                    // Add a custom XML part to the package
                    
Uri uriData = new Uri("/customXML/item1.xml", UriKind.Relative);

                    if 
(pkgFile.PartExists(uriData)) {
                        
// Delete template "/customXML/item1.xml" part
                        
pkgFile.DeletePart(uriData);
                    
}
                    
// Load the custom XML data
                    
PackagePart pkgprtData pkgFile.CreatePart(uriData, "application/xml");
                    
GetDataFromSQLServer(pkgprtData.GetStream(), ddlCustomer.SelectedValue);
                
}
            }

            
// Close the file
            
pkgFile.Close();

            
// Return the result
            
Response.ClearContent();
            
Response.ClearHeaders();
            
Response.AddHeader("content-disposition""attachment; filename=document.docx");
            
Response.ContentEncoding System.Text.Encoding.UTF8;

            
memoryStream.WriteTo(Response.OutputStream);

            
memoryStream.Close();

            
Response.End();
        
}

        
private void GetDataFromSQLServer(Stream stream, string customerID) {
            
//Connect to a Microsoft SQL Server database and get data
            
String source ConfigurationManager.ConnectionStrings["NorthwindConnectionString"].ConnectionString;
            const string 
SqlStatement =
                
"SELECT CompanyName, ContactName, ContactTitle, Phone FROM Customers WHERE CustomerID=@customerID";

            using 
(SqlConnection conn = new SqlConnection(source)) {
                conn.Open()
;
                
SqlCommand cmd = new SqlCommand(SqlStatement, conn);
                
cmd.Parameters.AddWithValue("@customerID", customerID);
                
SqlDataReader dr cmd.ExecuteReader();

                if 
(dr.Read()) {
                    XmlWriter writer 
XmlWriter.Create(stream);
                    
writer.WriteStartElement("Customer");
                    
writer.WriteElementString("CompanyName", (string)dr["CompanyName"]);
                    
writer.WriteElementString("ContactName", (string)dr["ContactName"]);
                    
writer.WriteElementString("ContactTitle", (string)dr["ContactTitle"]);
                    
writer.WriteElementString("Phone", (string)dr["Phone"]);
                    
writer.WriteEndElement();
                    
writer.Close();
                
}
                dr.Close()
;
                
conn.Close();
            
}
        }

        
protected void SubmitBtn_Click(object sender, EventArgs e) {
            CreateDocument()
;
        
}
        
    </script>

<body>
    <form id
="form1" runat="server">
    <div>
            <h1>Customer Letter Generator</h1>
            <table border
="0" cellpadding="0" cellspacing="0" style="width: 100%; height: 12%">
                <tr>
                    <td>
                        Choose a customer:</td>
                    <td>
                        <asp:DropDownList 
                           ID
="ddlCustomer"
                           
runat="server"
                           
AutoPostBack="True"
                           
DataSourceID="CustomerData"
                           
DataTextField="CompanyName"
                           
DataValueField="CustomerID" 
                           
Width="301px">
                        </asp:DropDownList>
                        <asp:SqlDataSource
                          ID
="CustomerData"
                          
runat="server"
                          
ConnectionString="<%$ ConnectionStrings:NorthwindConnectionString %>"
                          
SelectCommand="SELECT [CustomerID], [CompanyName] FROM [Customers]" ProviderName="<%$ ConnectionStrings:NorthwindConnectionString.ProviderName %>">
                        </asp:SqlDataSource>
                    </td>
                </tr>
          </table>
        </div>
        <br />
        <asp:Button
          ID
="Button1"
          
runat="server"
          
OnClick="SubmitBtn_Click" 
          
Text="Create Letter"
          
Width="123px" />
    </form>
</body>
</html>


If you build and run this application, you will see something like this:
 
As you can see, the code is simple and optimized for performance and memory. You can start building data-driven document solutions using this code. If your data source is an Access database, Web service, Oracle database, text file, or any other data source, it doesn't matter. Bottom-line, you only need to change the data access layer code. What matters the most is that using this technology you will be able to build server-side applications that generate Office documents using managed-code. I just love the potential that the Office XML File Formats provide and many thanks to Tristan Davis for sharing his knowledge on how to do this :). 
 
Finally, I found out today that Microsoft published a virtual lab related with the Office XML File Formats and I wanted to share this as well:
 
Programmatic Manipulation of the Microsoft Office Open XML Formats

Learn how to manipulate Microsoft Office system documents using the Microsoft Office Open XML Formats without the 2007 release. Work through scenarios involving programmatically manipulating documents using the Microsoft Office Open XML Formats.


Enjoy!
~Erika
Attachment: SqlServerSample.zip
Leave a Comment
  • Please add 5 and 8 and type the answer here:
  • Post
  • Thanks Eri, Both posts are very useful.

    If anybody is trying to build windows application instead of web-based application, then add reference DLL “Windowsbase.dll” from "C:\Program Files\Reference Assemblies\Microsoft\Framework\v3.0" folder for “System.IO.Packaging” namespace.
    (If you have installed .NET Framework 3.0)

    -Imtiyaz
  • Hi Imi!

    It's always nice to see your comments :).
  • I've been planning to start pulling together some real world examples around the ways in which people...
  • Hello,

    I tried to execute your sample code but it fires an exception when it closes the XmlWriter at rutine 'GetDataFromSQLServer'. The exception says something like 'Sequence couldn't be expanded'.

    Could you orient to me?

    Thanks a lot.

    ---- The stack at this point is: ----

    NotSupportedException: No se puede expandir la secuencia de memoria.]
      System.IO.__Error.MemoryStreamNotExpandable() +54
      System.IO.MemoryStream.set_Capacity(Int32 value) +33
      System.IO.MemoryStream.EnsureCapacity(Int32 value) +1986396
      System.IO.MemoryStream.Write(Byte[] buffer, Int32 offset, Int32 count) +1986476
      System.IO.BinaryWriter.Write(UInt16 value) +54
      MS.Internal.IO.Zip.ZipIOCentralDirectoryFileHeader.Save(BinaryWriter writer) +47
      MS.Internal.IO.Zip.ZipIOCentralDirectoryBlock.Save() +748
      MS.Internal.IO.Zip.ZipIOBlockManager.SaveContainer(Boolean closingFlag) +659
      MS.Internal.IO.Zip.ZipIOBlockManager.SaveStream(ZipIOLocalFileBlock blockRequestingFlush, Boolean closingFlag) +100
      MS.Internal.IO.Zip.ZipIOFileItemStream.Flush() +45
      MS.Internal.IO.Zip.ProgressiveCrcCalculatingStream.Flush() +31
      MS.Internal.IO.Zip.ZipIOModeEnforcingStream.Flush() +31
      System.Xml.XmlUtf8RawTextWriter.Flush() +48
      System.Xml.XmlWellFormedWriter.Close() +57
      ASP.default_aspx.GetDataFromSQLServer(Stream stream, String customerID) in D:\Mis Documentos\Visual Studio 2005\Projects\Tests\TestOOXml\Default.aspx:81
      ASP.default_aspx.CreateDocument() in D:\Mis Documentos\Visual Studio 2005\Projects\Tests\TestOOXml\Default.aspx:46
      ASP.default_aspx.SubmitBtn_Click(Object sender, EventArgs e) in D:\Mis Documentos\Visual Studio 2005\Projects\Tests\TestOOXml\Default.aspx:88
      System.Web.UI.WebControls.Button.OnClick(EventArgs e) +96
      System.Web.UI.WebControls.Button.RaisePostBackEvent(String eventArgument) +116
      System.Web.UI.WebControls.Button.System.Web.UI.IPostBackEventHandler.RaisePostBackEvent(String eventArgument) +31
      System.Web.UI.Page.RaisePostBackEvent(IPostBackEventHandler sourceControl, String eventArgument) +32
      System.Web.UI.Page.RaisePostBackEvent(NameValueCollection postData) +72
      System.Web.UI.Page.ProcessRequestMain(Boolean includeStagesBeforeAsyncPoint, Boolean includeStagesAfterAsyncPoint) +3840

    Jesus:

    I compiled my solution with B2TR and it works. I attached the code to the blog entry (SQLServerSample.zip). The only thing you need to change is the connection string.

     

    I hope this helps,

    -Erika

  • I accidentally deleted a comment while cleaning spam. But I rememeber that someone hinted me that I didn't need this line of code:

    // Get the root part

                       PackagePart pkgpRoot = pkgFile.GetPart(new Uri("/" + pkgr.TargetUri.ToString(), UriKind.Relative));

    You are absolutely right, I was following a different approach before I decided to replace the CustomXMLPart and forgot to delete that line.

    Great catch! Thanks a lot :).

    I updated the samples and attached the source files.

  • Hi Erika,

    I have to tell you that I have the same issue as Jesus getting an exception at the same line ("'Sequence couldn't be expanded'"). I am using the Beta2TR and have WinFx 3.0

    Please let me know.

    Thanks.

  • Hi Erika.

    How do I open a Word 2007 document (template) add some data (to a contentplaceholder) and print the document, from serverside code, without automating word, like in the bad old days?

  • Karl,

    You can download the attached project, this code uses System.IO.Packaging and the File Formats to open aWord 2007 document template, replace data in content controls with server-side code. No automation!

    http://blogs.msdn.com/erikaehrli/attachment/703088.ashx

    You should read this article:

    Server-Side Generation of Word 2007 Docs by Ted Pattison

    http://msdn.microsoft.com/msdnmag/issues/06/11/BasicInstincts/default.aspx

    It's quite helpful.

    -----------------------------------

    Jesus and Alberto.

    I run the sample using C#, but didn't test the VB.NET sample with B2TR, are you using VB.NET? I can't reproduce, but seems to me by reading the error trace that the problem has something to do with the memory stream and the size of the document. The problem happens at the moment of creating the document.

  • Hi Erika.

    Thank you for the answer. I am quite aware of the posibilites for manipulating a word document (great), and I have also read the material you refer to. But, I am still puzzled regarding the actual printing process. It is not a problem to print a fixed document XPS file. The question is, how do I do exactly the same with a words Office Open XML document, without involving Word itself. Whenever I try to print (using XpsDocumentWriter) I get an error because the document is not a fixed document? Mqyb I am missing the obvious?

    Could you show a code sample?

  • Karl!

    Seems I forgot the "printing" part of your question.

    Printing server-side is an interesting problem so I will have split my answer since there's two things to consider.

    Software - hardware?

    Proposing a server-side printing software solution goes hand-in-hand with the hardware infrastructure that the company has. Printing is probably the slowest thing to do in computer land. Say you have a top end printer, you will be able to print at most 50 pages/minute. If you were to print 1,000 documents of 10 pages each, the operation would take you 200 minutes. The least thing that should worry you (considering the times and resources involved to print) is using COM automation. By the time the printer prints the documents, Word automation will be done. Some banks have printers connected in a network to do load balancing, problem is print spoolers have a limited memory, so you would also need some program to enqueue (some kind of hard disk spooler program) documents. I know some banks use special hardware and software to handle massive printing. Some of this programs either print text files (use a driver to print in certifiers) or translate documents and generate a graphic (in memory) and send to the printer (most solutions involving pdf files use this approach). This programs also take care of the document queueing process.

    Printing Word documents...

    To print Word documents you need to run Word. Only Word understands it's formatting tags. Every single format (bold, underline, colors, spaces, fonts) that you see on the screen are rendered by Word and the same happens with a printer. So the only program that will print you a WYSIWYG Word document is Word. I know it's not recommended to use server-side automation, but at this point, it's your best shot.

    Now, there's an option. It IS possible to go the XPS way with the only problem that you would need to write some kind of parser that reads the WordML formatting elements and transform documents to XPS format (it's also XML). I don't know of any code sample to do this, but if you were to build this solution, probably the Windows SDK documentation can help:

    http://windowssdk.msdn.microsoft.com/en-us/library/ms771525.aspx

    I know there are some code samples (including printing XPS documents) that you can download here:

    http://www.microsoft.com/downloads/details.aspx?FamilyID=ea97a39d-6812-4904-8226-85f3da99996b&displaylang=en

    You will find information of every single element for WordML here:

    http://www.ecma-international.org/news/TC45_current_work/TC45-2006-50_final_draft.htm

    There are some translator applications that convert WordML to something else, for example:

    http://openxmldeveloper.org/articles/OpenXMLtoXHTMLinJava.aspx

    Finally, there are more great samples for manipulating WordML here:

    http://openxmldeveloper.org/archive/category/1003.aspx

    I hope this helps,

    -Erika

  • that is a great example, I'm currently working on a live system that used code like yours.

    but have you found a way to bind a part of the word document to a repeating element to construct a table?

    EG you have a list of items in an order and the output document requires a table to show them in (a item per row).

    i've not seen anyway to due this so I have started looking at directly manipulation the word XML for the table. is there a simple way using custom XML?

  • You know, you could save yourself a lot of trouble and use the FOR XML clause in SQL Server to build your XML document for you quite painlessly.

    -Dan

  • Is it possible to insert a HTMl table as the vale of Content Control

  • HELP!!!!!!

    Will this kind of setup work with the Business Contact Manager database on a server....

    basically, the Office development team kind of forgot to put support in word mailmerge for the userdefined fields that you can create in BCM.... and all i wont to do is create a mail merge letter template to include some of the user-defined fields...

    im not a happy bunny about this, as we have just spent 4k on 23 office 2007's with BCM to replace Act... its like taking several steps back in development...

  • Very useful post and comments Erika. Thanks a lot. Continuing from Karl's question, it was the case with Word 2003 that Microsoft did not recommend nor support the running of Word on a server. Which is not to say that it couldn't be done of course.

    Do you know if this has changed with Office 2007 ??

Page 1 of 4 (49 items) 1234