Mail Merging with Word, LINQ and XML Literals in VB

Mail Merging with Word, LINQ and XML Literals in VB

  • Comments 25

I've mentioned before that you can use XML literals in VB 9 to do text merging so I figured it should be pretty easy to do some mail merging in Word the same way. Word 2007 (and Excel, etc) now uses a standard XML format called Office Open XML to describe documents so I thought this would be the perfect opportunity to try out how to do this with LINQ and XML literals in Visual Basic.

What I wanted to do is take all the Customers in the Northwind database who had some orders shipping today and send them thank you letters. Of course, I had to create some new orders first since every order in Northwind was placed before the turn of the century! Anyway, to get started I created a new Word 2007 document with the letter text and some placeholder field names to indicate where I want the data:

Then I just saved the Doc1.docx file to my project folder. If we rename this file with a .zip extension we can see that it's just a zip file with a bunch of XML documents inside it. This is how the Office Open XML works. Basically it's a zip package with a bunch of related XML files inside. You can read more about Office Open XML in Microsoft Office here. If we drill down through the zip file we'll see that the text we just typed is located in the \word\document.xml file.

 

If I open this document.xml file in a text editor like notepad, copy all the text into the clipboard, and then paste it into a Visual Basic program it will be inferred as an XDocument object and we can see our text and placeholders in the XML (I collapsed a couple sections at the top and bottom for readability).

So what we want to do in our mail merge program is write a query that creates this XDocument for all our customers that have shipping orders and then pops it back into a zip file that Word 2007 can open. There's a namespace called System.IO.Packaging that will help us extract and replace the document.xml with the XDocument we create. But first, let's create the query by connecting our Server Explorer to our Northwind database and then adding a new item to our project and selecting the "LINQ to SQL Classes" template. Then just drag-drop the Customers and Orders tables onto the designer. (I show how to do this in this video).

What we want to end up with is a collection of XDocuments above, with our data merged into it. Instead of just creating a collection of XDocuments I also want to capture the CustomerID so that we can use it to create unique file names for our Word documents. So let's create a simple class called Letter that has two properties, CustomerID As String and Document As XDocument.

Public Class Letter

    Private m_doc As XDocument
    Public Property Document() As XDocument
        Get
            Return m_doc
        End Get
        Set(ByVal value As XDocument)
            m_doc = value
        End Set
    End Property


    Private m_ID As String
    Public Property CustomerID() As String
        Get
            Return m_ID
        End Get
        Set(ByVal value As String)
            m_ID = value
        End Set
    End Property
End Class

Now we can write our query to create a collection of these classes.

Dim db As New NorthwindDataContext
Dim letters = _
        From Order In db.Orders _
        Where Order.OrderDate IsNot Nothing AndAlso _
              Order.ShippedDate IsNot Nothing AndAlso _
              Order.ShippedDate.Value.Date = Date.Today _
          Let OrdDate = Order.OrderDate.Value.ToShortDateString _
          Let ShipDate = Order.ShippedDate.Value.ToShortDateString _
        Select New Letter With { _
               .CustomerID = Order.Customer.CustomerID, _
               .Document = <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
rest of doc omitted for clarity ...}

We're selecting the CustomerID and the XML literal with our embedded expressions in place of the placeholder fields I created in the document. I omitted the rest of the XML literal in the query above for clarity. All we need to do is run through the literal and replace all the placeholder fields with embedded expressions from our query. For instance, here's a snapshot of the body of our letter :

...
<w:r w:rsidR="007A5236"> <w:t xml:space="preserve">Dear </w:t> </w:r> <w:proofErr w:type="spellStart"/> <w:r> <w:t><%= Order.Customer.ContactName %></w:t> </w:r> <w:proofErr w:type="spellEnd"/> <w:r w:rsidR="00AA2EC6"> <w:t>,</w:t> </w:r> </w:p> <w:p w:rsidR="00AA2EC6" w:rsidRDefault="00AA2EC6" w:rsidP="00AA2EC6">
<w:r> <w:t xml:space="preserve">We'd like to inform you that the order you placed on </w:t> </w:r> <w:proofErr w:type="spellStart"/> <w:r w:rsidR="00112228"> <w:t><%= OrdDate %></w:t> </w:r> <w:proofErr w:type="spellEnd"/> <w:r w:rsidR="00806521"> <w:t xml:space="preserve"> has shipped on </w:t> </w:r> <w:proofErr w:type="spellStart"/> <w:r w:rsidR="00112228"> <w:t><%= ShipDate %></w:t> </w:r>
...

Now that we have a collection of our Letter objects, we need to create the Word document by making a copy of our original document (which remember is just a .zip file) and replacing the \word\document.xml file in that package. You can also create packages from scratch but then you have to create a relationship file that describes the relation of all the files in the package and it can get tricky. It's easier to just make a copy of the original Word docx file in our case and just replace the part we modified. The key to writing the document back properly is getting the content type right when we call CreatePart on the Package object. The Package object lives in the System.IO.Packaging namespace.

Dim uri As New Uri("/word/document.xml", UriKind.Relative)
Dim contentType = "application/vnd.openxmlformats-officedocument.wordprocessingml.document.main+xml"
Dim sourceFile = CurDir() & "\Doc1.docx"
Dim letterDir = CurDir() & "\Letters\"
My.Computer.FileSystem.CreateDirectory(letterDir)

For Each letter In letters
    Dim customerFile = letterDir & letter.CustomerID & ".docx"
    Console.WriteLine(String.Format("Creating letter {0}", letter.CustomerID))
    My.Computer.FileSystem.CopyFile(sourceFile, customerFile, True)

    Using p As Package = Package.Open(customerFile)
        'Delete the current document.xml file
        p.DeletePart(uri)

'Replace that part with our XDocument Dim replace As PackagePart = p.CreatePart(uri, contentType) Using sw As New StreamWriter(replace.GetStream()) letter.Document.Save(sw) sw.Close() End Using p.Close() End Using Next

This creates a directory of our letters all nicely mail merged with the customer and order data that met our criteria in the Where clause of the letters query.

 

If we open one of the documents we'll see the data properly merged into the letter.

 

I've attached the complete example to this post. Have fun exploring the Office Open XML format because there's just so many things you can do with it especially combined with LINQ, XML literals and Visual Basic!

Enjoy!

Attachment: WordMailMerge.zip
Leave a Comment
  • Please add 1 and 7 and type the answer here:
  • Post
  • Nice sample!

    Did you know you could've used ContentControls and custom XML?

    It would have saved you the trouble of recreating the whole document.xml part by just having to generate the custom XML data.

    But anyway, you've shown a great way to make use of the LINQ to XML features in VB9.

    I found this article to be very interesting back in february:

    http://msdn.microsoft.com/msdnmag/issues/07/02/OfficeSpace/default.aspx#S4

    Have a nice day!

  • This is a great example, thank you for sharing it with us!  This new open office XML format certainly opens up a lot of possibilities.

    Ooooh, a poyple background :)

  • Pingback from http://oakleafblog.blogspot.com/2007/12/entity-framework-beta-3-available-for.html

  • Hi DMurillo,

    It really was no trouble at all to create the entire document, I just pasted it into the editor and plopped my embedded expressions into it, but I also like the approach of just stuffing plain old xml into the package too, that would have worked just fine for my scenario above. The thing I like better about generating the entire letter is that it gives you an easier (IMHO) way to completely change the text of the letter for each customer if you want. Thanks for the link, I'm just starting to dig into this stuff!

    Cheers,

    -B

  • I'll be speaking at the Victoria Code Camp on January 26th so if you're in the area and you have nothing

  • Hi Beth,

    Great as usual!

    How do we manage a document that has an image, simply replacing the document.xml does not make the image available.

    Casey

  • Hi Beth,

    Though I might use a customXML part and modify you code but ....

    1. When I run the code below i always get  "a corrupted file" message when I open the doc,in word, although I can see the changes have been made sucessfuly after pressing "continue"

    2. Stepping thru the code I can see that the file is corrupted when the part is deleted, and the _rels folder and contents have also been deleted

    What to do ? Casey

    Code-

    Dim contentType = "application/vnd.openxmlformats-officedocument.customXmlProperties+xml"

    Dim uri As New Uri("/customXml/item1.xml", UriKind.Relative)

    Using p As Package = Package.Open(customerFile)

       'Delete the current customxml.xml file

       p.DeletePart(uri)

      'Replace that part with our XDocument

      Dim replace As PackagePart = p.CreatePart(uri, contentType)

      Using sw As New StreamWriter(replace.GetStream())

        ' sw.Flush()

        letter.Element.Save(sw)

        sw.Close()

     End Using

    p.Close()

    End Using

  • Hi Casey,

    I've never tried it but I think you need to add the image to the media subfolder and then add it to the document.xml.rels file. Here's a good document to read: http://msdn2.microsoft.com/en-us/library/bb266220.aspx

    HTH,

    -B

  • H Beth,

    As per the previous posts replacing the whole document.xml meant that images and formatting where lost.

    Using a custom xml part means a smaller chunk of xml to handle and the rest of the doc is undisturbed. Finally found some (simple) code that does the trick, courtesy pf Steve Hansen;-

    Casey.

    For Each letter In letters

               Dim customerFile = letterDir & letter.SupplierID & ".docx"

               'Console.WriteLine(String.Format("Creating letter {0}", letter.SupplierID))

               My.Computer.FileSystem.CopyFile(sourceFile, customerFile, True)

               Dim uri As Uri = New Uri("/customXml/item1.xml", UriKind.RelativeOrAbsolute)

               Dim doc As Package = Package.Open(customerFile)

               Dim part As PackagePart = doc.GetPart(uri)

               Dim outputStream As Stream = part.GetStream(FileMode.Create, FileAccess.ReadWrite)

               Dim sw As StreamWriter = New StreamWriter(outputStream)

               sw.Write(letter.Document.ToString)

               sw.Flush()

               sw.Close()

               doc.Close()

           Next

  • Before I post about DevTeach Day 3 I thought I'd report back how the LINQ session I did yesterday evening.

  • Great! I was looking for this capability to automate Word in VB for sometime now. This was very, very easy in VFP.

  • hi Beth

    I tried this sample but I have problem.

    my microsoft office word font and allignment basic settings dont change.

    so I have to change these settings after word document written.

  • do you have other examples inculing more tables in a dataset?

  • Last month when I was in Redmond I mentioned to a colleague that I really liked the Open XML format that

  • About a month ago the OpenXML SDK 1.0 (June 08 update) was released . The SDK provides strongly typed

Page 1 of 2 (25 items) 12