• Beth Massi - Sharing the goodness

    Comparing Two Open XML Packages with Open XML Diff

    • 2 Comments

    Yesterday I received an email from another employee here at Microsoft to check out this tool called Open XML Diff that he was working on. His name is Pranav Wagh and he just released another drop of the tool which you can read about oh his blog.

    You can use Open XML Diff when writing code to generate an Open XML document when you're not quite sure what XML it is that you need to write. Say you know how you want the document to render in Word, but are unsure about which element or attribute to tweak.  You can save a copy of a doc, modify it, save it again, and compare the two which helps you figure out what XML you need.

    I downloaded the source but had a couple trust issues on my Vista box because the projects are looking for the xmldiffpatch.dll, which the instructions said to install first from here, but that setup places the assembly off of C:\Program Files\ and it's a NO NO in Vista to reference assemblies there. Then I noticed that Pranav was supplying the xmldiffpatch.dll in the OpenXMLDiff\bin\Release folder already. So I created a single solution for the three projects and set a project reference from the OpenXMLComparer Winforms client to the OpenXMLDiff library and another project reference from the OpenXMLDiff library to the ViewRenderer project. Then the only binary reference you need to set up is to the xmldiffpatch.dll. Finally I set the OpenXMLComparer as the startup project and rebuilt the whole solution without problems.

    So to test the tool I took the MyDocument.docx Word document from my last post that just contains the text "This is my document." and saved a copy called MyDocument1.docx and made a simple change; I made the entire sentence bold. Then I compared the two documents. The report found three xml files in the package that changed. The important one being the MainDocument part, document.xml. Here's a snapshot of part of the report where you can see the differences:

    Pretty slick! It really helps me understand what changes are applied to the Open XML package based on changes I make through Office. Much more intuitive than reading Office specs ;-)

    Make sure to provide feedback on the Open XML Diff site.

    Enjoy!

  • Beth Massi - Sharing the goodness

    Accessing Open XML Document Parts with the Open XML SDK

    • 10 Comments

    About a month ago the Open XML SDK 1.0 (June 08 update) was released. The SDK provides strongly typed document part access to Word 2007, Excel 2007 and PowerPoint 2007 documents. The SDK has been a CTP for a while, but last month version 1.0 was finally released. So I installed this baby last week and started playing around with it and found it really easy to use after briefly looking at the documentation. The How Do I section is a great place to start.

    Upgrading the Letter Generator

    I decided to upgrade my Word 2007 letter generator program to use the SDK to manipulate the packages. Remember that Office 2007 documents are really just archive files, so if you rename them to .ZIP you can take a look at the contents of the package. The Open XML Package spec defines a set of XML files that contain the content and define the relationships for all of the document parts stored in a single package. To programmatically manipulate them you can use the raw System.IO.Packaging namespace, but the SDK's DocumentFormat.OpenXml.Packaging namespace is much easier to work with. 

    My mail merge program uses XML literals to construct XML for the document part of a Word 2007 file based on data in the Northwind database. The LINQ query was a piece of cake compared to figuring out how to manipulate the .docx package in order to replace the document.xml (called the MainDocument) part. Not that the final code is particularly long, it was just a pain to figure it out. The SDK not only saved me a few lines of code, it made the code much more readable and took only a few minutes to write. (I updated the code for the WordMailMerge program on Code Gallery).

    Getting Started with the Open XML SDK

    Let's take another simple example that constructs a MainDocument part using XML literals and then replaces it in a .docx package using the SDK. This time I'll focus on the code that manipulates the Open XML package with the SDK not on the particulars of XML Literals. The first thing I recommend is to install the VSTO Power Tools so you can open Office 2007 documents and manipulate the parts directly in the Visual Studio IDE like I showed in my last post using the Open XML Package Editor.

    Of course you'll need to also install the SDK which places the DocumentFormat.OpenXML.dll assembly into your GAC. Add a reference to this assembly in your project. As an aside, when x-copy deploying to a machine with the .NET Framework on it already just make sure you deploy the DocumentFormat.OpenXML.dll assembly alongside your application to avoid having to install the SDK on the target machine. The easiest thing to do is select "Show All Files" in the Solution Explorer, expand the References, and on the Properties for the DocumentFormat.OpenXML reference set "Copy Local" = True. This will place a private copy of the assembly next to your application when it's built.

    Now create a new Word 2007 document with some simple text in it, for instance, type: "This is my document" then save it and add the .docx file to your Visual Basic project. Double-click on it and that opens the Open XML Package Editor:

    We can manipulate the parts through this editor if we want to but what I really want to do is replace the document.xml with our own we create using XML literals and embedded expressions. Double-click on the document.xml to open the MainDocument part in the XML Editor (if the XML editor opens and the XML is all on one line with no breaks then just select all the contents and cut then paste it back into the editor and it will put the proper line breaks in there for you : Ctrl + A,X,V).

    For this simple example, let's place the executing user's name into the document. Create the XML Literal and an embedded expression by pasting the document.xml into the VB Editor and adding an expression to print out the executing user's name:

    Dim myDoc = <?xml version="1.0" encoding="utf-8" standalone="yes"?>
                <w:document xmlns:ve="http://schemas.openxmlformats.org/markup-compatibility/2006"
                   xmlns:o="urn:schemas-microsoft-com:office:office"
                   xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships"
                   xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"
                   xmlns:v="urn:schemas-microsoft-com:vml"
                   xmlns:wp="http://schemas.openxmlformats.org/drawingml/2006/wordprocessingDrawing"
                   xmlns:w10="urn:schemas-microsoft-com:office:word"
                   xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main"
                   xmlns:wne="http://schemas.microsoft.com/office/word/2006/wordml">
                   <w:body>
                       <w:p w:rsidR="00DD17EB" w:rsidRDefault="00361264">
                           <w:r>
                               <w:t>This is <%= Environment.UserName %>'s document</w:t>
                           </w:r>
                       </w:p>
                       <w:sectPr w:rsidR="00DD17EB" w:rsidSect="00DD17EB">
                           <w:pgSz w:w="12240" w:h="15840"/>
                           <w:pgMar w:top="1440" w:right="1440" w:bottom="1440"
                               w:left="1440" w:header="720" w:footer="720" w:gutter="0"/>
                           <w:cols w:space="720"/>
                           <w:docGrid w:linePitch="360"/>
                       </w:sectPr>
                   </w:body>
               </w:document>

    Replacing the MainDocument Part

    Before the SDK, replacing the MainDocument part in the package we had to figure out the right content type and write the code that deleted then added the new part. We also needed to add a reference to WindowsBase (a 3.0 assembly) in order to access the System.IO.Packaging namespace.

    Imports System.IO.Packaging
    Imports System.IO
    ...
    '**** Without OpenXML SDK
    Dim uri As New Uri("/word/document.xml", UriKind.Relative)
    Dim contentType = "application/vnd.openxmlformats-officedocument.wordprocessingml.document.main+xml"
    Dim docFile = CurDir() & "\MyDocument.docx"
    
    Using p As Package = Package.Open(docFile)
        'Delete the current document.xml file
        p.DeletePart(uri)
    
        'Replace that part with our XDocument
        Dim replace As PackagePart = p.CreatePart(uri, contentType)
        Using sw As New StreamWriter(replace.GetStream())
            myDoc.Document.Save(sw)
         End Using
    End Using

    For this example it's pretty easy, however if you add/remove parts it's up to you to update the relations in the package and this isn't an easy task using this raw API. Enter the Open XML SDK. Now we don't need to add a reference to WindowsBase, only to DocumentFormat.OpenXML and import the Packaging namespace contained within. Then our code can access the parts of the document in a strongly-typed way:

    Imports DocumentFormat.OpenXml.Packaging
    Imports System.IO
    ...
    '***** Use the OpenXML SDK for easier access to parts
    Dim docFile = CurDir() & "\MyDocument.docx"
    
    Dim wordDoc = WordprocessingDocument.Open(docFile, True)
    Using wordDoc
        'Replace the document part with our XML
        Using sw As New StreamWriter(wordDoc.MainDocumentPart.GetStream(FileMode.Create))
            myDoc.Document.Save(sw)
        End Using
    End Using

    After we run this code you'll see that the MainDocument part now has the user name in the document body as described by our XML literal.

    Using LINQ with the Open XML SDK

    Using the SDK we can also write LINQ queries over the part collections. For instance if we want to select all the top level parts and any of their sub-parts, we can write a query like so:

    Using wordDoc
      Dim parts = From part In wordDoc.Parts _
                Select part.OpenXmlPart, _
                       part.RelationshipId, _
                       part.OpenXmlPart.RelationshipType, _
                       SubParts = _
                       ( _
                        From subPart In part.OpenXmlPart.Parts _
                        Select subPart.OpenXmlPart, _
                               subPart.RelationshipId, _
                               subPart.OpenXmlPart.RelationshipType _
                       ).ToList
    End Using

    This query returns similar information to what you get with the Open XML Package Editor if we look at the same document. If we display the query results in two related DataGridViews we'll see that the MainDocument part contains additional parts for things like themes, styles and settings.

    If we want to access the actual XML content for each of the OpenXmlParts we can call the GetStream method on the OpenXmlPart we want and pass it a StreamReader which we can use to load an XDocument object.

    Using wordDoc
    
     Dim parts = From part In wordDoc.Parts _
                 Select Doc = XDocument.Load(New StreamReader(part.OpenXmlPart.GetStream())), _
                        part.OpenXmlPart, _
                        part.RelationshipId, _
                        part.OpenXmlPart.RelationshipType, _
                        SubParts = _
                        ( _
                         From subPart In part.OpenXmlPart.Parts _
                         Select Doc = XDocument.Load(New StreamReader(subPart.OpenXmlPart.GetStream())), _
                                subPart.OpenXmlPart, _
                                subPart.RelationshipId, _
                                subPart.OpenXmlPart.RelationshipType _
                        ).ToList
    End Using

    Loading and Querying the XDocument from the Package

    Let's say we have a case where we can't use XML Literals and embedded expressions, instead we want to pull out the MainDocument part and find and replace text inside. We can do this using XML Axis properties. This can get pretty tricky because there may be a lot of formatting information in the document. An easier way may be to use content controls which you can alias so that it's easier to query those instead, but for this example it's a pretty simple query to find our body text and replace the word "my" with the user name.

    Imports <xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main">
    ...
    Dim docFile = CurDir() & "\MyDocument.docx"
    Dim wordDoc = WordprocessingDocument.Open(docFile, True)
    Dim myDoc As XDocument
    
    Using wordDoc
        Using xr As New StreamReader(wordDoc.MainDocumentPart.GetStream())
            'Load the MainDocument part's XML
            myDoc = XDocument.Load(xr)
        End Using
    
        'Find the only line of text in this document
        Dim element = (From item In myDoc...<w:t>)(0)
    
        'Replace the value of the element
        element.Value = <s>This is <%= Environment.UserName %>'s document</s>.Value
    
        Using sw As New StreamWriter(wordDoc.MainDocumentPart.GetStream(FileMode.Create))
            'Save the modified XML back to the MainDocument part
            myDoc.Save(sw)
        End Using
    
    End Using

    One of the cool things about using the Open XML SDK is that you don't have to have Office installed to run any of this code. So it's a great alternative instead of using slow COM automation to manipulate documents.

    As I explore Open XML in Office 2007 more and more I'll post more realistic business examples using LINQ to XML and Visual Basic. For now, you may want to sink your teeth into Ken Getz's Advanced Basics March 2008 article in MSDN Magazine: Office 2007 Files and LINQ. This article also shows off some important XML namespace features of Visual Basic.

    Enjoy!

  • Beth Massi - Sharing the goodness

    Handy Visual Studio Add-In to View Office 2007 Files

    • 3 Comments

    Last month when I was in Redmond I mentioned to a colleague that I really liked the Open XML format that Office 2007 was using now and how I had been playing with the format and LINQ to XML. He pointed me to the VSTO Power Tools which contains a Visual Studio Add-In called the Open XML Package Editor. This tool allows you to view Office 2007 files from Word, PowerPoint and Excel in a neat little treeview that lets you navigate the Open XML file and manipulate all the parts individually. After installing the VSTO Power Tools, just double-click on the Office 2007 file directly from your Visual Studio project to open the tool.

    <From the VSTO Power Tools documentation:>

    Open XML Package Editor

    This is a Visual Studio 2008 add-in to allow parsing and editing of Open Packaging Conventions files, including Word, Excel and PowerPoint documents. Features include:

    • Open any Office 2007 Open XML Package file or XPS Package file directly in Visual Studio.
    • Intuitive, browsable tree view of the Package file.
    • Open any XML part directly in Visual Studio's rich XML editor.
    • Easy to use user interface for adding and removing parts and relationships.
    • Import and export part contents to and from files.
    • Detects when a Package file that is opened in Visual Studio is modified externally. Prompts user to reload without having to close any open XML part editors.
    • Create new Office Packages from a set of templates using Visual Studio's File > New dialog.

    Figure 1 shows the treeview that is provided when you open an Open XML Package file in Visual Studio:

    From the treeview, if you double-click on any XML part in the file, that part will be opened in the standard Visual Studio XML editor, as shown in Figure 2:

     

    <end docs>

    One thing I noticed is that if the XML editor opens and the XML is all on one line with no breaks then just select all the contents and cut then paste it back into the editor and it will put the proper line breaks in there for you (Ctrl + A,X,V).

    There are nine other tools included in the VSTO Power Tools that I highly recommend trying out if you are doing any kind of Office development with Visual Studio. Also check out the VSTO Team Blog and VSTO Developer Portal for more information on Office development with Visual Studio.

    Enjoy!

  • Beth Massi - Sharing the goodness

    Channel 9 Interview: Jared Parsons on the P-Invoke Interop Assistant

    • 1 Comments
    I just posted an interview screencast on Channel 9. In this interview, Jared Parsons, a Developer on the Visual Basic IDE, shows us the P/Invoke Interop Assistant available on CodePlex. The tool helps with converting unmanaged C code to managed P/Invoke signatures and vice versa. Say goodbye to digging through random header files or MSDN documentation to find the right constants, structures and signatures. The P/Invoke Interop Assistant does a smarter translation for you using SAL (Source Code Annotation Language). 

    Enjoy!
  • Beth Massi - Sharing the goodness

    WPF Forms over Data: 2 More Videos!

    • 8 Comments

    I couldn't wait for the content pages to prop onto MSDN to tell you that I just uploaded two more WPF Forms over Data videos and code samples onto Code Gallery: http://code.msdn.microsoft.com/wpfdatavideos, videos 3 and 4.

    Video #3 shows how to create a dropdown combobox (pick-list) that pulls data from a lookup table in a database. #4 shows a technique for customizing the display of validation messages on controls (for more details on that you can read this post too.)

    Enjoy!

  • Beth Massi - Sharing the goodness

    Channel9 Interview: Visual Basic Language Design Meeting

    • 3 Comments

    I sat down with the VB Language design team and asked them about their design process, favorite features, their thoughts on other languages, as well as what the Visual Basic language strategy really is. It was a fun and enlightening interview with a group of really smart people lead by Paul Vick. You can find most of the team members writing on the Visual Basic Team Blog.

    Stay tuned for more interviews like these... 

    Enjoy!

  • Beth Massi - Sharing the goodness

    Keeping the Dev Centers Fresh with Social Bookmarking

    • 6 Comments

    Last week Chris Slemp from MSDN/TechNet interviewed me and 3 other top bookmarkers about how we use the Social Bookmarking Preview and how we'd like to see it evolve. In this post he also has some interesting stats and analysis on how poeple are using social bookmarking (scroll down for the interviews).

    To be honest, I don't really think about social bookmarking all that hard, it's become part of my routine (and I'm hoping others in the community start to feel that way as well). I use it for organizing all the great content I produce and manage so it's pretty easy for me to be in the habit of tagging things.

    It's also made the Dev Centers a lot easier to maintain and keep fresh so I'm loving it. Give it a try, the more you tag things "Visual Basic" the more great content will show up on the Visual Basic Dev Center Community Page through this feed. This allows everyone in the community to see what Visual Basic content is popular anywhere on the web. Pretty cool. Of course there are still features to add and bugs to fix but I think it's a great preview so far.

    Enjoy!

  • Beth Massi - Sharing the goodness

    Community Article: Using Windows Communication Foundation with Windows Workflow Foundation – Part 1

    • 1 Comments

    This week on the VB Dev Center we're featuring another community submitted article by Maurice de Beijer (VB MVP) on Using Windows Communication Foundation with Windows Workflow Foundation. Maurice has a great wiki as well that you should check out if you're doing Windows Workflow development.

    In this article he shows you how to use SendActivity which enables you to call WCF as well as standard web services in your workflows. Stay tuned for the next article which will show you how to use ReceiveActivity, which enables a you to publish a workflow as a WCF service.

    Enjoy!

  • Beth Massi - Sharing the goodness

    Silicon Valley Code Camp Registration Open

    • 4 Comments

    Peter Kellner (ASP.NET MVP) and crew are at it again this year organizing the 3rd annual Silicon Valley Code Camp.
    Once again, Foothill College has stepped up and is providing free use of their facility on Saturday and Sunday November 8th and 9th this year.  The event is again free and ready for registration at the following URL:

    http://www.siliconvalley-codecamp.com/

    I'll be speaking on LINQ of course but I'll also be showing some Office development and Open XML as well. There were 800 people registered last year and there is a huge breath of technology that it covers, not just Microsoft or .NET technologies, so it's a great way to get exposure to a lot of cool stuff. We are fortunate to live in the Bay Area where there is a lot of cool technology and great speakers. This is also your chance to share your knowledge with other developers. Registration is FREE and anyone can submit talks so if you think you have someting interesting to teach people, go for it!

    Hope to see you there!

  • Beth Massi - Sharing the goodness

    Auto-Reply: Out of Office on Vacation

    • 2 Comments

    Yes, even Beth Massi gets a vacation. This year I have the pleasure of attending Nick Landry's (Mobile MVP) wedding in Jamaica. It's my first time in the Caribbean. This is what I call community baby! ;-)

    I'm actually writing this before I leave and queuing it up to publish later. Yes that's right I'm probably on the beach right now drinking Red Stripe. Ya mon. Maybe I'll work on my dreadlocks too.

    I'll be back to blogging when I get back next week. Don't miss me too much. :-)

    Later!

Page 1 of 2 (13 items) 12