Document assembly seems to be a hot topic these days especially when combined with the power of SharePoint. Today, I want to show you a pretty rich document assembly solution that is able to take multiple Word, Excel and PowerPoint documents and merge them all together to form a final Word document. I showed this solution at both PDC and SPC, so I am going to take the opportunity in this blog to discuss some of the details around the solution.
If you want to jump straight into the code, feel free to download the solution here.
Imagine a scenario where I work for a company that analyzes stocks and generates reports for every company/stock analyzed. These reports are typically quite rich and usually involve more than one person contributing to the content. Content is separated out into multiple Word, Excel, and PowerPoint documents where each document is assigned to an individual. Once all the content has been written, the content is all assembled into a final report as a Word document. My company has asked me to write a solution that will be able to merge all these documents programmatically.
Before I get into the details of my solution, I want to talk about a brand new feature of SharePoint 2010, called Document Sets, which I will leverage to help solve this problem. Document Sets gives users a new way to manage a collection of documents as a single object. Think of this feature as allowing for a binder of related content.
In the case of this solution, I have defined a custom Document Set as having a set of files (six in my case) that correspond to the various components of the final analysis report. Here is a screenshot of a Document Set for a company called Contoso:
Using Document Sets and given the scenario I talked about above we will need to take the following actions:
Setting up the right template makes all the difference when creating Office document solutions. In the case of my solution I have a Document Set with six files, where one of the files is my template file. All these documents are empty except for the template document. As in many of my previous posts, the template document represents the final look of the final report. I will leverage content controls within my template to specify semantic regions within my document to be used to merge content together. The title of my content controls will represent the type of content to be merged. For example, a Word document, a chart from a spreadsheet, a table from a spreadsheet, or a SmartArt graphic from a presentation. The content of my content control will represent the name of the file that contains the content to be merged. For example, here is a screenshot of my template document highlighting one of my content controls labeled "Word:Document" with content set to "Introduction":
This content control will represent the region where I will merge the Introduction Word document into my template file. For sake of completeness the other content controls are labeled "Spreadsheet:Chart", "Spreadsheet:Table", and "Presentation:SmartArt".
You can find the files that represent my Document Set here.
Perform the following steps to enable Document Sets on your SharePoint 2010 site:
At this point you should have a library set up with a Document Set content type.
In order to make this solution usable, we need to add a command within our document library that allows users to merge documents together. The easiest way to accomplish this task is to create a Web Part within Visual Studio 2010. Once you've created the Web Part modify the CreateChildControls
method as follows:
This method will add a new button control where we can add our logic to merge the documents contained within a given Document Set (the merge code will be called from void btnSubmit_Click(object sender, EventArgs e).
Once we've created this web part, the next step is to add the button to our Document Set. The easiest way to accomplish this task is to use SharePoint Designer 2010. Perform the following steps to add your custom web part to the Document Set library:
At this point you should see an "Assemble Documents" command show up for any created Document Set:
Finding content controls within a document involves the following steps:
The following code accomplishes the steps outlined above:
In this solution, there are four types of content to be merged:
Let's talk about each of these content types.
By far the easiest way to assemble Word documents together is to take advantage of altChunks, which I have already blogged about in the past. In any case, here is the code necessary to merge documents together on SharePoint:
Pretty easy stuff!
Again, I've already blogged about importing SmartArt from PowerPoint to Word. Here is the code necessary to accomplish this task:
The only difference between the code above and the code I showed you in my previous post is that the code above works with files on SharePoint.
Again, I am going to leverage the same code I showed you in a previous post called importing charts from spreadsheets to Word documents. Here is the same code modified to work with files that exist within SharePoint:
In a previous post I showed you how to import a table from Word to Excel. Today, I will show you how to do the reverse. Here are the steps necessary to accomplish this task:
To help with the tasks listed above I am going to take advantage of some of the Open XML SDK code snippets that were published. Here is the code necessary to accomplish the tasks outlined above:
The last step is to offer the assembled document to the user with an Open/Save/Cancel dialog as follows:
Here is the code snippet to create this dialog based on a document that is in memory:
Putting everything together and running the code we end up with a document that contains all the content contained within our library merged into a final report:
Pretty cool stuff!