Information, tips, news and announcements about SQL Server Analysis Services and PowerPivot directly from the product team.
This article is a follow-up on a previous blog post titled “How to Build a VSTO-Based PowerPivot Workbook,” which discussed some of the advantages and disadvantages of building PowerPivot workbooks by using VSTO. A key VSTO advantage is that you can bring data from virtually any source into PowerPivot even if there is no suitable data provider, but a significant disadvantage is that Excel Services in SharePoint doesn’t run the VSTO code. Among other things, this means that you cannot keep VSTO-based workbooks automatically updated by using the Scheduled Data Refresh feature of PowerPivot for SharePoint. Moreover, Excel Services displays warnings about unsupported and disabled features when viewing the workbook in the browser, such as for buttons, text boxes, and other such objects, as illustrated in the following screenshot, which shows the VSTO-based Analytics for Twitter workbook in Internet Explorer. Removing the VSTO customizations could help to improve the user experience, but how do you bring custom data into a PowerPivot workbook without VSTO if there is no direct data provider?
Converting a VSTO-based PowerPivot workbook into a non-VSTO version—such as to provide seamless browser access and to take advantage of Scheduled Data Refresh in SharePoint—requires the following high-level steps:
Let’s take a closer look at these steps by converting the Analytics for Twitter solution. It is an excellent example that highlights many important PowerPivot elements and their dependencies.
Step 1: Importing data without using VSTO
VSTO-based PowerPivot workbooks rely on VSTO customizations primarily to pull data into Excel tables. Linked to PowerPivot tables, these Excel tables then simply act as PowerPivot data sources. Here’s the relevant figure from the previous blog post, which shows the data flow for the Twitter Analytics workbook.
Given the objective to eliminate the VSTO code on the client, we need to replace the Excel tables with a server-based repository that PowerPivot directly supports as a data source, such as a SQL Server database, text files on a file server, or SharePoint lists. The new data import solution can then periodically pull data from Twitter into the chosen server-based repository and PowerPivot can import the data from this repository without the need for VSTO.
The following figure illustrates the import method I chose for this article, which is based on a SharePoint Timer job and a collection of SharePoint lists. I chose this approach because it is relatively straightforward to create a Timer job to retrieve data at a configurable interval from the source and put it into one or multiple lists. PowerPivot can then import these lists through a data feed. For details about how this works, I recommend reading Uday Unni’s excellent white paper “Using SharePoint List Data in PowerPivot” available on MSDN at http://go.microsoft.com/fwlink/?LinkId=221005.
The new data flow is very similar to the original VSTO-based approach. My Timer job just takes on the role of the original VSTO code and the SharePoint lists simply replace the original Excel tables, as already explained. That’s all there is to it. In fact, I asked Analytics for Twitter author Aaron Meyers for permissions to reuse his original code. The source code is included in the Twitter workbook on the Sample Code worksheet. Reusing this code gave me a great head start. Thanks again for sharing, Aaron!
Without going into too much detail about SharePoint programming, here’s how I created my Timer job:
Note: You can find the resulting Visual Studio project called TwitterImportTimerJob in the attachments to this article. The solution is not optimized for performance. It is for demonstration purposes only and not to be used in a production environment.
You can deploy the solution on a SharePoint developer workstation directly in Visual Studio 2010. By default, the Twitter Import Timer job is configured to run daily between midnight and one o’clock in the morning, but it’s also possible to run this job on demand at any time. In SharePoint Central Administration, click on Monitoring, click on Review Job Definitions, then click on Twitter Import Timer Job, and then in the Timer Job configuration click on Run Now. The following figure shows the results for a run in my test environment.
When running the Twitter Import Timer job for the first time, it creates all necessary SharePoint lists automatically, as summarized in the following table.
Message boxes that the Excel client displays during the interactive import process
The Twitter Import Timer job runs unattended and can’t display message boxes, so it reports processing status and error messages in the TwitterImportStatus list instead.
Text box for comma-separated search queries on the Topics, People, Tone, and Details worksheets
The Twitter Import Timer job automatically creates the default search terms in the SearchTerms list. You can add and remove search terms to customize the Twitter import process. Each search term item also has an Include property that you can set to No to exclude the term from the searches without deleting it.
tblToneDict table on the Tone Dictionary worksheet
The Twitter Import Timer job initializes the ToneDictList with the same entries found in the tblToneDict table.
tblTweets table on the hidden TweetData worksheet
The TweetsList columns are almost identical to the tblTweets table, except that TweetGUID replaces the GUID column and TweetAuthor replaces the Author column.
tblMentions table on the hidden TweetData worksheet
The MentionsList has the same columns as the tblMentions table in the workbook.
tblHashtags table on the hidden TweetData worksheet
The HashtagsList has the same columns as the tblHashtags table in the workbook.
tblUtility table on the hidden TweetData worksheet
The UtilityList has the same columns as the tblUtility table in the workbook.
Step 2: Updating the PowerPivot model
Ultimately, this step is about replacing the original tables in the workbook’s underlying PowerPivot model with new versions that connect to the corresponding SharePoint lists as their data sources. However, if you simply remove the existing tables, you lose their associated column definitions and measures. It is therefore a good idea to first document the existing PowerPivot model (see TwitterAnalyticsModel.xslx in the attachment to this article for a completed example). It is also a good idea to keep a copy of the original PowerPivot workbook so that you can refer back to the original model if necessary.
There are several important PowerPivot elements in the Twitter Analytics workbook, which the model documentation should cover:
Having thoroughly documented the PowerPivot model, you are ready for the ultimate task, which is to replace the existing PowerPivot tables with new versions. First, remove the existing tables, but leave the tblDimDate and tbleDimHour tables in place because these are linked to static Excel tables that don’t have VSTO dependencies. The VSTO code doesn’t update tblDimDate or tblDimHour. Next, follow this procedure to create new tables that use SharePoint lists as their data sources:
The final step is to reapply the original columns and measures to the new tables as previously documented to ensure that all columns are identical to their original versions, that the calculations work, and that all the required table relationships and measures exist. The following procedure yields the desired result:
Step 3: Updating the Workbook and Removing the VSTO code
At this point, you have completed the PowerPivot work so you can close the PowerPivot window. What’s left is to update the workbook data in Excel according to the following procedure:
The final task is to remove the VSTO code and clean up the workbook so that it renders without warnings in Excel Services (a completed workbook is attached to this article). Here are the steps:
Congratulations! This was a serious piece of work, but it was definitely worth it. The Twitter Analytics solution is now a full-blown PowerPivot team application, easily shared through SharePoint and accessible in the browser without the need for Excel 2010 and PowerPivot add-in on the client computer. You can also eliminate the data refresh burden. It was a somewhat complicated and challenging procedure for information workers to refresh the data in the original VSTO version, but now manual refresh is definitely deemphasized. Just configure Scheduled Data Refresh to do this job for you, as illustrated in the following screenshot.
In my test environment, I configured Scheduled Data Refresh to run daily at 4 AM. This was a good choice because my Timer Job tends to finish its import processing around three o’clock every morning according to the Execute Import Job – Finished entries in my TwitterImportStatus list. Refreshing the data during off-business hours after the Twitter Import job finishes ensures that the workbook contains the most recent data when the new workday starts.
And that’s it for now. As demonstrated, you do not necessarily have to use VSTO to build advanced PowerPivot workbooks on top of custom data sources. It can be a valid choice, but in many cases it is worthwhile to look for an alternative import method based on a supported data source. Non-VSTO workbooks are easier to share and can take full advantage of PowerPivot for SharePoint capabilities that are not available to their VSTO counterparts. Admittedly, converting an advanced VSTO workbook, such as the Analytics for Twitter solution, can be a time-consuming effort, especially if you are not the original workbook author, but it’s fortunately not a very common scenario. In any case, I hope you had fun reading this article and following the steps perhaps in your own test environment.