Welcome to MSDN Blogs Sign in | Join | Help

Microsoft Excel

The team blog for Microsoft Excel and Excel Services.
Excel Services part 10: Data Connection Libraries, or connecting to databases made easy, manageable, and secure

In the past few posts I have talked about all the work we have done to make managing, sharing, and securing your Excel workbooks better using SharePoint and Excel Services.  Today, I am going to cover a new feature that provides management, sharing, and security of data connections – the Data Connection Library (DCL).

What is a Data Connection Library?
A Data Connection Library (or DCL) is a new type of SharePoint library (much like a document library) that provides a place to store, share, and manage connection files.  By connection files, I mean Office Data Connection (ODC) files which contain all the information and parameters needed to form a data connection, such as server name, OLAP cube or table name, and query (note – ODC files are not a new feature – they were introduced in Excel 2002).  Since the DCL is a library in SharePoint, it comes with all the great SharePoint features you would expect – such as workflow support, file approval, library level/item level security, and sorting filtering based on metadata. You can create a DCL the same way you create any library, and DCLs can be created almost anywhere in SharePoint e.g. on a portal, team site, etc.   Here is what a DCL looks like in SharePoint 12.


(Click to enlarge)

Even though the DCL reuses the library concept in SharePoint, it is much more valuable than just a document library full of connection files – this is because of how Excel interacts with the DCL.  Let’s take a look at that and see how the DCL and Excel 12 solve a few problems.

Connecting to databases made easy …
Setting up a connection to a database in Excel is a task that many users struggle with – for example, if you want to connect to an ODBC datasource or SQL Server Analysis Services cube, users must know server names, cube names, table names, what type of connection to create, user credentials, etc.  Lots of clicks and knowledge are required. 

Excel 12 and DCLs make connecting to databases a much simpler, faster activity.  Users will simply need to know what data they want to work with, and that’s pretty much it.  Let’s take a look at an example of getting started with a PivotTable connected to a database in Excel 12.

To connect to a database (or other external data source like a web query) in Excel 12, users will use the Data tab.  Here is a shot of the Data tab in the beta build (note, this is not the final UI we will ship).


(Click to enlarge)

One of the buttons on the Data tab is the “Existing Connections” button.  To connect to the database they want to use for their PivotTable, the user will start by pressing this button.  When they do, they see a new dialog (“Existing Connections”) which lists the connections that are stored in the DCL. 


(Click to enlarge)

One point to note is that these connections all have friendly names and non-technical descriptions, so it is easy for users to choose the connection they want.  Those names and descriptions are provided by the person(s) who set up the DCL and populate it with ODC files – more on this below. 

Next, the user simply needs to highlight the connection they want to use and press open.  At this point, they see another dialog which allows the user to tell Excel 12 what to do with the data.


(Click to enlarge)

Another point to note is that Excel 12 looks at the connection the user has selected and only offers options that are possible for a particular database (for example, in this case, the datasource was a SQL Server Analysis Server database, which cannot be represented in Excel 12 as a table, so that option is disabled).  At this point, the user needs to simply press OK and they have a PivotTable connected to data in Excel 12.  That’s a total of 3-4 clicks.

Some of you are probably asking how the connections ended up in the DCL in the first place.  In general, we anticipate that either departmental “connection-savvy” power users or IT will author data connection files and put them in DCLs where the connections will be reused by many people in the organization. 

You may also be asking how Excel knows about the existence of DCLs.  SharePoint has a new feature that allows the administrator to “advertise” the location of the DCL to Office 12 clients, allowing connections from a DCL to show up in Excel 12.  Of course, the DCL only shows up if the user has permissions to access those connection files.

Solving connection management problems …
In addition to improving discoverability of connections, DCLs will help customers manage connections.  Information about data sources can change, such as server name, OLAP cube name, table name, etc. – a typical example is a database moving from a test server to a production server.  For organizations that have many authors, it may be difficult to communicate these changes to all the right people.  Worse yet, there may be hundreds of existing workbooks that need to have their data connections updated.  The DCL helps solve these problems because customers will only need to update a single connection file in the DCL with new information.  After they have done so, workbook authors will get the right connection information the next time they use that connection file, and any existing workbooks that were created using the connection file will now have their connections updated automatically the next time that workbook’s data is refreshed. 

How exactly does this work?  By default, workbooks will refresh their connection information from a DCL only when they fail to connect to the data source (you might think of this as a “failover” mechanism).  But we have also added the ability to force workbooks to always get the latest connection information before attempting to connect.  An example of when this might be useful is when you want workbook authors to start using a new database for business reports, but you still want to keep the old database around and functioning for auditing or test reasons.  Connections to the old database still work, but you want current and future workbooks to start using the new database.  The “always use this file to refresh data” setting is designed for exactly that kind of scenario.  The setting is a property in the ODC file itself – it can be set when the ODC file is created (pictured below).


(Click to enlarge)

Making data connectivity more secure …
Now that we have talked about discoverability and manageability, let’s conclude by looking at how the DCL can be used to make connecting to data more secure. 

One common security concern is knowing which data connections are safe to run – for example, data connections can contain malicious queries, or they could contain connection parameters that can slow an app down or compromise the integrity of the data.  By creating a DCL, and by only allowing most knowledgeable and trusted “connection authors” to save connections to the DCL, you add an extra layer of security that helps ensure that connections coming from a DCL safe to run. 

In a previous entry, I talked about Trusted Locations on Excel Services as a means to ensure that malicious workbooks were prevented from running on the server.  Much like Trusted Locations, Excel Services has “Trusted Connection Libraries” for data connections.  Excel Services has a mode where it will only process data connections from DCLs that the administrator has explicitly marked as “trusted” by the server.  As mentioned above, data connections have many security threats associated with them – in many ways processing a data connection can be like running code.  By providing Trusted Connection Libraries, Excel Services gives the administrator the ability to allow only specific data connections to be run on the server.    

That’s it for DCLs.  Next time we will take a look at how Excel Services integrates with SharePoint dashboards. 

Posted: Tuesday, November 29, 2005 9:57 AM by David Gainer
Filed under:

Comments

Jan Karel Pieterse said:

Hi David,

Interesting as Excel Services may be, I suspect the subject is not the most popular one around. Your last three posts have attracted no comments at all.

I'm certainly not saying Excel services shouldn't be written about. But I am under the impression that your writing for the wrong audience here.

Nevertheless: Great blog, keep it up.
# November 30, 2005 5:06 AM

David Gainer said:

Hi Jan Karel,

My goal for this blog is to cover everything we are doing with Excel client and server in v12, so I am going to be pretty thorough for all feature areas. I promise there will be lots more posts that stir up comments before we are all done :-)
# November 30, 2005 1:08 PM

Nigel Harper said:

David,

I have just got back to your blog after a 3 week break and finished wading my way through all your new entries and readers' comments. Wow! Lots to absorb and get through.

I'm sure that this lot on Web Services is being read but that it is very new to most of us and so we have not got anything to say. Rest assured by the time you start up a blog for the next version of Excel you will no doubt get a load more responses on this topic.

Thanks again,
Nigel
# November 30, 2005 1:32 PM

Harlan Grove said:

A question I posed under a previous topic seems to need to be asked again. Will Excel services provide any control over recalculation, i.e., anything akin to Excel's Application.Calculation property and its Application.Calculate and Application.CalculateFull methods? If not, do SetCellA1 calls trigger recalculation, and if so, is recalculation interupted upon subsequent SetCellA1 calls before the workbooks fully recalcs? Or is recalculation deferred until the first GetCellA1 call? Do workbooks need to be saved to servers with Calculation set to Automatic?
# November 30, 2005 7:49 PM

David Gainer said:

Hi Harlan,

1. Workbooks can be saved with calculation set to either automatic or manual.
2. If it is automatic, then SetCellA1 (and any of the other setting calls) will trigger a recalc.
3. If it is manual, we have 2 methods to trigger a recalc: Calculate and CalculateWorkbook. The former enables to calculate a range or the entire workbook, and works as Range.Calculate. The latter calculates the entire workbook, and takes an argument to control the calculation – essentially enabling the two operations that are equivalent to Excel’s Application.Calculate and Application.CalculateFull.
4. Subsequent calls with Excel Web Services are not possible while a call is still being processed. So if SetCellA1 did not return yet (e.g. because the workbook is still being calculated) and you call it again, you will get an exception. The way to interrupt the previous request is to call CancelRequest (the only method that can, actually, be called during the course of a previous request); if the cancel is successful, you can issue the next request. Note that CancelRequest may or may not succeed – the server is not always able to stop a request (e.g. if it is hanging on an external data query).
# December 1, 2005 12:22 AM

Harlan Grove said:

David Gainer...
...
|3. If it is manual, we have 2 methods to
|trigger a recalc: . . .

Good.

|4. Subsequent calls with Excel Web Services
|are not possible while a call is still being
|processed. So if SetCellA1 did not return yet
|(e.g. because the workbook is still being
calculated) and you call it again, you will get
|an exception. . . .

Not good. Any way to query whether a request has been completed? If each SetCellA1 call would trigger recalc if the workbook were saved with Automatic calculation, and the next SetCellA1 call couldn't be issued successfully until that recalculation ends, then either there'd need to be some means of querying the state or all SetCellA1 calls would need to be wrapped in exception handlers wrapped in loops.

Any way to turn off automatic calculation from Excel services? If not, wouldn't it be best to treat all workbooks as saved with calculation set to manual? Maybe I'm being overly naive, but if the first dozen requests posted to Excel Services were SetCellA1 calls, there'd be nothing gained by recalculating after each request. On the other hand, it'd be a good thing if the first GetCellA1 call automatically triggerred recalculation even without an explicit calculate request.

Dealing with automatic calculation in large models will likely be enough to convinve most of your potential customers that Excel Services wasn't a good idea. On the other hand, dealing with manual recalc but forgetting to issue an explicit recalc request before issuing GetCellA1 requests would also have unpleasant results which might lead potential customers to the same conclusion. You might want to poll the willing MPVs for their calculation war stories. There are serious trade-offs involved.
# December 1, 2005 4:00 PM

David Gainer said:

Hello Harlan,

>Not good. Any way to query whether a request
>has been completed? ....

The normal way of using a web service such as this, through the proxy class that Visual Studio generates, is synchronous. The call does not return until its processing is complete. So the issue that you raise goes away. Only code that tries to generate and send SOAP calls directly, and chooses to do so asynchronously, could hit that problem. Since we believe that most developers will use the Visual Studio framework, we did not add the ability to query for the request state, but we appreciate your feedback.

>Any way to turn off automatic calculation
>from Excel services? If not ...

No. Turning off recalc for all workbooks may be suitable for some cases, and not for others. As an example, if the application is designed with multiple iterations of set-calculate-get (each iteration in the loop both sets parameters and gets calculated results, and calculation is not very heavy), then an extra Calculate call each iteration could be a significant performance hit – an extra roundtrip to the server. So we did not want to force manual recalc for all custom applications. Ideally, it would be good to be able to know how the workbook was saved and to be able to change the setting – great feedback. This is just one example why an Excel-like object model for Excel Services is a desirable feature, which we hope to pursue at some point in the future.

>Dealing with automatic calculation in large
>models will likely be enough to convinve most
>of your potential customers that Excel
>Services wasn't a good idea

Again, thanks for the feedback, especially the implied conclusion that it is not clear cut, and really depends on the scenario. The tradeoffs are exactly why we did not wish to force one way or the other – both are possible, but require careful application design. If an application presents the issue of expensive calcs for each call to SetCellA1, there are a few workarounds. They are based on the assumption that you are in control of both the workbooks and the code.
1. Save the workbook with manual recalc, and call Calculate explicitly only once after the Set calls are done.
2. Organize the workbook parameters in a single range, and call SetRangeA1 once, instead of multiple calls to SetCellA1. This way, not only calculation happens only once, but you also save most of the roundtrips to the server – a performance gain.
# December 6, 2005 1:08 AM

Alex Barnett blog said:

Sander Viegers is a user experience (UX) designer in the Office Design Group who contributed to Excel...
# July 14, 2006 6:36 AM

Заметки с полей said:

При публикации форм InfoPath 2003 возникало несколько проблем. В частности, одной из проблем было то,...
# August 20, 2006 4:40 PM

Microsoft Excel 2007 (nee Excel 12) said:

Today we have a guest post from Dan Parish, who is a program manager on the Excel Services team. ...
# September 25, 2006 3:01 PM

Mark Bower said:

Today I needed to get an InfoPath Forms Services web-form to make a call out to a web service. I was

# December 5, 2006 5:55 PM

Mark Bower said:

Today I needed to get an InfoPath Forms Services web-form to make a call out to a web service. I was

# December 5, 2006 5:56 PM

The Mit's Blog said:

De retour dans une de mes couches logicielles préférées de MOSS : Excel Services Il est juste de reconnaitre

# October 9, 2007 8:48 AM
New Comments to this post are disabled
Page view tracker