Good Evening,

I sit down to write this blog article on the plane on my way to LA in preparation for tomorrow’s presentation at PDC on the same subject.   In my last article I described the ability to leverage the hard work that the SQL Server team put into building a spatial library for your own .NET applications.    In this article I will continue to describe how the SQL Server Spatial Library can be leveraged in many different types of applications, and in this case for use with Microsoft Office, and more specifically Microsoft Excel.

It will likely take a couple blog articles to cover the entire scope of this project, but this first article will show all the necessary steps to get the base functionality built and we will expand on it over the coming two articles.

Let’s get started building the ES2 (Excel Spatial Spreadsheet) add-in…

ES2-LogoImage

Pre-Requisites

In order to follow along you will need Visual Studio 2008, Microsoft Excel 2007 and you will also need to download the SQL Server Option Pack which contains the SQL Server types assemblies (or Install SQL Server 2008 on your development machine as it also includes the same assemblies).

Office Add-In Technologies

Microsoft Office offers a wide variety of options for developers to extend the core office suite.   There is of course the built in Visual Basic for Applications which allows developers to build applications hosted by Office.   You can also build different kinds of add-ins that plug into Office.   Some add-ins are more “GUI” oriented, allowing you to add task panes, new ribbons while others focus on providing more “backend” services like new formulas for Excel.    

There are also a few options for building add-ins for office.   Visual Studio includes the VSTO project types and you can also go with the more traditional and complex native extensions written in C / C++.

What the heck is a Spatial Function for Excel?

For this blog article, our focus is to build new spatial operators that can be used within Excel as if they were native Excel functions.  We want to enable an Excel User (Information Worker) who wants to find out the area of a state or zip code to type in something like:

=AREA(ZipCodeShape)

or to find out if a customer is located within a kilometer of a store location:

=IF(DISTANCE(CustomerLocation, StoreLocation)<1000,”YES”,”NO”)

it would also be nice if an Excel user could easily turn a user’s existing data like addresses  into spatial objects for analysis by calling out to web services offered by Bing maps for example:

=PARSEANDLOCATE(“One Microsoft Way, Redmond WA”)

In fact we’d like to enable someone who is not an expert in spatial analysis to just “try and learn” spatial operators quickly and painlessly, which is exactly what Microsoft Excel enables (you can always Undo and try out different scenarios by pasting a new formula in a different cell).

We want “The Power of SQL Spatial with the Simplicity of Excel”

To achieve that goal I’ll  be leveraging a type of add-in called an Automation Add-In.   For those who remember the days of COM / DCOM programming (yes I’m that old) and terms like late binding, IDispatch and automation interfaces then you’ll be right at home, for those who were able to skip that time period, don’t stress out, .NET makes it pretty easy to create COM Automation classes.  

Implementing the base Spatial Automation Add-In

Start Visual Studio 2008 or 2010 as Administrator (more on this later) and create a new .NET 3.5 C# Class Library Project and name it something interesting like MySpatialFunctions.

Create-Project

Once the newly created project is loaded in Visual Studio, right click on the project to bring up the project properties dialog.  

Click the [Assembly…] button on the Application Tab of the project properties and make sure the “Make Assembly COM-Visible” is checked.

Assembly-Info

We also need to ensure that the “Register for COM Interop” on the Build tab is checked.  This step tells Visual Studio to register any COM components in the project every time the project is rebuilt.  The alternative is to manually register the assembly using the regasm.exe .NET utility.

Project-Properties

Now rename the default class name to something a little nicer than Class1, for example MySpatialFunctions.  Please note that the name you pick for the class will show up in Excel’s drop down list of functions as the Category name for your functions (example shown below):

Excel-InsertFunction

Following the instructions from the previous article we need to add a reference to the SQL Server Type library which includes the Spatial data types.  Right click on the references folder in your newly created project and add a reference to the SQL Server Type assembly.

Insert-Reference

You will also need to add:

using Microsoft.SqlServer.Types;

to the beginning of your class file to bring in the .NET definitions for the spatial data types. 

Now let’s make sure your class can act as a COM object by bringing in the .NET COM interop support and decorating your class with the necessary attributes [ClassInterface(ClassInterfaceType.AutoDual)].  Without getting too deep into COM terminology, this tells .NET to expose our class as supporting both COM late-binding (IDispatch / Automation) and early binding clients.

Your class file should now look something like the following: 

   1:  using System;
   2:  using System.Collections.Generic;
   3:  using System.Linq;
   4:  using System.Text;
   5:   
   6:  using Microsoft.SqlServer.Types;
   7:  using System.Runtime.InteropServices;
   8:  using Microsoft.Win32;
   9:   
  10:  namespace MySpatialFunctions
  11:  {
  12:      [ClassInterface(ClassInterfaceType.AutoDual)]
  13:      public class MySpatialFunctions
  14:      {
  15:   
  16:          [ComRegisterFunctionAttribute]
  17:          public static void RegisterFunction(System.Type t)
  18:          {
  19:              Microsoft.Win32.Registry.ClassesRoot.CreateSubKey
  20:                  ("CLSID\\{" + t.GUID.ToString().ToUpper() + "}\\Programmable");
  21:          }
  22:   
  23:          [ComUnregisterFunctionAttribute]
  24:          public static void UnregisterFunction(System.Type t)
  25:          {
  26:              Microsoft.Win32.Registry.ClassesRoot.DeleteSubKey
  27:                  ("CLSID\\{" + t.GUID.ToString().ToUpper() + "}\\Programmable");
  28:          }
  29:      }
  30:  }

The above code will expose your C# class as a COM automation class and will expose all public methods as automation callable methods.   The RegisterFunction, decorated [ComRegisterFunctionAttribute] is called by Visual Studio as part of building the project.  These functions are used register the MySpatialFunctions class with the Windows Registry.  You can also use the .NET regasm.exe utility to achieve same result.

Now that the plumbing is in place, let’s implement the SPATIALAREA function which will take in the description of a Geography object in WKT (well known text) format from Excel and return the area of that geographic object.   WKT is really just text or a string, so the input parameter to the function will be a String and the COM interop between Excel and our automation add-in will make sure that whatever Excel is passing in will appear to us as a .NET String.   We will return the area as a double.

   1:  namespace MySpatialFunctions
   2:  {
   3:      [ClassInterface(ClassInterfaceType.AutoDual)]
   4:      public class MySpatialFunctions
   5:      {
   6:          public double SPATIALAREA(String Shape)
   7:          {
   8:              double fReturnValue = -1;
   9:   
  10:              // Convert the String representation of
  11:              // geographic object into a SqlGeography
  12:              SqlGeography geo = SqlGeography.Parse(Shape);
  13:   
  14:              // We now have a valid Geography, let's call
  15:              // it's area function and convert the results
  16:              // from SqlDouble to .NET Double.
  17:              fReturnValue = geo.STArea().Value;
  18:   
  19:              // Return the results
  20:              return fReturnValue;
  21:          }
  22:         
  23:          // ....
  24:      }
  25:  }

The above code receives a String from Excel and converts the string representation into a SqlGeography object using using the class method Parse() which expects a WKT encoded string.   Once the string is successfully parsed, we call the OGC standard STArea() method to return the shape’s area in square meters.   The SqlGeography and SqlGeometry objects return SQL types (SqlDouble, etc.), so we use the .Value method to return the underlying double value.  The area is then returned to Excel as a double.

I’ll stop with this one function for now to test that we have all the necessary pieces in place, but you could easily add additional public functions that expose additional methods from the SqlGeography class.   For example you could add a SPATIALDISTANCE and a SPATIALINTERSECTS method to return the distance between two spatial objects or to test if two spatial objects intersect.    I’ll in fact do that in the next blog article to demonstrate some more advanced Excel methods that operate on ranges of cells.

Now Build the Project (F5).  This will compile the class assembly and register it as a COM class library.   You need to run Visual Studio as an administrator in order for Visual Studio to correctly register the library as a COM Class Library.   When the add-in is packaged for distribution we can create an installer that will take care of registration, for local development and small distributions, it is very convenient to have VS do the registration.   Trust me… COM registration setup programs are an acquired taste and can be painful to get right.

Once the project is built and the assembly registered with Windows we can switch gears and launch Excel to bring in the new automation add-in we just created.

Manually Enabling the Automation Add-In within Excel 2007

Lanch Excel 2007 and click on the Office button (upper left round button):

Excel

Select [Excel Options…] on the bottom right of the dialog box and from the Options dialog box, select the [Add-Ins] tab.

Excel-Add-In-Options

Choose Manage: Excel Add-ins and click [Go…], on the dialog box that follows click [Automation….] and find your class MySpatialFunctions.MySpatialFunctions in the list, select it and click OK.  This will register your automation add-in with Excel and make its functions available to Excel user as Formulas.

Excel-Automation-Add-In

Let’s now make sure the method is available within Excel.   In Excel, select a cell and click the Formula button (Fx) and in the Category drop down list, find your MySpatialFunctions.

Excel-InsertFunction-2

You have now completed the basic process for registering a custom automation add-in with Excel.   In the next blog article I will show how to automate this process using a setup program for deployment to end-users.

This is really just a glimpse into what is feasible in terms of spatial analysis now that this plumbing is in place.

The next blog post will include example of functionality to operate over a range of cells.  These aggregate functions can act on multiple geographic shapes, and include UNION, INTERSECTION and a function to calculate the CONVEXHULL of a shape, etc. 

Testing the base Spatial Add-In

So we’ve built an Add-In that currently supports one spatial function, the SPATIALAREA function, which when given a WKT encoded string representing a shape will return its area.   So how do you get a set of WKT data to try this out?    WKT is a standard format for geospatial data and there are different tools and data sources on the internet that let you convert data to / from WKT.  Another such text representation is called GML and it is relatively easy to modify our Add-In to support both formats (and auto-detect what format).  

For now here is an example WKT object, which represents the country of Morocco, all the pairs of numbers are Latitude and Longitude values which form the vertices of the country’s border:

POLYGON ((-8.6666678252256713 27.666664129182578, -8.667222995748233 28.709442230680754, -7.6447228335782063 29.38499825838333, -7.4338894686020938 29.397220532368337, -7.1238890576883476 29.636943787478454, -6.5858335012886569 29.567218718351096, -6.400278105924599 29.804443265276372, -5.5383339200815715 29.902496352581256, -5.2758341886801494 30.055274928261763, -4.9205561051119027 30.508052873828575, -4.396945078096711 30.648330706130608, -4.0169448364146341 30.911109819241169, -3.6266669275411774 30.970554415405132, -3.6022224394530782 31.095554457582143, -3.8261113959581103 31.163887059367184, -3.8183335039199489 31.695552783902336, -3.5116668926113768 31.673332210341314, -2.9994444009703307 31.833331997926884, -2.8538890815261464 32.088333074020561, -2.3311115158732578 32.157493608115495, -1.1805555950618838 32.110549872177238, -1.2958334558623108 32.163055394625928, -1.2497221362066235 32.326942482097934, -1.0102780535866551 32.50833136699751, -1.3827780560041652 32.724441516585529, -1.5424999126467027 32.9394379321343, -1.4802780647792078 33.063049307516593, -1.6680558432224473 33.261108396767547, -1.6002779231388138 33.556938179949341, -1.730555759883178 33.70499415941044, -1.6544443650187513 34.083610597469317, -1.7933334168756687 34.378326347728262, -1.6867106306540376 34.485511790735742, -1.8575001956524209 34.6102751699483, -1.7472224616645127 34.7472152758781, -2.209444613630581 35.085830782093907, -2.6350001107359389 35.099716151314666, -2.8855556899671657 35.242775016942396, -2.7727781165503669 35.12344354594692, -2.8416665727057948 35.126937801752014, -2.9147223394202406 35.273605397275951, -2.946944586102064 35.329162557540613, -2.986111147028685 35.418052595660889, -3.0663889907443789 35.28944394203873, -3.33250047654772 35.191383412856624, -3.6591668811445603 35.2727737647203, -4.3538894581048853 35.146110476744205, -4.6958342727246034 35.208885297255144, -5.25305545748645 35.576942396678376, -5.345833746377215 35.841659555457895, -5.3955573480761734 35.916336164243809, -5.9187442140904558 35.790649477730661, -6.3152781038228216 34.834999113878069, -6.8430557148205375 34.018608010752295, -7.4966669134759023 33.648330689833216, -8.5383338847421175 33.250549332499993, -9.27639013723884 32.558326726694006, -9.2811126007121647 32.175552453361512, -9.6791668736386267 31.708610438323728, -9.8252793032591743 31.3849982174836, -9.8402786969497935 30.628051715409157, -9.611946167924998 30.412777042428619, -9.64166834548642 30.161941508800595, -10.242500183623481 29.302219297167209, -10.606668533428026 28.966110252659604, -11.081111961457841 28.738609209711591, -11.509445127963096 28.304996469296981, -12.065833980717276 28.083053697361194, -12.902500271008393 27.954166414933635, -13.174961157838167 27.666957939188453, -8.6666678252256713 27.666664129182578))

You can copy the above text into a Cell within a blank Excel Spreadsheet, for example copy the text into cell A1.  You can “truncate” the text by putting a space / blank in cell B2.

Test-Add-In-Part1

Now let’s try out our fancy SPATIALAREA function by typing =SPATIALAREA(A1) in cell A2 and pressing Enter.

  Test-Add-In-Part2

And we should get the answer in square meters, format the cell to show 2 decimal places (unless you really like scientific notation):

Test-Add-In-Part3 

Achieving A More Perfect Spatial Add-In

One of the issues with using the simple approach (Auto Dual Class Interface) to create and automation add-in is that some of the non-spatial related base Object methods are then exposed to Excel as callable methods and visible in the Function Picker in Excel.    You can see some of those unintended functions above as “Equals”, “GetType” and “ToString”  

It would be preferable from a user experience and ease of use perspective if we could hide those methods and only expose spatial methods within Excel.   In order to achieve finer control over the exposed methods it is necessary to get a tiny bit more involved with the COM interop support in .NET.   

With some refactoring we can in fact hide those base Object public methods and have finer control over the methods that are exposed to Excel.  

I will cover this in detail in the next blog article, but here is the approach:

1.   Define an Inteface in C# that represents all the methods to be exposed to Excel by the automation class. 

    •  Decorate that interface with [ComVisible(true)] attribute and specify a unique Guid for that interface [Guid(“unique-guid-here”)].

2.   Modify the MyFunctions class:

o   Derive / implements interface from step #1.

o   Add [ComVisible(true)] decoration to the class

o   Add [Guid(“unique-guid-for-the-class”)] decoration to the class.

o   Add [ProgId(“MySpatialFunctions.MySpatialFunctions”)]

o   Modify [ClassInterface] Attribute to reflect [ClassInterface(“ClassInterfaceType.None”)]

  1. Modify the RegisterFunction and UnRegisterFunction to reflect both the Interface and the Class.

In addition to “cleaning” up some of the unintended methods from above, It would also useful to get a reference to the Excel application to enable more communication back and forth between the add-in and Excel.    You can get a reference to the Excel Application object by implementing the Extensibility.IDTExtensibility2 interface as part of your class.   Once you have access to the Excel Application object, your add-in can actually start communicating and controlling Excel, automatically formatting cells for example that include a spatial formula.

What’s Next?

In the next article I will build on this base add-in and expand on the functions it exposes as well as describe the process of building a VSTO add-in to provide a Spatial Ribbon within Excel.  I’ll also cover how to build some spatial UI componentry for an Excel spreadsheet to allow the embedding of a visualizing / mapping spatial control within a spreadsheet.  Finally I also plan to cover the creation of a setup project to install your add-ins (VSTO and Automation) on end-user workstations.   

So some cool stuff is coming.   Here is a small teaser to show you what you can expect:

Poster-2

Hope to see you at PDC tomorrow and let me know if you find this Excel add-in helpful in learning the SQL Spatial operators and in performing simple spatial analysis without having to leave Excel.

Sincerely,

Olivier