In our last series of posts Zach and I talked about using SQL Server's new FILESTREAM support to store, manage and retrieve media in SQL Server to enable rich user experiences. We've been at it again this week trying to get some traction with the new spatial data types and query semantics in SQL Server 2008. This new functionality makes it possible to integrate location awareness into just about any type of application. What could be more compelling than allowing a user to interact with information that is relevant to locations that they care about, like home, work, school or vacation destinations? This technology has long been the domain of sophisticated GIS applications, but by integrating spatial capabilities into SQL Server 2008, Microsoft is making location awareness available to the average database developer.
Zach has some experience with GIS applications, but I have none at all. So the challenge was whether a couple of database geeks could figure out how to leverage this spatial stuff in an application in a couple of days. I'm happy to say that we made great progress and we'd like to share some of our experiences in hopes that it will demystify the spatial functionality in SQL Server 2008. Rather than regurgitate the same stuff you can read in books online when CTP5 comes out, I'm going to try to focus on some of the key discoveries we made in hopes that it might save you some time.
The scenario we focused on was "geo-locating" the media we stored in SQL Server 2008 in our previous FILESTREAM sample. By tagging these images and videos with some location information, it's possible to do spatial queries that retrieve all the pictures that were taken in a particular geographic area. There are a ton of cool applications for this kind of functionality, I'll leave it to your imagination to think of some.
I figured a logical place to start was by simply adding a geography column to our eventMedia table like so:
CREATE TABLE [dbo].[eventMedia] ( [mediaId] [bigint] NOT NULL IDENTITY PRIMARY KEY, [mediaKey] [uniqueidentifier] NOT NULL ROWGUIDCOL UNIQUE, [title] [nvarchar](256) NOT NULL, [dateCreated] [datetime] NOT NULL DEFAULT(GETDATE()), [createdBy] [nvarchar](256) NOT NULL, [fileName] [nvarchar](256) NOT NULL, [contentType] [nvarchar](256) NOT NULL, [location] [geography] NOT NULL, [binaryFile] [varbinary](max) FILESTREAM DEFAULT(0x));GO
Next I started thinking about how to get shapes into our new geography column. Turns out it was a whole lot easier than I expected. Under the covers, the geography type is implemented as a UDT. That means geography is not a scalar type like an integer or date, rather its an object with methods and properties. This is a great example of how the .NET Framework Common Language Runtime integration introduced in SQL Server 2005 is now enabling a whole new generation of features and capabilities in SQL Server 2008.
Everyone knows how to assign values to scalar types, but how do you create an instance of an object in Transact-SQL? In the case of the geography type there are a variety of ways, but the simplest that I found is the Parse() method. This takes a string representation of a shape as an argument. The format of that string is defined by the Well Known Text (WKT) specification published by the OGC. For example, here's some Transact-SQL code that creates an instance of a geometry object and initializes its value to a POINT shape:
DECLARE @g [geography];SET @g = geography::Parse('POINT (39.96296 -75.16834)');PRINT @g.ToString();
In this example the point shape's coordinates are the latitude and longitude of my old apartment building in Philadelphia. It's worth mentioning that the geography type utilizes a default coordinate system known as WGS 84 which is used by most GPS systems. Once you've initialized a geography instance, there's a ton of methods and properties you can use to interrogate them, compare them, and create new instances from existing ones. Most of these capabilities are defined in the OGC's Simple Features for SQL Specification.
The Parse() method is great because it can deal with all of the different types of shapes, allowing me to code a single INSERT statement that can load any kind of valid shape into the geography column. Here's some C# code we used to insert geography data into our our eventMedia table:
SqlCommand cmd = new SqlCommand("INSERT [dbo].[eventMedia]([mediaKey], [title], [createdBy], [fileName], [contentType], [location]) VALUES( @mediaKey, @title, @createdBy, @fileName, @contentType, geography::Parse(@location));", cxn, txn);cmd.Parameters.Add("@mediaKey", SqlDbType.UniqueIdentifier).Value = mediaKey;cmd.Parameters.Add("@title", SqlDbType.NVarChar, 256).Value = title;cmd.Parameters.Add("@createdBy", SqlDbType.NVarChar, 256).Value = createdBy;cmd.Parameters.Add("@fileName", SqlDbType.NVarChar, 256).Value = sourceFileName;cmd.Parameters.Add("@contentType", SqlDbType.NVarChar, 256).Value = contentType;cmd.Parameters.Add("@location", SqlDbType.NVarChar).Value = location;
Next we started thinking about what kind of shapes to load into our new geography column. My first inclination was to start loading a bunch of map data. Zach and I really didn't want to write a bunch of graphics code to draw maps, there are great services out there like Virtual Earth that draw maps way better than we could. We quickly realized that there really wasn't any need to load a whole map, all we needed to demonstrate "geo-locating" our media files was to create some shapes that represent the map and regions that we are interested in, then tag each of our media files with a specific location within that map. To keep things simple, we decided to store those shapes in the same table as our media files so we could query them all in the same place.
Our demo design uses the following "types" of rows in the eventMedia table:
For the demo we decided to focus on the Pro Cycling Tour International Championship in Philadelphia, PA. It's relatively self-contained and has some cool sections that correspond neatly to our concept of "regions of interest".
Map and Region Rows
One interesting thing I ran into when creating the geography polygons was that you have to define the polygon's points in a counter-clockwise fashion to have the proper "ring orientation". If you define geography points in a clockwise fashion you will get the following error:
Msg 6522, Level 16, State 1, Line 2A .NET Framework error occurred during execution of user defined routine or aggregate 'geography': Microsoft.SqlServer.Types.GLArgumentException: 24205: The specified input does not represent a valid geography instance because it exceeds a single hemisphere. Each geography instance must fit inside a single hemisphere. A common reason for this error is that a polygon has the wrong ring orientation.Microsoft.SqlServer.Types.GLArgumentException: at Microsoft.SqlServer.Types.GLNativeMethods.ThrowExceptionForHr(GL_HResult errorCode) at Microsoft.SqlServer.Types.GLNativeMethods.GeodeticIsValid(GeometryData g) at Microsoft.SqlServer.Types.SqlGeography.IsValidExpensive() at Microsoft.SqlServer.Types.SqlGeography.ConstructGeographyFromUserInput(GeometryData g, Int32 srid) at Microsoft.SqlServer.Types.SqlGeography.STGeomFromText(SqlChars geometryTaggedText, Int32 srid) at Microsoft.SqlServer.Types.SqlGeography.Parse(SqlString s).
I don't confess to understand this 100%, but Ed Katibah told me its kind of like turning the world inside-out. Ed also mentioned that sister data type geometry polygons don't have this limitation and can be loaded in either order (clockwise or counter-clockwise). Without getting into too much detail here, that led to a conversation about what the right type was to use for our demo. I had thought that geometry was limited to standard x and y coordinates and wouldn't handle lat/long coordinates, but was surprised to learn that it does. The basic difference between the two is that geography types account for the curvature of the earth, while geometry types don't. That means for relatively small surface areas they are roughly equivalent, but if you are dealing with larger surface areas you will definitely want to stick with the geography type.
Another important thing to remember about polygons is that you need to "close" them with a final point that is the same as the first point in the polygon. If you don't do this you get the following error:
Msg 6522, Level 16, State 1, Line 2A .NET Framework error occurred during execution of user defined routine or aggregate 'geography': System.FormatException: 24119: The Polygon input is not valid because the start and end points of the exterior ring are not the same. Each ring of a polygon must have the same start and end points.System.FormatException: at Microsoft.SqlServer.Types.GeometryDataBuilder.EndFigure() at Microsoft.SqlServer.Types.OpenGisWktReader.ParseLineStringText(FigureAttributes attributes) at Microsoft.SqlServer.Types.OpenGisWktReader.ParsePolygonText() at Microsoft.SqlServer.Types.OpenGisWktReader.ParsePolygonTaggedText() at Microsoft.SqlServer.Types.OpenGisWktReader.ParseGeometryTaggedText() at Microsoft.SqlServer.Types.OpenGisWktReader.ReadGeometry() at Microsoft.SqlServer.Types.SqlGeography.STGeomFromText(SqlChars geometryTaggedText, Int32 srid) at Microsoft.SqlServer.Types.SqlGeography.Parse(SqlString s).
This points out one of my few beefs about the CLR implementation in SQL Server, and that is that CLR exceptions always generate the same Transact-SQL error number (6522). Somebody on the CLR team once explained to me why it has to be this way, I think it has to do with limitations of the existing Transact-SQL error handling framework and RAISERROR in comparison to the extremely rich exception handling capabilities in the .NET Framework. I'm sure my buddy Bob Beauchemin would be able to explain this in gory detail.
One last tip about polygons, and any other shape that uses lat/long coordinates. I made several typos when initially recording my lat/long coordinates, resulting in less than satisfying results for the demo. I called Ed to get some debugging recommendations. Ed mentioned a great spatial partner named Safe Software who is working on a beta version of their spatial ETL tool named FME, which can be used to move spatial data in and out of SQL Server 2008 from other formats. One of their components will actually allow you to visualize shapes stored in a geography column, making it a heck of a lot easier to determine if the polygon you thought you created is actually the polygon you created.
Anyway back to the matter at hand, we created a .NET Framework command-line utility named FileStreamLoader designed to populate our eventMedia table. This was the easiest way for us to pump in all of our pictures and videos using FILESTREAM. It's tough to do that kind of stuff in a Transact-SQL script, so database geeks please go learn .NET programming it will make your life a whole lot easier. I'll post the code for FileStreamLoader up on CodePlex once CTP5 ships.
So after loading up our map/region rows using FileStreamLoader I started running some queries to see how they worked. Here's a good one that retrieves all of the polygons from eventMedia:
SELECT N'Map/Region Row' As [Label], [mediaKey] As [Key], [title] As [Title], ROUND([location].STArea(),2) AS [Area], ROUND([location].STLength(),2) AS [Length], [location].ToString() AS [Location]FROM [dbo].[eventMedia]WHERE [location].STDimension() = 2;
I used a few spatial methods to illustrate some of the rich functionality you get when using the geography type. The STArea() values are the total area of the polygon in square meters. The STLength() values are the total length around the polygon in meters. The ToString() values are WKT string representation of the geography instance. I used the STDimension() method to filter the result set down to just 2-dimensional objects. The results look like this:
The demo application needs to determine what the overall map space is. We can hard code this, but its better if the app can determine it dynamically using a spatial query. This query gets the map region from the table, which is the 2-dimensional geography object with the biggest area:
DECLARE @mapKey [uniqueidentifier];DECLARE @map [geography];
SELECT TOP 1 @mapKey = [mediaKey], @map = [location]FROM [dbo].[eventMedia]WHERE [location].STDimension() = 2ORDER BY [location].STArea() DESC;
Next, the demo application needs a list of sub-regions within the overall map space. Again, its best if the app can get this list using a spatial query:
SELECT N'Region Row' AS [Label], [mediaKey] AS [Key], [title] AS [Title]FROM [dbo].[eventMedia]WHERE ([mediaKey] <> @mapKey) AND ([location].STDimension() = 2) AND ([location].STIntersects(@map) = 1);
The STIntersects() function is one of the most important functions for enabling spatial queries. In this case, each 2-dimensional geography instance in the location column is evaluated against the map instance to see if they intersect. The results represent the set of "sub-regions" in our map:
While I don't discuss it in this blog post, its important to note that the STIntersects() method supports usage of the new spatial index type in SQL Server 2008. We didn't really need one for such a small number of rows, but when dealing with larger scale GIS systems indexing can become critical.
Finally, on to the basic spatial query the demo application needs to support. We need to retrieve a list of pictures that were taken within the boundaries of our map. This query gets all of the pictures that have been geo-located within our map:
SELECT N'Map Picture Row' AS [Label], [mediaKey] AS [Key], [title] AS [Title], [location].ToString() AS [Location], LEN([binaryFile]) AS [File Size]FROM [dbo].[eventMedia]WHERE ([contentType] = N'image/jpeg') AND ([location].STDimension() = 0) AND ([location].STIntersects(@map) = 1);
The results look like this:
Since the pictures are stored as a FILESTREAM, the demo application can go out and retrieve them using new Win32 file streaming API support in SQL Server 2008. Now let's do a final spatial query that gets all of the pictures in one of our "areas of interest", namely the Manayunk Wall:
DECLARE @wallArea [geography];
SELECT @wallArea = [location]FROM [dbo].[eventMedia]WHERE [wallKey] = 'dbd03037-7451-4451-8a26-b7b95fd2486f';
SELECT N'Wall Picture Row' AS [Label], [mediaKey] AS [Key], [title] AS [Title], [location].ToString() AS [Location], LEN([binaryFile]) AS [File Size]FROM [dbo].[eventMedia]WHERE ([contentType] = N'image/jpeg') AND ([location].STDimension() = 0) AND ([location].STIntersects(@wallArea) = 1);
Here are the results:
And there we have it! These are the two pictures that were taken in the Manayunk Wall area. Obviously the demo gets more interesting with more photos and more regions.
Now you know how to use the new spatial functionality in SQL Server 2008 to geo-locate media files in your application. Hopefully all you database developers out there won't be afraid to dive in and start using this amazing new functionality. Zach is going to do some additional posts in our spatial series which discuss how to leverage these new spatial features within a WPF application.
PingBack from http://msdnrss.thecoderblogs.com/2007/10/28/demystifying-spatial-support-in-sql-server-2008/
I've been stalling, trying not to say too much about spatial until it's actually available. The code
[Fair warning... this is a long post.] Great blog! I see you are using FME. FME is good stuff, but FME is not a GIS and so is very limited compared to using a full featured GIS for work with either GIS data or with SQL Server 2008 spatial.
I'll use Manifold as the GIS example in this post, since I'm a product manager for Manifold and most familiar with that. As far as I know, Manifold is the only production GIS that is shipping today with built-in support for SQL Server 2008 spatial capabilities (since August, 2007 - See the announcement at http://www.manifold.net/info/news.shtml ).
Manifold has a lot of spatial DBMS experience supporting other vendors, including Oracle Spatial, IBM DB2 with IBM's Spatial Extender, PostgreSQL/PostGIS, SQL Server 2005 using the Manifold Spatial Extender, and "generic" spatial DBMS capability for virtually any DBMS. I hope these comments, drawn from our experience with many tens of thousands of spatial DBMS applications, will help novice spatial DBMS users avoid re-inventing the wheel when putting the mighty power of SQL Server 2008 spatial to use.
GIS data often requires significant work besides simple format translations to be useful within spatial DBMS, and such work is usually best done in the visual / analytic / editing context provided by a real GIS. That speeds up workflow, and allows analytic processes that can transform data into the most useful form, which is what you want if you are going to be storing it within your SQL Server 2008 spatial data warehouse. It is rarely the case that data from legacy formats should be uploaded into a spatial DBMS like SQL Server 2008 without substantial alteration and editing.
A simple example: Let's say you want to grab data from a variety of different formats and different spatial DBMS servers for upload into SQL Server 2008 spatial storage. If you bring the data as layers into a real GIS that can do reprojection on the fly, and combine it with existing layers within SQL Server, you can see to what degree they overlay. Errors in projections will be immediately evident, especially if your GIS has the capability of combining layers from radically different sources, such as images, terrain elevation surfaces and many different formats of vector data, as Manifold automatically does.
What often causes trouble with data not seen in context is not something that can be solved by a simple act of reprojection, it is missing small nuances in matters such as datums or specific coordinate system parameters so that the data to be uploaded is slightly off even though the coordinate system (projection) information appears OK. Such problems are easiest to catch in a visual setting but cause no end of chaos once they get embedded into your corporate geospatial data warehouse.
An example where you might need to do more serious work before uploading: Suppose you have a data set which consists of boundary lines created with AutoCAD and stored in a DXF without any geographic context. That's a classic situation that afflicts about 300,000 jurisdictions in the US with civic spatial data as well as endless commercial users such as utilities with corporate CAD data.
Before uploading into SQL Server 2008 you'll need to get that data set georeferenced, you'll want to get it cleaned up to be free of classic CAD errors (dangles, overshoots, unclosed boundary lines, etc.), you'll most likely want to create area objects instead of boundary lines, assign data attributes automatically to the resultant area parcels and only then export into SQL Server 2008 spatial storage. Manifold can do all that in a highly visual, point-and-click environment using a variety of layers, such as automatically streamed-in Virtual Earth satellite or road map backgrounds, to assist the process. You can use hundreds of editing and analytic tools to assist as well. If a problem is not caught before upload, you can easily fix it afterwards because all those tools still work even after the data is in SQL Server 2008.
When a GIS like Manifold has strong DBMS capabilities, you can also slice and dice your data into desired form using data attributes as well as visual or spatial characteristics to control the process. For example, you might want to use spatial SQL to dynamically create new objects based upon spatial relationships such as intersections between buffer zones and existing objects that also take into account the heights of surfaces in the region or the value of data attributes within nearby polygons. In fact, with Manifold you can do that with data before it uploads into SQL Server 2008 or you can do it later, editing data stored in SQL Server 2008 as if it were a local layer.
A GIS like Manifold that can connect directly to a spatial DBMS and visually, seamlessly work with data using a full power, visual GUI brings a lot of additional value to the data stored in the DBMS. For every user whose choice of interface is SQL, there are many more who prefer a visual GUI using something like Manifold. Having both spatial SQL as well as a visual interface makes it a lot easier to manage data in the DBMS and to develop spatial applications because you can choose when you want to write code for server-side, DBMS processing and when you would prefer a more visual, point-and-click process to do the job.
For example, if you want to grab photos within a given area of interest it is a lot easier for most users to simply "lasso" such points with a mouse on a visual map than it is to write SQL to select them. A GIS like Manifold will give you such visual capabilities and will allow you to dynamically edit what's in your SQL Server 2008 spatial storage and to manage that storage using visual means. If you are developing applications, you can use the GIS as a visual workbench while you try out SQL or .NET scripts within Manifold to help you create a spatial application much faster, even if that application ultimately will be implemented exclusively within server-side code.
Keep in mind that getting data into a spatial DBMS is just the beginning. What comes after is use within applications. While there are plenty of big applications that merit server-side coding to create each application individually, there are far many more applications that become instantly available when people use existing, modern GIS capabilities to manipulate data stored within spatial DBMS in an ad hoc, interactive manner. In most cases, it is much easier and faster to simply point-and-click to accomplish the desired task than to write an application that does it.
For an example of working with layers in a visual context, you can do things like the tutorial at
The above tutorial shows an example where favorite restaurants are marked as points of interest using Virtual Earth background images and maps. That tutorial and any of the other hundreds of visual examples in the user manual can be done as layers within SQL Server 2008.
A general intro to Manifold is at
See the whole user manual at
Manifold has supported spatial capabilities in SQL Server 2008 since August (the spatial pre-release of that date), and that support is bult into in our standard, production release. See, for example,
Everything works: full multiuser editing of drawings in SQL Server, direct connect to IMS web applications, interaction with images and surfaces, visual copy and paste from Oracle Spatial or other spatial DBMS products to Katmai, zillions of editing and other tools, reprojection on the fly, etc.
CUrrent support in the shipping Manifold product is for Katmai pre-releases that are not public. When Microsoft issues a public SQL Server 2008 CTP that includes spatial capabilities, Manifold will immediately issue a free update that supports the public CTP as well. That update will also include detailed, visual examples featuring SQL Server 2008 to help users new to GIS get started.
As mentioned earlier, Manifold supports essentially supports all other spatial DBMS products as well as SQL Server 2008 spatial.
Having the ability to simultaneously connect to just about any other spatial DBMS is extremely useful when moving data into SQL Server 2008. For example, you could have four different windows simultaneously open showing a drawing each in Oracle, DB2, PostgreSQL and SQL Server 2008 and simply copy and paste between windows to move, say, parcel data, from DB2 into SQL Server 2008. Manifold will automatically match projections, do reprojection on the fly if need be, change geometry types into SQL Server 2008 native types and adjust attribute data so that the spatial data is correctly stored within SQL Server 2008.
Prior experience has also helped Manifold assure that support for SQL Server 2008 spatial includes mature support right from the beginning for other key technologies like native 64-bit Windows code, total support for Vista, automatic usage of Virtual Earth, automatic multicore and multiprocessor support, use of NVIDIA CUDA, integrated .NET scripting (including IronPython), etc.
The result is that right now, today, you can combine SQL Server 2008 spatial and Virtual Earth using seamless Manifold GIS running 64-bit in Windows Vista x64 on a quad-core processor and you will have a GIS / spatial DBMS tool of unprecendented power that works with data from just about any spatial data source known and scales from individual desktop use to thousands of simultaneous users within the largest enterprise. That's a real credit to the synergies of the Microsoft ecosystem.
We're really excited about SQL Server 2008 spatial capability. We are so excited about this that Manifold support for SQL Server 2008 spatial is built in at no additional cost into every Manifold edition from Enterprise Edition ($395 a seat) on up. We want our customers to consider SQL Server 2008 spatial a fundamental, standard building block for both server-side spatial applications and also for everyday desktop GIS.
Today at TechEd EMEA Microsoft announced 11 partners who will support SQL Server 2008 Spatial .
There are a bunch of features in SQL 2008 that have been on the ask list for a VERY long time. I suspect
Interested in using the newly-available GEOGRAPHY data type? This well-written article provides a detail
Okay, it's time to come clean. As a technical evangelist, particularly one that focuses on driving platform
Construyendo comandos para consultas a base de datos, es un tema que siempre despierta interés y siempre