I’ve been pretty quiet around here, and at least part of the reason is that we’ve been pretty tight-lipped about what we’ve been up to. We’ve now gone public—we’re providing support for geospatial data in our next version of SQL Server, codenamed Katmai. The Virtual Earth folks mentioned this in their blog (others, too) but now I can make a little more noise.
What I’d like to do first is try to explain to the folks out there who aren’t experts in the field what this all means and give a very high-level tour of what we’re providing.
First, what is geospatial data? For starters, we mean locations on the Earth. It’s more than location, though: we mean the location and shape of objects on the Earth. Think of the description of roads, states, lakes, etc. For example, think of something like Live Maps:
In addition to the normal objects, I’ve added a polygon that shows Microsoft’s main campus. We want to be able to store all of the data on this map—the roads, the parks, and user-generated polygon—in SQL Server. Being a database, we want to be able to ask questions about the data. For example, “What are the roads that intersect Microsoft’s main campus?” A more complex example would be “What is the area of all parks within 1 kilometer of Microsoft’s main campus?”
How do we do this? First, we need new data types to be able to store this information. Toward this end, we are introducing two new types in Katmai:
· The first type, “geography”, will store points, lines, polygons, and collections of these in latitude/longitude coordinates using a round-Earth model. Most commonly-available data is given in latitude/longitude coordinates, so we expect that most people will want to use this type. Furthermore, this type will give correct computations on a true ellipsoidal model of the planet. What is the area of Indonesia? Will my flight from Seattle to Beijing take me over North Korea? Where can I store my GPS readings? This is the type for you.
· A “geometry” type to support flat-earth data. For those of you who are familiar with such things, this type is our OGC-compliant offering. In some ways, this is a more specialty offering for people who need to work in projected map coordinates either for legacy or legal reasons, but this type can be used for things like interior spaces as well, e.g., “Where in this warehouse is my book?”
If we take our roads data above—perhaps all of the roads for the United States—we could store them in a table Roads:
Roads(name varchar(30), location geography)
I.e., geography is a column type just like any other. We expose a pretty comprehensive set of operations on these type through a method-based interface. For example, if we have a geometry variable @microsoft that represents Microsoft’s main campus, we can find out which roads intersect it with the query:
SELECT nameFROM RoadsWHERE location.STIntersects(@microsoft) = 1
Given a similar Parks table containing all US parks, we can ask our parks question from above:
SELECT SUM(location.STArea())FROM ParksWHERE location.STDistance(@microsoft) < 1.0
Of course, beyond answering this query, we need to be able to answer the query quickly. Since my data may be very large, speed is going to mean having a good spatial index. Perhaps that will be the subject of my next post. Watch this space for more.
Why this distinction between geography and geometry? To me, everything is geometry, and a geometry has a spacial reference. The spatial reference specifies whether this is in spherical (geographic), cartesian (projected) or geocentric space, what datum is used etc.
Are the any subtypes like point, polyline, polygon etc, or will a table always be this generic type where you can't enforce the specific geometry type?
Furthermore I don't see a unit on those distance and area queries, so how would that work on a geographic type? Is the distance in degrees which rarely makes much sense (and even less on an area). Furthermore the datum used will also change this. I know this is not part of the OGC stuff, but then again, you introduced this new (weird?) geographic type. Maybe you can elabotate a bit more on why we now have these two types.
Isaac @ MSDN is a new blog from "Isaac Kunen is a Program Manager on the SQL Server team working on Spatial support in SQL Server 2008." In a first post he introduces the basics of the implementation including the difference between the round earth and
I look forward to learning more details about the spatial features of Katmai.
One thing that I hope gets addressed early on is a "translation" feature to take data from existing GIS formats and convert them into the microsoft format (and vice versa). We use lots of boundary files from ESRI or MapInfo for things like county boundaries, zipcode boundaries, etc that would be useful for "point in polygon" type operations.
The Spatial sample for SQL2005 showed some initial promise for us, but there did not appear to be an easy way to convert complex boundaries from another format into something that could be used by it. If I am wrong about that, someone please post how this can be done.
Doesn't ESRI already provide GeoSpatial support for SQL Server using their SDE product?
You're right: distances in degrees---or areas in square degrees---doesn't make much sense. For geography, the results of operations will depend on which ellipsoid you use, so your SRID will matter. We'll return results in the unit of measure for the system, typically meters (or square meters in the area case).
I'll try to tackle the geography/geometry separation in a future post.
OK thanks, I'll look forward to that. I really don't see the reason for having the geographic datatype. From my point of view, it will just cause more confusion. Only reason I can see is for some weird bordering case at the date-line and poles where you would want a special type of indexing to handle these cases, but fact is that this is not in any way specific to Geographic - Projected data has the same issue.
To me, saying that "most people would want to use geographic" is clearly a statement indicating that You don't know the geospatial industry very well. Frankly, when I was working for a data vendor, I can't recall a single time where our clients requested data from us in geographic.
I think it depends on which “most people” you’re looking at. It’s true that most people in the GIS community use planar coordinates, but we’re looking to support people beyond the GIS establishment.
It’s true that the planar model is dominant in the established GIS market, but this is more for legacy reasons than anything else. Why would you work on such a distorted version of the Earth except for practicality? It’s easier to work with a flat piece of paper than a globe, but it distorts the data and introduces complexities that are now unnecessary now that most work is done on the computer. That said, we recognize that the GIS community is very heavily invested in planar mapping, so we’re supporting it.
For people without that legacy, however, there is little reason to work in anything other than a round-Earth model. GPS units output in geodetic coordinates, most data is available in geodetic coordinates, and visualization software can project to the plane on the fly for display. Why would someone fresh to the field deliberately distort things (and make the problem more complex) by projecting everything to the plane?
I'm not saying that geographic coordinates are bad. They have their place. I'm arguing that I don't see the need for two geometry models. It's just a spatial reference on the geometry, and the spatial reference specifies whether the units are degees, meters, feet or what-ever.
I think this is fantastic. It is about time, and makes complete sense, since GIS data has moved beyond the "GIS Professional" level to the masses. More people will understand geography than geometry.
<<It’s true that the planar model is dominant in the established GIS market, but this is more for legacy reasons than anything else>>
I wish the above statement was true. The fact is that even if we were starting from scratch, with no legacy to support, we would still probably utilize planar mapping and common map projections. The reason for this, of course is that the geodetic coordinate systems are based on a mathematical 'approximation' of the earth (the ellipsoid), not the true shape of the earth (the geoid). Of course, none of this matters if you are simply plotting points of interest, but makes a world of difference when it comes to subdivision boundaries, and other locations requiring pinpoint accuracy.
It is great to finally see spatial support coming. I hope it will follow Oracle's lead in providing a complete spatial model that will eliminate the need for middleware like ESRI's SDE.
How are you handling topology? Oracle uses a persistent (explicit) topology data model while ESRI uses on-the-fly topology discovery. What path are you going down for topology?
This is a little bit less of an introductory post than the last one , but there was quite a bit of discussion
Do you have plans to eventually come out with client tools to work with spatial data stored in SQL2K8, integrating those 70+ spatial functions in some Microsoft Suite, such as SS Management Studio or BI packages (SSIS)?
I see this is a great move to help larger enterprises integrate GIS with Business data. Just saw that there are already spatial ETL tools to help migrate data from hundreds of other spatial formats into SSIS (http://www.safe.com/microsoft).
Looking forward to the next post on Spatial...
I feel so old fashioned not being able to understand the world of SQL going beyond 2D. I found this great
I'm looking forward to seeing this. Currently I use MapInfo SpatialWare, which allows me to store my geometries as a blob in the SQL server table. From what I understand, this new spatial support will allow me to do the same thing (no like ArcSDE requiring middleware).
I can see some benefit in having the two data types (the maths is one), and I hope that as part of this development there will be functions within the database to convert between geographic and geometry datatypes (co-ordinate conversion). If this functionality is not available, then the product will have very limited value, as you would have to use a GIS to perform this conversion.
I am also interested in the precision of the stored geometries. I often work with survey accurate data, and some software stores this as a series of floating point values (which degrades the accuracy of the data), is this the case with these new data types?
Do these datatypes store 2D, 2.5D or 3D geometries (even a GPS will give co-ordinates in 3 dimensions)?