- My Overheating IBM ThinkPad T60p Laptop
-
My faithful old IBM T60p laptop has been with me for years. Over time, and every so slowly, it has been running hotter and hotter (especially when running complex spatial queries on SQL Server 2008 or generating map-based reports using the new SQL Server 2008 R2 Report Builder… ;-). A few weeks ago it got so hot that I had to take it off my lap … and then things went from bad to worse. Here’s the tale…
I decided that it was time to upgrade to Windows 7. I ordered a new Seagate 7200rpm 320GB hard drive for the laptop. I took the old internal hard drive out and put it into a NexStar3 USB2.0/eSATA 2.5-inch enclosure …

… and proceeded to do a fresh install of Windows 7.
The Windows 7 install went flawlessly. Even the most difficult (anticipated) parts of the install – new drivers for printers and other devices - went very smoothly. In my case, Windows 7 found all of the drivers automatically over the Net including the driver for the eSATA Express Port card necessary to connect to my NexStar3-enclosed hard drive with all my data on it. So far so good.
Then bad things begin to happen – for no apparent reason, the laptop started to shut down all by itself. After having this happen a few times (Windows 7, btw, recovered very nicely each time), I concluded that the shut downs occurred when the machine got very, very hot – perhaps a heat related shutdown? I did a web search to see if others were having this problem. I appeared to be all alone. I speculated that it might be one of the following reasons:
1. My use of an underpowered 65W power supply (its supposed to be a 90W supply but that brick is much larger and I had consequently been using the smaller power supply for the last couple of years).
2. The new operating system, Windows 7. I imagined that Windows 7 was using the CPU or GPU (or both) just a little bit harder than Vista and this was pushing the laptop “over the edge”.
After swapping out the 65W power brick for the proper 90W unit, it was still overheating and shutting down. I figured it had to be Windows 7. Desperate, I sought advice from the hardware technician who supports the SQL Server engineering team. He had a very simple suggestion:
1. Shut the machine down
2. Use compressed air to blow out accumulated dust from the CPU/GPU ventilation system in the upper left-hand corner of the laptop.
I was skeptical but, having nothing to lose, gave it a try. I liberally blasted both cooling ports on the laptop (see illustration, below)
Several of the air blasts dislodged a noticeable amount of dust. At the time it didn’t seem like enough to make any significant difference…
I powered up the laptop. Even after several hours of continuous use the machine was running cool and behaving perfectly. It’s now several days later and the laptop is running like new. Windows 7 is a delight. I installed SQL Server 2008 R2 on it easily and moved all of my spatial databases from the external hard drive to the laptop. I even got up the nerve to try the new Office 2010 beta distribution and again no problems.
The technician did say that it still might warrant a physical inspection of the cooling ports and further dust/cruft removal but this simple trick worked miracles on this particular machine.
- A Spatially-Enabled Dinner In Austin, Texas: Some Observations Outside the Office
-
Not having posted for a while, I thought I'd take a few minutes and provide some thoughts on spatial after successfully negotiating dinner here in Austin, Texas (October 6, 2009). I'm in town for the Texas GIS Forum, one of my favorite conferences, and always seek out barbeque when I can. A quick check on Bing Maps verified that my hotel was very close to Rudy's Bar-B-Q. I remembered that Rudy's was on a one-way road, next to a freeway, and wanted to verify my route using Live Search on my Windows Mobile Phone. The route that it provided seemed counter-intuitive: it wanted to take me across the freeway and have me reverse my route at an overpass (see below).
Bing Maps Route
I chose a more straight-forward route and finally coerced Live Search on my mobile to guide me that way (after repeated, "reroute from current location?" messages...). An added bonus was that Rudy's had an entrance off of my newly found route and I got there in a very direct fashion.
When I got back to the hotel, I decided to see how Google Maps handled the same route request. Google did the right thing and provided an almost ideal route (see below), though it is curious that Research Blvd, which parallels Highway 183 on both sides, is not shown - curious. It too missed the back entrance to the restaurant, but that's a minor detail.
Google Maps Route
When I got to Rudy's, I fired up the GPS-Photo tool on my mobile phone and took a series of images.
Entrance to Rudy's on Research Blvd.
View from parking lot
Outdoor seating is available...
For dinner, I had the lean brisket, white bread, beans, potato salad and an IBC Cream Soda - all for $12.19. The foot was great but the beans needed some spicing up. I added a bit of their famous barbeque sauce (below) to made the beans quite tasty... It was 91 degrees outside and quite humid (from my West Coast environment viewpoint) at 6:00 PM in the evening. Perfect barbeque weather.
Geocoded Barbeque Sauce (latitude 30.414167, longitude: -97.7467, WGS84)
When I got back to the hotel, I loaded each of the GPS-enabled images from the mobile phone into my favorite photo application: SpatialAlbum. I'll blog more about this application at a later point, but, in short, it allows you to load GPS-enabled digital images, using a drag-an-drop interface, into SQL Server 2008. It uses a lot of SQL Server 2008 goodies: spatial, sparse columns, filestreams, the new datetime2 type and overlays all spatial data on top of Bing Maps.
SpatialAlbum application
Here is the location of the photos I took using the Bing Maps feature of SpatialAlbum:
Rudy's location on Bing Maps
The following are some Bing Maps displays in SpatialAlbum showing various view/featues:
Synoptic view in the Aerial mode
Bird's eye view from the south
Bird's eye view showing back entrance...
Bird's eye view from the north showing parking lot
Bird's eye view of the main entrance off Research Blvd.
And there you have it - more information than you ever wanted on an obscure topic. Maybe I should do a WikiPedia page on Rudy's...
Technorati Tags:
SQL Server 2008 R2,
spatial,
sparse columns,
datetime2,
filestream,
Bing Maps,
Google Maps,
Bird's eye,
SpatialAlbum,
Austin,
Texas,
Rudy's,
barbeque,
IBC,
Texas GIS Forum
- Working with Invalid Data and the SQL Server 2008 Geography Data Type, Part 2
-
In a long past post, Working with Invalid Data and the SQL Server 2008 Geography Data Type, Part 1b (August 2008), I discussed techniques for dealing with invalid data and the geography data type. In my last post, I listed the new geography validation functions in the SQL Server Spatial Tools project on CodePlex. In this post, I'm going to use three of these new functions:
- IsValidGeographyFromGeometry
- MakeValidGeographyFromGeometry
- MakeValidGeographyFromText
For this exercise, I'm going to use the Zillow Neighborhood Boundaries
. Since the neighborhood data (polygons) are delivered as shapefiles, I used the well respected Shape2SQL data loader from SharpGIS. Note that there are 42 individual state-based shapefiles which will need to be individually downloaded and then loaded into SQL Server. Since the goal of this exercise is to end up with geography features, the data must be in geographic coordinates to start with. The projection file (.prj) associated with each shapefile confirmed this, containing the following information:
GEOGCS["GCS_North_American_1983",DATUM["D_North_American_1983",SPHEROID["GRS_1980",6378137.0,298.257222101]],PRIMEM["Greenwich",0.0],UNIT["Degree",0.0174532925199433]]
This is EPSG:4269, so that's the SRID I used in Shape2SQL.
My goal is to load this data into a column of type geography so it would make sense to load the data directly into a geography column. I'll save you the suspense and tell you right now that that some of this data will not load directly into a geography column - big surprise, right? Consequently, I loaded each shapefile's spatial data directly into a table, ZillowNeighborhoods, with a geometry column named GEOM. After the data loading was completed, I added a column of type geography named GEOG:
ALTER TABLE ZillowNeighborhoods
ADD GEOG geography NOT NULL
Being bold, I next attempted to use the new MakeValidGeographyFromGeometry function to update the new geography column:
UPDATE ZillowNeighborhoods
SET GEOG = dbo.MakeValidGeographyFromGeometry(GEOM)
Well, suffice-it-to-say, it wasn't quite that easy, even with our new fangled function, which returned the following error message:
Msg 6522, Level 16, State 1, Line 1
A .NET Framework error occurred during execution of user-defined routine or aggregate "MakeValidGeographyFromGeometry":
System.ArgumentException: 24200: The specified input does not represent a valid geography instance.
So, what to do next? Why not use the IsValidGeographyFromGeometry function so identify the candidate group of which geometries contained invalid data from a geography-perspective:
SELECT ID FROM ZillowNeighborhoods WHERE dbo.IsValidGeographyFromGeometry(GEOM) = 0
--Results:
--3499
--4319
--4328
--4428
--4458
--5182
--5201
--5214
--5224
Now I needed to figure out which of these geometries could not be converted, so I used the following query, once for each ID:
SELECT dbo.MakeValidGeographyFromGeometry(GEOM) FROM ZillowNeighborhoods WHERE ID = 3499
The queries which failed allowed me to isolate the problem geometries to the following rows (identified by the ID field):
3499, 4328, 5224
Before I go any further, let's update the geography column, sans the problematic polygons:
UPDATE ZillowNeighborhoods
SET GEOG = dbo.MakeValidGeographyFromGeometry(GEOM)
WHERE ID NOT IN (3499, 4328, 5224)
Now, let's investigate the problem geometries to see if we can determine what the issues are that prevent conversion. Here is the first troublesome geometry (ID = 3499):
SELECT GEOM FROM ZillowNeighborhoods WHERE ID = 3499
Here is the returned geometry in the Spatial results tab in Management Studio:

A careful visual analysis didn't find any obvious defects in the geometry.
Then, I decided to fiddle with the original geometry using the Reduce method with very small tolerances. Interestingly enough, I was able to get the geometry to convert to a geography using the following query:
SELECT dbo.MakeValidGeographyFromGeometry(GEOM.Reduce(.0000001))
FROM ZillowNeighborhoods WHERE ID = 3499
So what is happening here? First of all, the tolerance used in the Reduce method is very small. So how many vertex points were removed from the original geometry by Reduce? It turns out that 6 vertices were removed. A quick discussion with the SQL Server spatial developers revealed that during current validation processing some very close vertices can shift slightly and cause edges to cross/overlap (remember that geographic edges are not straight lines...). In the next release of SQL Server (SQL11), we have changed the underlying methodology and eliminated this issue. For now, we will have to workaround this issue using Reduce with very small tolerances.
By chance, I happened to look at the comments on my August 2008 post, referenced above, and found that I was not the first one to discover this technique:
-----------------------------------------------------------------------
# Craig Tadlock said on February 16, 2009 9:36 PM:
This is an excellent post. It works for 99.9% of the geometries Ive had to deal with. For the other .1%, try adding this..
Reduce the geometry value by a very small amount 1st. I'm not actually sure what this does internally, but it's solved most my issues.
Example...
select geography::STGeomFromWKB(geom.MakeValid().Reduce(.00000001).STUnion(geom.STStartPoint()).MakeValid().STAsBinary(), 4326) from tl_2008_us_csa
-----------------------------------------------------------------------
So, my hat's off to you, Craig, a true invalid geography pioneer!
-----------------------------------------------------------------------
NOTE: I've subsequently done a little more investigation and found that Alastair (author of the first published SQL Server Spatial book, Beginning Spatial with SQL Server 2008) had already figured this trick out in his excellent post, Fixing Invalid Geography Data (Oct. 18, 2008). Nothing like "discovering" what everybody else already knew ;-)
-----------------------------------------------------------------------
To complete the investigation, let's see if we can isolate the vertexes which were removed by Reduce. To do this I buffered the difference polygons between the original and generalized polygons using the following query:
--Find problem areas and mark with a buffer polygon
DECLARE @g0 GEOMETRY
DECLARE @g1 GEOMETRY
SELECT @g1 = GEOM.Reduce(.00000001) FROM ZillowNeighborhoods WHERE ID = 3499
SELECT @g0 = GEOM FROM ZillowNeighborhoods WHERE ID = 3499
SELECT @g0.STDifference(@g1).STBuffer(.001)
Using Safe Software's Feature Manipulation Engine's (FME) Universal Viewer, I've created the following visualization where I’ve overlaid the original and generalized geometry over each other with the modified areas indicated by the buffer regions generated with the above query. I then marked the obvious vertexes which were removed by the Reduce in the Windows Paint program. The remaining vertexes where so close to other vertexes that it was not possible to actually identify them, though it is clear where they are.
Needless-to-say, the generalized polygon with this tiny tolerance did not materially change the polygon's basic geometric quality. So I would posit that this technique is a valid method. To complete this sequence, I updated the row where ID=3499 with the following query:
UPDATE ZillowNeighborhoods
SET GEOG = dbo.MakeValidGeographyFromGeometry(GEOM.Reduce(.00000001))
WHERE ID = 3499
So what about the other two polygons which would not convert (ID's: 4328, 5224)? Well it turns out the "Reduce" technique works for these too. But there is a twist ... it wouldn't be any fun if I was done, right?
It turns out that when run through the Reduce-based technique, each of these geometries produces GeometryCollections:
-----------------------------------------------------------------------
NOTE: The SQL Server Spatial Team has entertained many inquiries on why Reduce, under certain circumstances, would take higher dimension objects (such as polygons) and create lower dimension objects (such as lines and points) from them (most other spatial databases maintain the dimensionality of the spatial object when generalizing). The following examples are a good illustration of how to capitalize on the behavior of Reduce when it encounters extremely thin polygons.
-----------------------------------------------------------------------
Here is the Well-Known Text (WKT) for ID = 5224 from the following query:
SELECT GEOM.Reduce(.0000001).STAsText() FROM ZillowNeighborhoods WHERE ID = 5224
GEOMETRYCOLLECTION (LINESTRING (-86.7028556317091 36.198983579874039, -86.702864557504654 36.198975399136543), LINESTRING (-86.72271192073822 36.195485174655914, -86.7227138876915 36.195484921336174), LINESTRING (-86.725671902298927 36.1651765704155, -86.7256792485714 36.165166437625885), POLYGON ((-86.724177449941635 36.164932250976562, -86.72216959297657 36.165002837777138, -86.720163896679878 36.165325194597244, ... , -86.725447744131088 36.164976820349693, -86.724177449941635 36.164932250976562)))
So what's with the LINESTRING's? Let's take a look.
First, let's create geometry instances from each of the LINESTRINGs. Since the LINESTRINGs are very small, let's buffer them to help make the area where there occur along the polygon visible:
DECLARE @g1 VARCHAR(MAX) = 'LINESTRING (-86.7028556317091 36.198983579874046, -86.702864557504654 36.19897539913655)'
DECLARE @g2 VARCHAR(MAX) = 'LINESTRING (-86.72271192073822 36.195485174655914, -86.7227138876915 36.195484921336167)'
DECLARE @g3 VARCHAR(MAX) = 'LINESTRING (-86.725671902298942 36.1651765704155, -86.7256792485714 36.165166437625885)'
SELECT GEOMETRY::STGeomFromText(@g1,4269).STBuffer(.001)
UNION ALL
SELECT GEOMETRY::STGeomFromText(@g2,4269).STBuffer(.001)
UNION ALL
SELECT GEOMETRY::STGeomFromText(@g3,4269).STBuffer(.001)
UNION ALL
SELECT GEOM FROM ZillowNeighborhoods WHERE ID = 5224
Here is the view from Management Studio:
Let's "zoom" in on a single problem area:
-----------------------------------------------------------------------
-- Isolate one of the problem areas which is identified by the linestring ...
-----------------------------------------------------------------------
-- Define linestring
DECLARE @g1 VARCHAR(MAX) = 'LINESTRING (-86.7028556317091 36.198983579874046, -86.702864557504654 36.19897539913655)'
-- Clip the original geometry instance by intersecting a buffer around the linestring with the original geometry instance
SELECT GEOMETRY::STGeomFromText(@g1,4269).STBuffer(.0001).STIntersection(GEOM)
FROM ZillowNeighborhoods WHERE ID = 5224
UNION ALL
SELECT GEOMETRY::STGeomFromText(@g1,4269).STBuffer(.00001)
Here is how this looks in Management Studio:
So the LINESTRING is a "spike". The "line" in the brown buffer region is actually an extremely thin extension of the base polygon. It turns out that the rest of the LINESTRINGs are also spikes. The last geometry (ID = 4328) was likewise festooned with spikes. A quick recap on why spikes cause validation issues, from above: "...during current validation processing some very close vertices can shift slightly to cause edges to cross/overlap".
To create the final geometry objects, ready for conversion to geography, simply extract the POLYGON WKT from the GEOMETRYCOLLECTIONS returned by the Reduce-based processing and create geometry instances which can then be used to update the original geometry after conversion by MakeValidGeographyFromText (note the use this new function):
DECLARE @geom VARCHAR(MAX) = POLYGON ((-86.724177449941635 36.164932250976562, -86.72216959297657 36.165002837777138, ..., -86.725447744131088 36.164976820349693, -86.724177449941635 36.164932250976562))'
UPDATE ZillowNeighborhoods
SET GEOG = dbo.MakeValidGeographyFromText(@geom,4269)
WHERE ID = 5224
Let's fix the last neighborhood polygon, ID=4328. A quick check shows that the core neighborhood polygon is actually a MULTIPOLYGON:
SELECT * FROM ZillowNeighborhoods WHERe ID = 4328
A quick check of the structure of the MULTIPOLYGON with the following query...
SELECT dbo.MakeValidGeographyFromGeometry(GEOM.Reduce(.00000001)).STAsText()
FROM ZillowNeighborhoods
WHERE ID = 4328
...reveals that there are 4 polygons (expected) and 1 line (a "spike") in the GEOMETRYCOLLECTION. I removed the LINESTRING from the WKT and, using the following query, updated the row:
DECLARE @geom VARCHAR(MAX) = 'GEOMETRYCOLLECTION (POLYGON ((-73.8411675542593 40.795716315507896, ..., -73.8411675542593 40.795716315507896)), POLYGON ((-73.853304713964462 40.7884576357901, ..., -73.853304713964462 40.7884576357901)), POLYGON ((-73.85657873749733 40.78686719387769, ..., -73.85657873749733 40.78686719387769)), POLYGON ((-73.85907343775034 40.785239797085531, ..., -73.85907343775034 40.785239797085531)))'
DECLARE @geog GEOGRAPHY
SELECT @geog = dbo.MakeValidGeographyFromText(@geom,4269)
UPDATE ZillowNeighborhoods
SET GEOG = @geog
WHERE ID = 4328
This completes the exercise, updating the ZillowNeighborhoods table with a column of type geography.
Technorati Tags:
SQL Server,
2008,
Spatial,
MakeValidGeographyFromGeometry,
MakeValidGeographyFromText,
IsValidGeographyFromGeometry,
Shape2SQL,
SharpGIS,
CodePlex,
Geography,
Validation,
Zillow,
Neighborhoods,
Safe Software,
Feature Manipulation Engine,
FME,
Universal Viewer
- Update to the SQL Server Spatial Tools project on CodePlex
-
As Isaac noted in his most recent post, the SQL Server Spatial Tools project on CodePlex has been updated with several new geography functions. This post provides additional detail on the new features:
- IsValidGeographyFromGeometry
- IsValidGeographyFromText
- MakeValidGeographyFromGeometry
- MakeValidGeographyFromText
- ConvexHullGeography
- ConvexHullFromText
Here are descriptions for each:
bool IsValidGeographyFromGeometry(SqlGeometry geometry)
Check if an input geometry can represent a valid geography without throwing an exception. This function requires that the geometry be in longitude/latitude coordinates and that those coordinates are in correct order in the geometry instance (i.e. latitude/longitude not longitude/latitude). This function will return false (0) if the input geometry is not in the correct latitude/longitude format, including a valid geography SRID.
bool IsValidGeographyFromText(string inputWKT, int srid)
Check if an input WKT can represent a valid geography. This function requires that the WTK coordinate values are longitude/latitude values, in that order and that a valid geography SRID value is supplied. This function will not throw an exception even in edge conditions (i.e. longitude/latitude coordinates are reversed to latitude/longitude).
SqlGeography MakeValidGeographyFromGeometry(SqlGeometry geometry)
Convert an input geometry instance to a valid geography instance. This function requires that the WKT coordinate values are longitude/latitude values, in that order and that a valid geography SRID value is supplied.
SqlGeography MakeValidGeographyFromText(string inputWKT, int srid)
Convert an input WKT to a valid geography instance. This function requires that the WKT coordinate values are longitude/latitude values, in that order and that a valid geography SRID value is supplied.
SqlGeography ConvexHullGeography(SqlGeography geography)
Computes ConvexHull of input geography and returns a polygon (unless all input points are collinear).
SqlGeography ConvexHullGeographyFromText(string inputWKT, int srid)
Computes ConvexHull of input WKT and returns a polygon (unless all input points are collinear). This function does not require its input to be a valid geography. This function does require that the WKT coordinate values are longitude/latitude values, in that order and that a valid geography SRID value is supplied.
- "Cartographic Adjustment" of Spatial Data for SQL Server Reporting Services, Part 5
-
The Report Builder, a component of Reporting Services, described in this post will be available as a separate download when the upcoming CTP2 release of SQL Server 2008 Release 2 (R2) is made available to the public, currently scheduled for August 2009.
This was to be the last post in this series but it's too interesting to stop here. As of this writing, I'm planning one additional post in this series which will expand on the new mapping capabilities of SQL Server Reporting Services. Please note that these new capabilities will be introduced in CTP2 of the upcoming SQL Server 2008, R2 release. Today's post will introduce the new SQL Server Report Builder and its new mapping capabilities. Here is the workflow presented so far:
- Locate data source and download (Part 1)
- Load the data into SQL Server (Part 1)
- Remove unwanted features, simplify Alaska and Hawaii spatial features (Part 2)
- "Reposition" Alaska and Hawaii cartographically (Part 3)
- Locate and load State-based population data (Part 4)
- Use the data in the new SQL Server Report Builder
- More SQL Server Report Builder
So that you don't have to immediately scroll to the end of this post to see the final result, here is a reduced scale version of the final map report, noting, of course, the repositioned Alaska and Hawaii features ;-):
For those of you interested in the Report Builder workflow to create this map, please keep reading.
When the new Report Builder is first opened, a design canvas is presented, allowing you to select on of three report wizards: Table, Chart or Map. Since we have just finished adjusting a set of spatial data, the Map wizard looks like the ticket:
When Map is selected, you will be presented with a wizard which will take your through the creation of your first map report.
The first wizard menu presents you with a selection of spatial data sources. For this post we are going to use SQL Server spatial data:
The next step is to create a new data source connection:
Selecting "New..." brings up the next screen:
We have the option to create a new data source name but for this exercise, we won't change the default name. Selecting the "Build..." button displays the Connection Properties menu:
Here we will define our database server instance, authenticate the connection and connect to a database:
Hitting "OK" brings up the next panel. Since this looks good, we'll hit "OK":
This looks good, so we'll hit "Next >":
The query designer is now presented. We'll select "Edit As Text":
Here is the basic query we intend to use:
SELECT STATEFP, GEOM FROM US_STATES_INSET
This initial query selected columns STATEFP and GEOM with the goal of using the State FIPS Codes in the STATEFP column as the join column with the US_STATES_POPULATION table. The join column in the US_STATES_POPULATION table is the STATES column, also containing FIPS Codes for the States. A quick review of the two join columns revealed a basic incompatibility despite the fact that they were both defined as character columns:
| STATEFP | STATES | NAME |
| 01 | 1 | Alabama |
| 02 | 2 | Alaska |
| 04 | 4 | Arizona |
| 05 | 5 | Arkansas |
| 06 | 6 | California |
| 08 | 8 | Colorado |
| 09 | 9 | Connecticut |
to make the two join columns compatible, we can cast the character columns to integer. Here is the rewrite for the current query:
SELECT CAST(STATEFP AS INT) AS STATEFP, GEOM FROM US_STATES_INSET
We will use this query in the query designer. Hitting "!" button will execute this query, with the results displayed in the bottom window, below:
Selecting the "Next >" button brings up the following menu:
This all looks good, so we will select "Next >" here also and bring up the visualization menu, where we will choose "Color Analytical Map":
Choosing "Next >" brings up the "Choose the analytical dataset" menu. Our analytical data will come from the US_STATES_POPULATION table, se we nee to select the "Create a dataset" radio button:
Choosing "Next >" brings up the menu panel which allows us to choose a connection to the new dataset. Since the US_STATES_POPULATION table is in the same database as the spatial table (US_STATES_INSET), we can use the existing connection, DataSource1:
Choosing "Next >" bring us to the query designer screen for the analytical data:
As with the query designer for the spatial data, we well choose "Edit As Text" and use the following query:
SELECT CAST(STATE AS INT) AS STATE, POPESTIMATE2008 FROM US_STATES_POPULATION
Our intent is to join the two tables and display the 2008 population estimates (POPESTIMATE2008 column) for each state on the resulting map. As will recall, we are casting the State FIPS Code to integer to guarantee join compatibility on the join columns.
After selecting "Next >", we are offered the opportunity to specify the join columns:
Checking the "Key" box allows us to specify the analytical fields in the US_STATES_POPULATION table query we just specified - in this case Map Binding Field: STATEFP, Analytical Field: STATE:
Select "Next >" to bring up the color theme and data visualization menu. Here we select the data field (analytical field) which will be symbolized by different polygon fill colors based on, in this case, population:
Selecting "Finish >>" brings up the design canvas. In the version of the Report Builder that is being used for this post, there is a bug which prevents the map display from rendering the underlying spatial data correctly. In order to remedy this, we need to set the coordinate system for the map display (It currently defaults to the geography data type. Since the GEOM column in the US_STATES_INSET table is of type geometry, the design-time display is not correct):
To set the correct coordinate system, left-mouse-click on the design canvas to select and then right-click with the mouse to bring up the contextual menu. What we are looking for are the "Viewport Properties..."
Select the "Viewport Properties..." allows us to set the Map coordinate system to Planar (X,Y). In future versions of the Report Builder, the coordinate system will be automatically detected.
Selecting "OK" completes the task and the design canvas correctly illustrates the United States inset map. Note that the map is not yet rendered and the design canvas map display is simply a placeholder.
In the design canvas, we can adjust the map a bit, moving the color scale bar to the right and adding a title. When done, we can render the final map by choosing "Run" from the main menu ribbon of the Report Builder:
When rendering is completed, the following report is produced:
In the next post, Part 6, we will explore some additional capabilities of the Report Builder to finish off the series.
- "Cartographic Adjustment" of Spatial Data for SQL Server Reporting Services, Part 4
-
In the previous posts I indicated that this would be the last post in this series but it turned out that I was a bit optimistic. In order to make the final Report Builder post relevant, some interesting data for analysis was needed. In this post we describe where to find and how to load State-based population data - our "interesting data".
- Locate data source and download (Part 1)
- Load the data into SQL Server (Part 1)
- Remove unwanted features, simplify Alaska and Hawaii spatial features (Part 2)
- "Reposition" Alaska and Hawaii cartographically (Part 3)
- Locate and load State-based population data
- Use the data in the new SQL Server Report Builder
LOCATING STATE POPULATION DATA
Our search for population data by State was satisfied at the US Census Bureau Population Estimates site. There are a number of data files which are available, but I chose Population, Population change and estimated components of population change: April 1, 2000 to July 1, 2008 (NST-EST2008-alldata).
This data is delivered as commas-separated-value (CSV) ASCII. This data was opened in Excel and saved as Text (Tab delimited).
LOADING STATE POPULATION DATA
Now we are going to use FME to do something for which it is not well known - loading data with no spatial component.
ADDING THE STATE POPULATION DATA AS THE SOURCE DATASET
In FME-speak, all ASCII data is format "Comma Separated Value (CSV)". Consequently, we choose CSV as our format - never mind that our data is tab separated values. Next we locate the data using the "..." menu button. After both the Format: and the Dataset: text fields are populated, choose the "Settings..." button.
The "Settings..." button brings up a new menu panel which allows us to correctly define our field separator as "tab". Since our ASCII file contains field names, we want to make sure that this record is not interpreted as data. In fact, we will use the field names as the columns names in the final database table. The CSV File Preview provides visual verification that we are correctly interpreting the file structure.
The source data now appears on the workbench canvas:
ADDING THE DESTINATION DATASET
The destination dataset will be our new database table which we will name us_states_population. Here is the first menu panel that appears when we add a new Destination Dataset:
Our first task is to choose the correct output format. We have two potential choices for SQL Server. In this case, since the new database table will not have a spatial column, we want the Microsoft SQL Server Non-spatial (MSSQL_ADO) format:
Note: If we chose the Microsoft SQL Server Spatial (MSSQL_SPATIAL) format, the data load would work correctly but would have a spatial column defined in the output table, populated with NULL values.
Next, we want to select the "Settings..." button. This allows us to define our server instance, dataset and authentication. The use of the term "Dataset" is a bit confusing for database users. What FME is asking for, in this case, is the database name (in our case, temp_spatial):
With the destination defined, hit "OK" to continue...
FME next asks if you would like to add a new feature type to the dataset. This can be interpreted as "Would you like to add a new table to the database?", so we need to answer Yes:
The Feature Type Properties (a.k.a. Table Properties) panel asks for the Feature Type Name. We supply the name of the database table to hold the population data: us_states_population
We hit OK at this point and the destination end point appears on the workbench canvas:
Next, we connect the source and the destination and the upper connection triangles turn green, indicating success. Now, right click on the destination data and choose "Copy Attributes From Feature Type...":
The Copy Attributes menu panel will now display. Choose, "Copy Attributes From:" the only option will be the correction option, in this case:
The workbench canvas now appears as follows. It is unclear why FME choose to rename the STATE column but it did (to STATE00). This causes the implicit connections between source and destination columns to be undefined (yellow connection triangle on source and red connection triangle on the destination).
To associate the STATE (source) and STATE00 (destination) columns, drag a line between the two:
The final task is to check the data types and definitions associated with each column. Here are the original definitions:
The SUMLEV and STATE00 columns are both defined as float. While we are not interested in the SUMLEV column, the STATE00 column needs to be defined as char 2 for compatibility with the STATEFP column in the us_states table, to which we will join this table later on:
LOAD THE STATES POPULATION DATA
Execute the FME workspace to load the data into a SQL Server table, us_states_population. The last step is to rename the STATE00 to STATE in Management Studio.
We are now ready to create our first map report in the Report Builder. This will be described in Part 5.
Technorati Tags:
Non-spatial,
data,
population,
Census,
loading,
Safe Software,
FME 2010 Beta,
FME,
Feature Manipulation Engine,
SQL Server,
2008,
non-spatial
- "Cartographic Adjustment" of Spatial Data for SQL Server Reporting Services, Part 3
-
In the previous posts we have covered items 1,2 and 3 of the workflow. In this post we address item 4.
- Locate data source and download (Part 1)
- Load the data into SQL Server (Part 1)
- Remove unwanted features, simplify Alaska and Hawaii spatial features (Part 2)
- "Reposition" Alaska and Hawaii cartographically
- Use the data in the new SQL Server Report Builder
SQL SERVER SPATIAL TOOLS ON CODEPLEX
Now that we have our spatial data suitably prepared, we need to move the Alaska and Hawaii spatial features to new locations. In order to do this, we need the help of a new function which we have published on the SQL Server Spatial Tools site at CodePlex.com. This function, AffineTransform, is one of several new functions, aggregates and data types which we have made available with source code. It is beyond the scope of this post to describe SQL Server Spatial Tools, but if you are not familiar with it, you should take a look.
REGISTERING THE SQL SPATIAL TOOLS ASSEMBLY
SQL Spatial Tools is delivered as an assembly (SQLSpatialTools.dll) which must be registered on the database you wish to use. The registration process is delivered as a register.sql file which needs two modifications (in red) before it is run:
-- Insert your database
use [Your_Database]
go
-- Enable CLR
sp_configure 'clr enabled', 1
reconfigure
go
-- Insert the path to the SQLSpatialTools assesmbly here
create assembly SQLSpatialTools from 'C:\Apps\SQLSpatialTools.dll'
go
...
REPOSITIONING ALASKA AND HAWAII, PASS 1
With SQL Server Spatial Tools registered to your database, the AffineTransform function is now available for use in T-SQL. The AffineTransform is a static function with 3 parameters: Translate, Scale and Rotate. Setting the initial values for these parameters, especially scale and rotate is based on an educated guess. The Scale and Translate parameters are addressable via independent X and Y component values. The Rotate parameter has a single value, with positive numbers rotating the object in a counter-clockwise direction, negative numbers in a clockwise direction. Displaying the grid lines (Show grid lines in Management Studio) can be used to assist in setting the X and Y Translate values.
Here is the T-SQL for the first pass at repositioning Alaska and Hawaii:
USE [Your_Database]
GO
----------------------------------------------------------------
-- Reposition Alaska
----------------------------------------------------------------
DECLARE @a GEOMETRY
SELECT @a = GEOM FROM us_states_inset2 WHERE STUSPS = 'AK'
SET @a = AffineTransform::Translate(-1500000, -2500000).Apply(@a)
SET @a = AffineTransform::Scale(.75, .75).Apply(@a)
SET @a = AffineTransform::Rotate(15).Apply(@a)
----------------------------------------------------------------
-- Reposition Hawaii
----------------------------------------------------------------
DECLARE @h GEOMETRY
SELECT @h = GEOM FROM us_states_inset2 WHERE STUSPS = 'HI'
SET @h = AffineTransform::Translate(2500000, -400000).Apply(@h)
SET @h = AffineTransform::Scale(2, 2).Apply(@h)
SET @h = AffineTransform::Rotate(15).Apply(@h)
----------------------------------------------------------------
-- Display Original and Repositioned Features
----------------------------------------------------------------
SELECT @a -- Repositioned Alaska
UNION ALL
SELECT @h -- Repositioned Hawaii
UNION ALL
SELECT GEOM FROM us_states_inset2 -- Original Feature Positions
GO
The red arrows show the displacement of the Alaska and Hawaii features, after the first pass, below:
REPOSITIONING ALASKA AND HAWAII, PASS "n"
I'd like to say that I was able to achieve final positioning of Alaska and Hawaii in 2 passes but it did take a few more iterations. Here is the T-SQL for my final pass, achieving the desired positioning of Alaska and Hawaii:
USE [Your_Database]
GO
----------------------------------------------------------------
-- Reposition Alaska
----------------------------------------------------------------
DECLARE @a GEOMETRY
SELECT @a = GEOM FROM us_states_inset WHERE STUSPS = 'AK'
SET @a = AffineTransform::Translate(-1500000, -4500000).Apply(@a)
SET @a = AffineTransform::Scale(.5, .5).Apply(@a)
SET @a = AffineTransform::Rotate(27).Apply(@a)
----------------------------------------------------------------
-- Reposition Hawaii
----------------------------------------------------------------
DECLARE @h GEOMETRY
SELECT @h = GEOM FROM us_states_inset WHERE STUSPS = 'HI'
SET @h = AffineTransform::Translate(5000000, -400000).Apply(@h)
SET @h = AffineTransform::Scale(1.5, 1.5).Apply(@h)
SET @h = AffineTransform::Rotate(30).Apply(@h)
----------------------------------------------------------------
-- Display Original and Repositioned Features
----------------------------------------------------------------
SELECT @a -- Repositioned Alaska
UNION ALL
SELECT @h -- Repositioned Hawaii
UNION ALL
SELECT GEOM FROM us_states_inset -- Original Feature Positions
GO
The original location and the final repositioned location of Alaska and Hawaii are shown below:

UPDATE FINAL POSITIONS IN THE TABLE
With Alaska and Hawaii in final position, the last step is to update the geometry columns (GEOM) for the Alaska and Hawaii rows with the new geometry values:
----------------------------------------------------------------
-- Reposition Alaska
----------------------------------------------------------------
DECLARE @a GEOMETRY
SELECT @a = GEOM FROM us_states_inset WHERE STUSPS = 'AK'
SET @a = AffineTransform::Translate(-1500000, -4500000).Apply(@a)
SET @a = AffineTransform::Scale(.5, .5).Apply(@a)
SET @a = AffineTransform::Rotate(27).Apply(@a)
----------------------------------------------------------------
-- Reposition Hawaii
----------------------------------------------------------------
DECLARE @h GEOMETRY
SELECT @h = GEOM FROM us_states_inset WHERE STUSPS = 'HI'
SET @h = AffineTransform::Translate(5000000, -400000).Apply(@h)
SET @h = AffineTransform::Scale(1.5, 1.5).Apply(@h)
SET @h = AffineTransform::Rotate(30).Apply(@h)
----------------------------------------------------------------
-- Update Alaska Geometry Instance With New Position
----------------------------------------------------------------
UPDATE us_states_inset
SET GEOM = @a
WHERE STUSPS = 'AK'
----------------------------------------------------------------
-- Update Hawaii Geometry Instance With New Position
----------------------------------------------------------------
UPDATE us_states_inset
SET GEOM = @h
WHERE STUSPS = 'HI'
----------------------------------------------------------------
-- Display Table
----------------------------------------------------------------
SELECT * FROM us_states_inset
And there you have it - Alaska and Hawaii neatly tucked under the southwest states, ready for use in the new mapping function in SQL Server Reporting Services.
- "Cartographic Adjustment" of Spatial Data for SQL Server Reporting Services, Part 1
-
I was recently presented with the problem of taking a US States dataset, in geographic coordinates, and creating a database table whose spatial column would render in a similar fashion to the map, below, with the Alaska and Hawaii insets.
The ultimate goal was to use the resulting table in the new map control of the upcoming Report Builder in SQL Server Reporting Services (SSRS).
APPROACH
To illustrate how do this, I choose to use Safe Software's Feature Manipulation Engine (FME 2010 Beta) and SQL Server 2008. Safe Software aficionados will point out that this could have been done completely in FME. Likewise, this also could have been done completely in SQL Server 2008, though it would have been much more difficult. The use of both tools to solve this problem represents a reasonable real-world approach.
WORKFLOW
Here is the basic workflow:
- Locate data source and download
- Load the data into SQL Server
- Remove unwanted features, simplify Alaska and Hawaii spatial features
- "Reposition" Alaska and Hawaii cartographically
- Use the data in the new SQL Server Report Builder
In this post we will cover workflow items 1 and 2. In subsequent post we will address the other workflow items.
DATA SOURCE
The original data came from the US Census Bureau's collection of geographic data:
http://www2.census.gov/geo/tiger/TIGER2008/
Specifically, the data was the Tiger Line 2008 US State ShapeFile (tl_2008_us_state) found at:
http://www2.census.gov/geo/tiger/TIGER2008/tl_2008_us_state.zip
Here is how the input Shapefile appears in the FME Universal Viewer:
DATA LOADING
Data loading was accomplished using the FME Workbench, using the following procedure:
- Identify the tl_2008_us_state.shp Shapefile as the input data source
- Project the spatial data into an Albers Equal Area projection suitable for the conterminous United States
- Writing the data to a SQL Server 2008 table using the geometry data type
Add a data source (Source Data), specifying the format as ESRI Shape, the dataset as tl_2008_us_state.shp and the coordinate system as EPSG: 4269 (NAD 83).
For visual appeal, an Albers Equal Area projection is chosen using the ESRIReprojector transformer. I wanted to use the Reprojector transformer but, to date, have not figured out how to locate an existing projection for the North America (-96 central meridian, etc.).
No we get to the important part - SQL Server Spatial, our format for the Destination Dataset. Choose settings, next.
The setting menu panel allows you to setup the database connection and specify the output spatial column and name:
Selecting OK bring you back to the Add Destination Dataset where you can select OK to move on...
I always find this next part in FME a wee bit confusing, but before we get into that, answer "Yes" to this menu.
When FME asks, in a database context, if you would like to add a new feature type to the data set, what is really trying to say is, "what would like to name the new database table?". In this case our new table table (FME Feature Type Name) will be set to us_states.
Hit "OK" and we now can see the (almost) completed FME workspace. Note that the destination end point is not yet "connected" to the Shapefile attribute data. You can tell this by the yellow connector on the ESRIReprojector transformer and the fact that no attributes are list below destination endpoint:
Right-clicking on the destination endpoint brings up a contextual menu which provides a number of choices. You want the "Copy Attributes From Feature Type" option:
In this case, you have only one choice and it's the correct one (what a coincidence ;-):
The workflow is now complete and the data load sequence is ready to run:
To run the workflow, select the green triangle "run" button from the FME menu:
During the run process, FME writes out all actions to a log file and displays the log to the screen. In this case, the log noted the following:
MS SQL Server (Spatial) Writer: Failed to write SRID because the feature's coordinate system could not be converted to an EPSG number
Let's take a look at the new table, us_states, in SQL Server Management Studio, to see how the SRID value is treated. Here is how the spatial data displays using the "Spatial results" tab after a SELECT * FROM us_states query:
To check the SRID values of each geometry instance we can use the following query:
SELECT geom.STSrid AS SRID FROM us_states
From this we can ascertain that when FME cannot figure out the proper EPSG-based SRID to assign to geometry instances, it does nothing and allows SQL Server to use its default SRID value of 0.
Technorati Tags:
SQL Server,
2008,
spatial,
Safe Software,
Feature Manipulation Engine,
FME,
Universal Viewer,
data loading,
Shapefile,
shape file,
SRID,
EPSG
- "Cartographic Adjustment" of Spatial Data for SQL Server Reporting Services, Part 2
-
In the previous post we covered items 1 and 2 of the workflow. In this post, we will cover item 3:
- Locate data source and download
- Load the data into SQL Server
- Remove unwanted features, simplify Alaska and Hawaii spatial features
- "Reposition" Alaska and Hawaii cartographically
- Use the data in the new SQL Server Report Builder
The challenge now is to remove unwanted spatial features which are not relevant to the Reporting Services map we want to create. Additionally, we will simplify the Alaska and Hawaii spatial features.
Let's define the features that we wish to complete remove:
- American Samoa
- Commonwealth of the Northern Mariana Islands
- Virgin Islands of the United States
- Guam
- Puerto Rico
To remove these features, we will use standard T-SQL queries:
DELETE FROM us_states WHERE NAME = 'American Samoa'
DELETE FROM us_states WHERE NAME = 'Commonwealth of the Northern Mariana Islands'
DELETE FROM us_states WHERE NAME = 'Virgin Islands of the United States'
DELETE FROM us_states WHERE NAME = 'Guam'
DELETE FROM us_states WHERE NAME = 'Puerto Rico'
GO
Now that we are done with the mundane, let's look at the interesting problem of taking the Alaska and Hawaii spatial features and simplifying them by removing the of out lying islands in the Aleutian Islands and the Hawaiian archipelago. The image, below, shows the features we wish to remove:
This sounds simple enough until we realize that the Alaska and Hawaii spatial features are multipolygons and the features that we want to remove are part of the geometry instances. To make these features accessible, we have to separate the multipolygons into their individual polygon components. To do this we will use FME, both the Workbench and the Universal Viewer. Here is the workflow:
The data source is the us_states database table. An AttributeFilter transformer is specified to isolate the Hawaii multipolygon based on the STUSPS attribute:
The output of the AttributeFilter is then sent to a Deaggregator transformer to separate the Hawaii multipolygon into individual polygons. The output from the Deaggregator transformer is sent to a Visualizer, below:
In the Visualizer, we can identify each unique polygon which makes up the Hawaii spatial feature using the FME attribute, _part_number. Using this approach, we determined that we want to keep _part_numbers 1,2,3,4,6 and 7.
The Alaska problem is the reverse since there are so many parts to the multipolygon. In this case we will identify the polygons which we want to remove: _part_numbers 22, 18, 37, 29, 6, 12, 27, 31, 10, 48, 47, 13, 28, 9, 17, 3, 34, 39 and 44.
This workflow takes the separated (disaggregated) Hawaii and Alaska multipolygons and applies attribute filter transformers to identify specific component polygon elements. For the Hawaii sub-flow, the new RangeFilter transformer (FME 2010 beta) is used since polygon parts could be easily identified by two ranges. For the Alaska sub-flow, the polygon components to be removed are listed in the AttributeFilter transformer. The <UNFILTERED> channel is then chosen, returning the remaining Alaska polygon components. For both sub-flows, a final Aggregator transformer is used to reassemble the polygon components into multipolygon instances. All 3 sub-flows (Alaska, Hawaii and <UNFILTERED>) from the first AttributeFilter are channeled to the output database table, us_states_inset. This new table with the simplified Alaska and Hawaii spatial features is illustrated below from SQL Server Management Studio:
With the spatial features suitably simplified, we are now ready for the next step, "repositioning" the Alaska and Hawaii spatial features as insets. This will be described in Part 3.
Technorati Tags:
SQL Server,
2008,
Spatial,
Feature Manipulation Engine,
FME,
AttributeFilter,
RangeFilter,
Disaggregator,
Aggregator,
Transformer,
FME 2010 Beta
- Address Geocoding With SQL Server 2008 Spatial
-
In his new book, Beginning Spatial with SQL Server 2008, Alastair Aitchison describes how to create a new CLR-based geocoding T-SQL function utilizing the MapPoint Web Service. Additionally, there is an SSIS-based MapPoint Batch Geocoder for SQL Server 2008 which also uses the MapPoint Web Service. The MapPoint Web Service, as the name implies, requires an Internet connection. But, what if your SQL Server 2008 instance is not connected to the Internet? Yuri Software has a server-based offering, which allows an address geocoding server to be configured for your intranet environment. This service, YAddress, offers US-based address geocoding and address data cleansing (address correction, validation and standardization) without the need for an Internet connection.
Yuri Software has created two SQL Server-compatible geocoding functions which have been released on CodePlex. In order to make these functions usable in a test environment, Yuri Software has create a web service to which these functions connect. In a production environment, a high performance local YAddress server would be used. Each of these functions is described, below, along with simple T-SQL coding examples.
ProcessAddress
The ProcessAddress function is the first of the YAddress geocoding implementations for SQL Server. Executing the following T-SQL illustrates the values returned by YAddress:
SELECT * FROM ProcessAddress('3641 Mt Diablo Blvd', 'Lafayette, CA')
ErrorCode: 0
ErrorMessage:
AddressLine1: 3641 MT DIABLO BLVD
AddressLine2: LAFAYETTE, CA 94549-9998
Number: 3641
PreDir:
Street: MT DIABLO
Suffix: BLVD
PostDir:
Sec:
SecNumber:
City: LAFAYETTE
State: CA
Zip: 94549
Zip4: 9998
County: CONTRA COSTA
CountyFP: 06-013
Latitude: 37.891068
Longitude: -122.126023
PreciseGeocode: 1
Since this function returns a latitude and longitude value as floating point numbers, it is compatible with both SQL Server 2005 and SQL Server 2008.
Let's use SQL Server 2008 to construct a geography instance from the latitude and longitude returned from the ProcessAddress function, in this case using the OGC-compliant STGeomFromText() constructor:
DECLARE @lat FLOAT
DECLARE @lon FLOAT
DECLARE @geo GEOGRAPHY
SELECT @lat = latitude, @lon = longitude FROM ProcessAddress('3641 Mt Diablo Blvd', 'Lafayette, CA')
SELECT @geo = GEOGRAPHY::STGeomFromText('Point(' + CAST(@lon AS VARCHAR(32)) + ' ' + CAST(@lat AS VARCHAR(32)) + ')',4326)
SELECT @geo.STAsText()
GO
--POINT (-122.126 37.8911)
While the STGeomFromText() constructor certainly works, it's an effort to construct the Well-Known Text Point string. Let's construct the T-SQL using the Point() constructor:
DECLARE @lat FLOAT
DECLARE @lon FLOAT
DECLARE @geo GEOGRAPHY
SELECT @lat = latitude, @lon = longitude FROM ProcessAddress('3641 Mt Diablo Blvd', 'Lafayette, CA')
SELECT @geo = GEOGRAPHY::Point(@lat,@lon,4326)
SELECT @geo.STAsText()
GO
--POINT (-122.126023 37.891068)
This construction is certainly much easier and undoubtedly more performant. But Yuri Software has another trick up their sleeve - a SQL Server 2008-specific function: ProcessAddress2008.
ProcessAddress2008
ProcessAddress2008() returns a geography instance directly, eliminating the need for the constructor. Here is the full set of data returned from the function:
SELECT * FROM ProcessAddress2008('3641 Mt Diablo Blvd', 'Lafayette, CA')
ErrorCode: 0
ErrorMessage:
AddressLine1: 3641 MT DIABLO BLVD
AddressLine2: LAFAYETTE, CA 94549-9998
Number: 3641
PreDir:
Street: MT DIABLO
Suffix: BLVD
PostDir:
Sec:
SecNumber:
City: LAFAYETTE
State: CA
Zip: 94549
Zip4: 9998
County: CONTRA COSTA
CountyFP: 06-013
Location: 0xE6100000010C8D4127840EF24240CEE2C5C210885EC0
PreciseGeocode: 1
The Location field contains an instance of type geography, directly. To show the advantage of this approach, let's construct the same basic query, used in ProcessAddress, above:
DECLARE @geo GEOGRAPHY
SELECT @geo = Location FROM ProcessAddress2008('3641 Mt Diablo Blvd', 'Lafayette, CA')
SELECT @geo.STAsText()
GO
--POINT (-122.126023 37.891068)
We could be even more concise and achieve the same result:
SELECT Location.STAsText() FROM ProcessAddress2008('3641 Mt Diablo Blvd', 'Lafayette, CA')
GO
--POINT (-122.126023 37.891068)
Registering The Function
The last observation surrounds the setup commands used to register the new function with SQL Server. There are occasions where databases that have been restored from backups do not have the database owner correctly specified. This issue surfaced while trying to create the assembly from the YAddressSqlFunction.dll:
The database owner SID recorded in the master database differs from the database owner SID recorded in database 'Sample_Zipcodes'. You should correct this situation by resetting the owner of database 'Sample_Zipcodes' using the ALTER AUTHORIZATION statement.
Msg 6582, Level 16, State 1, Procedure ProcessAddress, Line 3
Assembly 'YAddressSqlFunction' is not visible for creating SQL objects. Use ALTER ASSEMBLY to change the assembly visibility.
In order to ameliorate this issue, use the following stored procedure, setting the database owner to system administrator (sa) before creating the assembly:
sp_changedbowner 'sa'
- Loading GeoNames Data Into SQL Server 2008 (the PowerShell, bcp way...)
-
In the first installment on this topic, I used a commercial text editor, EditPad Pro, to perform the Unicode file conversion and SQL Server BULK INSERT to load the data. In this installment, I'm going to change to workflow a bit to use of Microsoft's PowerShell for the Unicode file conversion and bcp (Bulk Copy) to perform the data loading. Thus, the new workflow is:
- Download GeoNames data
- Create a database table for the GeoNames data
- Convert and load the GeoNames data using PowerShell and bcp
- Add, update and index the spatial data column
- Sample spatial query
For this post, I'm going to concentrate on step 2. For the other workflow steps, please refer back to the first installment on this topic, Loading GeoNames Data Into SQL Server 2008 (yet another way...).
Convert and load the GeoNames data using PowerShell and bcp
For this exercise, my colleague, Bob Beauchemin, gave me some very useful guidance. If you are not familiar with Bob's blog, it a great place to find a wealth of information about SQL Server. Bob has special affection for spatial and you can find some very interesting posts on this topic.
Install and Configure PowerShell
The basis for this solution is the Microsoft Windows PowerShell command line shell. Being an old UNIX guy, shell scripting was my first inclination when presented with the UTF-8 --> UTF-16 conversion problem. And I'm happy to say that PowerShell provides and excellent Windows-based shell for doing exactly what is needed. If you've got Windows Server 2008, it's already there. If you have Windows XP, Windows Vista, or Windows Server 2003, you will need to download the appropriate version for your platform.
Once installed, you can fire up PowerShell by issuing the command, powershell, in the command shell (Start-> Run -> Open: cmd). When loaded, you will get a PS> prompt. Before we can execute a script, we need to set the script execution policy. For our humble purposes, we don't need to be bothered by the security considerations which might be demanded in a more public execution environment. If you decide to put your script into wider use, please do set an execution policy suitable for your environment.
Try running the following commands:
PS> Get-ExecutionPolicy
Restricted
PS> Set-ExecutionPolicy Unrestricted
PS> Get-ExecutionPolicy
Unrestricted
Note: The execution policy persists between PowerShell sessions, so you only need to set it once.
To exit PowerShell, type exit at the PowerShell Prompt. To learn more about PowerShell see the PowerShell Help Center.
Create PowerShell Script
You have now configured PowerShell so that you can easily run the script we are about to describe.
In this script, we will do the following:
- Create a temporary file name to hold the output from the command which converts the UTF-8 data to UTF-16.
- Execute a command which converts the data from UTF-8 to UTF-16, changes the row terminators to '\r\n' and writes a Unicode signature at the beginning of the converted data (the BOM is described in the previous post).
- Load the converted data file into SQL Server 2008 using the Bulk Copy Program, bcp
- Delete the temporary file
All of these operations will be saved as a function which we can run from the PowerShell command line, called convert-and-load. Here is the PowerShell script:
function convert-and-load {
$tempfile = "$env:temp\TEMP$(Get-Date -format 'yyyyMMddhhmmss').txt"
get-content -Path $args[0] –encoding utf8 | out-file $tempfile -encoding Unicode
$serverinstance = $args[2]
$serveroption = "-S$serverinstance"
bcp $args[1] in $tempfile -w –T $serveroption
del $tempfile
}
The bcp -w and -T arguments do the following (from MSDN bcp Utility):
-w Performs the bulk copy operation using Unicode characters. This option does not prompt for each field; it uses nchar as the storage type, no prefixes, \t (tab character) as the field separator, and \n (newline character) as the row terminator. Note that even though it says \n (newline) for the row terminator, it really means \r\n (carriage return\newline).
- T Specifies that bcp connects to SQL Server with a trusted connection
The next step is to install this script so that PowerShell recognizes the function.
Install PowerShell Script
You can install this new function into your PowerShell profile. This profile should be placed in the C:\Documents and Settings\[your Windows login]\My Documents\WindowsPowerShell folder. You can name the profile anything you want but it has to have a prefix of .ps1. I named mine: profile.ps1. When PowerShell starts up, it reads this file and loads, in this case, our new convert-and-load function.
Execute PowerShell Script
The new function, convert-and-load, takes 3 arguments:
- UTF-8 filename - e.g. c:\temp\allCountries.txt
- fully qualified database table name - e.g. Sample_GeoNames.dbo.GeoNames
- database server\database instance name - e.g. sqlspatial\ss1600
Here is how the command looked in my environment, ready for execution:
PS> convert-and-load c:\temp\allCountries.txt Sample_GeoNames.dbo.GeoNames sqlspatial\ss1600
UTF-8 to UTF-16 File Conversion Performance
The entire convert and load process of the allCountries.txt GeoNames dump (6,906,334 records) took approximately 1 hour and 25 minutes to complete with the UTF-8 to UTF-16 conversion process consuming approximately 1 hour and 5 minutes of that time. This is comparable to the time it took to perform the same conversion in EditPad Pro from my previous post.
bcp Performance
I saved the temporary UTF-16 file by removing the del $filename line in the script and ran 3 scenarios to get a better feel for the performance of bcp. Here are the results:
Data residing on c: drive (same drive as the database storage):
6906334 rows copied.
Network packet size (bytes): 4096
Clock Time (ms.) Total: 1112859 Average: (6205.94 rows per second)
18.55 minutes
Data residing on e: drive (database storage on c:):
6906334 rows copied.
Network packet size (bytes): 4096
Clock Time (ms.) Total: 1009954 Average: (6838.27 rows per second)
16.83 minutes
Data residing on e: drive, increased network packet size:
6906334 rows copied.
Network packet size (bytes): 16192
Clock Time (ms.) Total: 1102265 Average: (6265.58 rows per second)
18.37 minutes
There are other tuning measures to improve bcp's performance and we will investigate this for a future post. Compare this with the 5.75 minutes it took to load the same data using the BULK INSERT T-SQL command (see the "Load the GeoNames data" section from my previous post).
Conclusion
While we give up some performance, mainly due to bcp, we do have a more convenient workflow. It is also important to recognize how bcp and BULK INSERT affect their respective execution environments. bcp places the computing load on the local client machine. BULK INSERT, on the other hand, places the load directly on the database server. If you are trying to minimize load on a busy database server, then bcp may be the best solution.
In my next installment on this topic I will explore the SQL PowerShell Extensions, providing access to BULK INSERT from within the PowerShell script.
Technorati Tags:
PowerShell,
bcp,
BULK INSERT,
data Loading,
GeoNames,
SQL SERVER,
SQL Server 2008,
Spatial,
UTF-8,
UTF-16,
Unicode
- Loading GeoNames Data Into SQL Server 2008 (yet another way...)
-
In his post last year, Integrating Virtual Earth and GeoNames, Johannes described how to load the GeoNames database into SQL Server using the "Import Data" Wizard in SQL Server Management Studio. In his scenario, Johannes did not need to take advantage of the spatial type support in SQL Server 2008, since there were columns of latitude and longitude available for use by Virtual Earth. But what if you wanted to load the GeoNames locations into SQL Server Spatial and use the new geospatial features? This post describes how to do that in 6 easy steps...
- Download GeoNames data
- Prepare GeoNames data
- Create a database table for the GeoNames data
- Load the GeoNames data
- Add, update and index the spatial data column
- Sample spatial query
So, let's get started...
Download GeoNames Data
You can find GeoNames data organized by country (and a few other organizational categories) by browsing the GeoNames Download page. If you wanted to download the GeoNames data for the United States, you would choose US.zip. For this exercise, we are going to download the allCountries.zip file, containing the full worldwide GeoNames database. Unzipping the downloaded file will produce the file allCountries.txt.
Note: This version of GeoNames data was download on January 13, 2009 and contained 6,906,334 records.
Prepare GeoNames Data
The GeoNames data is available as a tab-delimited, UTF-8 encoded text file. Each record is terminated with newline character ('\n'). SQL Server 2008 does not support UTF-8, expecting such data in UTF-16 encoding with each record delimited with a carriage return, newline combination ('\r\n'). Additionally, it needs a Byte Order Marker (BOM) at the beginning of the file. Johannes introduced NAnt as a way to convert UTF-8 data to UTF-16. I chose another way. Using the Windows text editor, EditPad Pro, I converted the allCountries.txt file into a form compatible with SQL Server 2008. Here is the workflow I used with EditPad Pro:
Note: While this conversion could conceivably be done in Word 2007, Word cannot open and operate efficiently on files of this size. EditPad Pro can reasonably handle data of this magnitude.
a. Open allCountries.txt file in EditPad Pro.
b. Set Record Delimiter. In the "Convert" menu (main menu), select "To Windows (CR LF)". This will set the record delimiter to '\r\n'.
This took several minutes to complete in EditPad Pro after the "To Windows (CR LF)" option was selected - presumably loading the file into memory and performing the requested operation...
c. Set text encoding. While still in the "Convert" menu, choose "Text Encoding".
In the Text Encoding menu note the Original Encoding, set to Unicode, UTF-8. Select the "Encode the original data with another character set." button. Under New Encoding, choose "Unicode, UTF-16 little endian".

After hitting "OK", my humble dual proc machine with 2GB of RAM squawked about low virtual memory, but continued to process the file...
d. Set the Byte Order Marker (BOM). In the "Options" menu (main menu), select "Configure File Types...".
Choose the "Encoding" tab. Under File Types, make sure "Text Document" is selected. Make sure that the Text Encoding is set as follows:
Note on the Default Line Break Style: While it appears that the line break was already set under stop b. (above), it appears that setting the line break in both places is required, though it is not obvious, why...
Note on the Byte Order Marker (BOM): You can see the BOM as the FF FE bytes at the
start of the file by pressing Crtl+H in EditPad Pro (this switches EditPad Pro to hex display mode). If your file does not include the BOM, SQL Server will complain, when loading data, that the "...file does not have a Unicode signature."
e. Create new file copy, correctly encoded for SQL Server 2008. In the "File" menu (main menu), select "Save Copy As...".
Save the file with the desired name (I used allCountries_utf16.txt):
The file, allCountries_utf16.txt, is now a Unicode UTF-16, BOM encoded text file with tab-delimited ('\t') fields and CR\LF row terminators ('\r\n'), ready for SQL Server 2008.
Before we leave this section, here are some statistics:
| FILE | | SIZE |
| allCountries.zip | | 174,594 KB |
| allCountries.txt | | 783,093 KB |
| allCountries_utf16.txt | | 1,563,976 KB |
Create a database table for the GeoNames data
Here is the T-SQL to create a table called GeoNames...
CREATE TABLE GeoNames(
geonameid int NOT NULL,
name nvarchar(200) NULL,
asciiname nvarchar(200) NULL,
alternatenames nvarchar(max) NULL,
latitude float NULL,
longitude float NULL,
feature_class char(2) NULL,
feature_code nvarchar(10) NULL,
country_code char(3) NULL,
cc2 char(60) NULL,
admin1_code nvarchar(20) NULL,
admin2_code nvarchar(80) NULL,
admin3_code nvarchar(20) NULL,
admin4_code nvarchar(20) NULL,
population int NULL,
elevation int NULL,
gtopo30 int NULL,
timezone char(31) NULL,
modification_date date NULL
)
GO
Note on the alternatenames column: This column holds the GeoNames field which requires Unicode (UTF-8, UTF-16).
Load the GeoNames data
I used the BULK INSERT command to load the allCountries_utf16.txt file. Note the DATAFILETYPE = 'widechar' parameter - this is required for loading UTF-16 data.
BULK
INSERT GeoNames
FROM 'C:\temp\allCountries_utf16.txt'
WITH(
DATAFILETYPE = 'widechar',
FIELDTERMINATOR = '\t',
ROWTERMINATOR = '\n'
)
GO
--(6906334 rows(s) affected) (00:05:44)
Note on the ROWTERMINATOR: While I have changed the Row Terminator in the allCountries_utf16.txt file to '\r\n', SQL Server, curiously, requires the Row Terminator to be set to '\n'. There is no good explanation for this...
Add, update and index the spatial data column
The current table does not contain a column to hold native spatial data. Since the data is in WGS84 geographic coordinates, I used the geography data type for the new column:
ALTER TABLE GeoNames
ADD geog GEOGRAPHY NULL
GO
To populate the new geography column (geog), I used the STGeomFromText constructor to create a POINT feature for each GeoNames row. Since this is a text-based constructor, the longitude and latitude data, coming from columns of like name, will need to be converted into text using the CAST function. The SRID for the STGeomFromText constructor is set to 4326, the EPSG value for WGS84 coordinates.
UPDATE GeoNames
SET geog = GEOGRAPHY::STGeomFromText
('POINT(' + CAST(longitude AS CHAR(20))
+ ' ' + CAST(latitude AS CHAR(20)) + ')',4326)
GO
--(6906334 rows(s) affected) (00:09:40)
Note on coordinate order: Since this is an OGC-based operator, the WKT is formed as: 'POINT (<longitude> <latitude>)'.
In order to create a spatial index, the table must have a primary key. While I could have created the primary key in the CREATE TABLE DDL, this would have meant that the data was loaded into an existing index, slowing down the load process considerably.
ALTER TABLE GeoNames
ADD CONSTRAINT pk_geonames_geonameid
PRIMARY KEY (geonameid )
GO
Here is the DDL to create a spatial index on the geography column, geog:
CREATE SPATIAL INDEX geonames_mmmm16_sidx
ON GeoNames(geog)
USING GEOGRAPHY_GRID
WITH (
GRIDS = (MEDIUM, MEDIUM, MEDIUM, MEDIUM),
CELLS_PER_OBJECT = 16,
PAD_INDEX = ON
)
GO
--(00:10:30)
Sample spatial query
As an example of the type of queries which can now be supported using the GeoNames table, consider the following: "Find all place names within 1 kilometer of the Travis County (Texas) boundary line":
Note: the distance parameter for the STBuffer() method (geography data type) is in meters...
DECLARE @g GEOGRAPHY
DECLARE @h GEOGRAPHY
DECLARE @i GEOGRAPHY
DECLARE @j GEOGRAPHY
SELECT @g = geog FROM US_Counties WHERE NAME = 'Travis'
--SQL Server Execution Times: Elapsed time = 8 ms
SELECT @h = @g.STBuffer(1000) -- Positive buffer around Travis County
--SQL Server Execution Time: Elapsed time = 16 ms
SELECT @i = @g.STBuffer(-1000) -- Negative buffer around Travis County
--SQL Server Execution Time: Elapsed time = 16ms
SELECT @j = @h.STDifference(@i)-- Difference Polygon
--SQL Server Execution Time: Elapsed time = 4 ms
SELECT name FROM GeoNames WHERE geog.STIntersects(@j)=1
ORDER BY name ASC
--91 rows affected
--SQL Server Execution Time: Elapsed time = 823 ms.
Here is the list of place names returned by the query:
| Anderson Mill Anderson Mill Elementary School Apple Spring Apple Spring Hollow Austin Air Ranch Airport Austin Raceway Park Balcones Country Club Balcones Country Club Lake Balcones Country Club Lake Dam Bear Creek Best Western Southgate Inn Ste Blessed Juan Diego Catholic Church Bonnet Cemetery Boultinghouse Mountain Bratton Cemetery Bridgeway Community Church Cambrian Creek Canyon Vista Middle School Canyon Vista Pool Cedar Park High School Center Union Church Chalk Knob Chalk Knob Branch Child Evangelism Fellowship Church Church of Christ - Pond Springs Clarence Washington Farm Dam Clarence Washington Farm Lake Cottonwood Branch Crestwood Suites - Austin Cypress Creek Baptist Church | | Cypress Elementary School Devils Hollow Esa Austin-Round Rock-South Fall Creek Cemetery Fitzhugh Fitzhugh Cemetery Garfield Pumping Station Gateway Community Church Gay Hollow Hamilton Hill Hammetts Crossing Hampton Inn Austin Round Rock Harris Branch Haynie Flat Cemetery Hilton Garden Inn Round Rock Hope Presbyterian Church Huddleston Cemetery Ingram Cemetery Jollyville Jollyville Cemetery Jollyville Elementary School KGTN-AM (Creedmoor) Koenig Ranch La Frontera in Round Rock La Quinta Inn & Suites Round Rock South Manchaca Optimist Youth Sports Complex Manchaca Springs Marriott Austin North Martin Hill McNeil | | McNeil High School Merrelltown Muleshoe Bend Muleshoe Bend Trail New Hope Community Church Niederwald Cemetery North Creek Park Pond Springs Presbyterian Church of the Hills Purple Sage Elementary School Rattan Creek Trail Red Bluff Creek Residence Inn By Marriott Austin Round Rock Rhodes Cemetery Rim Rock Trail Round Rock Korean Presbyterian Church Round Rock Opportunity Center Roy Creek Saint Vincent de Paul Catholic Church Shingle Hills Signal Hill Spicewood Elementary School Spillar Ranch Studio 6 Austin Northwest Tanglewood Spring Texas No Name Number 9 Dam The Marbridge School Turkey Bend Turkey Bend Trail Vasquez Cemetery Woods Hill |
Here is the visual result of the place names locations, within the county boundary buffer, presented in Management Studio:
Here is a more detailed view (the actual point locations are represented by the block dots in the center of each point symbol):
And there you have it - sub-second performance against 6.9 million rows with a complex spatial object on a very modest machine...
- Where is Ed?
-
After a rather lengthy hiatus, I'm back to blogging once again. I was awakened from my reverie by an old friend who noted that a few folks had wondered if I was still alive and kicking. Not to worry, I'm still busy with SQL Server Spatial. But back to the central question: Where is Ed?
One of the things I've been up to is working with my colleague Olivier on a little project, not coincidentally named "Where is Ed?" or WiE, for short. The goal of WiE started simply enough - build an application using off-the-shelf Microsoft technology to track my location, anywhere in the World (well, at least when I was visible to the GPS satellite constellation and had WiE tracking turned on in my Windows Mobile-equipped phone).
I won't go into a lot of detail about WiE since Olivier has written extensively about it on his blog and has created a CodePlex site from which the underlying code can be downloaded. The basic architecture is as follows:
The public face to WiE is presented at the site: http://www.wiecommunity.com.
You might note the references to George Curious and Thomas Engine in the list of Friends. These are test users, inspired by Olivier's children's interests ;-).
When either Olivier or I are active, you can observe our location and a trace of our route:
Please bear in mind that this project is a work in progress, with numerous minor details still being worked out. If you have any comments or suggestions, please let me know.
- Working with Invalid Data and the SQL Server 2008 Geography Data Type, Part 1b
-
"Fixing" Polygon Ring Orientation in SQL Server 2008 using T-SQL
In the last installment, I promised to try and correct invalid Geography data using FME. While I am making progress on that front, I decided to take a small detour and discuss an interesting but little known behavior of the SQL Server Geometry type, polygons and data validation.
One of the most common data errors when attempting to import spatial data into the Geography data type involves the ring orientation of polygons:
A .NET Framework error occurred during execution of user-defined routine or aggregate "geography":
Microsoft.SqlServer.Types.GLArgumentException: 24205: The specified input does not represent a valid geography instance because it exceeds a single hemisphere. Each geography instance must fit inside a single hemisphere. A common reason for this error is that a polygon has the wrong ring orientation.
Consequently, we often see inquiries about changing ring orientation in SQL Server, using T-SQL. Since SQL Server does not allow polygons with the "wrong" ring orientation into the Geography data type, any solution to this dilemma must use the Geometry data type. At this point, I should note that FME 2008, when loading data into a column of type Geography, attempts to correct polygon ring orientation. I say attempts, since there are edge conditions (+- 180 degree meridian and the Poles) which can derail this process. The same goes for the technique in T-SQL that I am about to describe.
Here is the aforementioned behavior - When executing an OGC-based method (which works on the entire geometry) on valid polygon features in the Geometry data type, SQL Server will check ring orientation. If SQL Server finds outer rings oriented in a clockwise direction, it will re-orient such rings to counter-clockwise - the direction required for outer rings in the Geography data type. The same goes for inner rings (holes), which SQL Server will orient to clockwise.
So, how might you take advantage of this behavior? I'll use a very simple test case:
--
-- Create table with Geometry column
--
CREATE TABLE foo (
id INTEGER,
geom GEOMETRY);
--
-- Create two similar polygons with opposite ring orders
--
DECLARE @g GEOMETRY = 'POLYGON ((0 0, 10 0, 10 10, 0 10, 0 0))';
DECLARE @h GEOMETRY = 'POLYGON ((0 0, 0 10, 10 10, 10 0, 0 0))';
--
-- Persist polygons to the table
--
INSERT INTO foo (id, geom) VALUES(1,@g);
INSERT INTO foo (id, geom) VALUES(2,@h);
--
-- Update rows, forcing validation of spatial objects
--
UPDATE foo
SET geom = geom.STUnion(geom.STStartPoint());
--
-- Verify new, consistent, ring order after forced validation
--
SELECT id,geom.ToString() FROM foo;
--
-- Results:
-- 1 POLYGON ((0 0, 10 0, 10 10, 0 10, 0 0))
-- 2 POLYGON ((0 0, 10 0, 10 10, 0 10, 0 0))
The key to this behavior is the the STUnion() method. Since this is an OGC-based method, working on the entire geometry for a given feature, it forces polygons into the orientation required for the method - which just happens to be the one used for the Geography type (quite a coincidence, eh? ;-). This method illustrated is quite efficient, keeping overhead small in the quest to achieve our goal of re-orienting the polygon rings.
Since the Geometry must be valid for this to work, you might want to add the MakeValid() method to the processing stream.
UPDATE foo
SET geom = geom.MakeValid().STUnion(geom.STStartPoint());
It is important to note that a MakeValid() on an already valid polygon geometry will not change ring orientation, hence the need to "force" a complete validation pass with the OGC-based method (STUnion, in this case).
To finish up this exercise, let's move the Geometry data to Geography data in a sister table. Before doing this, please bear in mind that this approach is naive in that it does not accommodate several potential edge conditions. Never-the-less, this approach should work in many cases.
--
-- Create table with Geography column
--
CREATE TABLE gfoo (
id INTEGER,
geog GEOGRAPHY);
--
-- Convert from Geometry to Geography using Well Known Text
--
INSERT INTO gfoo
SELECT foo.id, GEOGRAPHY::STGeomFromText(foo.geom.STAsText(),4326)
FROM foo;
--
-- Convert from Geometry to Geography using Well Known Binary
--
INSERT INTO gfoo
SELECT foo.id, GEOGRAPHY::STGeomFromWKB(foo.geom.STAsBinary(),4326)
FROM foo;
A special thanks to IsaacK and BobB for suggestions and assistance with the SQL syntax...
- GeoWeb 2008 Conference Dinner Cruise Photos (and SQL Server 2008 Spatial ;-)
-
On the Wednesday night GeoWeb 2008 dinner cruise around Vancouver harbor, several folks who had forgotten their cameras (and 1 who had broken their camera) noted that I was busy taking photos and asked if they could be made available. I am happy to report that I have successfully uploaded the photos to my Skydrive account. There you will find 13 zipped archives, each containing approximately 7 to 8 images (GW-1.zip,...,GW-13.zip). Why 13 zip archives, you ask? Because Skydrive does not allow any single file to be over 50 megabytes in size. The 94 images I took vary in size from 4.96 to 7.70 megabytes each, for a total size of 590 megabytes - thus the need for 13 archives. In the interest of time, I did not perform any significant editing, removing only the obvious duds – camera shake on a couple and some “misses” during the fireworks display.
I just put up another folder containing the individual photos, if you want to browse before downloading the archives or if you just want a few images.
The last question might you ask is how these images could possibly relate to SQL Server 2008 Spatial? Easy - each image has the latitude and longitude (or longitude and latitude, depending on your persuasion…) and UTC time encoded in the EXIF image metadata. This was accomplished through the use of my di-GPS equipped Nikon D300 camera. Back at the hotel room, I loaded the dinner cruise images into SQL Server 2008 (Geography data type for image location, DateTimeOffset data type for date/time/time zone, varbinary(max) data type with FileStream option for the image itself and I used the Sparse Columns option for the plethora of GPS-related data columns). With the location and image data for each photo in SQL Server 2008, Virtual Earth was used to display the location of the images:

Enjoy.
Technorati Tags:
GeoWeb 2008,
Photos