The Imprecise Nature of Geometry

The Imprecise Nature of Geometry

Hi Folks,

Let's kick this off with what may be a surprising result.  What should this code return?

`declare @g geometry = 'LINESTRING (0 0, 1 3)'`
`declare @h geometry = 'LINESTRING (0 3, 2 0)'`
`declare @i geometry = @g.STIntersection(@h).ToString()`
`select @i.STIntersects(@g), @i.STIntersects(@h)  -- returns 0, 0`

It would seem that @i should be the point where @g and @h intersect, and so a test whether @i intersects @g and @h should clearly return true (or 1).  However, for some reason @i does not appear to intersect either @g or @h.  How can that be true?

The reason is that geometric calculations are by nature imprecise, and this is an unavoidable result of the imprecise nature of the data types upon which these calculations are built.  To illustrate, let's take a look at a non-geometric example.  First, some values to work with:

`declare @u float = 1000`
`declare @v float = -1`
`declare @w float = 1.0001`

Note that none of these values are particularly large: they all fit comfortably within our 64-bit floating point variables.  Next let's look at a pair of equations:

`@u * (@v + @w)`
`(@u * @v) + (@u * @w)`

If we think back to basic algebra, we know that these two equations should be the same: in the second one we've simply distributed @u over (@v + @w).  But, if we test this...

`if  (@u * (@v + @w)) = ((@u * @v) + (@u * @w))`
`    print 'equal'`
`else `
`    print 'not equal'`

...we find that SQL Server doesn't think they're equal.  We can look at each of these values out to see what's going on:

`select @u * (@v + @w)        -- 0.099999999999989`
`select (@u * @v) + (@u * @w) -- 0.100000000000023`

Ack!  If we were working with true rational numbers, these values should be exactly the same (and should be exactly .1).  Our floating point calculations don't return the correct result, and don't even return the same results: they will never compare as equal.

There is, in fact, no way to avoid this type of problem completely with any bounded numerical representation, so we are stuck dealing with imprecise calculations no matter how we represent values.  One result of this is the maxim that you should never test floating point numbers for equality, and this general rule extends to geometries (and geographies) as well.

Let's look again at our geometry example.  If we again do a little algebra, we find that the actual intersection point should be (2/3, 2).  If we as SQL Server, however, our results are a little off:

`declare @g geometry = 'LINESTRING (0 0, 1 3)'`
`declare @h geometry = 'LINESTRING (0 3, 2 0)'`
`declare @i geometry = @g.STIntersection(@h).ToString()`
`select @i.ToString()  -- result: POINT (0.66666666666666674 2)`

Errors like these are unavoidable in principle.  One may be inclined to simply increase the precision of one's calculations, but no (finite) amount of tightening can truly eliminate this problem.

Cheers, and enjoy  SQL Server 2008.
-Isaac

• Wouldn't internally using a highly precise fixed-point (rather than 8-btye double) type address some of this? Grant decimal(38,n) isn't storage appealing, but...

• Hi Kent,

Unfortunately, fixed point---binary or decimal---is no panacea here.  For example, I think we'd agree that a * (x / a) should always equal x, so long as a is not zero.

But...

declare @a decimal(38,20) = 3

declare @x decimal(38,20) = 1

select  @a * (@x / @a)

...comes back with the result 0.999999, not 1.0.

The only true way to handle all computations like the ones in this post without error would be to actually store rational numbers, but the storage size of a rational is not bounded.  Rationals also don't account for what happens when we take a sine, or even a square root.

Cheers,

-Isaac

• Isaac,

Neat example - I was trying the same test in PostGIS and it is nicely equally as wrong as it is in SQL Server 2008.  Its nice when  the databases I use most commonly produce the same wrong answer then at least I can apply similar tricks to compensate.

The way I handle this in PostGIS is to use the ST_DWithin function.  Which is a short hand of saying are 2 objects within a certain distance of each other.  So I figure if things are this close for all intensive purposes they intersect.

The below returns true in PostgreSQL

SELECT ST_DWithin(ST_Intersection(ST_GeomFromText('LINESTRING (0 0, 1 3)'), ST_GeomFromText('LINESTRING (0 3, 2 0)')),ST_GeomFromText('LINESTRING (0 3, 2 0)'),0.00001)

I was trying to figure out how to do the same thing in SQL Server 2008, but wasn't sure which function to use.  The closest would seem to be

@geom1.STDistance(@geom2) < 0.00001

But I'm not sure if that would take advantage of indexes or not (in PostGIS it wouldn't thus the need for the aforementioned approach

• This version is slightly more readable and comparable to yours than the one I posted before

SELECT ST_DWithin(ST_Intersection(g,h), g, 0.00001) As intersectsg, ST_DWithin(ST_Intersection(g,h), h, 0.00001) as intersectsh,

ST_AsText(ST_Intersection(g,h)) As trep

FROM

(SELECT ST_GeomFromText('LINESTRING (0 0, 1 3)') As g,

ST_GeomFromText('LINESTRING (0 3, 2 0)') As h) as foo

Result is

true, true, POINT(0.666666666666667 2)

• Hi Regina,

This could probably use another post or two---there's quite a bit of meat here.  I would be very surprised (shocked, even!) if SQL Server actually matched PostGIS in all such cases.

SQL Server can make use of the index to answer the STDistance predicate you suggest, though.

Cheers,

-Isaac

• Using the STBuffer function is another way to specify the desired intersection within a desired range of error (conceptually the same as Regina's response):

declare @g geometry = 'LINESTRING (0 0, 1 3)'

declare @h geometry = 'LINESTRING (0 3, 2 0)'

declare @i geometry = @g.STIntersection(@h).STBuffer(.0001)

select @i.STIntersects(@g), @i.STIntersects(@h)  -- Returns 1, 1

Luke

Page 1 of 1 (6 items)