One Type, Two Types...

One Type, Two Types...

  • Comments 10

This is a little bit less of an introductory post than the last one, but there was quite a bit of discussion about our decision to split our spatial types in two---one for our planar ("flat-Earth") model and one for our ellipsoidal ("round-Earth") model---so I thought I'd address it.  To be honest, this is something that we struggled with, but in the end we favored the two-type model.  Here's a quick look at of some of our reasoning.

Let me begin with a roughly analogous situation: SQL Server's floating- and fixed-point types, float and decimal.  These types really have the same interface, so we could have conceivably had one type with a floating-point/fixed-point option.  Why choose two types over one?


One reason is that mingling the two would be confusing, since their semantics differ.  If we didn't separate the types, then we could write a method that took that single numeric type as input.  If we performed any operations that depended on the difference in semantics, then we would have to be careful to check which one we were dealing with and proceed accordingly.  We could skip these checks if we always expected, say, fixed-point, but we could be almost certain that if our code survived long enough, some schmuck would eventually hand us a floating-point value.


By separating the types, there is a clear division, and given an instance of a particular type we know exactly what we're dealing with.  Additionally, we know that nobody could hand us the wrong type later on.


So by separating the types we've made things more explicit and clear, but one can argue that since the semantics are often the same---or at least close enough that we don't care---we could save code by unifying the types.  This is undoubtedly true: there's a tradeoff to be made.


We face a similar problem with spatial types.  The two types often behave quite similarly, but there are some key differences.  The tradeoff is the same: you can save code by unifying the types at the expense of clarity and robustness.


So, what's different between the planar and ellipsoidal  types?  Well, beyond the obvious---one is flat, one is round---here are some examples:


  • In the planar system, distances and areas are given in the same unit of measure as coordinates.  E.g., the distance between (2, 2) and (5, 6) is 5 units, regardless of what units are.  In a geodetic system, coordinates are given in degrees, but it hardly makes sense to give lengths and areas in degrees or square degrees---we'd much prefer something like meters.

  • In the planar system, we don't care about the orientation of a polygon.  E.g., a polygon described by ((0, 0), (10, 0), (0, 20), (0, 0)) is the same as one described by ((0, 0), (0, 20), (10, 0), (0, 0)).  The OGC Simple Features for SQL specification doesn't dictate a ring ordering, so we don't enforce one.  In a geodetic system, a polygon is ambiguous without an orientation.  E.g., which hemisphere would a ring around the equator describe?

  • In planar coordinates, it makes sense to use a bounding box as a cheap substitute for an object.  In geodetic coordinates, a bounding circle is more natural.

  • OGC talks about outer rings and inner rings, but this distinction makes little sense for a geodetic type: any ring of a polygon can be taken to be the outer one.

    Of course, this is only a start.  In addition, we are aiming to simplify things a little bit in our round-Earth type.  We thought that given all of this, merging the two types would be confusing, especially for the non-experts out there, so we decided to separate things into two types.


    Is that the right decision?  We think so, but we recognize that not everyone may agree with us.  We look forward to hearing from you.




    • The idea of using a bounding circle for geodetic objects is interesting.  Presumably this is physically represented as a geographic location with a radius value?  How do you compute this efficiently (e.g. do you actually compute the Smallest Enclosing Circle, or do you just pick an obvious point such as the centroid).

      And then the big question: how do you index these circular envelopes?  All the spatial indexes I'm familiar with use rectangles as their indexable item.  Is there an index which handles circles directly?

    • Ok, I buy why you need two different types, but please someone at least bring out the naming police here. You can't seriously consider naming it "Geography" ? Should that really prevent confusion?

      "geography" (Noun): The study of the physical structure and inhabitants of the Earth.

      "geometry" (countable and uncountable; plural geometries)

      1. The branch of mathematics dealing with spatial relationships.

      2. A type of geometry with particular properties.

      spherical geometry

      3. The spatial attributes of an object, etc.

      What about "Geographic" ? At least this relates to the type of coordinate system that is uses, or even better "Spheric" so it doesn't necessarily relate to Mother Earth.

    • Ah, the naming issue...  :)

      From :

       2 : the geographic features of an area

      This seems reasonable to me, and feels better than "geographic", which is an adjective.

      To be honest, our main concern with the name "geography" is that it sounds too close to "geometry".

    • My point exactly. The geography is the FEATURES of an area, but not the AREA itself. You can view it as the other columns in a datarow that describes certain characteristics relating to the geometric shape.

      In the sense you want it to be understood, the geometry type would also be 'geography'.

      Since the datatype is describing the type of the data, it only makes sense that it is an adjective. It tells you that this is data described in a spherical space. Inconsistently the 'geometry' on the other hand doesn't tell you anything about the type of coordinate space.

      Isn't 'Integer' considered an adjective too?

    • I'm sorry---I misread your comment.

      I'm no grammarian, but in general, we name types after nouns, not adjectives.  "Integer", for example, is a noun, and "integral" is the matching adjective.  

      For "geometry", I think we want definition 2 b : surface shape.

      The same goes for most types in SQL Server: "char" (short for character---the adjective definition of which seems a bit odd for our use), "money" (not monetary), "timestamp", etc.  

      It's true that some are a bit odd in normal english usage---none of the dictionary definitions of "float" make sense here---but they're all used as nouns when we use them.  (E.g., "Pass a float to the function.")



    • Martin,

      You nailed it: picking a minimal bounding circle is tough, so we take the centroid and the smallest radius we can.  I'll cover this more in a future post, but we don't actually use the MBR (or MBC) for indexing.



    • I'm a novice at GIS, but I love it so.  My comment on naming is this.  When coding (TSQL or .NET), I'm usually using variables called "Area" when it refers to geometric shapes.  It makes it more generic if I ever change that data type from say a Rectangle to a Region.  To me, everything is area whether its a Polygon, Rectangle, etc.  Also, "Shape" makes sense but it always takes me back to the days of the VB6 Shape control.


    • I've been stalling, trying not to say too much about spatial until it's actually available. The code


    • This is a little bit less of an introductory post than the last one , but there was quite a bit of discussion about our decision to split our spatial types in two---one for our planar ("flat-Earth") model and one for our ellipsoidal ("round-Earth")

    Page 1 of 1 (10 items)