Determining the Geographic Extent of Spatial Features in a SQL Server Table

I’ve been using the SQL Server Denali CTP(s) for just over 6 months now. As a result, I’ve become accustomed to some of the new features and forget that they aren’t generally available in production environments yet… it always comes as a bit of a surprise when you have to do some work on a SQL Server 2008/R2 database and suddenly discover half of the feature you’ve come to rely on aren’t there any more (I pity those consultants who turn up at a client’s site to find SQL Server 2000/7/6.5 instances – fortunately, if you’re dealing with spatial data, this doesn’t really happen😉

One such example happened today, when I was trying to determine the extent of the features in a column of spatial data, in order to create a bounding box for a spatial index covering all the data. In SQL Server Denali, this can be done very easily using an EnvelopeAggregate. Although not documented, any envelope created by SQL Server (i.e. using STEnvelope() or EnvelopeAggregate()) will be always have its first point at the bottom left of the envelope, and its third point at the opposing, top right corner. You can therefore calculate the minimum and maximum extents of the features in the MyGeomColumn of geometry data as follows:

SELECT
  geometry::EnvelopeAggregate(MyGeomColumn).STPointN(1).STX AS MinX,
  geometry::EnvelopeAggregate(MyGeomColumn).STPointN(1).STY AS MinY,
  geometry::EnvelopeAggregate(MyGeomColumn).STPointN(3).STX AS MaxX,
  geometry::EnvelopeAggregate(MyGeomColumn).STPointN(3).STY AS MaxX
FROM BigOlTableOfData;

So what do you do in SQL Server 2008/R2, where the EnvelopeAggregate() is not available? Here’s a couple of options…

Option #1 : With a Cursor

Well, you could come up with your own envelope aggregate using a (dreaded) cursor. Something like this:

DECLARE
  @geom geometry,
  @geomaggregate geometry = 'GEOMETRYCOLLECTION EMPTY';
DECLARE @i int = 0;

DECALRE geocursor cursor fast_forward
FOR SELECT MyGeomColumn FROM BigOlTableOfData;
OPEN geoCursor;
FETCH NEXT FROM geoCursor INTO @geom;
WHILE @@FETCH_STATUS = 0 BEGIN
  SET @geomaggregate = @geomagregate.STUnion(@geom);
  FETCH NEXT FROM geoCursor INTO @geom;
  SET @i += 1;
END;
CLOSE geocursor;
deallocate geoCursor;

Then, create an envelope around @geomaggregate using STEnvelope(), and select the first and third points as before.

Option #2 : CLR Function

Alternatively, you could choose not to explicitly create the envelope, but create a CLR function that looped through all the points of each geometry (using STPointN() and STNumPoints()) , and keep track of the minimum and maximum X and Y values in the loop. Something like this:

double xMin = double.MaxValue;
double xMax = double.MinValue;
double yMin = double.MaxValue;
double yMax = double.MinValue;

// Loop through each geometry in the dataset
foreach(SqlGeometry geom in MyGeomColumn) {

  // Loop through each point in this geometry
  for(int i=1; i<= geom.STNumPoints(); i++)
  {
    SqlGeometry point = geom.STPointN(i);

    // Check whether this point is a new min/max value
    if (point.X < xMin)
    { xMin = point.X; }
    else if (point.X > xMax)
    { xMax = point.X; }
  
    if (point.Y < yMin)
    { yMin = point.Y; }
    else if (point.Y > yMax)
    { yMax = point.Y; }
  }
}

Option #3 : CTE

Here’s a nice alternative using two CTEs, which I first saw described here. The first CTE creates a table, ENVELOPE, containing the individual envelopes of each feature in the table. The second table selects just the corner points of each of those envelopes, with a UNION ALL query (note that this is a regular UNION, not a STUnion()).

Then, you simply select the minimum and maximum X and Y values of any of the points listed in the CORNERS table:

WITH
  ENVELOPE as 
  (
    SELECT MyGeomColumn.STEnvelope() as envelope from BigOlTableOfData
  ),
  CORNERS as 
  (
    SELECT envelope.STPointN(1) as point from ENVELOPE 
    UNION ALL select envelope.STPointN(3) from ENVELOPE
  )

SELECT
  MIN(point.STX) as MinX,
  MIN(point.STY) as MinY,
  MAX(point.STX) as MaxX,
  MAX(point.STY) as MaxY 
FROM
  CORNERS;

Option #4 : Persisted Envelopes

Another option, and the one I chose to use today, is to alter your table structure to include computed columns containing the minimum and maximum X and Y values of the envelope around each feature on a per-row basis. I chose to make my columns PERSISTED, so that the computed values would actually be materialised and saved in the database rather than computed on-the-fly:

ALTER TABLE BigOlTableOfData
ADD
  MinX AS (CONVERT(int, MyGeomColumn.STEnvelope().STPointN((1)).STX, 0)) PERSISTED,
  MinY AS (CONVERT(int, MyGeomColumn.STEnvelope().STPointN((1)).STY, 0)) PERSISTED,
  MaxX AS (CONVERT(int, MyGeomColumn.STEnvelope().STPointN((3)).STX, 0)) PERSISTED,
  MaxY AS (CONVERT(int, MyGeomColumn.STEnvelope().STPointN((3)).STY, 0)) PERSISTED;

This requires a little bit more processing up-front, (and on each additional INSERT into the table), but the data in my table was going to remain relatively static. The advantage of this approach is that you can now work out the full extent of values in the dataset at any time quickly and simply by:

SELECT
  MIN(MinX) AS MinX,
  MIN(MinY) AS MinY,
  MAX(MaxX) AS MaxX,
  MAX(MaxY) AS MaxY;

Seeing as these are just integer columns, this is much quicker than relying on the more complex spatial methods, of having to employ any looping-type behaviour. You can also add an index to the integer coordinate values in the computed columns, which might enable you to perform faster search and retrieval queries of e.g. values lying within a rectangular area of interest (such as what features are visible in a given Bing Maps view) than using the more powerful, but frequently slower, built-in spatial indexes.

This entry was posted in SQL Server and tagged , , . Bookmark the permalink.

6 Responses to Determining the Geographic Extent of Spatial Features in a SQL Server Table

  1. Monty says:

    Alastair,

    I’m looking to do exactly this in SQL 2008 R2. My order of preference for the four methods above is: 3, 4, 1, 2. I really don’t want to be tooling about with CLR functions if I can avoid it.
    However, my first three preferred methods all seem to require the use of STenvelope() function on a geography type, which isn’t available in 2008 R2.

    Am I missing something?

    Thanks,
    Monty

  2. StewLG says:

    geometry::EnvelopeAggregate(MyGeomColumn).STPointN(3).STX AS MaxX,
    geometry::EnvelopeAggregate(MyGeomColumn).STPointN(3).STY AS MaxX

    You’ve used MaxX twice here; the second should be MaxY

  3. Davide De Marchi says:

    Using the spatial index is much simpler (and efficient):

    select bounding_box_xmin,bounding_box_xmax,bounding_box_ymin,bounding_box_ymax from sys.spatial_index_tessellations where object_id = (select object_id from sys.tables where name = ‘your_table’ and type_desc = ‘USER_TABLE’)

    Hope it helps!

    • alastaira says:

      Davide – yes, but that only gives you the values for the bounding box you supplied when you created the spatial index😉 The point of this article is to explain how to determine those values…

  4. Anthony Hart says:

    DECLARE @minx FLOAT = (SELECT GEOMETRY::UnionAggregate (A.geog).STEnvelope().STPointN(1).STX AS xmin FROM [State] A),
    @miny FLOAT = (SELECT GEOMETRY::UnionAggregate (A.geog).STEnvelope().STPointN(1).STY AS ymin FROM [State] A),
    @maxx FLOAT = (SELECT GEOMETRY::UnionAggregate (A.geog).STEnvelope().STPointN(3).STX AS xmax FROM [State] A),
    @maxy FLOAT = (SELECT GEOMETRY::UnionAggregate (A.geog).STEnvelope().STPointN(3).STY AS ymax FROM [State] A);

    DECLARE @sql VARCHAR(MAX) = ‘CREATE SPATIAL INDEX idx_State_Geom ON [State] (Geog) WITH ( BOUNDING_BOX = ( ‘ + CONVERT(VARCHAR, @minx) + ‘, ‘ + CONVERT(VARCHAR, @miny) + ‘, ‘ + CONVERT(VARCHAR, @maxx) + ‘, ‘ + CONVERT(VARCHAR, @maxy) + ‘))’;

    PRINT @minx
    PRINT @miny
    PRINT @maxx
    PRINT @maxy
    PRINT @sql

  5. Mal says:

    Alistair,
    I know I’m late for the party on this …🙂
    Thanks for this tip – and many others I’ve gleaned from you. I’m on 2012 BTW.

    While EnvelopeAggregate is indeed convenient, my testing shows that your (or, rather, Barend’s) CTE method is somewhat quicker. I have a table of 225,000 points, and the CTE method runs in 19 seconds; the (modified – see below) EnvelopeAggregate method 24. Not a big deal, but this isn’t a big table – and I have plenty of tables.

    The modification: You compute the EnvelopeAggregate four times – this took my test 1 minute 50 (around 4 times 24, give or take). I calculated the envelope into a variable, and then extracted the corners.

    DECLARE @geom geometry
    SELECT @geom=geometry::EnvelopeAggregate(MyGeomColumn)
    FROM BigOlTableOfData
    SELECT
    geometry::EnvelopeAggregate(@geom).STPointN(1).STX AS MinX,
    geometry::EnvelopeAggregate(@geom).STPointN(1).STY AS MinY,
    geometry::EnvelopeAggregate(@geom).STPointN(3).STX AS MaxX,
    geometry::EnvelopeAggregate(@geom).STPointN(3).STY AS MaxY

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s