Posts tagged ‘Denali’

March 7, 2012

SQL Server 2012 – What’s New for Spatial?

So Microsoft launched SQL Server 2012 today, and you can download the SQL 2012 Express edition right now from http://www.microsoft.com/sqlserver/en/us/get-sql-server/try-it.aspx

No doubt there will be lots of people writing blog articles about the various new features – AlwaysOn, Column Store , Power View etc. but I thought I’d stick to what I know and write about what’s new for spatial.

In fact, I don’t really even need to do that, because there’s always a pretty good whitepaper about that from the SQLCAT website, here. This whitepaper is written for Denali CTP3 but, to my knowledge (and from my limited playing around with RTM this morning), there don’t appear to be any new features introduced between then and now.

There’s also already a very useful table here that compares the spatial features currently available in SQL Azure to those in SQL Server 2012.

SQL Server 2012 Spatial Was My Idea

So, what’s left to do? Well – I might start by mentioning the fact that I’m quite chuffed about some of the new features that have been included. Do you remember the Windows 7 advertising slogan with people saying things they did with their PC and that “Windows 7 was my idea”? Well, that’s how a little bit like how I feel about SQL 2012:

I’m not suggesting that these features were introduced solely to appease my personal whims, but don’t let it ever be said that Microsoft don’t listen to customer feedback!

Upgrading a Spatial Application from SQL Server 2008/R2?

Finally, some words of warning for anyone upgrading existing spatial databases from SQL Server 2008/R2 to SQL Server 2012:

Remember to account for Curved Geometries!

The introduction of curved geometry datatypes mean that the range of possible geometries that can be returned by existing methods is increased, and you may need to write additional code paths to deal with that. Take the STConvexHull() method, for example, which returns the convex hull around a geometry. In SQL Server 2008/R2, that method would always return a Polygon (other than in the special case in which all the input points lay on a straight line, in which case it returned a LineString). In SQL 2012, however, if any of the inputs are themselves curved geometries, the convex hull returned by STConvexHull() can be a CurvePolygon, e.g.:

DECLARE @g geometry = 'GEOMETRYCOLLECTION(CIRCULARSTRING(0 0, -5 5, 0 10), CIRCULARSTRING(10 10, 15 10, 10 0))';
SELECT
  @g.STConvexHull(),
  @g.STConvexHull().STAsText();

The result of STConvexHull() in this example is:

CURVEPOLYGON (COMPOUNDCURVE (CIRCULARSTRING (18.090169943749487 5.0000000000000213, 16.351988430497684 9.051195518771241, 12.240554713995163 10.584146142748239), (12.240554713995163 10.584146142748239, -0.23001591065449833 9.9947064659342839), CIRCULARSTRING (-0.23001591065449833 9.9947064659342839, -4.99999999999994 5.0000000000000213, -0.23001591065449833 0.00529353406575872), (-0.23001591065449833 0.00529353406575872, 12.240554713995163 -0.5841461427481951), CIRCULARSTRING (12.240554713995163 -0.5841461427481951, 16.351988430497684 0.94880448122880079, 18.090125812525294 4.9777873878621213), (18.090125812525294 4.9777873878621213, 18.090169943749487 5.0000000000000213)))

image

Calculated Results may differ due to Increased Precision

SQL Server 2012 now uses 48 bit precision for spatial calculations rather than 27 bit as under SQL Server 2008/R2. This means that results of certain spatial queries will differ between those obtained under SQL Server 2008/R2. This has been mentioned before, but it really is worth iterating because it’s both unusual and potentially application-breaking behaviour.

The following very simple example demonstrates the issue:

DECLARE @line1 geometry = 'LINESTRING(0 11, 430 310)';
DECLARE @line2 geometry = 'LINESTRING(0 500, 650 0)';

SELECT @line1.STIntersection(@line2).STIntersects(@line1);
  • When executed under SQL Server 2008/R2, you’ll get the result 0.
  • Upgrade to SQL Server 2012, and you’ll suddenly get the result 1.

For a method that returns a bit value, that’s pretty much the greatest difference it’s possible to get…. so be sure to check your code (especially if you ever rely on exact equality testing between instances).

September 11, 2011

Can’t Cast from SqlGeometry to SqlGeometry and Other Strange Version Conflicts with SQL Server Denali Microsoft.SqlServer.Types

I came across a curious error earlier today when attempting to use a SqlDataReader to read a column of geometry data from a SQL Server table:

System.InvalidCastException: Unable to cast object of type ‘Microsoft.SqlServer.Types.SqlGeometry’ to type ‘Microsoft.SqlServer.Types.SqlGeometry’

SqlGeometry to SqlGeometry… you’d think that would be a pretty easy cast, wouldn’t you? It turns out that this is a problem caused by a conflict between the spatial libraries used in SQL Server Denali compared to that in 2008/R2, and you’ll get this error depending on which version of Microsoft.SqlServer.Types.dll you use, and how you try to access geometry or geography columns from a datareader:

while (dataReader.Read())
{
  // This works in SQL Server 2008/R2, but errors with Denali
  SqlGeometry g = (SqlGeometry)dataReader.GetValue(0);

  // This works in SQL Server 2008/R2, but errors with Denali
  SqlGeometry g = (SqlGeometry)dataReader["GeomCol"];

  // This works in Denali, but not in SQL Server 2008/R2
  SqlGeometry g = SqlGeometry.Deserialize(reader.GetSqlBytes(0));

  // This works in Sql Server 2008/R2/Denali
  SqlGeometry g = new SqlGeometry();
  g.Read(new BinaryReader(reader.GetSqlBytes(0).Stream));
}

After a bit of digging around, it appears that using GetValue or square brackets notation [] to access a geometry/geography field in a SqlDataReader is hard-coded to load the 10.0 (SQL Server 2008) version of the Microsoft.SqlServer.Types library.

If you’ve got side-by-side installations of both SQL Server 2008/R2 and Denali (as I have), and try to reference the 11.0 (Denali) version of Microsoft.SqlServer.Types, you’ll therefore get an assembly mismatch when both versions of the library are loaded, which causes the slightly unhelpful error listed at the top of this post. Even if you’ve only got Denali installed, your code may still try to reference a (non-existent) 2008/R2 version of the Microsoft.SqlServer.Types.dll library, so you’ll get a different error instead:

Could not load file or assembly ‘Microsoft.SqlServer.Types, Version=10.0.0.0, Culture=neutral, PublicKeyToken=89845dcd8080cc91′ or one of its dependencies. The system cannot find the file specified.

The simplest way to resolve these errors is by changing the way you reference any geography/geometry (and, I imagine, hierarchyid) columns from your DataReader, as in the code example above. Alternatively, you can set up an assembly redirection in the application configuration file as explained here (about halfway down), which will allow you to correctly target the Denali version.

As per the What’s new in SQL Server Denali whitepaper, “… side-by-side installations of SQL Server Code-Named “Denali” CTP1 and CTP3 are not supported with existing SQL Server 2008 installations …”, so perhaps I only have myself to blame for this. Interestingly though, the person who raised this MS Connect issue, says that they have experienced exactly the same problem on a clean install of Denali CTP3. The response from Microsoft suggests that this may be due to older versions of the library being packaged with Visual Studio 2010, and also confirms that the problem will not be resolved prior to RTM of SQL Server Denali.

Strangely, I encountered another curious error a few months ago concerning version conflicts of Microsoft.SqlServer.Types. My CTP3 Management Studio Spatial Results tab does not plot curved geometries (selecting a CircularString or the result of BufferWithCurves etc. just produces a blank pane). I had originally assumed that, since this was only a CTP release, this feature had simply not been added yet. It turns out that curved geometries are supported in SSMS CTP3 Spatial Results tab but, if you have side-by-side SQL Server 2008 and Denali, this can corrupt this feature. I guess the reason is similar – that SSMS is somehow attempting to load the SQL Server 2008/R2 version of Microsoft.SqlServer.Types, which, of course, doesn’t support curved geometries. I wonder if this will be fixed by RTM….

Tags: ,
July 29, 2011

sp_help_spatial_geography_histogram and Indexing geography data in SQL Server Denali

Working with spatial data involves many dark arts, and performance tuning spatial indexes is perhaps the blackest of all of them. Although there are only a few parameters to configure: GRIDS, CELLS_PER_OBJECT, and BOUNDING_BOX, the number of different index combinations possible and the way in which these settings interact with each other is pretty complex. This is particularly true when dealing with geography data, since even the fundamental concept of defining a “grid” becomes quite tricky when dealing with an ellipsoidal surface.

Fortunately, there are two new system stored procedures in SQL Server Denali that make the process of spatial index-tuning much simpler, by allowing you to recreate (and therefore visualise) polygons representing the decomposition of a set of data under different index settings. These are sp_help_spatial_geography_histogram, and its geometry sister, sp_help_spatial_geometry_histogram. If you Google either of these terms right now, you’ll get only 7 results (Hopefully after I publish this post there might be 8 results…). I find this pretty surprising because, to me, these procedures (particularly the geography flavour) are pretty awesome.

Why? Well, when creating spatial indexes you generally want each feature to be described using the greatest possible accuracy (in order to increase the primary filter efficiency), but requiring the least number of cells to do so (in order to ensure that features are fully tessellated without blowing the CELLS_PER_OBJECT limit, or the index becoming large and unwieldy). But, when you create a index on a spatial column, how do you know in which grid cell(s) a given feature lies?

For the geometry datatype, this is pretty easy. LOW resolution is a 4×4 grid, MEDIUM resolution corresponds to an 8×8 grid, and HIGH resolution is a 16×16 grid. There are four levels of nested grids. So, to work out the size of each cell at Level 1 in a MEDIUM geometry index, you take the extent of the index bounding box in both x and y dimensions, and divide by 8. If the next grid resolution is also MEDIUM, then each cell at Level 2 will be 1/8th of a Level 1 cell, i.e. 1/64th of the height and width of the bounding box of the index. etc. etc.

So, if you have a bounding box that covers a map of the whole globe, from (–180 –90) to (180 90), then the level 1 cells in a HIGH resolution grid look like this:

image

Since the geometry datatype operates on a simple, planar surface, it’s actually always been pretty easy to work out this grid of cells programmatically, and determine which (and, crucially for performance-tuning, how many) cells a given feature intersects. But, the sp_help_spatial_geometry_histogram  makes this even easier. You can now create the above grid for yourself using the following T-SQL:

CREATE TABLE #geometry_histogram_test (
  geom geometry
);
INSERT INTO #geometry_histogram_test VALUES (
  geometry::STPolyFromText('POLYGON((-180 -90, 180 -90, 180 90, -180 90, -180 -90))', 4326)
  );

DECLARE @histogram_results table (
CellID int,
Cell geometry,
IntersectionCount int
);
INSERT INTO @histogram_results
EXEC
sp_help_spatial_geometry_histogram 
@tabname=#geometry_histogram_test,
@colname=geom,
@xmin=-180,
@xmax=180,
@ymin=-90,
@ymax=90,
@resolution=16;

SELECT CELL.STBoundary().STBuffer(0.05) FROM @histogram_results;

Tweak the @resolution value to get the grids at a given level and resolution. For example, to determine the set of Level 1 cells at HIGH resolution, supply a @resolution parameter of 16. To determine the set of Level 2 cells in a MEDIUM, MEDIUM grid, the @resolution parameter should be (8*8) = 64. But what’s more – the output of the procedure won’t just create the grid for you, but will give you a count of each feature from the underlying table that intersects each feature. For this example, I just created a single polygon spanning the full extent of the bounding box so the intersection count of every cell is just 1, but for distribution analysis of a real dataset, this is very useful.

But what’s even more useful is the sp_help_spatial_geography_histogram counterpart, because, despite the explanation in the  MSDN spatial indexing overview, it’s always been hard to visualise exactly how an index grid is applied to an ellipsoidal surface. Now, using code very similar to the above, you can see exactly how the geography index cells are created (and, again, determine how many features intersect each cell). For example, the following code listing demonstrates the set of Level 1 HIGH geography index cells:

CREATE TABLE #histogram_test (
  geog geography
);

INSERT INTO #histogram_test VALUES (
  geography::STGeomFromText('FULLGLOBE', 4326)
  );

DECLARE @histogram_results table (
CellID int,
Cell geography,
IntersectionCount int
);

INSERT INTO @histogram_results
EXEC
sp_help_spatial_geography_histogram 
@tabname=#histogram_test,
@colname=geog,
@resolution=16;

SELECT CELL.RingN(1).STBuffer(10000) FROM @histogram_results;

I’ve run the procedure against a table containing a single FULLGLOBE geometry to ensure that every cell intersects some feature (cells that don’t intersect anything in the base table aren’t included in the output).  Since geography indexes have no bounding box and are implicitly assumed to tessellate the whole globe, these Level 1 cells will be the same for every HIGH Level 1 geography index. And here’s what they look like:

image

And the Level 3 cells in a LOW, LOW, LOW geography grid (at which point, the resolution is 4*4*4 = 64)? They look slightly more like this:

image

Pretty, huh? But more importantly, these procedures make visualising, analysing, and ultimately tuning geography indexes much easier.

July 26, 2011

Determining the Geographic Extent of Spatial Features in a SQL Server Table

I’ve been using the SQL Server Denali CTP(s) for just over 6 months now. As a result, I’ve become accustomed to some of the new features and forget that they aren’t generally available in production environments yet… it always comes as a bit of a surprise when you have to do some work on a SQL Server 2008/R2 database and suddenly discover half of the feature you’ve come to rely on aren’t there any more (I pity those consultants who turn up at a client’s site to find SQL Server 2000/7/6.5 instances – fortunately, if you’re dealing with spatial data, this doesn’t really happen ;)

One such example happened today, when I was trying to determine the extent of the features in a column of spatial data, in order to create a bounding box for a spatial index covering all the data. In SQL Server Denali, this can be done very easily using an EnvelopeAggregate. Although not documented, any envelope created by SQL Server (i.e. using STEnvelope() or EnvelopeAggregate()) will be always have its first point at the bottom left of the envelope, and its third point at the opposing, top right corner. You can therefore calculate the minimum and maximum extents of the features in the MyGeomColumn of geometry data as follows:

SELECT
  geometry::EnvelopeAggregate(MyGeomColumn).STPointN(1).STX AS MinX,
  geometry::EnvelopeAggregate(MyGeomColumn).STPointN(1).STY AS MinY,
  geometry::EnvelopeAggregate(MyGeomColumn).STPointN(3).STX AS MaxX,
  geometry::EnvelopeAggregate(MyGeomColumn).STPointN(3).STY AS MaxX
FROM BigOlTableOfData;

So what do you do in SQL Server 2008/R2, where the EnvelopeAggregate() is not available? Here’s a couple of options…

Option #1 : With a Cursor

Well, you could come up with your own envelope aggregate using a (dreaded) cursor. Something like this:

DECLARE
  @geom geometry,
  @geomaggregate geometry = 'GEOMETRYCOLLECTION EMPTY';
DECLARE @i int = 0;

DECALRE geocursor cursor fast_forward
FOR SELECT MyGeomColumn FROM BigOlTableOfData;
OPEN geoCursor;
FETCH NEXT FROM geoCursor INTO @geom;
WHILE @@FETCH_STATUS = 0 BEGIN
  SET @geomaggregate = @geomagregate.STUnion(@geom);
  FETCH NEXT FROM geoCursor INTO @geom;
  SET @i += 1;
END;
CLOSE geocursor;
deallocate geoCursor;

Then, create an envelope around @geomaggregate using STEnvelope(), and select the first and third points as before.

Option #2 : CLR Function

Alternatively, you could choose not to explicitly create the envelope, but create a CLR function that looped through all the points of each geometry (using STPointN() and STNumPoints()) , and keep track of the minimum and maximum X and Y values in the loop. Something like this:

double xMin = double.MaxValue;
double xMax = double.MinValue;
double yMin = double.MaxValue;
double yMax = double.MinValue;

// Loop through each geometry in the dataset
foreach(SqlGeometry geom in MyGeomColumn) {

  // Loop through each point in this geometry
  for(int i=1; i<= geom.STNumPoints(); i++)
  {
    SqlGeometry point = geom.STPointN(i);

    // Check whether this point is a new min/max value
    if (point.X < xMin)
    { xMin = point.X; }
    else if (point.X > xMax)
    { xMax = point.X; }
  
    if (point.Y < yMin)
    { yMin = point.Y; }
    else if (point.Y > yMax)
    { yMax = point.Y; }
  }
}

Option #3 : CTE

Here’s a nice alternative using two CTEs, which I first saw described here. The first CTE creates a table, ENVELOPE, containing the individual envelopes of each feature in the table. The second table selects just the corner points of each of those envelopes, with a UNION ALL query (note that this is a regular UNION, not a STUnion()).

Then, you simply select the minimum and maximum X and Y values of any of the points listed in the CORNERS table:

WITH
  ENVELOPE as 
  (
    SELECT MyGeomColumn.STEnvelope() as envelope from BigOlTableOfData
  ),
  CORNERS as 
  (
    SELECT envelope.STPointN(1) as point from ENVELOPE 
    UNION ALL select envelope.STPointN(3) from ENVELOPE
  )

SELECT
  MIN(point.STX) as MinX,
  MIN(point.STY) as MinY,
  MAX(point.STX) as MaxX,
  MAX(point.STY) as MaxY 
FROM
  CORNERS;

Option #4 : Persisted Envelopes

Another option, and the one I chose to use today, is to alter your table structure to include computed columns containing the minimum and maximum X and Y values of the envelope around each feature on a per-row basis. I chose to make my columns PERSISTED, so that the computed values would actually be materialised and saved in the database rather than computed on-the-fly:

ALTER TABLE BigOlTableOfData
ADD
  MinX AS (CONVERT(int, MyGeomColumn.STEnvelope().STPointN((1)).STX, 0)) PERSISTED,
  MinY AS (CONVERT(int, MyGeomColumn.STEnvelope().STPointN((1)).STY, 0)) PERSISTED,
  MaxX AS (CONVERT(int, MyGeomColumn.STEnvelope().STPointN((3)).STX, 0)) PERSISTED,
  MaxY AS (CONVERT(int, MyGeomColumn.STEnvelope().STPointN((3)).STY, 0)) PERSISTED;

This requires a little bit more processing up-front, (and on each additional INSERT into the table), but the data in my table was going to remain relatively static. The advantage of this approach is that you can now work out the full extent of values in the dataset at any time quickly and simply by:

SELECT
  MIN(MinX) AS MinX,
  MIN(MinY) AS MinY,
  MAX(MaxX) AS MaxX,
  MAX(MaxY) AS MaxY;

Seeing as these are just integer columns, this is much quicker than relying on the more complex spatial methods, of having to employ any looping-type behaviour. You can also add an index to the integer coordinate values in the computed columns, which might enable you to perform faster search and retrieval queries of e.g. values lying within a rectangular area of interest (such as what features are visible in a given Bing Maps view) than using the more powerful, but frequently slower, built-in spatial indexes.

Follow

Get every new post delivered to your Inbox.

Join 53 other followers