Splitting Multi Geometries into single geometries

Unlike some spatial formats/systems (such as the ESRI shapefile), SQL Server lets you store different types of geometry within a single column. That is to say that a single geometry or geography column within a SQL Server table may contain Points, LineStrings, Polygons, or Multi-elements of those types. Since every individual item of geometry data defines its own SRID, it is even possible to define items of different SRIDs within the same column.

Whilst in theory I approve of this flexibility to define the structure of your spatial data, in practice I rarely see situations in which it is suitable to store different types of geometries within the same column, i.e. Points and LineStrings. And I have never heard a good argument why you would want to store items with different SRIDs in the same column. In fact, I normally place a constraint on my columns to ensure that I only insert geometries of the same SRID – anything else is just asking for trouble.

However, there are many situations where it makes sense to mix both single items and collections of the same type of geometry. i.e. a table of rivers or roads would frequently need to store both LineStrings and MultiLineStrings. A table of countries would probably contain both Polygons (i.e. single land masses – Poland, Algeria, Mongolia etc.) and MultiPolygons (any country with separate islands – UK, Japan, New Zealand etc.). In other words, each item of data would have the same number of dimensions – 0, 1, or 2.

Consider the following table, which contains four rows of data – two MultiPolygons (a set of squares and a set of triangles), and two Polygons (both rectangles).


CREATE TABLE #MixedData (id int, geom geometry);
INSERT INTO #MixedData VALUES
(1, 'MULTIPOLYGON(((0 0, 2 0, 2 2, 0 2, 0 0)), ((6 6, 10 6, 10 10, 6 10, 6 6)), ((18 18, 48 18, 48 48, 18 48, 18 18)), ((60 60, 100 60, 100 100, 60 100, 60 60)))'),
(2, 'POLYGON((30 6, 100 6, 100 10, 30 10, 30 6))'),
(3, 'POLYGON((36 60, 42 60, 42 90, 36 90, 36 60))'),
(4, 'MULTIPOLYGON(((5 30, 10 35, 5 40, 5 30)), ((18 60, 18 80, 10 70, 18 60)), ((80 20, 90 20, 85 25, 80 20)))');

SELECT id, geom FROM #MixedData

The table looks like this:

image

And as shown on the spatial results tab below – each element from the same multi-geometry is shown in the same colour:

image

Now suppose you wanted to “break apart” the MultiPolygons in this table into single elements, so that we just had a table of 9 separate polygons. What would you do?

Method 1: Using a User-Defined Function

You can create a user-defined function that splits a provided multigeometry into its component elements as follows:

CREATE FUNCTION dbo.SplitMultiGeometry (@MultiGeom geometry)
RETURNS @Table TABLE( Geom geometry )
BEGIN
DECLARE @n int = 1;
WHILE (@n <= @MultiGeom.STNumGeometries())
BEGIN
INSERT INTO @Table VALUES(@MultiGeom.STGeometryN(@n));
SET @n = @n + 1;
END
RETURN
END
GO

This function operates on an individual supplied geometry and returns a table containing each element it contains (for single element Points, Linestrings, and Polygons, it simply returns the original supplied value). You can use it as follows:

DECLARE @g geometry = 'MULTIPOINT(0 0, 2 4, 10 7)';

SELECT geom.STAsText() FROM SplitMultiGeometry(@g);

Which returns a table with three rows as follows:

POINT (0 0)
POINT (2 4)
POINT (10 7)

To use this function to split an entire column of values, you can then use the CROSS APPLY operator, as follows:

SELECT
id,
split.geom
FROM
#MixedData
CROSS APPLY (
SELECT geom FROM SplitMultiGeometry(geom)
) split

Note that, even though I’ve split each individual geometry out,  I’ve retained the id column from the original table so that you can still attribute which “parent” element each child was derived from, as shown below:

image

And, on the spatial results tab:

image

If you’re using the spatial types in a .NET application, you can also easily adapt the logic used in this function to operate on (and return) a List<> of SqlGeometry or SqlGeography values instead of a table.

public List<SqlGeometry> SplitMultiGeometry(SqlGeometry MultiGeom) {
 List<SqlGeometry> splitGeometries = new List<SqlGeometry>();
 int numGeometries = (int)MultiGeom.STNumGeometries();

 for (int n = 1; n <= numGeometries; n++)
 {
splitGeometries.Add(MultiGeom.STGeometryN(n));
 }
 return splitGeometries;
 }

Method 2: Using a Numbers Table

Shortly after I first published this article, Simon Sabin wrote to me suggesting that an alternative approach would be to use a numbers table and, of course, he’s absolutely correct ;)

To do so, first create a table containing the numbers from 1 to 100,000 (which should be plenty to accommodate the number of individual elements in a multielement geometry), as follows:

SELECT TOP 100000 IDENTITY(int,1,1) AS n
INTO Numbers
FROM MASTER..spt_values a, MASTER..spt_values b;

You can then join from your spatial table to the numbers table to retrieve each individual geometry using a query as follows:

SELECT
 id,
 geom.STGeometryN(Numbers.n)
FROM
 #MixedData JOIN Numbers
 ON Numbers.n <= geom.STNumGeometries()

The results obtained will be identical to those from the first method, although since it’s a purely set-based solution, you may well get better performance over large datasets than using the CROSS APPLY method. Thankyou to Simon for the suggestion.

About these ads
This entry was posted in SQL Server and tagged , , , . Bookmark the permalink.

One Response to Splitting Multi Geometries into single geometries

  1. Pingback: Tweets that mention Splitting Multi Geometries into single geometries | Alastair Aitchison -- Topsy.com

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s