Posts tagged ‘OGR2OGR’

September 29, 2011

Exporting Spatial Data From SQL Server to ESRI Shapefile

I love the site stats you get from WordPress. This morning I noticed that, in the 9 months that I’ve been writing this blog, nearly 200 people have come to my site after searching for the term “OGR2OGR” in a search engine. That’s more than any other single word search term. Around another 200 people have come here after searching for variations and combinations on this theme, including “OGR2OGR Sql Server”, “OGR2OGR MSSqlSpatial”, “OGR SQL”, “OGR Import shapefile” etc.

What I’m not sure, though, is how to interpret this statistic or best act upon it. Does it mean, for example, that my articles about OGR2OGR – the open source toolkit for spatial data conversion and manipulation – are helpful, or high quality? Sure, several of them get listed on the first page of results for a Google search, but does that simply mean that nobody else on the internet is writing about OGR2OGR? And do the people that come here searching for information about OGR2OGR actually find what they’re looking for, or do they end up leaving empty-handed?

I don’t know the answers to these questions, but since there are obviously some people out there looking for this kind of information (and since I know, from experience, that OGR2OGR can be an absolute bugger to get working correctly) here’s another post about OGR2OGR anyway… and this time it’s about exporting spatial data from SQL Server. There’s plenty of information on the ‘net describing how to import spatial data from ESRI shapefiles into SQL Server (including my own posts, such as here, here and here, for example). However, what I haven’t seen any examples of yet is how to do the reverse: taking geometry or geography data from SQL Server and dumping it into an ESRI shapefile.

There’s plenty of reasons why you might want to do this – despite its age and relative limitations (such as the maximum filesize per file, limited fieldname length, the fact that each file can contain only a single homogenous type of geometry, etc…), the ESRI shapefile format is still largely the industry standard and pretty much universally read by all spatial applications. Recently, I needed to export some data from SQL Server to shapefile format so that it could be rendered as a map in mapnik, for example (which, sadly, still can’t directly connect to a MS SQL Server database).

So, here’s a step-by-step guide, including the pitfalls to avoid along the way.

Setup

To start with, make sure you’ve got a copy of OGR2OGR version 1.8 or greater (earlier versions do not have the MSSQL driver installed). You can either build it from source supplied on the GDAL page or, for convenience, download and install pre-compiled windows binaries supplied as part of the OSgeo4W package.

Now, let’s set up some test data in a SQL Server table that we want to export. To test the full range of OGR2OGR features (or should that say, “the full range of error messages you can create”?), I’m going to create a table that contains two different geometry columns – an original geometry and a buffered geometry, populated using a range of different geometry types:

CREATE TABLE OGRExportTestTable (
  shapeid int identity(1,1),
  shapename varchar(32),
  shapegeom geometry,
  bufferedshape AS shapegeom.STBuffer(1),
  bufferedshapearea AS shapegeom.STBuffer(1).STArea()
);

INSERT INTO OGRExportTestTable (shapename, shapegeom) VALUES
('Point #1', geometry::STGeomFromText('POINT(13 10)', 2199)),
('Point #2', geometry::STGeomFromText('POINT(7 12)', 2199)),
('Line #1', geometry::STGeomFromText('LINESTRING(0 0, 8 4)', 2199)),
('Polygon #1', geometry::STGeomFromText('POLYGON((2 2, 4 2, 4 4, 2 4, 2 2))', 2199)),
('Line #2', geometry::STGeomFromText('LINESTRING(0 10, 10 10)', 2199));

Here’s what the contents of this table looks like in the SSMS Spatial Results tab:

image

Exporting from SQL Server to Shapefile with OGR2OGR (via a string of errors along the way)

The basic pattern for OGR2OGR usage is given at http://www.gdal.org/ogr2ogr.html, with additional usage options for the SQL Server driver at http://www.gdal.org/ogr/drv_mssqlspatial.html. So, let’s start by just trying out a basic example to export the entire OGRExportTestTable from my local SQLExpress instance to a shapefile at c:\temp\sqlexport.shp, as follows: (change the connection string to match your server/credentials as appropriate)

ogr2ogr -f "ESRI Shapefile" "C:\temp\sqlexport.shp"
"MSSQL:server=localhost\sqlexpress;database=tempdb;tables=OGRExportTestTable;trusted_connection=yes;"

This will fail with a couple of errors, but the first one to address is as follows:

ERROR 1: Attempt to write non-point (LINESTRING) geometry to point shapefile.

SQL Server will allow you to mix different types of geometry (Points, LineStrings, Polygons) within a single column of geometry or geography data. An ESRI shapefile, in contrast, can only contain a single homogenous type of geometry. To correct this, rather than trying to dump the entire table by specifying the tables=OGRExportTable in the connection string, we’ll have to manually select rows of only a certain type of geometry at a time by specifying an explicit SQL statement. Let’s start off by concentrating on the points only. This can be done with the –sql option, as follows:

ogr2ogr -f "ESRI Shapefile" "C:\temp\sqlexport.shp"
"MSSQL:server=localhost\sqlexpress;database=tempdb;trusted_connection=yes;"
-sql "SELECT * FROM OGRExportTestTable WHERE shapegeom.STGeometryType() = 'POINT'"

This time, a new error occurs:

ERROR 1: C:\temp\sqlexport.shp is not a directory.

ESRI Shapefile driver failed to create C:\temp\sqlexport.shp

Although our first attempt to export the entire table failed, it still created an output file at C:\temp\sqlexport.shp. Seeing as we didn’t specify the behaviour for what to do when the output file already exists, when we run OGR2OGR for the second time it has now errored. To correct this, we’ll add the –overwrite flag.

ogr2ogr -f "ESRI Shapefile" "C:\temp\sqlexport.shp"
"MSSQL:server=localhost\sqlexpress;database=tempdb;trusted_connection=yes;"
-sql "SELECT * FROM OGRExportTestTable WHERE shapegeom.STGeometryType() = 'POINT'"
-overwrite

Running again and the error is now:

ERROR 6: Can't create fields of type Binary on shapefile layers.

Ok, so remember that our original shapefile had two geometry columns. When you create a shapefile, one column is used to populate the shape information itself, while every other column becomes an attribute of that shape, stored in the associated .dbf file. Since the remaining geometry column is a binary value, this can’t be stored in the .dbf file. To resolve this, rather than using a SELECT *, we’ll explicitly specify each column to be included in the shapefile. You could, if you want, then omit the second geometry column completely from the list of selected fields. Instead, I’ll use the ToString() method to convert it to WKT, which can then be stored as an attribute value:

ogr2ogr -f "ESRI Shapefile" "C:\temp\sqlexport.shp"
"MSSQL:server=localhost\sqlexpress;database=tempdb;trusted_connection=yes;"
-sql "SELECT shapeid, shapename, shapegeom, bufferedshape.ToString(), bufferedshapearea  FROM OGRExportTestTable WHERE shapegeom.STGeometryType() = 'POINT'"
-overwrite

This should now run without errors, although you’ll still get a warning:

Warning 6: Normalized/laundered field name: ‘bufferedshapearea’ to bufferedsh

The names of any attribute fields associated with a shapefile can only be up to a maximum of 10 characters in length. In this case, OGR2OGR has manually truncated the name of the buferedshapearea column for us, but you might not like to use the garbled “bufferedsh” as an attribute name in your shapefile. A better approach would be to specify an alias for long column names in the –sql statement itself. In this case, perhaps just “area” will suffice:

ogr2ogr -f "ESRI Shapefile" "C:\temp\sqlexport.shp"
"MSSQL:server=localhost\sqlexpress;database=tempdb;trusted_connection=yes;"
-sql "SELECT shapeid, shapename, shapegeom, bufferedshape.ToString(), bufferedshapearea AS area  FROM OGRExportTestTable WHERE shapegeom.STGeometryType() = 'POINT'"
-overwrite

And now, for the first time, you should be both error and warning free:

image

So, now, just repeat the above but substituting the POINT, LINESTRING, and POLYGON geometry types, and creating three separate corresponding shapefiles:

ogr2ogr -f "ESRI Shapefile" "C:\temp\sqlexport_point.shp" "MSSQL:server=localhost\sqlexpress;database=tempdb;trusted_connection=yes;" -sql "SELECT shapeid, shapename, shapegeom, bufferedshape.ToString(), bufferedshapearea AS area  FROM OGRExportTestTable WHERE shapegeom.STGeometryType() = 'POINT'" -overwrite

ogr2ogr -f "ESRI Shapefile" "C:\temp\sqlexport_linestring.shp" "MSSQL:server=localhost\sqlexpress;database=tempdb;trusted_connection=yes;" -sql "SELECT shapeid, shapename, shapegeom, bufferedshape.ToString(), bufferedshapearea AS area  FROM OGRExportTestTable WHERE shapegeom.STGeometryType() = 'LINESTRING'" -overwrite

ogr2ogr -f "ESRI Shapefile" "C:\temp\sqlexport_polygon.shp" "MSSQL:server=localhost\sqlexpress;database=tempdb;trusted_connection=yes;" -sql "SELECT shapeid, shapename, shapegeom, bufferedshape.ToString(), bufferedshapearea AS area  FROM OGRExportTestTable WHERE shapegeom.STGeometryType() = 'POLYGON'" -overwrite

All done, right? Let’s just have a quick check on the created files to make sure they look ok. You can do this using ogrinfo with the –al option, which will give you a summary of the elements contained in any spatial data set. I mean, I’m sure they’re fine and everything, but…. hang on a minute:

ogrinfo -al c:\temp\sqlexport_point.shp

Here’s the POINT shapefile:

image

Initially looks ok, the shapefile contains 2 Point features – they’ve got the correct attribute fields, but both points seem to have been incorrectly placed at coordinates of POINT(0.0 2.0). Huh?

What about the LINESTRING shapefile:

image

This is even worse – even though we specified that only LineString geometries should be returned from the SQL query, the created shapefile thinks it contains 2 Point features (Geometry: Point, near the top). And those points both lie at POINT (0.0 0.0)…

And the POLYGON shapefile:

image

Same problem as the LineString – it’s effectively an empty Point shapefile.

The problem seems to be that, even though we’re only returning geometries of a certain type from the SQL query, we haven’t explicitly stated that to the shapefile creator, so OGR2OGR is creating three empty shapefiles first (each set up to receive the default geometry type of POINT), and then trying to populate them with unmatching shape types, leading to corrupt data. To explicitly state the geometry type of the shapefiles created, we need to supply the SHPT layer creation option for each shapefile as specified at http://www.gdal.org/ogr/drv_shapefile.html, by adding –lco “SHPT=POLYGON”, –lco “SHPT=ARC” (for LineStrings) etc. as follows:

ogr2ogr -f "ESRI Shapefile" "C:\temp\sqlexport_point.shp"
"MSSQL:server=localhost\sqlexpress;database=tempdb;trusted_connection=yes;"
-sql "SELECT shapeid, shapename, shapegeom, bufferedshape.ToString(), bufferedshapearea AS area  FROM OGRExportTestTable WHERE shapegeom.STGeometryType() = 'POINT'"
-overwrite
-lco "SHPT=POINT"

ogr2ogr -f "ESRI Shapefile" "C:\temp\sqlexport_linestring.shp"
"MSSQL:server=localhost\sqlexpress;database=tempdb;trusted_connection=yes;"
-sql "SELECT shapeid, shapename, shapegeom, bufferedshape.ToString(), bufferedshapearea AS area  FROM OGRExportTestTable WHERE shapegeom.STGeometryType() = 'LINESTRING'"
-overwrite
-lco "SHPT=ARC"

ogr2ogr -f "ESRI Shapefile" "C:\temp\sqlexport_polygon.shp" "MSSQL:server=localhost\sqlexpress;database=tempdb;trusted_connection=yes;"
-sql "SELECT shapeid, shapename, shapegeom, bufferedshape.ToString(), bufferedshapearea AS area  FROM OGRExportTestTable WHERE shapegeom.STGeometryType() = 'POLYGON'"
-overwrite
-lco "SHPT=POLYGON"

Unfortunately, that brings us round to almost exactly the same error as we first started with when creating the LineString and Polygon shapefiles (Gah! I thought we’d got rid of them!):

ERROR 1: Attempt to write non-linestring (POINT) geometry to ARC type shapefile

ERROR 1: Attempt to write non-polygon (POINT) geometry to POLYGON type shapefile

We are only selecting geometries of the matching type for each shapefile by filtering the query based on STGeometryType(), so why does OGR2OGR think that we are selecting other types of geometries? What’s more, we haven’t yet explained why the point shapefile (which was, after all, being populated only with point geometries), incorrectly placed both points at coordinates POINT(0.0 2.0). It seems that something is corrupting the results of the SQL statement.

And here’s the “Ta-dah!” moment. According to http://www.gdal.org/ogr/drv_mssqlspatial.html, when retrieving spatial data from SQL Server, “The default [GeometryFormat] value is 'native', in this case the native SqlGeometry and SqlGeography serialization format is used”. However, this doesn’t actually appear to hold true. SQL Server stores geometry and geography data in a format very similar to, but slightly different from Well-Known Binary (WKB). The SQL Server binary values for the two points in the OGRExportTestTable are:

0x00000000010C0000000000002A400000000000002440
0x00000000010C0000000000001C400000000000002840

The Well-Known Binary of these two points is , instead, as follows:

0x01010000000000000000002A400000000000002440
0x01010000000000000000001C400000000000002840

As you can see, they’re very similar – the coordinate values are serialised as 8-byte floating point binary values in both cases, but the MSSQL Server native serialisation has a different and slightly longer (i.e. one byte more) header. Thus, if OGR2OGR is expecting to receive one type of data, but actually gets the other, all the bytes will be displaced slightly. This could explain both why OGR believed it was receiving the incorrect geometry types and also why the point coordinates were wrong.

To correct this, rather than retrieving the shapegeom value directly, use the STAsBinary() method in the SQL statement to retrieve the WKB of the shapegeom column instead.

ogr2ogr -f "ESRI Shapefile" "C:\temp\sqlexport_point.shp"
"MSSQL:server=localhost\sqlexpress;database=tempdb;trusted_connection=yes;"
-sql "SELECT shapeid, shapename, shapegeom.STAsBinary(), bufferedshape.ToString(), bufferedshapearea AS area  FROM OGRExportTestTable WHERE shapegeom.STGeometryType() = 'POINT'"
-overwrite
-lco "SHPT=POINT"

ogr2ogr -f "ESRI Shapefile" "C:\temp\sqlexport_linestring.shp"
"MSSQL:server=localhost\sqlexpress;database=tempdb;trusted_connection=yes;"
-sql "SELECT shapeid, shapename, shapegeom.STAsBinary(), bufferedshape.ToString(), bufferedshapearea AS area  FROM OGRExportTestTable WHERE shapegeom.STGeometryType() = 'LINESTRING'"
-overwrite
-lco "SHPT=ARC"

ogr2ogr -f "ESRI Shapefile" "C:\temp\sqlexport_polygon.shp" "MSSQL:server=localhost\sqlexpress;database=tempdb;trusted_connection=yes;"
-sql "SELECT shapeid, shapename, shapegeom.STAsBinary(), bufferedshape.ToString(), bufferedshapearea AS area  FROM OGRExportTestTable WHERE shapegeom.STGeometryType() = 'POLYGON'"
-overwrite
-lco "SHPT=POLYGON"

Right, we are definitely making progress now. Trying ogrinfo again reveals that all of the layers have the correct number of features, or the appropriate type, and all have the right coordinate values. Yay!

image

There’s just one nagging thing and that’s the Layer SRS WKT: (unknown). When we converted from the SQL Server native serialisation to Well-Known Binary, we lost the metadata of the spatial reference identifier (SRID) associated with each geometry. This information contains the details of the datum, coordinate reference system, prime meridian etc. that make the coordinates of each geometry relate to an actual place on the earth’s surface. In the shapefile format, this information is contained in a .PRJ file that usually accompanies each .SHP file but, since we haven’t supplied this information to OGR2OGR, none of the created shapefiles currently have associated .PRJ files, so ogrinfo reports the spatial reference system as “unknown”.

To create a PRJ file, we need to append the –a_srs option, supplying the same EPSG id as was supplied when creating the original geometry instances in the table. (Of course, if you have instances of more than one SRID in the same SQL Server table, you’ll have to split these out into separate shapefiles, just as you have to split different geometry types into separate shapefiles)

The Final Product

So here’s the final script that will separate out points, linestrings, and polygons from a SQL Server table into separate shapefiles, via the WKB format but maintaining the SRID of the original instance, and retaining all other columns as attribute values in the .dbf file. Tested and confirmed to work with columns of either the geometry or geography datatype in SQL Server 2008/R2, or SQL Server Denali CTP3.

ogr2ogr -f "ESRI Shapefile" "C:\temp\sqlexport_point.shp"
"MSSQL:server=localhost\sqlexpress;database=tempdb;trusted_connection=yes;"
-sql "SELECT shapeid, shapename, shapegeom.STAsBinary(), bufferedshape.ToString(), bufferedshapearea AS area  FROM OGRExportTestTable WHERE shapegeom.STGeometryType() = 'POINT'"
-overwrite
-lco "SHPT=POINT"
-a_srs "EPSG:2199"

ogr2ogr -f "ESRI Shapefile" "C:\temp\sqlexport_linestring.shp"
"MSSQL:server=localhost\sqlexpress;database=tempdb;trusted_connection=yes;"
-sql "SELECT shapeid, shapename, shapegeom.STAsBinary(), bufferedshape.ToString(), bufferedshapearea AS area  FROM OGRExportTestTable WHERE shapegeom.STGeometryType() = 'LINESTRING'"
-overwrite
-lco "SHPT=ARC"
-a_srs "EPSG:2199"

ogr2ogr -f "ESRI Shapefile" "C:\temp\sqlexport_polygon.shp"
"MSSQL:server=localhost\sqlexpress;database=tempdb;trusted_connection=yes;"
-sql "SELECT shapeid, shapename, shapegeom.STAsBinary(), bufferedshape.ToString(), bufferedshapearea AS area  FROM OGRExportTestTable WHERE shapegeom.STGeometryType() = 'POLYGON'"
-overwrite
-lco "SHPT=POLYGON"
-a_srs "EPSG:2199"

And here’s the resulting shapefile layers loaded into qGIS:

image

July 1, 2011

Loading OS VectorMap data to SQL Server with Powershell and OGR2OGR

I’ve already written several posts about loading spatial data into SQL Server – using OGR2OGR, or Shape2SQL, for example. In this post, I’m going to demonstrate how you can call OGR2OGR from a PowerShell script in order to loop through and load the entire set of OS VectorMap layers into SQL Server in a few simple lines of script. Note that you’ll need the latest version of OGR2OGR (1.8) in order to play along at home, since I’ll be using the direct MSSQLSpatial driver introduced in that version.

Create the Tables

OS Vectormap data is supplied in a number of different layers, including roads, natural features, settlement areas, railway lines etc. Some of these layers represent point data, some polylines, and some polygons. (Note that there are no multi_xxx features, and a given layer contains only features of one type). Unlike some spatial formats (such as ESRI shapefiles), SQL Server lets you mix ‘n’ match different types of geometry within the same column. You could, if you wanted to, merge all the OS VectorMap features into a single table. However, since there are different additional attributes associated with each type of feature, it generally makes sense to define separate tables for each feature layer.

OGR2OGR can create tables for each feature layer in SQL Server directly, creating geometry or geography columns to hold spatial data itself and additional columns to hold any non-spatial attributes in the source data file. However, I find this to be quite unreliable; sometimes, the field lengths of the columns in the table created by OGR2OGR are set too small (e.g. numeric(5,2)), causing attribute data inserted into the table to be truncated or lead to an overflow error. Sometimes, column field lengths are set too large, which is wasteful (i.e. using nvarchar(max) to store a fixed-length two-byte string). And, sometimes, the datatype chosen for a column is just plain wrong. i.e. creating a float column to store a set of purely integer values.

So, I’d always recommend that you create the destination tables for each layer manually, specifying the correct column datatypes yourself before using OGR2OGR to load data into it. The following script can be used to create individual tables for each of the feature layers in the OS Vector Map dataset – road/railway/settlement layer/heritage/community services etc. – including all associated attribute values. Notice that I’m using the geometry datatype, and I’ve named the spatial column geom27700, indicating that all the data inserted in this column will be projected data using the EPSG:27700 spatial reference system (the National Grid of Great Britain, as used by all Ordnance Survey data):

CREATE TABLE dbo.administrativeboundary(
	ogr_fid int IDENTITY(1,1) NOT NULL,
	geom27700 geometry NULL,
	featdesc varchar(255) NULL,
	featcode int NULL,
 CONSTRAINT PK_administrativeboundary PRIMARY KEY CLUSTERED 
(
	ogr_fid ASC
));

CREATE TABLE dbo.road_line(
	ogr_fid int IDENTITY(1,1) NOT NULL,
	geom27700 geometry NULL,
	featdesc varchar(254) NULL,
	featcode int NULL,
	name varchar(254) NULL,
	number varchar(30) NULL,
 CONSTRAINT PK_road_line PRIMARY KEY CLUSTERED 
(
	ogr_fid ASC
));

CREATE TABLE dbo.railway_line(
	ogr_fid int IDENTITY(1,1) NOT NULL,
	geom27700 geometry NULL,
	featdesc varchar(254) NULL,
	featcode int NULL,
 CONSTRAINT PK_railway_line PRIMARY KEY CLUSTERED 
(
	ogr_fid ASC
));

CREATE TABLE dbo.communityservices(
	ogr_fid int IDENTITY(1,1) NOT NULL,
	geom27700 geometry NULL,
	featdesc varchar(254) NULL,
	featcode int NULL,
	name varchar(254) NULL,
 CONSTRAINT PK_communityservices PRIMARY KEY CLUSTERED 
(
	ogr_fid ASC
));

CREATE TABLE dbo.height(
	ogr_fid int IDENTITY(1,1) NOT NULL,
	geom27700 geometry NULL,
	featdesc varchar(254) NULL,
	featcode int NULL,
	height varchar(254) NULL,
 CONSTRAINT PK_height PRIMARY KEY CLUSTERED 
(
	ogr_fid ASC
));

CREATE TABLE dbo.heritage(
	ogr_fid int IDENTITY(1,1) NOT NULL,
	geom27700 geometry NULL,
	featdesc varchar(254) NULL,
	featcode int NULL,
	name varchar(254) NULL,
 CONSTRAINT PK_heritage PRIMARY KEY CLUSTERED 
(
	ogr_fid ASC
));

CREATE TABLE dbo.naturalfeature_area(
	ogr_fid int IDENTITY(1,1) NOT NULL,
	geom27700 geometry NULL,
	featdesc varchar(254) NULL,
	featcode int NULL,
 CONSTRAINT PK_naturalfeature_area PRIMARY KEY CLUSTERED 
(
	ogr_fid ASC
));

CREATE TABLE dbo.naturalfeature_line(
	ogr_fid int IDENTITY(1,1) NOT NULL,
	geom27700 geometry NULL,
	featdesc varchar(254) NULL,
	featcode int NULL,
 CONSTRAINT PK_naturalfeature_line PRIMARY KEY CLUSTERED 
(
	ogr_fid ASC
));
CREATE TABLE dbo.railway_point(
	ogr_fid int IDENTITY(1,1) NOT NULL,
	geom27700 geometry NULL,
	featdesc varchar(254) NULL,
	featcode int NULL,
	name varchar(30) NULL,
 CONSTRAINT PK_railway_point PRIMARY KEY CLUSTERED 
(
	ogr_fid ASC
));

CREATE TABLE dbo.settlement_line(
	ogr_fid int IDENTITY(1,1) NOT NULL,
	geom27700 geometry NULL,
	featdesc varchar(254) NULL,
	featcode int NULL,
 CONSTRAINT PK_settlement_line PRIMARY KEY CLUSTERED 
(
	ogr_fid ASC
));

CREATE TABLE dbo.settlement_area(
	ogr_fid int IDENTITY(1,1) NOT NULL,
	geom27700 geometry NULL,
	featdesc varchar(254) NULL,
	featcode int NULL,
 CONSTRAINT PK_settlement_area PRIMARY KEY CLUSTERED 
(
	ogr_fid ASC
));

CREATE TABLE dbo.text(
	ogr_fid int IDENTITY(1,1) NOT NULL,
	geom27700 geometry NULL,
	featdesc varchar(254) NULL,
	featcode int NULL,
	name varchar(254) NULL,
	xml_name varchar(254) NULL,
	fonttype varchar(150) NULL,
	fontcolour varchar(30) NULL,
	fontheight varchar(30) NULL,
	orientatio numeric(30, 1) NULL,
 CONSTRAINT PK_text PRIMARY KEY CLUSTERED 
(
	ogr_fid ASC
));

CREATE TABLE dbo.tidalboundary(
	ogr_fid int IDENTITY(1,1) NOT NULL,
	geom27700 geometry NULL,
	featdesc varchar(254) NULL,
	featcode int NULL,
 CONSTRAINT PK_tidalboundary PRIMARY KEY CLUSTERED 
(
	ogr_fid ASC
));

Loading the geometry data with Powershell

The OS VectorMap dataset is cut into grid squares 100km x 100km across, denoted by a two letter combination, e.g. TG. Each of these squares are further subdivided into 100 10km x 10km squares, denoted by a two letter and two number combination e.g. TG20. So the directory structure of the OS VectorMap data looks like this:

image

To load the complete set of data, you need to loop through all subdirectories starting from a given base directory, then call OGR2OGR to append the data in each shapefile found therein into the appropriate feature table based on its filename. Note that not every feature layer will be found in every subdirectory (you won’t, for example, find many tidal features in Derbyshire).

And here’s a powershell script to do that loading for you. Just change the name of the base directory from which to search (c:\osdata), the path to ogr2ogr.exe (c:\warmerda\bld\bin), and the connection string to your SQL Server instance as appropriate:

get-childitem c:\osdata -include *.shp -recurse | foreach ($_) {
  echo "Now loading: " $_.fullname
  C:\warmerda\bld\bin\ogr2ogr -progress -append -f "MSSQLSpatial" "MSSQL:server=zangief\SQLEXPRESS;database=OSVectorMap;trusted_connection=yes;" $_.fullname -a_srs "EPSG:27700" -lco "GEOM_TYPE=geometry" -lco "GEOM_NAME=geom27700"
}

The layer creation options (-lco) supplied to ogr2ogr specify that this is geometry data in the EPSG:27700 coordinate reference system, and should be loaded into the geom27700 of each table. The other attribute values of each feature will be copied straight into the corresponding columns. Execute the script and, if all goes to plan, you should see the following as the files start to load:

image

Note that loading the full set of OS Vectormap will take some time so, at this point, I’d go and get a coffee or browse facebook or something. Then, when you return, you should be in possession of one table containing the combined data for each OS VectorMap feature layer, a bit like this:

image

June 18, 2011

Importing Spatial Data to SQL Server with OGR2OGR–now even easier!

A few months back, I posted an article explaining how to import spatial data into SQL Server 2008 from any format supported by the OGR library (including ESRI shapefiles, GML, and TIGER data), using OGR2OGR. That article was written using OGR2OGR from v1.7 of the GDAL 1.7 library, which doesn’t support SQL Server 2008 directly, so I instead used OGR2OGR to create a CSV file containing spatial data in Well-Known Text format and then parsed that data in SQL Server using the STGeomFromText() method.

The good news is that things have become a bit easier since then, and version 1.8 of the GDAL library now has a MSSQLSpatial driver that can interface directly with geometry and geography data in SQL Server 2008.

The bad news is that most of the places that offer pre-compiled GDAL binaries for Windows have yet to update to the new version. FWTools, for example, still comes packaged only with v1.7. Likewise, the osgeo download site at http://download.osgeo.org/gdal/win32/ also lists GDAL versions up to v1.7.

So, if you want to get hold of the latest GDAL to import directly into SQL Server you’ll have to build it yourself from source, which can be downloaded from http://download.osgeo.org/gdal/gdal180.zip

Fortunately, the source has been very considerately packaged, and includes solution files that will build GDAL out-of-the-box in VS2005, 2008, and 2010. Simply load the .sln file, click build, and wait a few minutes:

Building GDAL 1.8 in VS2010

Then, if you look in the output directory (\warmerda\bld\bin, by default) you should see a lovely collection of utilities for working with spatial data – GDAL (for working with raster data), and OGR (for its vector sibling).

Here’s the output of calling ogr2ogr –formats, which retrieves the list of supported vector spatial formats – note the MSSQLSpatial format supported for both read/write:

image 

Example usage to load a shapefile to SQL Server as follows:

ogr2ogr -overwrite -f MSSQLSpatial "MSSQL:server=.\MSSQLSERVER2008;database=spatial;trusted_connection=yes" "TG20.shp"

And here’s some lovely OS vector shape data in SQL Server:

image

Follow

Get every new post delivered to your Inbox.

Join 53 other followers