CaSe-SEnsItIvtY when Handling Multiple Spatial Columns in Safe FME Transformations

Some spatial data formats can store only a single logical set of features. Any given ESRI shapefile, for example, can store only POINTs, LINESTRINGs, or POLYGONs – not a mixture of each type of geometry. What’s more, all of those geometries must all be defined using the same spatial reference system, as defined in the associated .PRJ file.

Some other formats are more flexible. A SQL Server table, for example, can have many separate geometry or geography columns, and the values can contain both a mixture of geometry types, and even different spatial reference systems within the same column. I frequently use multiple spatial columns as a way of storing pre-calculated geometries relating to each feature at different levels of approximation, or different coordinate systems. For example, suppose that I have a table of features that were originally supplied in EPSG:27700 (OS National Grid of Great Britain). In addition to keeping the original geometry, I re-project those features into EPSG:4326 (WGS84) for displaying on Bing Maps. I also frequently simplify those features (using Reduce()) to create simpler approximations for display at far zoom levels. I keep each of these modified versions of the geometry in separate columns of the same table, named according to the datatype, SRID, and any modifications that have been performed, such as this:

CREATE TABLE (
  FeatureID int,
  geom27700 geometry,
  geog4326 geography,
  geog4326_reduce1000 geography
);

I came across a situation earlier today trying out Safe FME against a table with this structure that held me up for a few hours, so I thought I’d write about it here.

If you create a Feature Reader from a SQL Server table that has only a single geometry or geography column (as, I suppose, is probably the norm), you don’t need to explicitly specify the name of the column, as Safe FME is intelligent enough to work it out for you. Looking in the log, you’ll see a line that says

MS SQL Server (Spatial) Reader: Spatial column not specified for Feature Type 'dbo.Mastermap'. 
Using Column 'GEOMETRY' with type 'geometry'

If you’ve only ever used tables with one spatial column, you’ve probably not even noticed this as it shoots past in the log screen – you just drag your MS SQL Feature Reader onto the workspace and use it just as you would use a ESRI shapefile reader, or any other single data source. FME loads the values in the spatial column as features, and the values of all other columns of the table as attributes of those features.

But what if, like me, your source table has more than one spatial column? How do you specify which geometry/geography column should be used in the MSSQL reader to populate the feature type? Well, if you go to the parameters tab of the feature type you’ll see a couple of options:

  • “Specify Geography (geodetic) Column” – Yes if you’re using the geography datatype, or No if you’re using a column of the geometry datatype.
  • Geometry Column Name / Geography Column Name – The name of the column to retrieve.

In my dataset, the source column was of the geometry datatype and named Geom27700, so I set the parameters in my reader feature type as follows:

clip_image001[7]

I then carried on with building my (rather simple) workflow, as follows – check for valid data, reproject to WGS84, orient polygons using the left-hand rule, and then output to a writer:

image

Rather than insert the records into a new dataset, the writer in this case was designed to update the same table from which I’d read the features, but populate the geog4326 column (remember that features had been read from the geom27700 column).  So, I set the Writer Mode to UDPATE, selected the FEATURE_ID (the primary key of the table) as the key on which features would be matched, and set the other  parameters as follows:

 clip_image001[5]

Then, I clicked execute.

10 hours later, and FME was reporting that the transformation had been a success, with 48 million rows loaded into the destination table. But when I came to look at the table in SQL Server Management Studio, the geog4326 column that should have been populated by my writer was empty. Blank. NULL. Zilch. Zip.

So, what had gone wrong? I went back and looked through the FME log – everything seemed sound. I tried searching the internet and found some old posts about whether Safe FME supported multiple spatial columns per feature (like this one), so wondered if this was a limitation of the software, but nothing concrete came up. So then I contacted Safe FME’s “Doctors” over twitter. They asked me to send them a simple repro script, so I ditched my original package and started again. I built up a simple script, this time just looking at a test table with 100 rows, and…. it worked.

Much head-scratching later and I found the reason – when you specify the Geometry Column Name or Geography Column Name in a Feature Reader, the entered column is CASE-SENSITIVE, and must match exactly the definition of that column in the database. Frustratingly, there is no drop-down list of columns from which to choose this value – it must be manually typed. What’s more, if you enter the correct column name but using the wrong case, no error will be thrown. The package will run quite happily, in fact, loading all the attribute values from the table, but not the spatial data from the geometry/geography column itself. So your transformation will still load ‘000s of rows from the reader, but not with the (generally rather crucial) spatial feature itself.

For some reason, in this particular case, I’d named the column from which I was reading data GEOM27700 rather than geom27700. This is why, in my earlier example, the geog4326 column was NULL – it’s not that my UPDATE query hadn’t executed – the column had in fact been updated, but with NULL values that had been carried right the way through the FME transformation, since the reader had never successfully retrieved any values from the geom27700 column (since it was actually called GEOM27700).

I found this slightly odd behaviour, since my SQL Server collation itself is not case-sensitive, and will quite happily accept either GEOM27700 or geom27700 in a SELECT statement. What’s also odd is that this same behaviour is not mirrored in the Safe FME MS SQL Writer. In the writer, case sensitivity of the spatial column name is still significant, but if you attempt to specify a column with the wrong case, an error is thrown straight away and you’ll be unable to run the transformation, rather than the “let you carry on but don’t read anything” approach of the reader.

So, a slightly frustrating experience, and 10hrs of processing spent updating a table with a load of NULLs, but never mind – fortunately I’m only still learning Safe FME and this was just an experiment, so no angry clients ringing up asking where their data is (this time) Smile

And, to give credit where it’s due, having found the problem myself I then contacted the FME doctors again over twitter and they immediately replied, acknowledging the problem and informing that an update was in the works. I can think of a few software companies who could learn a thing or two about customer service like that….

This entry was posted in Spatial and tagged . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s