Posts tagged ‘Geocoding’

May 4, 2012

Geocoding in SQL Server with the Bing Maps Locations API

Almost every SQL Server database contains “spatial” data. That information might not use the dedicated geography or geometry spatial datatypes but, more likely, could be a table of customer addresses, the name of a city of region for which a sales manager is responsible, the itinerary of locations at which a delivery vehicle is scheduled to stop, etc. etc.

All of these are examples of spatial information – data that describes the location of objects on the surface of the earth. The problem is that this information is typically stored as a free-text string – “10 Downing Street”, “Round the Back of Sainsbury’s car park”, or “Manchester”, for example. Even semi-structured information such as a postcode, “NR1 6NN”, is not particularly useful for spatial queries. Instead, what is needed is a way to turn this text-based, descriptive information, into a structured spatial format (typically a single pair of latitude/longitude coordinates). And that’s where geocoding comes in.

In my SQLBits session last year, “Who Needs Google Maps?” I discussed a little about what is involved in creating your own geocoding function, based on parsing a supplied text address string and looking up the relevant coordinate location from a local gazetteer table. However, in practice, it’s very unlikely that you’ll ever want to create your own geocoding function when you can just use one somebody else has already made (such as the Bing Maps Locations API).

So, here’s a step-by-step guide to creating a geocoding function in SQL Server that calls the Locations API instead:

Step 1. Get a Bing Maps key

Use of the Bing Maps Locations API is free for many applications, but may involve a cost depending on how many geocodes you request, and what you’re using them for (you can check the terms of use at http://www.microsoft.com/maps/product/terms.html). Either way, before using the service you first have to sign up for a key, which you can do at http://www.bingmapsportal.com – it only takes a few seconds to do and you can get an evaluation key instantly. The key is an alphanumeric string that looks a bit like: AhGSgD1Twhjx9WqxjJZznG3tY3r0wnFr!gg1ngK3yGnp9b3hupQUVbNdv6Wb0qW

Step 2. Create a Geocoding Function

Fire up Visual Studio and create a new C# Class Library project. Once  created, add a reference to the Microsoft.SqlServer.Types.dll library  (Project –> Add Reference, and select the Microsoft.SqlServer.Types.dll library from the .NET tab).

Then, edit the default Class1.cs file to read as follows, inserting your Bing Maps key where indicated:

using System;
using System.Data;
using System.Data.SqlClient;
using System.Data.SqlTypes;
using Microsoft.SqlServer.Server;
using System.Net;
using System.IO;
using System.Xml;
using Microsoft.SqlServer.Types;
using System.Collections.Generic; // Used for List

namespace ProSpatial
{
  public partial class UserDefinedFunctions
  {

    /* Generic function to return XML geocoded location from Bing Maps geocoding service */
    public static XmlDocument Geocode(
      string countryRegion,
      string adminDistrict,
      string locality,
      string postalCode,
      string addressLine
    )
    {
      // Variable to hold the geocode response
      XmlDocument xmlResponse = new XmlDocument();

      // Bing Maps key used to access the Locations API service
      string key = "ENTERYOURBINGMAPSKEYHERE";

      // URI template for making a geocode request
      string urltemplate = "http://dev.virtualearth.net/REST/v1/Locations?countryRegion={0}&adminDistrict={1}&locality={2}&postalCode={3}&addressLine={4}&key={5}&output=xml";

      // Insert the supplied parameters into the URL template
      string url = string.Format(urltemplate, countryRegion, adminDistrict, locality, postalCode, addressLine, key);

      try
      {
        // Initialise web request
        HttpWebRequest webrequest = null;
        HttpWebResponse webresponse = null;
        Stream stream = null;
        StreamReader streamReader = null;

        // Make request to the Locations API REST service
        webrequest = (HttpWebRequest)WebRequest.Create(url);
        webrequest.Method = "GET";
        webrequest.ContentLength = 0;

        // Retrieve the response
        webresponse = (HttpWebResponse)webrequest.GetResponse();
        stream = webresponse.GetResponseStream();
        streamReader = new StreamReader(stream);
        xmlResponse.LoadXml(streamReader.ReadToEnd());

        // Clean up
        webresponse.Close();
        stream.Dispose();
        streamReader.Dispose();
      }
      catch (Exception ex)
      {
        // Exception handling code here;
      }

      // Return an XMLDocument with the geocoded results 
      return xmlResponse;
    }

    /* Wrapper method to expose geocoding functionality as SQL Server User-Defined Function (UDF) */
    [Microsoft.SqlServer.Server.SqlFunction(DataAccess = DataAccessKind.Read)]
    public static SqlGeography GeocodeUDF(
      SqlString countryRegion,
      SqlString adminDistrict,
      SqlString locality,
      SqlString postalCode,
      SqlString addressLine
      )
    {

      // Document to hold the XML geocoded location
      XmlDocument geocodeResponse = new XmlDocument();

      // Attempt to geocode the requested address
      try
      {
        geocodeResponse = Geocode(
          (string)countryRegion,
          (string)adminDistrict,
          (string)locality,
          (string)postalCode,
          (string)addressLine
        );
      }
      // Failed to geocode the address
      catch (Exception ex)
      {
        SqlContext.Pipe.Send(ex.Message.ToString());
      }

      // Declare the XML namespace used in the geocoded response
      XmlNamespaceManager nsmgr = new XmlNamespaceManager(geocodeResponse.NameTable);
      nsmgr.AddNamespace("ab", "http://schemas.microsoft.com/search/local/ws/rest/v1");

      // Check that we received a valid response from the geocoding server
      if (geocodeResponse.GetElementsByTagName("StatusCode")[0].InnerText != "200")
      {
        throw new Exception("Didn't get correct response from geocoding server");
      }

      // Retrieve the list of geocoded locations
      XmlNodeList Locations = geocodeResponse.GetElementsByTagName("Location");

      // Create a geography Point instance of the first matching location
      double Latitude = double.Parse(Locations[0]["Point"]["Latitude"].InnerText);
      double Longitude = double.Parse(Locations[0]["Point"]["Longitude"].InnerText);
      SqlGeography Point = SqlGeography.Point(Latitude, Longitude, 4326);

      // Return the Point to SQL Server
      return Point;
    }
  };
}

This code listing contains both a generic geocoding method to call the Bing Maps Locations API (called Geocode), and a wrapper function to expose that functionality as a scalar User-Defined Function (called GeocodeUDF) that selects the top-matching geocoded coordinates for a supplied address string.

There are several other ways you might want to expose geocoding functionality in SQL Server; geocoding is an imprecise operation, and you often might get more than one possible match returned for a given address. To enable a choice between different possible geocoded locations, you might therefore prefer to return the results in a table, using a Table-Valued Function rather than a scalar UDF. Such a table could also return other columns of information, such as the confidence of the match, or the bounding box around a large feature rather than a single point location to represent its location. These ideas and more are discussed in the Geocoding chapter of Pro Spatial with SQL Server 2012, but for now we’ll stick with the simple case of a UDF that just selects the top hit.

Build the project containing the code above (Build –> Build Solution).

Step 3. Import the Assembly and Register the Function in SQL Server

In SQL Server, first make sure that user-defined CLR code is enabled:

EXEC sp_configure 'clr enabled', '1';
GO
RECONFIGURE;
GO

Then, to interact with the Locations API service, you need to set the appropriate database security permissions to allow access to external services. The easiest way to do this is to simply set the database to be trustworthy (in this example, I’m going to create my geocoding function in the ProSpatial database – change this line to match your database name as appropriate):

ALTER DATABASE ProSpatial SET TRUSTWORTHY ON;
GO

Now, import the assembly containing the geocoding function, and remember to give it EXTERNAL_ACCESS permission. You’ll have to edit the path and filename to match that of the project you created in Step 2:

CREATE ASSEMBLY Geocoder
FROM 'C:\Users\Alastair\Visual Studio 2012\Projects\Geocoder\bin\Debug\Geocoder.dll'  
WITH PERMISSION_SET = EXTERNAL_ACCESS;
GO

Finally, register the function that will call into the GeocodeUDF method in the Geocoder assembly:

CREATE FUNCTION dbo.Geocode(
  @addressLine nvarchar(max),
  @locality nvarchar(max),
  @adminDistrict nvarchar(max),
  @postalCode nvarchar(max),
  @countryRegion nvarchar(max) 
  ) RETURNS geography
AS EXTERNAL NAME 
Geocoder.[ProSpatial.UserDefinedFunctions].GeocodeUDF;

Step 4. Geocode!

Now, to geocode any address from SQL Server, call the Geocode function, supplying the street address, locality, administrative district, postcode, and country/region, as in the following examples: (PLEASE SEE UPDATE BELOW!)

-- Create a new geography Point by geocoding a provided address
-- For any address elements unknown, simply supply an empty string
DECLARE @g geography;
SET @g = dbo.Geocode('10 Downing Street', 'London', '', 'SW1A 2AA', 'UK');

-- Retrieve the WKT of the created instance
SELECT @g.ToString();

Here’s the results of running this query in SQL Server:

image

POINT(-0.127707 51.50355). Now let’s then check these coordinates on Bing Maps. Unsurprisingly, (since the coordinates of the point in SQL Server were obtained from the Bing Maps Locations API), they line up pretty well….

image

UPDATE

Following a few comments from folks who were having trouble getting this code to work, I realised that the example code listing above is wrong – the method signature of the Geocode SQLCLR method expects the countryRegion parameter *first*, then district, locality, postcode, and finally street address. So, my example should really have read:

SELECT dbo.Geocode(‘UK’, ‘LONDON’, ‘London’, ‘SW1A 2AA’, ’10 Downing Street’);

Or, if you prefer to supply the parameters in the more common street address, locality, postcode, country order, just change the method signature and recompile the assembly. Sorry for any confusion caused!

April 6, 2011

Parsing Free-Text Addresses and a UK Postcode Regular Expression Pattern

In my upcoming session at SQLBits this Saturday, I’ll be attempting to replicate the functionality of Google Maps using nothing but freely-available tools and data – SQL Server Express, OS Open Data, and a dash of Silverlight.

One of the features I’ll be demonstrating is a basic geocoding function – i.e. given an address, placename, or landmark, how do you look up and return the coordinates representing the location so that the map can centre on that place? This is not really a spatial question at all – it’s a question of parsing a free-text user input and using that as the basis of a text search of the database.

The simplest way of doing this is to force your users to enter Street Number, Street Name, Town, and Postcode in separate input elements (and these match the fields in your database). In this case, your query becomes straightforward:

SELECT X, Y FROM AddressDatabase WHERE StreetNumber = ‘10’ AND StreetName = ‘Downing Street’ AND Town=’London’

Most databases don’t contain the location of every individual address. If there is no exact matching StreetNumber record, then you typically find the closest matching properties on the same road and interpolate between them (it seems reasonable to assume that Number 10 Downing Street will be somewhere between Number 9 and Number 11).

Forcing users to enter each element of the address separately doesn’t necessarily create the most attractive UI, however. What’s more common is to use a single free-text search box into which users can type whatever they’re searching for – a placename, address, landmark, postcode etc. Nice UI, but horrible to make sense of the input. In these cases, the user might supply:

“10 Downing Street, London”

“Downing Street, St James’, LONDON”

“10, Downing St. SW1A 2AA”

…not to mention “10 Downig Street. London”, and any other many of misspellings or alternative formats.

One approach you might want to take in these cases is to use a RegEx pattern matcher to determine if any part of the string supplied is a postcode. The UK postcode format is defined by British Standard BS7666, and can be described using the following regular expression pattern:

(GIR 0AA|[A-PR-UWYZ]([0-9][0-9A-HJKPS-UW]?|[A-HK-Y][0-9][0-9ABEHMNPRV-Y]?) [0-9][ABD-HJLNP-UW-Z]{2})

Matching the supplied address string against this RegEx doesn’t prove that a valid postcode was supplied, but just that some part of the user input matched the format for a postcode. The matching substring can then be looked up (say, against the CodePoint Open dataset) to confirm that it is real.

Once you’ve identified the postcode, you can then run a query to retrieve a list of roadnames that lie in that postcode, from something like the OSLocator dataset, and scan the remainder of the input to see if it contains any of those names. You can also scan for any numeric characters in the first part of the text input, which might represent a house number. If you find a matching property, with the same road name and valid postcode, you can be pretty sure you’ve found a match.

If you find more than valid match, or possibly several partial matches only, then you can of course present a disambiguation dialogue box – “Is this the 10 Downing Street you meant?”. For example, there are many “10 Downing Street”s in the UK – from Liverpool to Llanelli and Farnham to Fishwick…. without knowing either the town or the postcode, it could have referred to any of the following:

image

January 12, 2011

Who needs Google Maps? Build your own Mapping, Geocoding, and Routing service with SQL Server

Submit a session for SQLBitsBack in November 2009, I was lucky enough to present a session at the SQLBits V conference, on “Creating High Performance Spatial Databases“. I say “lucky” not because I enjoy presenting at conferences (because I don’t particularly), but because SQL Bits is a fantastic conference, organised by a highly-dedicated, bloody-hardworking, talented, and generally nice bunch of people, and it was an honour to be associated with them and to learn from them.

The next SQL Bits conference, SQL Bits 8, is happening in Brighton between 7th – 9th April 2011, and I’ve just submitted a new session for it, titled (as is this post) “Who needs Google Maps? Build your own Mapping, Geocoding, and Routing service”. If my session gets accepted, I’m planning demonstrating practical uses of the spatial datatypes in SQL Server to perform, well, mapping, geocoding, and routing.

Following feedback from Simon Sabin (a SQL Server MVP with much more presenting experience than me) I got after my last presentation , I’m going to be ditching the Powerpoint slides and the theory and, if my session is selected, I’ll be presenting a lot more eye-candy like this instead:

Routefinding

Route-finding in SQL Server

Mapping Features

Norwich OS Map in SQL Server Management Studio

Follow

Get every new post delivered to your Inbox.

Join 53 other followers