MapTiler, GDAL2Tiles, and Raster Resampling

So I’ve been using MapTiler to create some quick raster tilesets from a set of GeoTIFF images. MapTiler is a Windows application that will reproject and cut any GDAL-supported datasource into a set of 256px x 256px image tiles, suitable for use as a custom tile layer in Google Maps or Bing Maps et al.

MapTiler is essentially just a GUI wrapper around the GDAL2Tiles.py python script, which is distributed as part of GDAL.

Like all GUI wrappers, it provides a convenient front-end interface, and doesn’t require the end user to have any knowledge of command-line processing or build tools. This is an admirable objective in itself because, as noted in previous posts, simply trying to compile the components of a typical spatial tool chain can be quite a headache. Generally speaking, the application is reliable, easy-to-use, and generates pretty good quality results.

However, like many GUI applications, in the process of simplifying the experience for the end user, MapTiler also loses some of the flexibility and power of the underlying script. The current version of MapTiler (download from http://maptiler.googlecode.com/files/maptiler-1.0-beta2-setup.exe), for me, has three main shortcomings:

Firstly, the source code repository reveals that the current MapTiler executable was compiled using version 15748 of the GDAL2Tiles.py script, dated November 2008. However, the GDAL2Tiles.py script maintained as part of the GDAL library, and packaged with the latest GDAL 1.8.1RC2release, is version 19288, dated April 2010. Reading through the release notes on the GDAL site shows several bug fixes and enhancements have been introduced to the GDAL2Tiles.py script between these two versions.
Secondly, the front-end MapTiler interface does not expose all of the possible parameters of the underlying script. Crucially, I could see no way to specify the resampling method used when warping the input raster file into the Spherical Mercatorprojection suitable for Bing Maps / Google Maps, and resizing the data for different zoom levels. Nor is there a way to control the compression/quality level of the PNG/JPG images created by the tile-cutting process.
Finally, the process always runs on a single thread, which means that, assuming you’re running on a multicore processor (as even most humble laptops and home computers are, these days), you’re not using the full processing capacity of your system to process those tiles.

To correct those issues, you’re going to have to ditch the pretty MapTiler front-end and get dirty with the command-line instead, manually running the GDAL2Tiles.py script that sits at the heart of MapTiler.

Setting up GDAL2Tiles

Although GDAL2Tiles.py comes bundled with GDAL library, it has dependencies on other libraries. If you attempt to run the GDAL2Tiles.py script straight out of a newly compiled GDAL build, for example, you’ll most likely get the “No module named gdal” errors, as follows:

The reason is that, in addition to the GDAL library itself, you also need to install the python bindings for GDAL – allowing you to call into the GDAL functions from a python script. I don’t know much about python, so I won’t even attempt to explain this further, other than to say that the easiest fix is to install the requisite libraries from the OSGEO4W package, highlighted as follows:

(If you don’t have the GDAL library itself installed, you can add this from the OSGEO4W installer as well). With these two in place, you can get on with calling the script. The syntax and list of possible parameters can be retrieved by opening up a command shell and calling:

Python GDAL2Tiles.py –help

which produces the following output:

Usage: gdal2tiles.py [options] input_file(s) [output]

Options:
–version             show program’s version number and exit
-h, –help            show this help message and exit
-p PROFILE, –profile=PROFILE
Tile cutting profile (mercator,geodetic,raster) -default ‘mercator’ (Google Maps compatible)
-r RESAMPLING, –resampling=RESAMPLING
                        Resampling method (average,near,bilinear,cubic,cubicspline,lanczos,antialias) – default ‘average’
-s SRS, –s_srs=SRS The spatial reference system used for the source input data
-z ZOOM, –zoom=ZOOM Zoom levels to render (format:’2-5′ or ’10’).
-e, –resume          Resume mode. Generate only missing files.
-a NODATA, –srcnodata=NODATA
NODATA transparency value to assign to the input data
-v, –verbose         Print status messages to stdout

KML (Google Earth) options:
Options for generated Google Earth SuperOverlay metadata

   -k, –force-kml     Generate KML for Google Earth – default for ‘geodetic’ profile and ‘raster’ in EPSG:4326. For a dataset with different projection use with caution!
    -n, –no-kml        Avoid automatic generation of KML files for EPSG:4326
-u URL, –url=URL   URL address where the generated tiles are going to be published

Web viewer options:
Options for generated HTML viewers a la Google Maps

    -w WEBVIEWER, –webviewer=WEBVIEWER
                        Web viewer to generate (all,google,openlayers,none) -default ‘all’
-t TITLE, –title=TITLE
                        Title of the map
    -c COPYRIGHT, –copyright=COPYRIGHT
Copyright for the map
    -g GOOGLEKEY, –googlekey=GOOGLEKEY
                        Google Maps API key from http://code.google.com/apis/maps/signup.html
    -y YAHOOKEY, –yahookey=YAHOOKEY
Yahoo Application ID from http://developer.yahoo.com/wsregapp/

The option I was particularly interested in is the “Resampling” parameter, since I wasn’t particularly satisfied with the results that I was obtaining from MapTiler. To test it out, I tried tiling a sample GeoTIFF file from the Ordnance Survey Street View dataset. The source image in question is the southwest corner of Grid Square TQ38, and looks like this:

Note that the full size image is 5000 x 5000 pixels, and projected using the British National Grid system, EPSG:27700. Producing tiles for Bing Maps will necessarily involve warping the image and resampling it to the appropriate zoom levels. To start with, I just tried the default method (‘average’), as follows:

python gdal2tiles.py TQ38sw.tif

Unfortunately, this did little more than generate an error. But at least it was a helpful error – informing me that my source TIFF was not in the correct RGBA colour bands. Fortunately, that can be corrected by creating a VRT pointing at the source TIFF using gdal_translate:

gdal_translate –of vrt –expand rgba tq38sw.tif tq38sw_rgba.vrt

Now, simply try GDAL2Tiles again, but pointing at the VRT rather than the original TIFF:

python gdal2tiles.py TQ38sw_rgba.vrt

This generated a different error – this time stating that the source file had no coordinate reference system. Somehow, in the gdal_translate process, this appeared to have been stripped off. Since I knew the coordinate system was British National Grid, I supplied that using the –s_srs parameter:

python gdal2tiles.py TQ38SQ_rgba.vrt –s_srs EPSG:27700

(These are both examples of the sorts of things MapTiler automatically does for you behind the scenes). Now the script seemed to be working successfully, and started to generate subdirectories of 256 x 256px output tiles, following the TMS tile numbering system. So, I started to try different options for the –r resampling parameter.

Resampling Methods

Tiling the previous image at level 16 using different resampling methods for my sample input produced the following (only one representative tile shown for each method):

Average (same as MapTiler produces)

Near (quick but ugly)

Bilinear (Similar to average)

Cubic (eurgh)

Cubicspline (too blurry)

Lanczos (quite nice)

Antialias (error)

Oh no. Not again. It turns out that the antialias resampling method has additional dependencies – PIL and numpy.

numpy can be installed, as with gdal and the gdal python bindings, from the OSGEO4W setup:

Installing PIL is a little bit more involved. Firstly, you need to register your Python installation. To do save and execute the following script from within python. (i.e. save this as a new file, register_python.py, in your OSGEO4W directory. Then load up the OSGEO4W command shell and type python register_python.py)

#
# script to register Python 2.0 or later for use with win32all
# and other extensions that require Python registry settings
#
# written by Joakim Low for Secret Labs AB / PythonWare
#
# source:
# http://www.pythonware.com/products/works/articles/regpy20.htm

import sys

from _winreg import *

# tweak as necessary
version = sys.version[:3]
installpath = sys.prefix

regpath = "SOFTWARE\\Python\\Pythoncore\\%s\\" % (version)
installkey = "InstallPath"
pythonkey = "PythonPath"
pythonpath = "%s;%s\\Lib\\;%s\\DLLs\\" % (
    installpath, installpath, installpath
)

def RegisterPy():
    try:
        reg = OpenKey(HKEY_LOCAL_MACHINE, regpath)
    except EnvironmentError:
        try:
            reg = CreateKey(HKEY_LOCAL_MACHINE, regpath)
            SetValue(reg, installkey, REG_SZ, installpath)
            SetValue(reg, pythonkey, REG_SZ, pythonpath)
            CloseKey(reg)
        except:
            print "*** Unable to register!"
            return
        print "--- Python", version, "is now registered!"
        return
    if (QueryValue(reg, installkey) == installpath and
        QueryValue(reg, pythonkey) == pythonpath):
        CloseKey(reg)
        print "=== Python", version, "is already registered!"
        return
    CloseKey(reg)
    print "*** Unable to register!"
    print "*** You probably have another Python installation!"

if __name__ == "__main__":
    RegisterPy()

Having registered Python, you now you need to download and install the Python Imaging Library (PIL). You should download the latest version of the PIL library that targets the version of Python you’re using. At the time of writing, the latest version of PIL is 1.17, and the version of Python that my GDAL uses (the one that comes with OSGEO4W) is 2.5. You can download the corresponding binary for PIL from http://effbot.org/media/downloads/PIL-1.1.7.win32-py2.5.exe

Once the PIL installation has been downloaded and executed, you can finally create anti-aliased (and probably the best-looking) tiles as follows:

Was it worth the effort? Probably not. The anti-aliased tile is, I think, slightly better quality than the average tile produced by GDAL2Tiles or MapTiler but, looking at my source TIFF file again, I notice that it was never that great resolution to begin with. Crap in -crap out, I guess, but at least I’ve learned a bit in the process.

The results will obviously be very different if I was using a different type of raster source input – perhaps a terrain image, or photographic image and, for these, perhaps one of the other resampling types will produce better results.

Safe FME

It’s probably worth mentioning that everything done by MapTiler (and, therefore by GDAL2Tiles.py) can be done in Safe FME also, with the following simple workflow:

The properties of the WebMapTiler provides many of the same resampling methods as described above, and produce essentially identical outputs:

The main advantage of Safe FME is that it requires absolutely none of the faffing about that I had to do to get GDAL to produce the same result. In fact, I’ve summarised the entire process in just two diagrams. But then again GDAL is free. So, I guess, what you pay for with FME is not having to spend hours in front of your computer swearing and looking up error messages on Google Winking smile

Parallelising Tile Production

The last point I was going to make concerned how to create tiles more efficiently on a multi-core machine. gdal2tiles.py does not make use of any parallel-processing logic, and I don’t know nearly enough about python to go about hacking it to make it so. However, there are some easy tricks you can do to divide the work between CPUs.

Firstly, you can divide your source data into two virtual datasets using gdalbuildvrt. You can either specify different bounding boxes for the two sets or, if your source data is comprised of a set of files anyway, simply load one subset of files into one datasource and some into another. Launch one instance of gdal2tiles.py to process one dataset, and another to process the other dataset, and set the affinity of each process to different CPU cores. The problem with this approach is that you will need to re-render the “edges” of each set – tiles that contain features from more than one source.

A better approach (based on the answer provided at http://gis.stackexchange.com/questions/2336/converting-huge-geotiffs-to-tilepyramid ) is to make a modified copy of the gdal2tiles.py script that processes tiles in reverse order. You can do this by modifying line 1180 from:

for ty in range(tmaxy, tminy-1, -1):

to:

for ty in range(tminy, tmaxy+1):

Then, launch both the original and the modified script to process the same data source of tiles at the same time, setting the affinity of each script to different CPU cores. Make sure the scripts are set to not overwrite existing tile images (by adding the –e switch) and, as the scripts come close to meeting in the middle, quit one of them and allow the other to finish off cleanly. Alternatively, you can adapt the logic in one or more modified scripts to specify an exact, distinct batch of tiles that each should process (i.e. divide the range between tminy and tmaxy into the number of cores across which the job is to be processed), and run each script to process a different batch.

UPDATE

Based on the comments provided by Klokan below, I also compared a tile output from the commercial parallised version of MapTiler. I don’t know if this uses exactly the same algorithms as the free version, but the output looks like this (crisp, but to me the colours look faded):

8 Responses to MapTiler, GDAL2Tiles, and Raster Resampling

Klokan Petr Pridal says:

July 12, 2011 at 12:14 pm

Parallelized version of MapTiler/GDAL2Tiles is available from the authors of these utilities at http://www.maptiler.com/.
The utility was completely rewritten into C/C++, it is about 5x faster on a single core, when used on multi-core CPU every core is almost multiplying the speed too (boost up to 16x on todays computers). Tiling is done in a clever way, to optimize disk I/O significantly. The software can run on a computer cluster too, including Amazon EC2 (thanks to industrial standard MPI / tests done also with MapReduce approach).
The utility is able to merge even huge number of supplied input files into a seamless layer and directly watermark the tiles. Optimization of tiles (jpeg/png) is integrated and done in memory, tiles are saved to the disk already optimized. All standard tile formats are supported (WMTS/TMS/QuadKey) and tiles can be written also into alternative archives / databases / cloud storage, next to the standard directory structure on a local disk.

- alastaira says:
  
  July 12, 2011 at 4:40 pm
  
  Hi Klokan, Thanks for the comment, and great work with the GDAL2Tiles.py script and the MapTiler program!
  
  Can you confirm whether the (commercial) parallelised version also addresses the other points mentioned in this post? i.e. Can you control the resampling method or the quality/compession level of tiles produced, or is it the same as an the MapTIler Beta version? And are you planning a formal 1.0 release of the “free” version?
  
  - Klokan Petr Pridal says:
    
    July 12, 2011 at 5:00 pm
    
    All problematic features including the rescaling and compression levels were addressed and are done properly in the parallelised version, the MapTiler Cluster.
    
    MapTiler Cluster is a command line application with options similar to GDAL2Tiles, but significantly extended. It is a professional tiling tool. You find tiles rendered with this software in several well-known iPad/iPhone/Android applications. Datasets of whole countries are rendered regularly with this software.
    
    We plan to release also quite cheap alternative: MapTiler Pro, a desktop application, based on the same C/C++ rendering core, and interface similar to open-source MapTiler – with the features you identified as problematic.
Klokan Petr Pridal says:

July 12, 2011 at 12:17 pm

We have rendered whole Ordnance Survey OpenData data set in the past, so it is possible to compare the quality and file size of the tiles mentioned in this article: http://os.tileserver.com/.

- Klokan Petr Pridal says:
  
  July 13, 2011 at 6:13 am
  
  The color in OS StreetView dataset were faded intentionally, because we wanted to use the map as a background map for another data and one level up is OS VectorMap which is completely faded anyway.
  
Michael says:

July 13, 2011 at 11:27 am

Hi Alastair,
Great article. Have you seen this codeplex project to create a .NET wrapper for GDAL – http://gdalnet.codeplex.com/. Looks promising, but early stages. I’m currently looking for a cost effective method to render large areas in one go. Loathed to spend sizeable sums of money on a application that might be used only say 5 times a year. Any recommendations?
Petr – I’d be interested in learning more about the Pro version you talk about.
Thanks again for the great article.

dualsportmaps.com (@DualSportMaps) says:

December 3, 2011 at 2:57 pm

The images gdal2tiles is creating are too large to suit me. Is is possible to have gdal2tiles create indexed PNGs rather than RGB PNGs? If I open one of the tiles in GIMP and Image=>Mode=>Indexed – Web Optimized the file looks about the same but is 90% smaller. I’d like them like that! Help would be appreciated!

pvanb says:

May 23, 2012 at 7:27 am

If you want indexed tiles, you can try to run imagemagick to convert the images from rgb to indexed, see e.g., http://studio.imagemagick.org/pipermail/magick-users/2008-August/021670.html. Note the limitations mentioned and the other options (like pngnq).