Go to content Go to navigation and search

Home

Straws in the Wind Blog Articles


Search

Browse

RSS / Atom

Email me

textpattern

Creative Commons License
All Blog Articles, Data Models and Free Source Code by Simon Greener, The SpatialDB Advisor is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.

Tiling Very Large Rasters

Monday January 31 2011 at 04:57

One often meets a question like this (asked on the PostGIS discussion list) about the performance of raster images:

Can anyone comment on the “Performance” of the PostGIS Rater or WKTRaster?

If you were to serve this onto a web application, how effective are they? Well, I’m talking of very large datasets like 10GB each and 100 of them. How does it compare with serving the images as files directly.. from a GeoTIFF, ECW, Sid, etc. (obviously using Geoserver/Mapserver or some specialized image serving solution).

Been searching all over the web for the answers, but no luck.

Hope to get some insights from the users.

(The issue is not just restricted to databases with GeoRaster support like PostgreSQL/PostGIS or Oracle GeoRasters, but also affects the design and layout of disk based rasters/tiles.)

While there are many different aspects relating to raster data management, compression, storage and delivery this short article with restrict itself to two:

  1. Tiling
  2. Pyramiding (though only briefly).

Tiling

Putting aside issues relating to ECW/MrSID format imagery (I may cover these in another post if there is any interest to this one), the main problem facing someone asked to provide internet mapping access to a very large image (say > 100M) is to decide how to break it up to enable efficient access.

Let’s assume that there are good, empirical reasons as to the efficacy of breaking up the image before providing access (cf Googe Maps, Bing Maps, Yahoo Maps, other Tile-Cache implementing software). One comment will suffice for completeness: normally, any one request for a portion of an image (this does not relate to ECW/MrSID) would require the mapping program to access the image and read large amounts of it – with associated processing – before the required sub-image could be found and returned: this will entail lots of disk access, cpu and even heavy demands on a server’s network interface card (NIC) especially where application and data servers are separated. Add in multi-user access and the issues compound quickly.

Now, what tiles size?

The first thing to do is determine the optimal zoom range/map scale at which this image’s data is to be displayed. For example, if it is high-resolution urban ortho-photography (0.1m / pixel) then one would probably be using it a large scales (eg 1:5:000) and not at map scales smaller than, say from 1:15,000 to 1:50,000 and above.

An easy way to determine your optimal display scale is to open the image in a desktop GIS (QGIS, Manifold GIS, MapInfo, ArcGIS etc), and zoom around until the image is its sharpest. Try and set the screen size of the GIS such that it mirrors the size of the images that will be served by your Internet mapping software: for argument’s sake let’s assume tat your your Internet mapping software will be serving images of size 800 × 600 pixels. From Dr Peter Lamb’s Tiling Very Large Rasters paper, we can see that the optimal tile size for this data will be 1/4 of each side:

X: 800 / 4 = 200
Y: 600 / 4 = 150

That is, the optimal tile size will be 200 × 150 pixels.

You will also notice from the figures in the paper that smaller tiles run the risk of becoming less optimal at a much faster rate than if you erred on the size of large tiles. So, it is to be expected that tiles of size 50 × 30 pixels will cause a greater degradation in performance than if we had tiled at 400 × 300 pixels.

If you wish to use this 0.1m ortho-photography at smaller scales (i.e. supporting zooming out from the optimal) then choosing a larger tile size (such as 400 × 300) will probably not unduly affect performance.

Once you have a solution to a raster data set, then use whatever tiling software you have (e.g. FME etc) to tile and load the data.

Then you need to repeat this will all other raster data that will provide support at other map scales.

Pyramiding

One great benefit of ECW and MrSID encoding is that they provide for multi-scale access to a single raster dataset. However, there are issues around the use of these formats that one must be careful about such as their use with cartographic map images and not remotely sensed or other georeferenced photography.

Let us assume don’t have ECW/MrSID data or the ability to create it then what do you do for providing efficient multi-scale access to your data?

Let us continue to assume we have 0.1m / pixel ortho-photography that is designed for use at large scales and you want to use this as the basis for multi-scale visualisation. What do you do?

Some software, such as Oracle GeoRaster has the ability to resample a base image layer into different resolutions. These resampled images are organised into a pyramidal structure such that map access at a particular scale can be directed to imagery at a particular pyramid level.

It is a nice feature!

But what if you don’t have something like Oracle GeoRaster?

Then the approach you should take is to repeat that which I described above for creating a optimally tiled image with additional elements:

1. Determine the required map scales for the image.
2. Use your GIS or image processing software to resample the images such that they are optimal for your needs at each of the maps scales of 1 (Manifold GIS’s image resampling capabilities are fantastic for the price paid).
3. Display in your GIS to the optimal display scale and thus tile size.
4. Tile the image and load into your database or write to disk.

Repeat for each required map scale.

I hope this article is of use for someone.

regards
Simon

Creative Commons License

post this at del.icio.uspost this at Diggpost this at Technoratipost this at Redditpost this at Farkpost this at Yahoo! my webpost this at Windows Livepost this at Google Bookmarkspost this to Twitter

Comment [2]

Interesting article, although it doesn’t answer the original question, i.e. what the performance of PostGIS Raster or WKTRaster is compared to other forms of serving the data. You mainly explain the different concepts of how rasters can be prepared for web serving. Any ideas on the performance?

— KD · 20 May 2011, 08:52 · #

KD,

Your observation is true enough: I did not answer the question per se.

I meant only to address the one issue that affects serving large raster images regardless as to method: the tile size.

Whether one uses a database or file based storage is dependent on many factors other than speed. For example, I would not put rasters in a database unless interoperability with business data was required at the database transaction level, or the architecture of the deployment framework did not allow access to physical file systems.

I have not done any comparative analysis of database vs file based storage and access. It would be interesting to see what the results provided. Personally I would expect file based storage of small GeoTiff tiles to be faster than a database when everything is on the same machine (and the organisation of the folders for the file base catalog meant few files exist per directory).

regards
Simon

Simon Greener · 31 May 2011, 04:55 · #