White Paper : Caching to Improve GeoWeb Reliability

The Problem

The OGC distributed services model of having each organization run its own web services has a number of advantages and is being embraced world wide. A major problem, however, is its reliability.

While some organizations have great IT departments committed to keeping their services highly available, some of the best data resides with organizations that struggle to keep their services live. This problem is exacerbated for centralized aggregators, like Statens kartverk or ESDIN's forthcoming aggregation services, as their service will appear less reliable if the cascaded services are not themselves reliable. This problem is most apparent with Web Map Service (WMS) delivery, where Google and other commercial players have set high availability and performance standards.

The Solutions

The solution favored by those with central authority is to require a certain level of uptime and reliability. This works reasonably well when there is political will and money to back it up, but it can be easily complemented by more bottom-up solutions to achieve higher levels of service.

The halfway solution is “provider caching” — making copies of the relevant information and serving them elsewhere. The OGC is in the process of adopting a Web Map Tiling Service (WMTS) specification that will extend the WMS specification to create tiles that are more easily cached.

OpenGeo has been working on GeoWebCache, a caching solution that works with any WMS service, and there are several other similar projects. Caching services sit between a WMS and the client, not only greatly accelerating the speed but increasing the reliability as well.

If a back-end WMS service goes down then the cache still works. And since the caching engine is a much simpler and less resource intensive product it has greater overall availability. It can also be clustered easily for even higher availability. Requiring partners to install a cache in addition to their default service will ensure a greater degree of reliability.

The friendliest bottom-up solution with the lowest barrier to entry for data-serving partners is for an aggregator to provide them with centralized caches. This incentivizes partners to go through the aggregator, since it not only increases the reliability of their service but also reduces the load on their servers. The centralized cache is best if it is quite large, but it could also leverage techniques to only cache the most popular tiles requested in order to take the most stressful load off of partners.

The Product

A few key extensions to GeoWebCache could turn it into an ideal product for allowing an aggregator to cache partner WMS services, evolving from a provider cache to an aggregator cache. This service could take the cascading WMS services and start caching them to the maximum allowed disk space while keeping the most used tiles available for everyone.

To work with cascading WMS providers the key component is to allow their administrators to truncate the cache by notifying the aggregator cache when they change the data or styles of their services and triggering a re-cache. Partners should have control over the cache in several ways: they should be able to log in and manually expire the cache by drawing an area of the map and selecting their layer; optionally, they should be able to limit the rate at which the cache is seeded so as not overload their service.

In addition, the cache should provide automated notifications, such as GeoRSS feeds, to allow dynamic services to broadcast what has changed and what the cache needs to update. This will work well with implementations like GeoServer, which has a Geosynchronization module in the OGC OWS-5 testbed that sends notifications of transactions.

Extending GeoWebCache to provide sophisticated reporting of statistics that document usage, report load patterns, and identify areas of interest would allow the cascading WMS providers to make informed decisions about infrastructure requirements and other priorities. In addition to the aggregate statistics, the results would be easily visualized as a heat map overlay on top of the original data.

GeoWebCache is a great base to build upon, as it already has a solid caching core that can use various back-ends, different cache expiry mechanisms, a REST API to programmatically control what is cached, and a basic user interface. The needed extensions would be:

  • Friendly user interface to allow partners to log in and truncate, seed, update their caches
  • Improved security integration, handling of multiple users with per layer permissions
  • WMTS specification implementation
  • Truncation of cache based on GeoRSS feed notifications
  • Clustering / lateral cache synchronization / increased reliability
  • Queued requests to back-end servers to control load
  • Statistics collection hooks and presentation modules

Together these would create a great back-end interface that provides a real win to partners who would get clear value out of registering their services with a centralized catalog or aggregator. A OpenGeo Suite Enterprise contract would easily suffice in providing the amount of work needed to accomplish these improvements.

Other White Papers

Geospatial, An Open Source Microcosm

Open source has seen great success in general information processing, but does it have a future in vertical markets? In this article, we examine how geospatial open source provides an example of the market challenges of a mid-sized vertical market.

Read more...

OpenGeo Sensor Web Enablement (SWE) Suite

the Open Geospatial Consortium (OGC) has been engaged in developing a set of standards for web-enabling sensors and sensor observations. Version 1.0 of the Sensor Web Enablement (SWE) standards were approved and released. Versions 2.0 of these standards have either been approved, or will be approved by Fall.

Read more...

The OpenGeo Suite Enterprise Edition

This paper outlines how the OpenGeo Suite Enterprise Edition augments the innovation of open source software communities with the testing, certification, and maintenance necessary to create and maintain reliable, long-term enterprise production web services.

Read more...

The OpenGeo Architecture

The OpenGeo Suite is built from several open source projects (OpenLayers, GeoWebCache, GeoServer, PostGIS) that each provide distinct functionality. This paper explains what each component does and how they interact with other components.

Read more...

An Introduction to GeoWebCache

GeoWebCache is gaining popularity as enterprises look to accelerate their online maps. In this interview, Arne Kepp, the project founder and OpenGeo team member, provides historical background and technical details.

Read more...

Caching to Improve GeoWeb Reliability

The SDI model of distributed service providers can fall apart when services or connectivity are unreliable. National infrastructure providers can increase SDE reliability by providing a maintained caching infrastructure on top of distrobuted services.

Read more...

GeoServer in Production

GeoServer in a production environment can be evaluated according to three criteria: reliability, availability, and performance. This paper discusses methods for implementing production grade GeoServer deployments.

Read more...

Open Geocoding from OpenGeo

OpenGeo proposes to develop the first-ever robust, enterprise-ready, open source geocoding solution, and is looking for partners to provide feedback on requirements as well as project funding.

Read more...