Print This Post

Cascading and Federated WFS and the Concept of Geolinking

As many people have pointed out, especially here in Canada, there is a great deal of geographic or geographically related information which does not reside in spatial or GIS databases.  Nonetheless there is the need to link this information with associated geospatial entities (e.g. administrative or jurisdictional boundaries) for the purposes of spatial analysis and map display.  In fact, a geolinking service has been proposed at the Open Geospatial Consortium (OGC), for just this purpose.

An apparently unrelated issue is that modeling of geographic features and their consequent support in a Web Feature Service (WFS).  One organization might for example, model a road as a generic “RoadWay” and define specific subtypes for “Street”, “Highway”, “Expressway”, while another might simply add a classification attribute to the “RoadWay”.  Clearly the two models are not equivalent, but they are very similar.

Another apparently unrelated issue is that of traditional conflation.  In this case you may have two different descriptions of a building, with different geometric and non-spatial properties.  In conflating these two descriptions, you might like to use the geometry from one description and the spatial and non-spatial properties of the other.

How are these three issues – geolinking, model representation and conflation related to each other?  And what has this got to do with the WFS (Web Feature Service)?

For a quick review, a WFS is a web service that provides transactional (update/delete/insert, request) to geospatial data using XML messages in a manner that is vendor neutral and that hides the underlying data store (e.g. storage technology, schemas etc).  Most WFS have been implemented as client-server architectures, and many even employ RPC (Remote Procedure Call), but this is not really required.  REST-based architectures are not inconsistent with the WFS specification.

Let’s start with the idea of geolinking.  To make matters more concrete, we assume that we have two databases, one a relational database containing population data (birth rates, mortality rates, and current populations for a variety of jurisdictional entities (e.g. cities, municipalities, provinces, counties, states, etc.).  The schema of this data base is as follows: 

Jurisdiction Jurisdiction
Type
 Birth Rate
(births/year)
Mortality
(deaths/year)
Population
(current)
         
         

A sample fragment from the database table might then look like:

Jurisdiction
Jurisdiction
Type
Birth Rate
(births/year)
Mortality
(deaths/year)
Population
(current)
 Niagara  County  10.62  6.81  
 Welland  Municipality  10.7  6.9  51,275

Completely separate from this database (i.e. located in a different data store and likely managed by a different organization) is a database that contains the boundaries or extents of the jurisdictional entities for example the Province of Ontario. (This is the Canadian Province containing Niagara and Welland).  Assume that this database provides the spatial extent, expressed as a polygon, for all of the jurisdictional entities in Canada, and that there are features defined for Municipalities, Counties and Provinces, with each instance of these feature types having an ID value (e.g. type = Municipality, ID=”Welland”).

Now let’s proceed to link these two databases together – to geolink them – which is to associate the attributes in the relational database with the geometry in the spatial database. 

To begin with, take a feature perspective on the relational database, and implicitly assert features, with types defined by the values of the enumerated attribute “JurisdictionType”, and with local database resource identifier “Jurisdiction”.  Such a mapping could readily be supported by installing a WFS on the relation database and suitably configuring the WFS schema mapping (see http://www.galdosinc.com/archives/525 ).  Note that this will give rise to a particularly simple GML schema representing the demographic data.

Now let’s also install a WFS onto the geospatial database as well, so in both cases we can request features by ID and other properties, using the WFS request protocol.

To link these two datasets there needs to be a special kind of cascading WFS that can perform the needed schema mapping, and effect the desired geolinking.   This special WFS presents the usual WFS interfaces to the rest of the world, namely GetCapabilities, DescribeFeatureType, and GetFeature operations.  It then translates these operations into further operations against the two WFS installed above.  A GetCapabilities operation to this cascading WFS would result in GetCapabilities requests to each of the WFS’, with the Cascading WFS using its mapping rules to create a single Capabilities document response.  For example, it could return a single list of feature types, namely Province, County, and Municipality.  If we then requested a DescribeFeatureType (Municipality) it would return a single application schema for Municipality that combined the spatial information from one WFS and the attribute information from the other (Mortality, Birth Rate etc), by doing a “join” on the feature ID.   To generate a map of Ontario showing the birth rate by county, a client would make a request to the Cascading WFS, which would in turn translate this request into queries to the other WFS’ and effect the required join operation on the returned data.

Such a specialized cascading or federated WFS can also deal with the issue of variant models for geographic features.  Consider two spatial databases for roads as discussed earlier.  Suppose we now deploy a Cascading WFS which exposes a different road model, namely one with a generic notion of a NavigablePath and with subtypes for Road, Railway, and FerryRoute.  For the Roads subtype assume also a “type attribute” specifying the kinds of Roads, as an enumerated value, namely (Road, Street, Boulevard, Highway, Freeway, and Tollway.  The Cascading WFS is then configured to map its feature types to the feature types of the cascaded WFS’.  For example, (Road, type=”Road, Street, Boulevard” is mapped to the feature type “Street” of one database, and to (“RoadWay”, classification=”Street”), in the other.  When a client issues a request to the Cascading WFS, the WFS uses its mapping rules to generate queries to the cascaded WFS’, and then transforms and integrates the responses.  With this approach, different models for the road system can be handled using an extended Cascading WFS.

By now it should be apparent, that conflation is also something that “could” be handled by a suitably configured Cascading WFS. Of course this discussion has glossed over issues of performance, and the complexity of the mappings involved, some of which clearly will require numerical, string or even geometric transformations.  Nonetheless, it makes sense to think of these three different issues as related to particular Cascading WFS implementations, each performing data translation in addition to cascading of requests, and then to explore the needed types of translations.  This will come in a future blog.

Leave a Reply

  

  

  

You can use these HTML tags

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>