In a recent national experiment here in Canada, using Web Feature Servers (WFS) to aggregate and update road and other data, it came as a surprise to some of the participants that schema mapping was REQUIRED. This same issue has been raised more recently in a number of public forums and again as something of a revelation – “You mean schema mapping is necessary?”
Clearly the whole point of WFS has not been well explained or we simply have not made people aware it. Schema mapping is simply key!
Now what does this schema mapping mean anyways? Suppose you wish to share, aggregate or integrate data for a number of different spatial (or non-spatial) databases? How do you do it?
One approach (I will talk about this more in a subsequent blog) is to create a public schema that represents the feature types of interest for all of the prospective users. We might for example have a common model for roads, buildings and land parcels that spans all the counties of a state, municipalities of a province or country. This common model is expressed in terms of a schema (e.g. GML Application Schema) that states the names of the feature types, and the names, types and multiplicities of their properties. For example our schema might define the feature types (road, river, bridge, building), and for a road it might specify the properties name, description, numberOfLanes, surfaceType, and centerline. The road might have any number of names, a single description, a single numberOfLanes (integer), a single surfaceType (enumeration) and a single centerline (LineString).
Now, it is unlikely that we will get all of the counties, municipalities, states, etc. among whom we are trying to share data to adopt this common schema. They will all have existing use cases for their data, and they will have already developed schemas that model roads, buildings, and land parcels in order to support those use cases. They will not be willing, nor able, to change their schema to comply with the public one. This is where schema mapping comes in.
The purpose of a WFS is to provide, in part, vendor neutral access to geospatial data, but more importantly to be able to provide this data relative to an external public schema. This means that the client of the WFS sees a schema (obtained via a DescribeFeature request) that is the public schema and is unaware of the schema used by the underlying and (to her/him) opaque data store. Note that this requirement exists even if all of the nodes in the network (for sharing or aggregation) were using the same vendor software. See Figure 1.
The differences between the internal and external schemas may be trivial (e.g. just name or spelling changes) or they may be very significant (e.g. geometry in the internal database is represented differently (different model) than the public schema. Note that the desired schema mappings may not always be possible, and this needs to be assessed in selecting and configuring the WFS, as some may have more restricted (e.g. some have NONE) mapping capabilities than others.
Note that schema mapping has to be carried forward in both directions in most cases, meaning that one needs to map the data on requests and also on transactions such as updates, inserts and deletes.
With this approach we can think of replicating data between an ESRI spatial database (say ESRI ArcGIS Server over MS SQL Server) to/from an Oracle Spatial database. More importantly the same approach can deal with the schema (modeling) differences between the two nodes.
To make all of this practical we have to manage the public schemas themselves, since there is typically some level of work in establishing the schema mappings. Management of such schemas is thus an important function of the SDI, a point we will come back to in a latter blog.