- Galdos Systems Inc. - https://www.galdosinc.com -

Richer Semantics for Geography – CSW-ebRIM

When I was first working on the Geography Markup Language (GML) in 1998, I was much taken with the efforts of the Resource Description Framework (RDF) and associated RDF Schema (RDFS) to incorporate greater meaning into the description of things, mostly web pages. A meeting with Tim Berners-Lee probably had something to do with it, but the idea of using RDF, rather than DTD’s, to encode Geography seemed to me to be the right direction to go. RDFS had a notion of a class, and properties, and RDF/RDFS was very much focused on the encoding of meaning. For that reason, GML was originally written in RDFS – the idea of an application schema just did not seem possible with DTD’s.

Problems with RDF/RDFS did arise, however, especially with respect to the expression of concrete types. Here, RDFS had nothing to say, other than to refer to type definitions in XML Schema Datatypes. Furthermore, while the assignment of properties to classes in RDF was very attractive, there was no way to determine all of the properties that were assigned to a given class without checking the entire Internet (remember that schemas in RDFS could each assign properties to a class defined elsewhere).

These issues were not well received by the (mostly relational) database world of the time. Significant effort was made to port GML from RDF to XML Schema, and all versions of GML since Version 2.0 have been encoded only in XML Schema. While every attempt was made to retain most of the features of the RDF/RDFS version (e.g. global identifier, object-property-value, remote property value, application schema), this was not totally possible, and one might say that semantics suffered in the bargain.

GML has been quite successful as a vehicle for interoperability in that it has spawned, or been directly used to encode, a number of domain-specific languages such as CityGML, AIXM, DIGGS, WaterML, GeoSciML, and so on. This is entirely in keeping with the program from the earliest days of GML. On the other hand, GML has been less successful as a vehicle for semantic interoperability where no such standard vocabularies are, or can be, defined. Unfortunately, this is a common case in wide area data and systems integration such as Spatial Data Infrastructures. This is not to say that GML is not useful in such situations, simply that it is not sufficient by itself.

Some will say that this “problem” in GML could be solved by returning to its roots and porting 2013 GML into RDF. Certainly, today, RDF has more tools and support than it did in its early heyday of 1998-2000. While RDF can indeed be part of the solution, I think it is insufficient when used alone, and I include in that statement the existence of triple-store databases.

There is, however, another approach to expressing geo-semantics using another OGC standard, namely the eBusiness Registry Information Model profile of Catalogue Services for the Web (CSW-ebRIM). CSW-ebRIM leverages GML, and overcomes most of these difficulties, and we will see that this approach enables us to put most of the semantics back into GML.

While our focus in this article is on CSW-ebRIM as a data model, it is equally important to realize that the CSW-ebRIM specification also defines transactional interfaces for a registry service, a fact which is of considerable importance in this discussion and to which we will return in a following article.

The CSW-ebRIM data model provides a number of constructs, including Extrinsic Object, External Link Object, Classification Scheme, Association, and Slot, which are used to build a business application-specific information model, such as a model for land use, or a model for an urban infrastructure. The following diagram shows the basic ebRIM data model.

ebRIM Data Model – Class Hierarchy [1]

Version 3.0 ebRIM Data Model – Class Hierarchy

CSW-ebRIM adds spatial constructs to the above data model, providing Slots (i.e. properties) that are encoded in GML Simple Features values. Any RegistryObject can have such properties, so CSW-ebRIM can easily create all sorts of geospatially-enabled information models.

The ExtrinsicObject is somewhat analogous to a feature in GML, in that ExtrinsicObjects can be subtyped. For example, we can create a Road or Building ExtrinsicObject and we can assign it all manner of properties (Slots) including one or more geometric or temporal ones (using GML). In this sense, the type of the object is a subtype of ExtrinsicObject, just as, in a GML application schema, a feature is a subtype of gml:AbstractFeatureType. Unlike GML, however, CSW-ebRIM allows us to additionally classify an ExtrinsicObject according to a user-defined classification scheme. This classification scheme can then add semantics to the feature model in a manner which is not possible in GML. Furthermore, in CSW-ebRIM, we can classify a given ExtrinsicObject under multiple taxonomies at the same time, hence multiple semantic inheritance is supported – again something that is NOT possible in GML.

Another powerful feature of the CSW-ebRIM data model is the Association. Associations are typed entities that express relationships between RegistryObjects; in particular, ExtrinsicObjects can have properties, including spatial properties. Thus it is easy to say the ExtrinsicObject “CN Tower Restaurant” “isAPartOf” the “CN Tower” tower by first classifying the “CN Tower Restaurant” as a Restaurant and the “CN Tower” as a Tower, and then by associating one to the other via the user-defined “isAPartOf” relationship. Expressing relationships in GML is possible, but only by putting properties on the respective objects.

We would have to write something like:

<abc:Restaurant gml:id =”CN Tower Restaurant”>
    <abc:isAPartOf xlink:href = “#CN Tower”/>
<abc:Tower gml:id = “CN Tower”>
    <abc:contains xlink:href = “CN Tower Restaurant”/>

while GML, embedded in an ExtrinsicObject, might look something like this:

<wrs:ExtrinsicObject id="urn:uuid:31e65425-d0aa-495c-afde-8f49b3da6e11" lid="urn:uuid:31e65425-d0aa-495c-afde-8f49b3da6e11" objectType="urn:uuid:569F3323-9E6F-B770-252E-C120408EA19E:Tower" status="urn:oasis:names:tc:ebxml-regrep:StatusType:Submitted" xmlns:rim="urn:oasis:names:tc:ebxml-regrep:xsd:rim:3.0" xmlns:wrs="http://www.opengis.net/cat/wrs/1.0" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns="http://www.opengis.net/cat/csw/2.0.2">
  <rim:Slot name="location" slotType="urn:ogc:def:dataType:ISO-19107:2003:GM_Point">
        <gml:Point gml:id="G1" srsName="urn:ogc:def:crs:EPSG::4326" xmlns:gml="http://www.opengis.net/gml">
          <gml:pos>43.64265265625022 -79.38708975573326</gml:pos>
  <rim:Slot name="height" slotType="urn:x-dgiwg:def:uom:DFDD::metre">
    <rim:LocalizedString aaa:lang="en" value="CN Tower"             xmlns:aaa="http://www.w3.org/XML/1998/namespace"/>
  <rim:VersionInfo versionName="20130827T190221Z"/>
  <rim:Classification … </rim:Classification>

Note that, in GML, we have no way to express the relationship after the fact except by modifying the data, which is something we may NOT be able to do. There is no such problem in CSW-ebRIM as Associations are stored separately from the instances that they relate to, and contain pointers to the source and target objects participating in the Association. This means that Associations can be created between objects even if no such relationship was established when the objects were created. Moreover, the objects being associated do not need to live in the same data store and can be located anywhere on the Internet – yet another thing that is not possible in GML.

In summary, by using CSW-ebRIM, with its embedded GML, we can go a long way to capturing and expressing semantics in a world of location.