Communities of interest very often overlap and as a result there are often many database representations of a given real world object, typically reflecting different points of view about that object. A lighthouse is for navigation from the point of view of a ship, while it is merely an obstacle for an aircraft, and a feature of interest to someone on land walking or in a vehicle. These different perspectives on the same object are often difficult to reconcile. Furthermore, it is often difficult to know that we are even talking about the same object.
Names are assigned to physical features and gazetteers have long been employed to link these names to the associated physical feature. To the toponymist, the name has priority, as from their perspective the corresponding physical feature may be ill defined geographically or at least defined in some fuzzy manner. There are often many names that are linked together, with one of the names being recognized in some sense as the official one, although I suspect with the advent of Web 2.0, an agreement on an official name may be increasingly problematic.
Addresses are another sort of common attribute that we might associate with a physical object such as a building, a land parcel, or a prospective building site (it need not yet exist). Given physical objects some might easily have several addresses associated with them, just as they may have several geometric properties.
Addresses are of course critical since they serve as a tie point for legal matters, postal delivery and emergency response.
On reflection one might think that these different notions of names, addresses, and object identification could be unified in some way, and I would propose that this be done using a Geographic Entity Registry. Such a registry could be global (whole planet) or quite local such as a province or state. Identifiers would be globally unique in either case. The registry would not seek to capture any more information about an entity other than that required to identify it. This might include a globally unique ID, a nominal position, and bounding rectangle, some textual description, and perhaps a photograph, satellite or aerial image. Remember this information is NOT intended to be used for anything other than the identification of the real world object, and its discrimination from other real world objects.
It might also be important to the identification to provide specific associations between the object in question and other objects in the registry. For example, a tower might be a component of a bridge, and the bridge a component of a highway system.
Geographic entities should in no way be restricted, and can include things that are transient in a nature, non-physical (e.g. political boundary), or which are planned or have been removed, in addition to more familiar and supposedly permanent objects. In fact the notion of a real world identifier must take this into account, and a new identifier being created at some point when a given object has been modified so much that its identity has changed fundamentally.
Some of these geographic entities may have addresses assigned to them, and in principle many such addresses might be assigned to a given entity. Should this be part of the identification information or part of a particular description of the entity maintained in another database? Same question for names. I think this is likely more a question of the business processes associated with the maintenance of this information than anything else. One could certainly conceive a geographic entity registry that contains the identifiers for all geographic entities and where appropriate associates names (multiple) and addresses (multiple) with these entities as appropriate.
The issue of which databases are describing the same real world object is now answered by having these different database entries refer to the appropriate geographic entity identifier. Hence the lighthouse identifier would be referenced a like by the aviation, nautical and land databases, even though they provide completely different descriptions of that lighthouse. It is then perfectly clear that all of these databases are talking about the same real world entity.
What about the primacy of name from the toponymist’s perspective? Is this not subverted by our geographic entity registry? Not really. Our registry must really be seen as a registry of identifiers, and names are kinds of identifiers. All names attached to the same identifier are considered to be alternative names for the same thing. Furthermore one can readily assign any weighting (alternative, official, etc) that one may choose, and since the identifier is itself a meaningless tag, there is likely to be less conflict as to which name has primacy over any other. We can also see that different identifiers may be associated with the same name. For example, “Horseshoe Bay” is a part of West Vancouver, and a part of Howe Sound. Of course these are easily distinguished by having different identification strings, and by being classified as different types of entities (taxonomies) in the entity registry and by having different associations to other identifiers (e.g. the “part of” association). Yet “Horseshoe Bay”, the name can be used to refer to each of them.
What is emerging in this discussion is some of the key requirements for a Geographic Entity Registry, namely 1) the ability to assign globally unique identifiers 2) the ability to attach enough information to these identifiers so that we know what entity is being considered 3) the ability to assign multiple names and possibly addresses to an identifier 4) the ability to classify the identified entities by type (taxonomy), and 5) the ability to associate entities with one another.
I believe we will see global Geographic Entity Registries emerge in the next few years. This will be an important step in building the GeoWeb.