Classifications and Associations

Online travel companies are in the business of selling hotel rooms, tours, cruise ships, and a whole range of related services. Geographic objects, in the form of data about points of interest, draw travelers to a particular region. All of these objects are part of the key business or master data of an online travel company.

Many Users and Millions of Objects

The typical data volume for such master geographic data will be in the millions of objects. This means that the supporting information system must support both online automatic harvesting (e.g. pulling data from existing sites and databases) as well as efficient push insert/update transactions from a variety of data capture tools (e.g. digitizing over Google Earth or similar), using multiple concurrent data capture personnel. Data volumes and a large number of concurrent users also imply the need for scalability, in turn implying the need for web services supporting both horizontal and vertical scaling.

Classifications and Collections

Classifications (or taxonomies) allow objects to be grouped with other objects according to common properties. An INdicio registry allows assets to be catalogued and classified under any number of classifications at the same time. A license, for example, might be classified according to its type (e.g. Creative Commons) and also by its state (e.g. Expired) at the same time.

Collections

Assets can be grouped into user-defined collections. Collections can contain other collections; collections can also be assigned properties, be classified, or be associated to other resources. Collections can be deleted at the touch of button without impacting the objects in the collection.

Classifications

Asset classifications can have properties, and can be associated to one another and to other objects, etc. This is also true of every category in the classification.

Associations

Assets can be associated to one another using any number of user defined associations. These associations can be traversed, searched (e.g. “find the objects associated to a given asset”), and graphically displayed.

Classifying and Associating Objects

As information about the points of interest, hotels, etc., may be obtained from many different sources, it is important that the information system for mastering geographic objects have the ability to assign unique identifiers to each object, and to represent relationships between identified instances (e.g. “sameAs”, “part of”, etc.). This is important for making sure that data is not duplicated. It is also essential that the system provide a flexible means to classify any given data object according to one or many taxonomies (e.g. hotel type, point of interest type, hotel category, star ranking, etc.) at the same time, and to be able to quickly make changes to how objects are classified and/or related to one another.

Working with Geographic Data

Objects in an INdicio registry can have geographic properties such as location and extent. For example: a photograph could have the location where it was acquired; a tourist region could have a polygon describing its extent; and so forth. Objects with geographic properties can be displayed on maps, and searched using geographic-based requests (e.g. find all photographs in a region drawn on a map).

Location Matters

Of course such an information system must also be spatially aware with full support for point, line, and area geometries, as would be required in the description, for example, of a point of interest, a tourist route, and an entertainment district. In many cases, it will be very useful for data objects to have multiple geometric properties. For example, a tourist region may have a point associated with it that is the “center of the action”, while a polygonal boundary may define the complete area that it encompasses.

Geospatial Queries – NoSQL Data Model

Equally important to geographic data representation is geospatial querying. Data developers and online processing programs alike must be able to issue spatial, taxonomic, association and free text queries, or any logical combination of these query components, and quickly get responses that can be automatically transformed for additional filtering and/or presentation.

Notifications

Automated notification can be used to notify administrators when something changes, such as the expiration of a license, or the arrival of a new image or video clip. Any change to a registry object can automatically trigger a notification to be sent to a user by SMS, e-mail, etc. Since this is a plug-in architecture, new notification types can be easily supported as they come along.

Data Governance

Finally, any supporting information platform must provide a suite of services for data governance, including a built in audit trail for all data changes, automated notification of users or software systems, and user configurable life cycle status management for all data objects.