Print This Post

Is GML a format?

One often hears the term "data format" without much discussion as to what it means. People talk about converting from one format to another even when they express distinctly different semantics – for example "the conversion from Shape format to SVG format" and so on. While this "abuse of language" may be convenient it quite often makes it very unclear what is really going on.

What is a data format any ways?

Those that have been around long enough to remember when the discussion of database theory was stil in fashion will remember the terms "physical and logical independence" of programs and data. Loosely put this meant that one could write computer programs that could make data accesses without knowing the physical location and structure of the data elements they were reading or writing. Without such independence data access software was very brittle and broke anytime someone added a new "field" to a data file, even if the that program made no use of the field in question. Such independence of programs and data was much touted as a key rationale for databases. Databases allowed the writer or reader to perform data access operations without knowledge of the structure of the data.

Programs might read data into internal record structures but these structures existed only in the program and were completely decoupled from the actual structures used by the data base for data storage. New "fields" could be added to the database without any impact on existing programs and requiring no change to their internal data structures.

What then are data formats? Essentially they are just data record structures that are written to a file. A format specification provides the structure of the records and their external semantics – e.g. the first 10 characters is the object ID and so on. Often these formats are isolated behind API's but this does not change the nature of the format itself. The relationship between programs that deal with the formatted file are in the same position as data access software in the pre-database age.

Is this changed by the emergence of XML? Can we speak of GML as a format?

I would argue no. GML is NOT a format. Creators of software that read or write GML do not think of how the XML is layed out in a file and have no access to it. There are NO specifications for the length of records or even the order of the records within a file structure. Software accesses the data through various data models built by the parser (e.g. DOM, SAX etc.) and in which the items of interest are defined by the associated XML Schema (GML Application Schema). This means that such software is independent of the physical organization of the data- and really does deal with the data in terms of the logical model defined by XML (i.e. the XML Infoset).

One can thus think of GML (and any XML grammar) as a kind of "local database" that brings the independence of programs and data to the world of information exchange.

So GML is not a format.

1 comment to Is GML a format?

Leave a Reply

You can use these HTML tags

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>