05 Mar 2006 - GML Complexity Re-visited
I have discussed the issue of GML complexity a number of times in this blog. Mostly we have looked at things like the number of tags, use of XML Schema, subject complexity and so forth. Most of it was pretty qualitative. We had no real measures of the complexity, nor comparisons to other established XML grammars to see how GML stacked up. Well, now some folks over at Microsoft, led by Stan Kitsis have set about to create a number of XML Schema metrics and applied these to a large number of schemas, GML among them. Their work used GML v3.1 which is close enough to the current release (GML v3.1.1 and the pending GML v3.2) to mean their results are completely refelective of the GML we are all working with or planning to. The paper is entitled "Analysis of XML Schema Usage" and begins by developing a variety of metrics for XML Schema size and complexity and utiization of particular XML Schema features (e.g. Model-group operators, Simple type features, Occurence features, subtyping and friends, mixed content, wild cards, identity constraints and modularization).
They then provide statistics on the application of these metrics to a set of 63 schema projects from different IT Sectors. Some were internal to Microsoft and some wee external including of course GML. The schemas included some 6000 individual schema files, with roughly 82,000 global element names.
So how did GML stack up? There is not space to go over all of the findings and I will leave that to Stan and the Microsoft folks. However just a few items will give you the general idea.
Schema Size based on Lines of Code (LOC)
The range of schemas is shown in the table below with GML.
|
LOC-based category |
Definition |
Schema count |
|
Mini |
0 - 100 |
0 |
|
Small |
100 - 1,000 |
12 |
|
Medium |
1,000 - 10,000 |
24 |
|
Large GML |
10,000 - 100,000 10,291 lines |
23 |
|
Huge |
100,000 - … |
4 |
It is clear from this measure that GML is at the bottom end of the large schemas.
Schema Size - Based on size in kilobytes.
The schemas in the study ranged from a 6 Kbytes to 18 Mbytes. Most of these schemas (26 of the 63) are in the range of 100 KB to 1MB and this is indeed where we find GML at 532 Mbytes. There were NOT many small schemas (only 6 less than 10Kbytes), and as one might expect not many really large schemas (only 11 in this range).
Number of Complex Type Definition:
Some people think GML is complex because it declares so many complex types - well does it?
According to the Microsoft study this metric ranged over the following:
|
#CT-based category |
Definition |
Schema count |
|
Mini |
0 - 32 |
13 |
|
Small |
32 - 100 |
12 |
|
Medium |
100 - 256 |
14 |
|
Large |
256 - 1,000 |
12 |
|
Huge |
1,000 - … |
12 |
and GML - well 287 - so again at the bottom end of the large schemas.
1 Comment | Leave a comment»
Comments
No comments yet.
Leave a comment
Blog Entries:
08 May 2008 - Looking ahead to GeoWeb 200921 Apr 2008 - Spatial Infrastructures, IFC & Collaborative Engineering
14 Apr 2008 - KML released as an OGC Specification
02 Apr 2008 - BIM/CAD/GIS Integration
13 Mar 2008 - Structuralism and Data Exchange
05 Mar 2008 - Building the GeoWeb in your own backyard
03 Mar 2008 - Davos of Geo in Vancouver
28 Feb 2008 - What are coordinates?
19 Feb 2008 - Does the invisible hand always get it right?
31 Jan 2008 - “Design for Test” in the GeoWeb
23 Jan 2008 - GeoWeb Local - GML in Local Government
15 Jan 2008 - GML Core and Extensions
04 Jan 2008 - GeoWeb 3D
21 Dec 2007 - What are the key issues for geographic information technology?
26 Nov 2007 - GML in the Back Office
19 Nov 2007 - CAD- BIM-GIS-Games Integration
07 Nov 2007 - What’s in a name? Searching for the right words
23 Aug 2007 - KML Placemarks as Observations
29 Jun 2007 - Where GML was right .. and wrong
17 May 2007 - From GML 1.0 onwards - a brief history
17 May 2007 - GML and Database Interoperability
10 May 2007 - GeoWeb Manifesto
09 May 2007 - Meltdown and the Maze - Toward a Real Time Geography
08 May 2007 - GML, KML, Sensor Data, Imagery
20 Apr 2007 - Transporting GML in KML
21 Mar 2007 - The Architecture of the GeoWeb
14 Feb 2007 - From Interoperability to Infrastructure
14 Feb 2007 - GML without Geometry
18 Dec 2006 - ebRIM gets the nod at the OGC
06 Oct 2006 - In praise of complexity
05 Oct 2006 - Infrastructure - the next step past interoperability
12 Jun 2006 - GML and ebRIM
21 May 2006 - Features, Observations and Authorization
21 Apr 2006 - Transfer and Transaction Models
12 Apr 2006 - Feature Catalogues/Dictionaries, GML and RDF/S
10 Apr 2006 - Genus Loci
04 Apr 2006 - GeoWeb and Survival Part II - Towards Environmental Security
04 Apr 2006 - GeoWeb and Survival
17 Mar 2006 - Schemas, Interoperability and RDBMS
14 Mar 2006 - SDI Concepts
05 Mar 2006 - GML Complexity Re-visited
05 Mar 2006 - Observations are for more than sensor data
05 Mar 2006 - Application Schemas Drive Profiles
25 Feb 2006 - The problem with XML
15 Feb 2006 - The importance of profiles
08 Feb 2006 - One person’s metadata is another person’s …
07 Feb 2006 - From Soup to Nuts
02 Feb 2006 - GeoRSS - GML in news feeds
31 Jan 2006 - Performance and the GeoWeb
27 Jan 2006 - Remote API’S, Web Services and the GeoWeb
19 Jan 2006 - GeoWeb 2006 - GeoWeb Grows Up
09 Jan 2006 - Dealing with time in GML
23 Dec 2005 - Dynamic
14 Dec 2005 - GML in the cockpit
01 Dec 2005 - SDI - What is it really?
25 Nov 2005 - GML is the same for all applications
25 Nov 2005 - Schemas and Profiles - whats the difference?
22 Nov 2005 - Schemas - why the big deal?
15 Nov 2005 - GML for Geographic Imagery
13 Nov 2005 - GML, and KML - Why the fuss?
10 Nov 2005 - Is GML a format?
09 Nov 2005 - Embedding GML in “foreign” grammars
03 Nov 2005 - Authentication and Access Control
03 Nov 2005 - OnStar in the era of the GeoWeb
03 Nov 2005 - Do we need to encode location in news feeds?
03 Nov 2005 - gMedia - Towards Geographically Aware Media
03 Nov 2005 - Where are we going?
02 Nov 2005 - Sample XSLT Style Sheet
02 Nov 2005 - Sample KML Output
02 Nov 2005 - Sample GML Data File
02 Nov 2005 - Styling GML to KML - XSLT
02 Nov 2005 - Simple Geometry Schema
01 Nov 2005 - Simple GML Geometry
18 Oct 2005 - Simple GML Geometries
18 Oct 2005 - Styling GML to KML for Visualization
18 Oct 2005 - Some Simple GML Profiles
17 Oct 2005 - Embedding GML in non-GML grammars
17 Oct 2005 - Geotags - the answer to everything?
20 Sep 2005 - GeoWeb 2006
20 Sep 2005 - GML Observations and Features
14 Sep 2005 - What is KML?
07 Sep 2005 - Time in GML
07 Sep 2005 - GML Observations
07 Sep 2005 - GML and KML Syntax
07 Sep 2005 - GeoWeb - Part II - GML and KML
07 Sep 2005 - GI Markup - Part I - Feeding the web with Geographic Information
06 Sep 2005 - GML Complexity
06 Sep 2005 - GML “Sucks”
24 Aug 2005 - Web Feeds and Geographic Information
23 Aug 2005 - What is the Geo-Web?
23 Aug 2005 - IS WGS84 Enough
04 Aug 2005 - Coordinates in GML
03 Aug 2005 - GML Profiles
03 Aug 2005 - GML and Coordinate Systems
03 Aug 2005 - Information Sources
03 Aug 2005 - Features and Geometry Properties
03 Aug 2005 - GML Geometries
03 Aug 2005 - GML FAQ for RSS Geeks and others



