21 Mar 2007 - The Architecture of the GeoWeb
In the run up to GeoWeb 2007 (http://www.geoweb.org) it is helpful to start thinking about the various interpretations of the GeoWeb. Some think of the GeoWeb in terms of the aggregation of Spatial Data Infrastructures. Others in terms of data supply chains analgous to physical supply chains for manufactured goods. Still others see the GeoWeb as a particular instantiation of the Semantic Web. All are valid interpretations. In this note, we want to take a more basic perspective, and compare on an architectural basis the conventional World Wide Web and the GeoWeb which of course builds upon the former. As we shall see there are may things in common.
The conventional web is based on text indexes as shown in Figure 1. Note that a Search Engine is in effect a large index, that responds to queries for words from a browser to which it returns text summaries and the address of the web site (HTTP Server) where the referenced web pages or files can be found. It is important to realize that the web content does not reside at the search engine, but rather "out on the web". The Search Engine is populated by a so called Robot or Web Crawler. This is a program which is given a set of starting HTTP addresses on the web, and which understands the structure of HTML documents. It "crawls" the web by reading each HTML page, creating a summary of the page content, then looking for and following any embedded links (which are again HTTP addresses). The Web Crawler continues in this manner until there are no more pages to traverse from the starting set of nodes. The browser uses the returned information from the Search Engine (e.g. Google, Yahoo) to construct a list of alternatives for the user, which are then used to make a request to a selected web site where the actual content resides. For this entire mechanism to work, we need only a few simple standards, namely HTML (document structure with embedded links to other web pages) and HTTP as the protocol for requesting and delivering information between the browser and the web site, and the crawler and a web site. Note that the search engine is itself a web site. This basic mechanism has over time been extended to include non HTML files (such as SVG, Word documents, GML, images etc.) and more recently of course KML files.
Figure 1. Conventional Web Search Architecture
The GeoWeb follows a similar model to the above except that it extends the picture to include the fact that a great deal of information has an associated location or extent on the earth's surface. Of course this spatial relationship can be relatively simple (e.g. location of a building) or quite complex (e.g. geographic references in the body of a document). For a GeoWeb to work, we need to extend the Search Engine's index to incorporate some type of spatial index such as an R-Tree, Quad-Tree or some combination. This enables objects to be indexed by their location (where on the earth's surface) rather than just by some keyword. Note that this can proceed in a couple of different directions. Document items may contain references to places (e.g. place names) which can then be located and spatially indexed. Other information may be inherently spatial in nature such as a GIS data set, an ESRI Shape or GML File etc. These can also be spatially indexed by an appropriate Spatial Robot, e.g. one that can construct/update R-trees by reading files on the Internet. Google has already implemented such a strategy using KML files. Clearly other spatial file types such as GML files, Shape files etc can be "crawled" and indexed in a similar manner. Of course a great deal (the preponderance) of geospatial information does not live as files at all (this claim can also be made for many other types of data as well), but rather in managed applications or in databases. To access this information the Web Crawler (spatial robot) will need to be able to somehow connect to these databases and determine the extent of the objects contained therein. This might be handled in some cases by the use of Web Feature Service (WFS) interfaces since a WFS is able to provide the extent (as an MBR) for each of the feature types that it contains. This would make the task of the Web Crawler particularily simple. A GeoWeb version of Figure 1. might thus look as shown in Figure 2.
Figure 2. GeoWeb Search Architecture
It is likely that all of the big search engines will move to spatial crawling in the near future. This will add a new and exciting spatial dimension to the web - making the GeoWeb a reality all that much sooner.
35 Comments | Leave a comment»
Comments
No comments yet.
Leave a comment
Blog Entries:
08 May 2008 - Looking ahead to GeoWeb 200921 Apr 2008 - Spatial Infrastructures, IFC & Collaborative Engineering
14 Apr 2008 - KML released as an OGC Specification
02 Apr 2008 - BIM/CAD/GIS Integration
13 Mar 2008 - Structuralism and Data Exchange
05 Mar 2008 - Building the GeoWeb in your own backyard
03 Mar 2008 - Davos of Geo in Vancouver
28 Feb 2008 - What are coordinates?
19 Feb 2008 - Does the invisible hand always get it right?
31 Jan 2008 - “Design for Test” in the GeoWeb
23 Jan 2008 - GeoWeb Local - GML in Local Government
15 Jan 2008 - GML Core and Extensions
04 Jan 2008 - GeoWeb 3D
21 Dec 2007 - What are the key issues for geographic information technology?
26 Nov 2007 - GML in the Back Office
19 Nov 2007 - CAD- BIM-GIS-Games Integration
07 Nov 2007 - What’s in a name? Searching for the right words
23 Aug 2007 - KML Placemarks as Observations
29 Jun 2007 - Where GML was right .. and wrong
17 May 2007 - From GML 1.0 onwards - a brief history
17 May 2007 - GML and Database Interoperability
10 May 2007 - GeoWeb Manifesto
09 May 2007 - Meltdown and the Maze - Toward a Real Time Geography
08 May 2007 - GML, KML, Sensor Data, Imagery
20 Apr 2007 - Transporting GML in KML
21 Mar 2007 - The Architecture of the GeoWeb
14 Feb 2007 - From Interoperability to Infrastructure
14 Feb 2007 - GML without Geometry
18 Dec 2006 - ebRIM gets the nod at the OGC
06 Oct 2006 - In praise of complexity
05 Oct 2006 - Infrastructure - the next step past interoperability
12 Jun 2006 - GML and ebRIM
21 May 2006 - Features, Observations and Authorization
21 Apr 2006 - Transfer and Transaction Models
12 Apr 2006 - Feature Catalogues/Dictionaries, GML and RDF/S
10 Apr 2006 - Genus Loci
04 Apr 2006 - GeoWeb and Survival Part II - Towards Environmental Security
04 Apr 2006 - GeoWeb and Survival
17 Mar 2006 - Schemas, Interoperability and RDBMS
14 Mar 2006 - SDI Concepts
05 Mar 2006 - GML Complexity Re-visited
05 Mar 2006 - Observations are for more than sensor data
05 Mar 2006 - Application Schemas Drive Profiles
25 Feb 2006 - The problem with XML
15 Feb 2006 - The importance of profiles
08 Feb 2006 - One person’s metadata is another person’s …
07 Feb 2006 - From Soup to Nuts
02 Feb 2006 - GeoRSS - GML in news feeds
31 Jan 2006 - Performance and the GeoWeb
27 Jan 2006 - Remote API’S, Web Services and the GeoWeb
19 Jan 2006 - GeoWeb 2006 - GeoWeb Grows Up
09 Jan 2006 - Dealing with time in GML
23 Dec 2005 - Dynamic
14 Dec 2005 - GML in the cockpit
01 Dec 2005 - SDI - What is it really?
25 Nov 2005 - GML is the same for all applications
25 Nov 2005 - Schemas and Profiles - whats the difference?
22 Nov 2005 - Schemas - why the big deal?
15 Nov 2005 - GML for Geographic Imagery
13 Nov 2005 - GML, and KML - Why the fuss?
10 Nov 2005 - Is GML a format?
09 Nov 2005 - Embedding GML in “foreign” grammars
03 Nov 2005 - Authentication and Access Control
03 Nov 2005 - OnStar in the era of the GeoWeb
03 Nov 2005 - Do we need to encode location in news feeds?
03 Nov 2005 - gMedia - Towards Geographically Aware Media
03 Nov 2005 - Where are we going?
02 Nov 2005 - Sample XSLT Style Sheet
02 Nov 2005 - Sample KML Output
02 Nov 2005 - Sample GML Data File
02 Nov 2005 - Styling GML to KML - XSLT
02 Nov 2005 - Simple Geometry Schema
01 Nov 2005 - Simple GML Geometry
18 Oct 2005 - Simple GML Geometries
18 Oct 2005 - Styling GML to KML for Visualization
18 Oct 2005 - Some Simple GML Profiles
17 Oct 2005 - Embedding GML in non-GML grammars
17 Oct 2005 - Geotags - the answer to everything?
20 Sep 2005 - GeoWeb 2006
20 Sep 2005 - GML Observations and Features
14 Sep 2005 - What is KML?
07 Sep 2005 - Time in GML
07 Sep 2005 - GML Observations
07 Sep 2005 - GML and KML Syntax
07 Sep 2005 - GeoWeb - Part II - GML and KML
07 Sep 2005 - GI Markup - Part I - Feeding the web with Geographic Information
06 Sep 2005 - GML Complexity
06 Sep 2005 - GML “Sucks”
24 Aug 2005 - Web Feeds and Geographic Information
23 Aug 2005 - What is the Geo-Web?
23 Aug 2005 - IS WGS84 Enough
04 Aug 2005 - Coordinates in GML
03 Aug 2005 - GML Profiles
03 Aug 2005 - GML and Coordinate Systems
03 Aug 2005 - Information Sources
03 Aug 2005 - Features and Geometry Properties
03 Aug 2005 - GML Geometries
03 Aug 2005 - GML FAQ for RSS Geeks and others



