Print This Post

Machine-Readable Data and the “Social” Network of Machines

Much has been said in recent times about the importance of social networks and how they have impacted everything from the response to emergency events to the rise of the Arab spring. The impact has been so strong that when one raises the need to have tools for “information sharing” most people respond with Facebook or Twitter, as if these were the only possible answers.

While social networks have been enormously important in linking people together, there is another kind of linking that will be at least as important in the coming years, although it will never be as visible. This is linking within the community of machines. By a community of machines, I do not mean the network of machines that supports only (or almost exclusively) human interaction, although at some point any community of machines must indeed interact with humans. I am referring to the network of machines that acquire, process, interpret, and present information to humans, often in real time, in order for us to make more effective and timely decisions about events in the world around us. This community of machines is much less understood in the popular media and in the popular imagination, and yet it is every bit as vital to our future.

What then do machines need to communicate with one another, and what does it mean that we have a community of machines?

I should start by saying that I am not in any way talking about artificial intelligence, at least not in the sense of machines being self-aware or having any idea of what they are doing. The sort of communication that I am talking about does, however, relate to conveying meaning from one machine to another. Consider, for example, the difference between sending a picture of a Word document and the Word document itself. In the first case, only a human being could determine the title of the document, or perhaps the author and the date of publication. In the second case, a computer program could quite easily extract all three pieces of information. The difference between these two cases is in the information that is transmitted and in how it is encoded. We might phrase this in terms of the kinds of questions that could be answered by processing the received information using a computer program. Note that we are not talking about deducing implicit meaning by processing the received information. We don’t expect to determine whether or not the author was happy or not when the document was being written; while it might be inferred from the author’s choice of words, we are talking only about the explicit encoding of information. I know who the author is, because the word “author” appears in front of their name. In a similar way, in machine-readable data, there is an explicit model of the document that says such-and-such a string denotes the author of the document.

A similar example arises in the transmission of design information. In conventional Computer Aided Drawing (CAD), a drawing is encoded as a set of geometric elements (lines, squares, points, symbols, etc.) without regard to what these elements mean within the application domain in which they are used. A human being can easily understand and read CAD drawings, and interpret the labels (e.g. door, beam, ground wire) to associate meaning with the geometric elements. A computer program cannot. In conventional CAD drawings, we cannot ask the software to color the doors red; we can only highlight a layer in the drawing and color it red, or select a set of line elements and do the same.

In the so-called Building Information Model (BIM) encodings, a built structure is modeled in terms of entities and entity relationships that are meaningful (to humans) in a selected domain (e.g. building structures), and associated with these entities and relationships are geometric and topological elements or properties. In BIM encodings, unlike conventional CAD drawings, it makes perfect sense to ask the software to color all the windows red, or all of the doors blue.

These two examples illustrate the basic elements of machine communication. The critical part is that there is a model for the information exchanged in the communication. This model provides the context by which the information is understood by humans, and can be processed (e.g. colour the windows red, italicize the author’s name) by machines. Note that, broadly speaking, two different sets of humans are involved in the process. In the first group are the end recipients of the information, the ones who say “Yes, I read other works by Dickens”. In the second group are the programmers that create the programs that receive, process, and transform the data for consumption by the first group. The first group cares about the actual content of what is transmitted, but may not care at all about the difference between the picture of the document and the document itself (as long as they get pictures of every page). The second group requires the model of the transmitted information in order to do anything beyond the most basic receipt and storage of what is received. Machine communication is really about the communication between programmers with respect to models of the data to be exchanged between machines.

So why does all this matter? To begin with, whether we make the measurements remotely or in situ, measurements that are transmitted from sensing devices are typically processed through multiple steps before they are presented for human viewing and interpretation. In the case of remotely sensed imagery, for example, the images are processed to compensate for atmospheric effects or to classify or “interpret” the image (e.g. find areas that are “bright” in the IR indicating the presence of water) and, of course, to enable the image to be geo-registered. Having the data be machine-readable is more or less essential in dealing with any kind of measurement information; however, it goes much farther than that. Think about the shape or geometry of roads and buildings. A picture may be worth a thousand words, but it does not tell us which buildings are connected to one another, or how those connections are achieved. Nor can a picture tell us which sensor values should be associated with which wall surface area or room volume. But machines can “know” these things and be architected to share this information with one another.

One often hears the expression “We cannot control or understand what we cannot measure”. This is indeed true. Moreover, the problems that we are facing in our physical environment are very serious and we will need to take a more scientific/engineering approach to these problems, using the best measurements we can acquire, and the best models that we can construct.

Evidence-based decision making is critical to all our futures and, increasingly, those are urban futures. We can better prepare for an earthquake if we can forecast which buildings or structures are most at risk, determine which routes might offer the safest access to and exit from impacted areas, and estimate what might be the potential effect on human life if the infrastructure supporting water, transportation, electricity, and so forth is impacted by the event. Such forecasts require not only models of earthquakes, but also extensive information on the distribution of structures, roadways, and utilities, their types, date of construction, and value, as well as detailed demographics by age, income, etc. To continuously acquire, process, forecast, and distribute the simulation results to emergency decision makers requires ongoing communication amongst multiple machines, including those responsible for property assessment, building/roadway/utility design, event (e.g. earthquake) simulation, and for the communication of impact scenarios to emergency planners and responders. A permanent and evolving “social” network of machines is essential to such an activity. The same argument can be made in our preparations for and responses to floods, hurricanes, rising sea levels, increasing drought, and rising global temperatures.

One can truly say that the last decade has been the decade of social networks and, while I think there will be no decline in their importance, the next decade will belong more to structured machine-readable data — the social network of machines.