Ultra-scale Information Management: no place for rigid standards
This year's ER conference (International Conference on Conceptual Modeling) was hold on Tucson, Arizona, from Nov. 6 to Nov. 9. This year's ER is special since it is the 25th anniversary. Hence the organizers have prepared a few special events. To me, however, the best experience about ER 2006 was listening to a great talk given by Scott Renner from the MITRE Corporation.
Scott's talk is about "Community Semantics For Ultra-Scale Information Management." Here is the abstract.
The U.S. Department of Defense (DoD) presents an instance of an ultra-scale information management problem: thousands of information systems, millions of users, billions of dollars for procurement and operations. Military organizations are often viewed as the ultimate in rigid hierarchical control. In fact, authority over users and developers is widely distributed, and centralized control is quite difficult – or even impossible, as many of the DoD core functions involve an extended enterprise that includes completely independent entities, such as allied military forces, for-profit corporations, and non-governmental organizations. For this reason, information management within the DoD must take place in an environment of limited autonomy, one in which influence and negotiation are as necessary as top-down direction and control.The promise of successful information sharing is "when the right information is provided to the right people at the right time and place so that they can make the right decisions" [1]. To protect these five rights (right information, right people, right time, right place, and right decision), the DoD practice has proved that the top-down standardization approach can be partial success, but overall failure for ultra-scale information management systems. When the scale of an information system gets to be big, single vocabulary simply does not work any more, even if it is for a very much rigidly organized army environment.
This presentation examines the DoD’s information management problems in the context of its transformation to network-centric warfare The key tenent of NCW holds that “seamless” information sharing leads to increased combat power. We examine several implications of the net-centric transformation and show how each depends upon shared semantic understanding within communities of interest. Opportunities for research and for commercial tool development in the area of conceptual modeling will be apparent as we go along.
Straightforwardly, this conclusion leads to another prediction. To the scale of World Wide Web, where (1) the number of web users is much greater than the number of soldiers, (2) the entire domain is much more complicated and significantly greater than the domain of military, and (3) there is no enforced power over the Web, absolutely there is no chance of success if we plan to design any standard for global information sharing, such as, the Semantic Web.
Information is more than data. Scott presented an interesting point: information is about data and how this data is understood. When data itself is objective, understanding the meanings of data can be varied with respect to different people. Varied understanding of meanings thus may lead to complete different decisions based on, however, the same data. Hence it is necessary to separate the two different types of information representations. What is the data is different from what is the data for.
Let's look at Semantic Web again. In order to build a real Semantic Web, data in the Semantic Web must be existing, accessible, visible, and understandable. Existing means data values or/and data descriptions must have been created. Accessible means created data presentation must be deliverable to its correct destination by legal requests. Visible means data representation can be watched and identified by humans. Understandable means data representation can be correctly and unambiguously identified by machines.
Ostensibly, one issue that is overlooked by the current Semantic Web research is the relationship between visible and understandable. Many current Semantic Web practices assume what machines understand must be what everybody expects; or on the other way they assume that a small group's vision of a domain can be a standard description of the domain for machines to perform. As what the DoD project has demonstrated, the previous assumptions are impractical in the real world ultra-scale applications, even if they are for the army scenario.
Semantic Web researchers must start to learn from Web 2.0 practices. A public domain agreement cannot be enforced in general. The enforcement model might be executable for a small domain and for a limited number of domain participants. But the model can never be scale to large size such as the entire Web. This difficulty of scalability lays not only inside the theory of conceptual modeling, which no doubt is a hard problem, but also in the intrinsic human expectation of freedom, which makes the difficult problem be essentially unsolvable.
[1] Scott Renner, Net-Centric Information Management, 8th Int. C2 Research and Technology Symposium, McLean VA, 2005.
Referenced resources: