Monday, December 25, 2006

YOU: Time Magazine's Person of the Year, and LONG TAIL

The increasing influence of Web 2.0 has permeated the spirit of Long Tail into every aspect of human life. Probably Web 2.0 itself is nothing but the demonstration of Long Tail: an aggregation of nobodies eventually becomes somebody. With this philosophy, the person of the Year 2006 (as elected by the Time Magazine) becomes reasonably understandable. Although YOU are (most likely) a nobody by yourself, YOU can be part of a somebody through the effect of Web 2.0. Therefore, every one of YOU is the person of 2006 because YOU are somebody!

Web 2.0 is changing the world. Long Tail demonstrates its power. In history we have never been so close to the real power of collectivism.

The Scottish philosopher Thomas Carlyle once said that the history was about "Great Man." But every "Great Man" came out of dozens or hundreds of "less greater man," who again came out of thousands or millions of nobodies who were long tails. We often overlook the impact of these long tails because the impact from every one of them is so tiny comparing to the power brought by a "great men."

Fortunately, the rise of Web 2.0 changes the history. The beautiful Web 2.0 allows everybody to have a chance being a "Great Man" through engaging the effect of Long Tail about many others. The great diversity of the Web brings the world tremendous opportunities. Through Long Tail, Web 2.0 constitutes a real fantasy that "I work for everybody; and everybody works for me." Or it may be restated in a more fashion style: I am in your long tail; and you are at my long tail.

So, the person of the year 2006 might better be YOU with a Long Tail!

Referenced resources:

Friday, November 24, 2006

Ultra-scale Information Management: no place for rigid standards

This year's ER conference (International Conference on Conceptual Modeling) was hold on Tucson, Arizona, from Nov. 6 to Nov. 9. This year's ER is special since it is the 25th anniversary. Hence the organizers have prepared a few special events. To me, however, the best experience about ER 2006 was listening to a great talk given by Scott Renner from the MITRE Corporation.

Scott's talk is about "Community Semantics For Ultra-Scale Information Management." Here is the abstract.

The U.S. Department of Defense (DoD) presents an instance of an ultra-scale information management problem: thousands of information systems, millions of users, billions of dollars for procurement and operations. Military organizations are often viewed as the ultimate in rigid hierarchical control. In fact, authority over users and developers is widely distributed, and centralized control is quite difficult – or even impossible, as many of the DoD core functions involve an extended enterprise that includes completely independent entities, such as allied military forces, for-profit corporations, and non-governmental organizations. For this reason, information management within the DoD must take place in an environment of limited autonomy, one in which influence and negotiation are as necessary as top-down direction and control.

This presentation examines the DoD’s information management problems in the context of its transformation to network-centric warfare The key tenent of NCW holds that “seamless” information sharing leads to increased combat power. We examine several implications of the net-centric transformation and show how each depends upon shared semantic understanding within communities of interest. Opportunities for research and for commercial tool development in the area of conceptual modeling will be apparent as we go along.
The promise of successful information sharing is "when the right information is provided to the right people at the right time and place so that they can make the right decisions" [1]. To protect these five rights (right information, right people, right time, right place, and right decision), the DoD practice has proved that the top-down standardization approach can be partial success, but overall failure for ultra-scale information management systems. When the scale of an information system gets to be big, single vocabulary simply does not work any more, even if it is for a very much rigidly organized army environment.

Straightforwardly, this conclusion leads to another prediction. To the scale of World Wide Web, where (1) the number of web users is much greater than the number of soldiers, (2) the entire domain is much more complicated and significantly greater than the domain of military, and (3) there is no enforced power over the Web, absolutely there is no chance of success if we plan to design any standard for global information sharing, such as, the Semantic Web.

Information is more than data. Scott presented an interesting point: information is about data and how this data is understood. When data itself is objective, understanding the meanings of data can be varied with respect to different people. Varied understanding of meanings thus may lead to complete different decisions based on, however, the same data. Hence it is necessary to separate the two different types of information representations. What is the data is different from what is the data for.

Let's look at Semantic Web again. In order to build a real Semantic Web, data in the Semantic Web must be existing, accessible, visible, and understandable. Existing means data values or/and data descriptions must have been created. Accessible means created data presentation must be deliverable to its correct destination by legal requests. Visible means data representation can be watched and identified by humans. Understandable means data representation can be correctly and unambiguously identified by machines.

Ostensibly, one issue that is overlooked by the current Semantic Web research is the relationship between visible and understandable. Many current Semantic Web practices assume what machines understand must be what everybody expects; or on the other way they assume that a small group's vision of a domain can be a standard description of the domain for machines to perform. As what the DoD project has demonstrated, the previous assumptions are impractical in the real world ultra-scale applications, even if they are for the army scenario.

Semantic Web researchers must start to learn from Web 2.0 practices. A public domain agreement cannot be enforced in general. The enforcement model might be executable for a small domain and for a limited number of domain participants. But the model can never be scale to large size such as the entire Web. This difficulty of scalability lays not only inside the theory of conceptual modeling, which no doubt is a hard problem, but also in the intrinsic human expectation of freedom, which makes the difficult problem be essentially unsolvable.

[1] Scott Renner, Net-Centric Information Management, 8th Int. C2 Research and Technology Symposium, McLean VA, 2005.

Referenced resources:

Wednesday, November 01, 2006

Next Generation Web

What is the next generation of World Wide Web? Is it Semantic Web or is it something else?

This is my view: the next generation Web will be the coupling of Web 2.0 and Semantic Web. This thought is observed from the growing-up of humans. As a metaphor, we may compare the current World Wide Web (Web 1.0) to a baby. And indeed Web 1.0 is a baby because Web 1.0 pages only display information without directly communication with readers and without sufficient machine-understanding. In similar, babies only care of their own interest without thinking about others and without thinking of whether their message could be understood by the others.

When human babies grow up, they start to develop themselves simultanously in two ways. First, they learn to communicate to the other people, especially to the other children. They begin to talk to each other and improve their own knowledge through collective intelligence among groups. This is exactly the philosophy of Web 2.0.

On the other hand, human babies begin to learn from books. After they go to school, they start learning "standard" and "formal" specifications of world facts, which then are understandable by the general public. This is thus the philosophy of Semantic Web.

During the growing up of human children, the two processes interact each other. A proper balance between social activities and textbook learning is a key for children's individual development.

This child-growing scenario is a model for the evolution of World Wide Web. Nowadays, any overemphasis on either Web 2.0 or Semantic Web is unhealthy for the Web evolution. We have to properly combine both the aspects to achieve a well-functioned next generation Web.

Referenced resources:

Sunday, October 15, 2006

Paper Review: Creating a Science of the Web

Science Magazine, 11 August 2006: Vol. 313. no. 5788, pp. 769 - 771
Creating a Science of the Web Tim Berners-Lee, Wendy Hall, James Hendler, Nigel Shadbolt, Daniel J. Weitzner

Understanding and fostering the growth of the World Wide Web, both in engineering and societal terms, will require the development of a new interdisciplinary field.
This is a remarkable observation. Web research is starting to be beyond the traditional scope of Computer Science, which, by Berners-Lee and his colleagues, "is concerned with the construction of new languages and algorithms in order to produce novel desired computer behaviors." Behaviors on the Web are not only about computer behaviors, but also about human behaviors. On the Web, we are not using computers to simulate human behaviors. Instead, we are expecting computer behaving to cooperate with humans. This is a portion of Web Science that is beyond Computer Science.

Comparing to physics and biology, Web Science is to analyze Web behaviors and try to "find microscopic laws that, extrapolated to the macroscopic realm, would generate the behavior observed." This perspective is again different from traditional Computer Science. There are no natural laws in Computer Science research. We may adopt (or adapt) natural rules for Computer Science research to follow or simulate in contrast to discover natural laws in Computer Science research. There is, however, a semi-natural existence in Web Science research, which is the World Wide Web itself. Although the Web is an artificial creature, it has grown to be a nearly natural existence because no single human (or even entire human beings) may shut it down. Therefore, Web Science is indeed unique and it is a hybrid branch of nature science and social science.

Reference resources:

Saturday, October 14, 2006

Kelly's Theory of Personality

George Kelly's theory of personality could be an alternate view for constructing the Semantic Web.

Out of these insights, Kelly developed his theory and philosophy. The theory we'll get to in a while. The philosophy he called constructive alternativism. Constructive alternativism is the idea that, while there is only one true reality, reality is always experienced from one or another perspective, or alternative construction. I have a construction, you have one, a person on the other side of the planet has one, someone living long ago had one, a primitive person has one, a modern scientist has one, every child has one, even someone who is seriously mentally ill has one.
This is exactly what Semantic Web should be. We may view the semantics in Semantic Web from two varied aspects: the community view aspect and the individual view aspect. For any specific domain, the community view is the description of the domain agreed by all people in the community. By contrast, an individual view is a special description of the domain by a person in the community. Essentially, when people publish knowledge, they yield to the community view to anticipate broader public recognition. When people search some particular information, however, they yield to their own individual views to anticipate higher precision of search results.

The relation between community view and individual view is not the same as the relation between a superclass concept and a subclass concept. Ideally, the community view is the collective set of all the individual views in the community. However, a collective set does not equal to a simple collection of all small pieces of components. In every individual view, people have their special interests that may not be interested by the other people in the same community. Therefore, the community view is rather a compromised agreement than a representative view of everybody in the community.

On the other hand, an individual view is always related to certain community view. But any individual view at the same time may have its own specifications that are not belong to, or even contradict to, the adopted community view.

With this model, we may not construct a community view. Community view is not directly constructable by anybody or any group. We may only approach a community view by gathering enough individual views. Thus, there must be individual views before the community view. This is actually a basic assumption of Web 2.0. Now we must extend it to Semantic Web.

Reference resources:

Tuesday, October 03, 2006

Role of URI for Machine Understanding (Brainstorming with Tim Berners-Lee, issue 1)

(revised August 1st, 2008)

Well, where should I start? Beginning with a brainstorming by Tim's blog might be a good idea. Without his invention of World Wide Web, this blog communication could not have happened.

In his blog, Tim first mentioned his opinions about URI. Based on my understanding, a fundamental issue about machine-understanding is associating every Web data to an URI. Two identical URIs would simply mean two identical real-world objects. This philosophy is the cornerstone of the current machine-understanding.

Human-understanding begins also from a similar fundamental agreement. When a foreigner tries to communicate to a native, they talk by using fingers pointing to the same items. By speaking in different terms, gradually they understand each other. These fingers to humans are the URIs to machines.

Unless explicitly specified otherwhere, varied URIs by default mean differently (like two fingers pointing to different places). This rule is probably the most fundamental one in "machine-understanding." Otherwise the generic Web object identification problem could be very complicated.

Everyone deserves a URI! This is a brilliant point. One valuable but full of challenge request in the current Web development is human identification. When we type in a friend's name into current search engines, such as Google, we often get many search results of people who have the same name. If every Web user has a unique URI, which becomes his unique Web ID, it would be much easier for search engines to filter the results.

A question is, however, where a personal ID URI should point. The URI might point to a homepage, or a picture, or a short personal description, or a string of numbers such as social security number, or there are many other options. Any of these options could work; but every one of them has its limitation. For example, a string of numbers is easy to store and convenient for machine processing; but at the same time they are easy to be stolen and forged. On the other hand, a biography is semantically rich, harder to be forged, and easier to check its integrity. But it is much more time consuming to author biographies for every person and who is authorized to charge these biographies.

Tim suggested the use of FOAF RDF documents to be unique person indentifications. FOAF defines well-designed and easy-to-process attributes about individual persons. A problem is, however, that its RDF content is customized for sharing friends rather than identifying individuals. Is it really suitable for individual identification? This is an interesting problem that is worth of exploring in the future.

Referenced resources:

Sunday, October 01, 2006

This is my first blog post!

Finally, today is October 1st, 2006, I have my first blog post. I hope this is a place my friends and I are going to discuss some fun research topics. It is always interesting exploring new ideas, chatting with new friends, and brainstorming exciting stuffs. Let's start our dreams, and let's have fun!