Sunday, August 10, 2008

A quick thought of Semantic Web

As I continue reading Seth Lloyd's fabulous book Programming The Universe, a thought strongly takes me over and I cannot get rid of it. Is the W3C interpretation of Semantic Web a realizable plan according to the second law of thermodynamics?

In general, the central point of the W3C interpretation of Semantic Web is to upgrade the present Web of linked documents to the future Web of linked data. From the view of useful information (i.e., entropy with respect to the second law of thermodynamics), entropy of the system (i.e. the entire Web) must be inevitably decreased.

Certainly, however, World Wide Web is an open system in contrast to a closed system. Thus the mentioned systematic decrease of entropy does not directly violate the second law of thermodynamics. However, in order to truly decrease the entropy of an open system and maintain a low level of entropy systematically, the system must continuously be added useful work from outside of the system. To World Wide Web, human mind is the only available resource for the external useful work, and we must add an assumption that humans are not part of World Wide Web (will this assumption be hold all the way to the future?).

Even with all the previous assumptions, the W3C interpretation of Semantic Web still has a few intrinsic difficulties. In his book, Seth pointed out that the cost to decrease the entropy of a system is significantly high, let it alone that the W3C interpretation of Semantic Web asks not only to maintain the low entropy environment for long time (and potentially forever) but also to continuously push the systematic total value of entropy being lower and lower. It thus must demand unbelievable large amount of human mind to be the external useful work so that the goal can be fulfilled. In the other words, this requirement fundamentally contradicts to the optimistic declaration popularly among Semantic Web researchers that we may figure out a few low-cost, fabulous Semantic-Web killer applications and suddenly the dream of Semantic Web comes true.

The second law of thermodynamics tells that such a type of low-cost, fabulous Semantic-Web killer application simply may not exist. Or otherwise, they are typical perpetual motion machines of the second kind. Based on the second law of thermodynamics, it is theoretically impossible to build these perpetual motion machines. (If someone believes it to be possible, he will eventually discover the need of the amount of external input being quickly beyond the original expectation, just like Powerset has experienced.)

Does this observation sentence the death of Semantic Web? I still don't think so. To make Web content be more machine-executable is not impossible. However, we may need to have new thoughts of how semantics might be cost-efficiently added to the Web. In the other words, whatever tools we build should not violate the second law of thermodynamics, or otherwise the business would not be sustained (such as Powerset).

More and more, I tend to the vision of human-directed Semantic Web in contrast to the traditional vision of machine-enhanced Semantic Web. By this new vision of Semantic Web, we abandon the assumption that humans are excluded from the Web. By contrast, we allow humans and the Web together be a comparatively closed system in contrast to the Web alone be a open system. In this comparatively closed system (it is not a truly closed system since it still excludes the supporting equipments such as power plants), we may experience controlled systematic increase of entropy in exchange of a few local, conditional decrease of entropy. It would be a much less perfect Semantic Web than the W3C interpretation. But it would be much more executable, realizable, and it will bring concrete benefits for human users. I look forward Imindi to be the first real-world example of this vision.

Referenced resources:

5 comments:

Zemantic dreams said...

Extremely interesting take on the topic. I think your conclusion is absolutely correct.

Now let's make that part real! :)

Andraz Tori, Zemanta

Kingsley Idehen said...

The issue is the Human Interaction with the Linked Data Web. The Document Web pretty much shows the way. We just need to remember the lessons of the Document Web as we bootstrap the Linked Data driven Semantic Web.

The key to all of this is a continuous stream of URIs from human activity, wihout the Human users changing behaviour. This particualar challenge (aka activation threshold) has dogged metadata for eons, but I am extremely confident we are about to knock this challenge on its head.

That's all I can say for now. I prefer to show rather than talk or postulate :-)

Kingsley

Yihong Ding said...

@Andraz, thank you and we do need to try to make things be real.

@Kingsley. In fact, I am cooperated with a Semantic Web startup company now on bootstrap Semantic Web. As we experience, however, we feel a few problems. They are not academic issues, but business issues.

As you may know, the violation of the second law of thermodynamics is different from violating the first law of thermodynamics, when we look at them from the business aspect.

For Web companies, violating the first law of thermodynamics means to create product whose total information is greater than the total input information. Such a tool cannot exist and any attempt for this goal will fail immediately.

Violating the second law of thermodynamics (especially the second kind of perpetual motion machine) is more tricky and hard to realize. In fact, until now (according to Seth) there are still many proposals every year for this kind of machine inventions, and all of them are from very brilliant people.

The key of this type of violation is that it seems the idea works in small scale because the investor have enough resources (such as money) to provide external useful work to sustain the system. However, the increase need of this kind of external support increases exponentially beyond the designer and investor's expectation. Hence in large scale, it would face serious problem on how to actually maintain such a system work. This is the key of the violation of the second law of thermodynamics.

This law tells that in general case, the violation is forbidden because the invention cannot be sustained in a generic situation even if we provide the energy of the entire world to support it.

As you know, I would rather not say negative to Semantic Web because myself is part of the researchers I mention. But as a serious scientist, I have the responsibility to tell what I do believe even it hurts myself.

After reading Seth's book, I start to think whether this is the fundamental reason that the progress of Semantic Web to industry is so slow after the tremendous push from W3C for so many years. In comparison, Web 2.0, though many people still think it is simple and it does not have much valuable content itself, has become so great a business success global wide without a pushing force such as W3C behind.

How to balance the local decrease of entropy and the inevitable global increase of entropy would be a serious issue for Semantic Web researchers to consider. This is not something just for talk or postulate, it is something real and it has been repeatedly proved before in the realm of physical science and during the first and second industrial revolution.

Yihong

denny said...

I disagree in one aspect: I think it is possible to invest the amount of human power to the system and to still keep it going. I can't nail it down exactly -- I didn't read "Programming the Universe" yet, so I can't really discuss it, but the feeling goes along the following lines: the value of a network increases superlinearly, if not even quadratic (Metcalfe's Law), whereas the amount of information increases sublinearly (due to redundancies in human knowledge). Or, put it in another way: get more people and Wikipedia or Linux gets better, because they have a constrained scope. The more you constrain the scope the more value is added by more people.

This is an oversimplification.

Yihong Ding said...

denny,

First, thank you for the comment. I sincerely welcome any discussion on this issue. By posting this article I didn't claim that all my points are 100% correct. As the theme of this blog tells, Thinking Space is about thinking but not teaching.

One thing I think probably you have misread from the article. The argument is not how fast the amount online information may increase in comparison to the speed of networking increasing. The issue is how much cost we need to pay to upgrade the "usability" of online information, let it alone that the totally amount is continuously increasing.

The second law of thermodynamics tells that it is impossible to increase the total usability of information (i.e. the decrease of entropy) in a closed system. With respect to an open system, however, it is possible to decrease its entropy but it requires tremendous amount of useful work pushed into the open system from outside. Then the question is how costly the external useful work is.

As you said, it is possible to invest the amount of human power to keep things on going. But unless we do have figured out the efficient ways to convert human mind to the process of upgrading linked documents to linked data and maintaining the upgraded state, it is nearly unbelievable to say that we may invent some low-cost, fabulous Semantic-Web killer application that suddenly turn on a dreamed Semantic Web.

Yihong