Tuesday, July 31, 2007

What does tagging contribute to the web evolution? | An introduction of web thread

(Revised October 15, 2007)

tagging Tagging becomes popular with the prevalence of Web 2.0. A tag is a keyword or term bound to a piece of information. By default, a tag is assumed to be a correct partial explanation of the bound object. This explanation is unnecessary to be complete, and it is also unnecessary to be about the key characters of the bound object. The purpose of tagging is primarily to associate web content to human conventions.

From the view of web evolution, however, we have another explanation of this Web-2.0 tagging. By this view, tagging is a process of drawing threads across the Web. The activities of tagging convert the traditional node-driven web to be a new thread-driven web; this is a prediction based on web evolution.

Node-driven Web versus Thread-driven Web

A node-driven web is a web whose structure is primarily driven by how web users link their authored web nodes to each other. Albert-Laszlo Barabasi had made great contributions on this field by studying the so-called scale-free network, which is typically node driven. Whenever a new node is added into a node-driven web, its owner decides how to connect this new node to existing ones. This type of decisions is, however, often not random. In contrast, people always tend to link a new node to the most popular node existed in the current web. The chance that a less popular node gets a new link decreases exponentially to its popularity. This theory is fundamental to the growth of node-driven webs.


Web thread is a new term. In my mind, a web thread is a connection that link various multiple web nodes to a fixed inbound. Unlike a standard web link that connects exactly one outbound node to one inbound node, a web thread may simultaneously connect more than two web nodes omnidirectionally. For example, a Web-2.0 tag is a web thread. Other than linking from one node to another, a Web-2.0 tag simultaneously connects arbitrary numbers of web nodes that share the same tag. This is why we call it a "thread" but not a "link" on the Web.

A thread-driven web is a web whose structure is primarily driven by the interconnections of web threads. When a new node is added into a thread-driven web, the web itself automatically decides how to locate this new node by assigning proper threads to it based on the content in this node. Although we do not prohibit user-specified links or threads, the popularity of a web node is determined by the richness of its content (i.e. how many threads are weaved through this node) instead of the number of human-specified links connected to this node.

The popularity of web nodes in a node-driven web is primarily decided by the votes of humans. The most popular hubs in a node-driven web may not necessarily contain rich information themselves. For example, the homepage of Google is just a simple interface without much information. But humans subjectively decide that these hubs are more valuable than many other web nodes.

In contrast, the popularity of web nodes in a thread-driven web is primarily decided by the richness of content in these nodes. The most popular hubs in a thread-driven web may not necessarily be favorite sites for most of the human users. But these nodes definitely contain richest information on the Web for machines to process.

With this comparison, we can see that a node-driven web is human-oriented while a thread-driven web is machine-oriented. If the future is Semantic Web, the Web certainly is evolving from a node-driven web to a thread-driven web. To machines, the popular human favorite sites such as the main page of Google is less valuable because it does not really provide much useful information. In contrast, machines look for nodes weaved by the most number of threads that concentrate on their searched topic. This switch of vision on World Wide Web may eventually change the methods of web search in the future.

Some Potential Impacts to the Future

This re-interpretation of tagging may bring us several positive impacts on developing next-generation web technologies.

(1) This re-interpretation brings us a new picture of World Wide Web. World Wide Web has gradually turning from a random network to be a more and more well-organized network. Before Web 2.0, the model of World Wide Web is widely known as a scale-free network formally introduced by Albert-Laszlo Barabasi. This new interpretation, however, states that underneath the scale-free architecture, World Wide Web actually maintains a fairly organized latent structure, whose backbones are web threads. A scale-free network is significantly dominated by few highly connected hubs. A well-organized thread-driven network, however, is dominated by highly adopted threads. This crucial difference between the two network architecture is a key of developing next-generation web technologies, especially the next-generation web search techniques.

(2) This re-interpretation brings us a new understanding of what web tags are. Web tags are not standard web links. When we produce a new web tag, we are producing a new thread of World Wide Web. The popularity of these threads, however, are determined by their acceptance among web users. On the other hand, the popularity of web thread may still follow the Yule-Simon distribution---a power law relationship, as what the scale-free model follows. These thoughts might be helpful for the further study of web tags and threads.

(3) This re-interpretation suggests us a new way of building a semantic web. Instead of creating semantic-web nodes (as we are creating normal web nodes), building a semantic web is creating web threads and throwing these threads across a network. By weaving these threads, we acquire semantic-web nodes by their intersections.

(4) This re-interpretation brings us a new vision to personalize World Wide Web. Based on this view of thread-driven web, a personalized web becomes a web weaved by personalized threads. By delicately mapping personalized threads to widely adopted threads, we can produce personalized web for individuals. In return, these personalized webs become latent chaos patterns of the entire World Wide Web.

Wednesday, July 25, 2007

Weaving the Thread-Driven Semantic Web

World Wide Web was designed to be a node-driven weaved web. The current web is a set of nodes connected by manually assigned links. In contrast, the ideal Semantic Web (if ever realized) must be a thread-driven weaved web. In such a Semantic Web, the machine-processable semantics are the objective threads that connect all the nodes. In such a Semantic Web, the importance of individual nodes become less and less. On the contrary, threads become fundamental.

I posted a new article at Semantic Focus in which I presented a new vision of Thread-Driven Semantic Web.

Wednesday, July 18, 2007

Two Postulates, A View of Web Evolution, series No. 3

(Revised at May. 24, 2008)
(Revised at Sep. 28, 2007)

The reason beneath the evolution of World Wide Web has two folds. Fold one is due to a natural and general law of evolution. Fold two is about the relation between human and the Web, hence it is artificial and specific. The two folds are closely related to each other.

On the natural and general fold, World Wide Web is an objective existence. When the Web evolves, it has to follow the natural laws of evolutions. On the artificial and specific fold, World Wide Web is a man-made project. We cannot neglect the impact of humans in the progress of Web evolution. The entire progress of Web evolution thus is a process of mutual-interaction between the two folds.

Natural and General Fold of Web Evolution

Postulate 1: Web evolution is a directional, stagewise process.

By definition, evolution is a process in which something passes by degrees to a different stage (especially a more advanced or mature stage). In philosophy, such an evolutionary progress obeys a general law of dialectics---the Law of Transformation of Quantity into Quality. If the advance of World Wide Web is also an evolutionary process, Web evolution must also follow this general law of evolution.

ice thawThe Law of Transformation of Quantity into Quality tells that prompt qualitative changes are always caused by gradual quantitative alterations. For example, if a book has 100 pages as opposed to 50 pages, it is a quantitative change. However, if we reduce its size to be only one page, it is no longer a book. This is, thus, a qualitative change. As another example, a piece of ice remains being ice when its temperature rises but still below zero centigrade. This is a quantitative change. When this quantitative change continues until the temperature passes beyond zero centigrade, ice thaws and becomes water. Then we say that the accumulation of this continuously quantitative change eventually causes a qualitative change so that ice becomes water.

The Web evolution must also obey this law. In contrast to a continuously gradual process, the Web must evolve with sudden leaps and catastrophes, which are the transitional periods between consecutive stages of Web evolution. Every stage of Web evolution can be described by a unique quality. The transition from a lower stage to a higher stage is a qualitative upgrade. Within a single stage, we have have quantitative accumulation of Web evolution. By this recognition, a key issue of Web evolution study is to look for the evolutionary elements that can be measured by quantity and quality.

Artificial and Special Fold of Web Evolution

Postulate 2: the growth of World Wide Web is a nonstop process that stepwisely embodying human minds into a virtual world.

cloneBy nature humans are afraid of death and desire eternal life. Religions in human society are prevalent because of this desire of life after death. Besides religions, however, humans have also invented other approaches to have alternative types of immorality. One particular way is to author great work (which could be such as book, art work, or great building) in history. Through the great work, it is not only the names of the authors are remembered and become immortal, but also the thought of the authors become immortal since the great work represents the highlights of the authors' mind. This last approach towards immortality is a popular one that satisfies almost everybody because it is probably the most "doable" way for humans to perform comparing to all kinds of religious approaches.

There is another reason behind the popularity of authoring. This approach is a good way to preserve human knowledge in history. By preserving knowledge, we humans can improve ourselves better by learning the experiences and lessons from our ancestors. Hence by encouraging authoring our human society may achieve a collectively maximized advantage of evolution.

Although authoring is greatly attractive to both individuals and society, traditionally it has a main drawback. Only few really great work can be properly kept by people in long history of mankind, let it alone the risk of all kinds of man-made or natural disasters such as war or earthquake. As the result, most of the people in history were not able to leave anything memorable in the world. Hence both of the individual perspective of immortality and the collective perspective of preserving human knowledge are failed. It explains why humans are eager to find a cheap approach that can preserve human mind for everybody and can be kept forever disregarding all kinds of man-made or natural disasters.

The invention of World Wide Web finally solves this long-term issue. On World Wide Web, everybody, no matter he is rich or poor or he is wise or stupid, have the same privilege to author their mind. The Web will keep every authored work equally forever that disregards the quality of each work. This promise is a crucial reason behind the flourishing of World Wide Web because it automatically satisfies both individuals and the collective society.

One more phenomenon shows how much individual Web users are like to author on the Web. It has reported that many bloggers right now keep on blogging their daily life with very few or none readers every day. Certainly analysts have suggested various explanations on this phenomenon. But one thing is missed by many of these analysts. These people are just blogging their mind so that they demonstrate their once existence in this world. This desire of being known is beyond money and entertaining. And this is why the growth of the Web is primarily a process of nonstop materialization of human mind into a virtual world.

Moreover, this progress of nonstop materialization is a stepwise process. Due to the technological limitation, we are not able to effectively materialize all types of our mind onto the Web. Some types of mind activities such as conditional reflex are feasible to implement by the present technology. Some others such as emotion are infeasible by the present technology. Hence the implementation of human mind onto the Web is a long-term progressive process. By the first postulate we learn that continuously quantitative accumulation must eventually cause qualitative change. We can then declare that the progressive implementation process must also be stepwise.

Final Address

The evolution of World Wide Web is driven primarily by web-resource producers instead of consumers. This is a discovery directly drawn from the two postulates. World Wide Web would not have been evolved rapidly if it is a consumer-driven web. The reason is that only the few profitable Web nodes could survive after competition in a consumer-driven web. By contrast, World Wide Web is a producer-driven web because the majority of the web-resource producers contribute to the Web for the purpose that is beyond money.

These two postulates are the foundation of the study of Web evolution. Although they seem to be highly abstract and philosophical, from them we will derive seven practical corollaries. The two postulates and seven corollaries will compose an entire theory of Web evolution.

Readers who are interested in this topic could also watch the web evolution article (Part 2). The referenced link is an older version of this series (a little bit out of update). But it may contain more details of my thought about Web evolution.

The next: Web Evolution and Human Growth

Monday, July 16, 2007

Three Evolutionary Elements, A View of Web Evolution, series No. 2

(revised at May. 24, 2008)
(revised at Sep. 27, 2007)

When discussing Web evolution, the first thing we want to know is which elements are evolving. If the Web evolves, there must be some intrinsic elements of the Web upgrading with time. The recognition of these elemental evolutionary issues is the fundamental of Web evolution.

Web 1.0 TriangleStatic content, dynamic behavior, and interconnective link are the three basic evolutionary elements. The progress of Web evolution can be decided by the value update of these three elements.

In its physical structure, WWW is a network of many atomic nodes. Different people may think of atomic Web nodes differently. For instance, an atomic Web node could be a Web page referenced by an URL or it could be a Web object referenced by an URI. These different interpretation of atomic Web nodes lead to varied views of WWW.

By this theory of Web evolution, an atomic Web node is represented by a web space. At present, readers may simply think of a web space to be a personal homepage. This substitution is imprecise. But it is good enough until we do need a formal specification of web space later in this series.

Web 2.0 Triangle In a web space, static content is the written data, dynamic behaviors are the embedded functions and services, and interconnective links are the external Web links that connect one web space to the others. The specific values of these three elements reveal the particular evolutionary stages of the present Web.

For example, the value of static content for a 1.0 web space is raw data. The value of dynamic behaviors for a 1.0 web space is passive, non-portable functions. The value of interconnective links for a 1.0 web space is hardcoded links. In comparison, the respective values for a 2.0 web space are encapsulated data, portable services, and labelled links.

We can particularly measure the progress of Web evolution by studying the values of these three evolutionary elements. This is a fundamental recognition of Web evolution. But how the three elements evolves and why the three are the most fundamental ones. We need more answers. Beginning from the next installment, we start a formal description about the progress of Web evolution.

The next: Two Postulates

Saturday, July 14, 2007

In the Beginning …, A View of Web Evolution, series No. 1

(revised at May. 24, 2008)
(revised at Sep. 26, 2007)

This series is a step-by-step introduction to a view of web evolution. Many of us believe in the evolution of World Wide Web. Very few, however, have thought in depth why and how the Web evolves. I believe that World Wide Web is a self-evolving system which follows objective evolutionary laws. Hence the main goal of web evolution research is to discover these laws.

There is, however, a debate between whether the Web evolution is an objective process or whether the Web evolution is a human-guided process. From the philosophical point of view, this debate is equivalent to ask whether the progress of human history is determined by the general public or by the few heroes in history. If it is general public that determines history, we may thus be able to predict the future of history by figuring out the objective laws by summarizing the behaviors of general public. On the contrary, if it is few heroes who determine history, the future of history is basically unforeseeable. In person, I support the former viewpoint. Heroes in history are the ones whose behaviors happen to match the objective laws of evolution. On the basis of this philosophical belief, I exclaim the existence of objective laws of Web evolution.

In the beginning

Tim Berners-Lee Everything has a beginning, so does World Wide Web. In the beginning a man invented World Wide Web. His name was Tim Berners-Lee.

Objective evolutionary laws, however, do not applicable at the very beginning of an evolutionary event. The closer to the origin point, the less applicable the evolution laws are. In the opposite direction, evolutionary laws gradually dominate the progress of the evolution.

At the very beginning, when Tim Berners-Lee wrote a private program for himself to share documents through a network, it was unlikely that he had explicitly followed any evolutionary laws. When Berners-Lee was the only contributor at the beginning, there was no obligation to his development. No evolutionary laws made sense at the moment.

Later on when more contributors joined to the development of WWW, gradually they felt the demand of a formal organization to coordinate everybody's contribution. The W3C (World Wide Web Consortium) thus came to the world. Subjective willingness of individual developers started to be pressured by group willingness. This transition simultaneously indicates that objective laws started to be formed to guide the further progress of World Wide Web.

The Web keeps on growing. After it engages billions of contributors, a question becomes critical---could the progress of a project in such a scale still be controlled in the hand of a single organizations such as W3C?

If the answer to the previous question is yes, I can then reasonably infer to a conclusion that all the theory of "invisible hand" by Adam Smith must be wrong. Any billion-people-involved, long-term project must have its own evolutionary laws. At this super-large scale, solely human guide becomes unrealistic. In fact, we have already gotten an example to verify this claim.

Both Semantic Web and Web 2.0 were proposed to the public almost simultaneously at 2001. The proposal of Semantic Web was exclaimed by leading scientists such as Tim Berners-Lee himself and with the full support from W3C. After it was proposed, thousands of the best Web researchers all over the world started working for this vision of Semantic Web. On the other hand, Web 2.0 was suggested by few thinkers such as Tim O'Reilly and there were no formal organization behind to lead its progress. After seven years, the real-world practice shows the success of Web 2.0 while the practice of Semantic Web is still inside research labs.

This Web-2.0 phenomenon strongly suggests the existence of objective laws of Web evolution. The execution of these Web evolution laws is beyond the willingness of any individuals or any special interest group.

In summary, World Wide Web has grown mature enough to be a self-organizing system whose growth is determined by objective evolutionary laws instead of particular willingness of any individual humans or individual organizations. This recognition is the foundation of Web evolution research.

The next: Three Evolutionary Elements