Monday, September 07, 2009

The real-time web in a nutshell for Web developers and researchers

The "real-time web" is more and more popularly mentioned in various discussion. For example, ReadWriteWeb recently posted a three-installment series about The Real-Time Web: A Primer. In this post, I would like to share some of my thoughts of this new jargon in a concise way. Primarily, the post explains the real-time web in the way of Web research, which is different from the ReadWriteWeb post that targets the regular non-professional readers.

1) The real-time web is a web of information produced in real time.

The statement tells two points: (a) the real-time web is a data web; and (b) the information in the real-time web is genuine.

The two points are equally important. First, unlike World Wide Web in general contains plenty of services and connective links as well as data, the real-time web is primarily a web of data. Until now, no specific web services are designed for the real-time web (and indeed it is unlikely a necessity anyway). The dominant majority of the web links in the real-time web is referential to the details of the data mentioned rather than being connective among varied semantics in the web.

Second, the information the real-time web produces is generally genuine. It thus means that many times the produced semantics among the real-time-web data is new (i.e. has never occurred before or could not be search elsewhere) in the Web. This implicit hint carries tremendous amount of value. For example, once we might know the start time of a semantics coined in the Web, it would significantly reduce the difficulty of semantic search in the Web.

2) The real-time web is built upon a network of instant messaging.

Not necessarily be. But in reality the real-time web has been developed as a network of instant messaging. Twitter plays a crucial role in this migration. It is Twitter that invented the 140-character threshold in the real-time data production. This invention ties the data production in the real-time web tightly to the technology of instant messaging since the latter favors the production of the former. By contrast, the real-time web could have been in very different ways if it was led by the other companies such as CNN (by which the real-time web could be a network of a greater chuck of data integrated with complex services but with timestamp).

3) The real-time web is a subset of World Wide Web.

The real-time web is not World Wide Web in the next stage. It is a just a newly emerged subset of the Web. It is especially important for the Web developers to recognize the distinction so that they might not faulty interpret the evolution of World Wide Web.

4) The real-time web is a web of heavily overloaded information, full of duplicated data.

The percentage of data (as well as semantics/meaning) duplication is significantly greater than the rest parts of the Web. Very often the same data (or the same semantics) repeats itself extremely frequently within a short period of time frame in the real-time web. Hence any attempt of consuming the real-time-web data must be carefully thought to handle this unusual environment, which is quite different from handling the other Web data.

5) The real-time web is a web of uncooked information.

The real-time web shows the instinct human consciousness versus that the rest of the Web shows human memory. Again, this distinction implies the varied data mining technologies required for handling data in the real-time web.

6) The real-time web is not a new form of communication.

I disagree to the argument that the real-time web is a new form of communication. The argument not only incorrectly expresses the essence of the real-time web but also misleads the readers from the proper use and implementation of the real-time web.

The real-time web is not a form of interpersonal communication; the instant messaging is. The real-time web is a platform of instant messages. Why is the distinction critical? The two views of the real-time web demand significant differently on the issues of security and data integrity. By treating the real-time web be a form of communication, we need to focus on the restriction of accessing personal information. By contrast, by treating the real-time web be a platform of instant messages, we need to pay more attention to guarantee the freedom of information broadcasting. Try to mix the two fairly contradictory purposes could only lead to the unnecessary complexity of developing the real-time web.

The correct attitude is that (a) we need to have a public and free real-time web, and (b) we may need to invent better forms of private communication within the real-time web.

7) The real-time-web information decays.

Unlike the information stored in the regular Web, the information in the real-time web decays significantly faster. When a piece of information decays, it is meaningful to be used no longer. The fact implies that we need to invent a channel that allows the transportation of an information from its real-time web accessibility to the regular web accessibility to stop the process of decaying, if the information is truly worth of preserving. There is a lot more work to do in order to truly facilitate the real-time web.

Summary

Do not underestimate the production of the real-time web. Do not overestimate the value of the real-time web. Take a different thought of the real-time web.

1 comment:

aangtce said...

///7) The real-time-web information decays.

Unlike the information stored in the regular Web, the information in the real-time web decays significantly faster.


Very true also the real-time web data is a lot noisier(a corollary of 4)


/// The fact implies that we need to invent a channel that allows the transportation of an information from its real-time web accessibility to the regular web accessibility to stop the process of decaying, if the information is truly worth of preserving.


Guess it is time for all the statistics tools to be unleased as real-time web service and integrated with real-time web data??