Wednesday, November 28, 2007

Blink: an embarrassment of collective intelligence

Blink is another best-selling book authored by Malcolm Gladwell after his influential The Tipping Point. The book Blink is about the unconsciousness of human being. In the book, Malcolm argues that a decision made by well-trained unconsciousness many times is better than an alternate decision made by through thoughts. Reason: well-trained unconsciousness (or the so-called "thin slicing") only catches the very core of the problem, while through thoughts often wander into unessential branches that lead to the burying of the core. This is thus "the power of thinking without thinking," as the subtitle of the book.

This observation of the importance about the "thin slicing" shows an embarrassing side of the collective intelligence: if there is a conflict between a decision made from a collective base and an alternate decision made by the instinct of few top experts, which one should we trust? The Web-2.0 experiences ask us to vote for the first decision, but Malcolm's book tells us that most of the time it is the second one that is more trustworthy. Which one would you pick in real then?

This is a vague question that may not have an absolute answer in general. But at least the question shows that collective intelligence is not a panacea. An opinion from a domain expert and another opinion from a layman certainly should be weighted differently when we apply both to make a decision. Some time, as what Blink tells, the instinct of very few experts is much more correct than a collective decision.

So is the YouBeTheVC competition a really serious event? Maybe it is just another American Idol show. Think of it, would Larry Page and Sergey Brin (or Mark Zuckerberg) attend this kind of idol show when they had the blueprint of Google (or Facebook) in mind? I doubt it. Distinctive idea is more often out of a blink in contrast to out of a collective vote.

Friday, November 23, 2007

Multi-layer Abstractions: World Wide Web or Giant Global Graph or Others

(the compact version of this article is cross-posted at ZDNet)

Sir Tim Berners-Lee blogged again. This time he invented another new term---Giant Global Graph. Sir Tim uses GGG to describe Internet in a new abstraction layer that is different from either the Net layer abstraction or the Web layer abstraction. Quite a few technique blogs immediately reported this news in this Thanksgiving weekend. I am afraid, however, that few of them really told readers the deeper meaning of this new GGG. To me, this is a signal from the father of World Wide Web: the Web (or the information on Internet) has started to be reorganized from the traditional publisher-oriented structure to the new viewer-oriented structure. This claim from Sir Tim Berners-Lee well matches my previous predictions of web evolution.

Why another layer?

We need to look at a question---why do we need another layer of abstraction of Internet? The answer: when all the previous abstractions are no longer sufficient enough to foster the newest evolution of Internet. Based on Sir Tim, we previously had two typical abstractions of Internet layers. The first layer of abstraction is called the Net, in which Internet is a network of computers. The second layer of abstraction is called the Web, in which Internet is a network of documents. After these two abstractions, Sir Tim now declare the third layer of abstraction named the Graph, in which Internet is a network of individual social graphs.

We are all familiar to the Net layer of Internet, which Sir Tim also call the International Information Infrastructure (III). Whenever we buy a new computer and link it online, this computer automatically becomes a part of the III. Through this computer, humans can access information stored in all the other computers within the III. Simultaneously, the information stored in this new computer become generally accessible by all the other computers within the III. This abstraction layer is particularly useful when we discuss information transformation protocols on Internet.

The Web layer of Internet is often called the World Wide Web (WWW). "It isn't the computers, but the documents which are interesting." Most of the time human users only care of information itself but not on which computers the information is physically stored. Whenever somebody uploads a piece of information online, this information automatically becomes a part of the WWW. In general, a piece of information holds its unalterable meaning that is independent to whether it is physically stored in computer A or computer B. This abstraction layer is particularly useful when we discuss general information manipulation on Internet.

Are these two abstractions enough for us to explore all the potential of Internet? Sir Tim answers no, and I agree. The Internet evolution continuously brings us new challenges. As I had pointed out in my series of web evolution, the primary contradiction on the Web is always the contradiction between unbounded quantitative accumulation of web resources and limited resource-operating mechanism at the meantime. We continuously require newer web-resource-operation mechanisms to solve this primary contradiction at a new level. The newer resource-operation mechanisms, however, are reflections of the newer abstraction layers of Internet. In particular to the Web 2.0, this primary contradiction is shown as the continuously increased amount of individually tagged information and the lack of ability to coherently organize them together. The concept of social graph is helpful to solve this contradiction.

Both Brad Fitzpatrick and Alex Iskold presented the same observation: every individual web user expects to have an organized social graph of web information in which they are interested. Independently, I had another presentation but about the same meaning. The term I had used was web space. Due to current status of web evolution, web users are going to look for integrating their explored web information of interest into a personal cyberspace---web space. Inside each web space, information is organized as a social graph based on the perspective of the owner of the web space. This is thus the connection between the web spaces under my interpretation and the social graphs under the interpretation of Brad and Alex. Note that this web-space interpretation reveals another implicit but important aspect: the major role of an web-space owner is a web viewer instead of a web publisher.

The emergence of this new Graph abstraction of Internet tells that the Web (or information on Internet) is now evolving from a publisher-oriented structure to a viewer-oriented structure. At the Web layer, every web page shows an information organization based on the view of its publishers. Web viewers generally have no control on how web information should be organized. So the Web layer is upon a publisher-oriented structure. At the new proposed Graph layer, every social graph shows an information organization based on the view of graph owners, who are primarily the web viewers. In general, web publishers have little impact on how these social graphs should be composed. "It's not the documents, it is the things they are about which are important." Who are going to answer what are "the things they are about"? It is the viewers instead of the publishers who will answer. This is why information organization at the Graph layer becomes viewer-oriented. The composition of all viewer-oriented social graphs becomes a giant graph at the global scale that is equivalent to the World Wide Web (but based on a varied view); this giant composition is thus the Giant Global Graph (GGG).

More Discussion

Turning from the publisher-oriented web to the viewer-oriented web is a fascinating transformation. Based on the view of web evolution, the core of this transformation is the upgrade of web spaces.

  • On Web 1.0, web spaces were homepages. Homepages typically represented the publishers' view. So Web 1.0 was a publisher-oriented web.

  • On Web 2.0, web spaces become individual accounts. Web 2.0 is in a transition from the publisher-oriented web to the viewer-oriented web. Individual accounts are representative units of this transition. Within an account, web viewers collect resources of interest and store them into the account. So these accounts contain significant viewer-oriented aspects. On the other hand, these accounts are isolated in varied web sites, which are typical information organizations built upon the publisher-oriented view. Therefore, individual accounts on these particular sites must inevitably also contain significant publisher-oriented aspects. Such a mixture between the two views causes more problems than benefits. Users feel difficult to organize information across the boundary of web sites.

  • On the future Web 3.0, web spaces will become primarily viewer-oriented. In contrast to the Web-2.0 accounts, Web-3.0 spaces (or graphs) are going to be a collection of web resources from various web sites that are organized essentially based on the view of web viewers. Web-3.0 spaces will become viewer-side home-spaces in contract to the publisher-side home-pages on Web 1.0.

This vision of viewer-oriented web is exciting. But is there anything else still missing in this vision? If the things we have discussed until now were all we need, Twine would have been the example of our ultimate solution towards the Web 3.0. But I also have analyzed that Twine (or at least the current Twine Beta) was at most Web 2.5. There is still a missing piece in this vision.

The missing piece is the character of proactivity. In my web evolution article, I have emphasized that the implementation of proactivity is a key for the next transition on web evolution. Unlike the publishers who can fully control of whether and what they should publish, viewers have no control on either of these questions. Therefore, a successful viewer-oriented information organization must be equipped with certain proactive mechanisms so that viewers can continuously update their social graphs by newly uploaded web information. Similar to that the implementation of activity (such as RSS) was a key to the success of Web 2.0, the implementation of proactivity will be a key to the success of Web 3.0, or Semantic Web, or the new proposed Giant Global Graph.

Monday, November 19, 2007

An Introduction to ThinkerNet

ThinkerNet is the main component of Internet Evolution, a new site that is focused on exploring the future of the Internet. ThinkerNet is an interactive blog forum written by industry mavens, futurists, authors, entertainers, and other famous Internet faces. You can find bloggers at ThinkerNet such as Craig Newmark, the founder of Craigslist.com, or Philip Rosedale, founder and CEO of the Linden Lab, and many other impressive names whenever you think of the future of Internet. ThinkerNet is a place engaged by exceptional thinkers on the Web.

I was invited to join this extraordinary group of thinkers. I regard this invitation as an honor and recommend ThinkerNet to the readers of Thinking Space. Many articles at the site are worth of reading and thinking. Check them out and enjoy!

Friday, November 16, 2007

My Talk with Talis

I had a talk with Paul Miller about Semantic Web and web evolution. The first half of the talk is about myself and my current PhD research; and the second half of the talk starts discussing my visionary view of web evolution and Semantic Web. In general, we had an easy and nice talk except that Skype dropped off 4 to 5 times in the middle. So please forgive us if you hear some broken connection in the talk.

This talk is part of the Talking with Talis series, in which Paul had talked with web evangelists such as Danny Ayers, Nova Spivack, Thomas Vander Wal, and many more.

Thursday, November 15, 2007

The Curse of Knowledge and Semantic Web

In my newest Semantic Focus article, I casually introduced ontology mapping and its solutions by discussing an interesting issue---The Curse of Knowledge. The following is a quote from the article:

"Since the Curse of Knowledge is a major reason for ontology mapping on the Semantic Web, we may try to solve this problem by breaking the Curse of Knowledge. By breaking this curse, we may solve the problem of ontology mapping in reality more easily than trying to exploit the complex algorithms of computer science."

Check out the complete article at Semantic Focus if you would be interested in.

Monday, November 12, 2007

Chinese Version of Thinking Space

I have set up a Chinese version of Thinking Space (思维空间) dedicating for Chinese readers. Although at present my plan of 思维空间 is to have an official Chinese translation of Thinking Space (Google translation is far less than satisfaction), sooner or later I may start to post some special articles that are particular for Chinese readers. So, if you are Chinese or you prefer to reading in Chinese, be aware of this new Chinese version of Thinking Space. Wish to see your comments in this new space!

BTW: Any translated post will have a "read the story also in Chinese" label underneath the title.

Tuesday, November 06, 2007

The Implicit Web

(This article is cross-posted at ZDNet's Web 2.0 Explorer.)
(watch the article also in Chinese, translated by the author)

Implicit web is a new concept coined in 2007. Due to the first Defrag conference right now, discussion of this new term is timely.

Generally this concept implicit web intends to alert us a fact that besides all the explicit data, services, and links, the Web engages with much more implicit information such as which data users have browsed, which services users have invoked, and which links users have clicked. This type of information is often too boring and tedious to be human readable. So, inevitably, this type of information is only implicitly stored (if stored) on the Web. The implicit web intends to describe a network of this implicit information.

Implicitness Everywhere

Implicit information is everywhere. Implicit information on the Web is about things to which human web users have paid attention. For example, it is about which web pages are frequently read, how often they are read, and who read them. It is also about which services are frequently invoked, how often they are invoked, and who invoked them. Consider the number of web users and how many activities everybody has done daily on the Web, the amount of implicit information must be astonishing. The implicit information co-exists with every web page, every web service, and every web link. In short, great amount of implicitness co-exists with every little piece of explicitness on the Web.

Implicit does not mean insignificant or unimportant. By contrast, implicit web information is often valuable and even crucial in various situations. For example, implicit information of click rates can help editors decide which news are the most popular ones and thus they should put these news on the front page. In similar, the same type of implicit click rates can help salespeople decide which merchandises are among the greatest demanding and so they can arrange the next supply line.

Many companies have already started to collect implicit information and they take benefits from it. Alex Iskold had written a compact introduction on how some companies have utilized implicit information in their products. One well-known example is Amazon.com, which always lists related buyer recommendations with each of its online merchandise. "Customers Who Bought This Item Also Bought," many readers must be familiar to this label. And more importantly, many web users do care of the content underneath this label. This is a typical example of how implicit web information helps.

Amazon is not the only company that benefits from implicit information. Amazon is not one of the few companies that benefit from implicit information. In fact, nowadays almost every website that sells something, from baby toys to cars, has some back-end mechanism on analyzing the traffic (a typical implicit information) and adjust their sales plan based on the analysis. Implicitness is indeed everywhere.

Connect Implicitness

Implicitness is everywhere, but is fragmented everywhere. Implicit information on the Web is not connected. This is a problem.

Until now, implicit web information is generally separately stored, typically by individual companies. For example, both Gap.com and jcrew.com have their own stored visitor history but not shared to the other, although we may imagine that this information must be well connectible since both companies sell apparel and accessories. Someone may argue that Gap and J. Crew are competitors. So let us switch the pair to be Banana Republic and Victoria's Secret. The products of these two companies are well complement (in contrast to compete) to each other. But still the implicit information is isolated to itself, despite that both sides can benefit by connecting this independent implicit information. Readers can find many more this type of examples.

If sharing implicit information among big companies is still questionable (because these big boys hardly believe that they could get help from their little sisters), this type of sharing is much more critical to small websites. There are numerous individual sites that cannot utilize themselves well enough from their own implicit information because they are too small in size. At the same time, however, there are no effective way for them to share and find helpful implicit information, though everybody knows that there is plenty of this information on the Web.

All these discussions lead to one demand: we need the implicit web, which is not there yet. The goal of the implicit web is to defragment all the fragments of implicitness (where the name Defrag is gotten for the conference). But how can we indeed connect all the different types of implicitness on the Web to be a coherent implicit web? This is a grand challenge to the newly formed community of implicit web research. We do not have a clear answer yet.

No matter whatever, however, the solution to the question must be beyond web links. The implicit web engages with complex types of semantics. The amount of information on the implicit web is gigantic. The implicit Web is also very much dynamic. The traditional model of web link is too simple, too shallow, and too static to deal with all these challenges at the same time. We need big, creative thoughts to store and link all the implicitness.

The greatest potential problem to the implicit web is privacy. To companies, some implicit information may be too confidential to be shared. To individual persons, some implicit information may be too private to be public. We need innovative methods of privacy control on the implicit web.

Implicit Web in nutshell

In summary, I briefly list my beliefs about the implicit web.

1. The implicit web is a network that defragments every piece of implicitness on the explicit web, which is the generally known World Wide Web itself.

2. If the explicit web reveals the static side of human knowledge through posted data, services, and links, the implicit web reveals the dynamic side of human knowledge by recording how users access these data, services, and links.

3. The explicit web engages collective human intelligence. The implicit web engages collective human behaviors.

4. The implicit web is not part of the Semantic Web, but they are closely related. If the Semantic Web constructs a conceptual model of World Wide Web, the implicit web constructs a behavior model of World Wide Web.