Thursday, September 20, 2007

Semantic Web: Difficulties and Opportunities

This post responses to a recent post by Alex Iskold which focuses on the difficulties of realizing semantic web. I am not opposite to Alex's claims. In fact, the problem of realizing semantic web is not just difficult, it is VERY difficult. Besides what Alex said, however, I have several other additions after reading his post.

Realizing Semantic Web: is it a human project or an evolutionary event?

I believe that this is a common bias. Many people think of realizing the Semantic Web as a great human project that is similar to the famous Apollo program. The Apollo program aimed to "landing a man on the Moon and returning him safely to the Earth." In fact, these two things are very much different to each other.

An apparent difference between Apollo program and constructing semantic web is that the latter is not just a goal attempt, but a process of transformation resulting from the arising of mutant forms. We are not designing and then building a semantic web based on the design. As what Alex mentioned in his post, this type of human designs is hardly practical. This is why "for the past decade [Semantic Web] has been a kind of academic exercise rather than a practical technology."

In contrast to achieving a well planned goal, we are experiencing a process of transforming the Web through a consistent filtering of its mutant forms. When individual persons or groups of people create a new form of web applications, they create a mutation of web segment. The fate of this mutation is determined, however, by the public votes instead of the dictating power of any single organizations. This mutation filtering mechanism follows the standard law of natural selection, i.e., only the favorable traits could be kept and become flourishing during an evolution but the unfavorable traits become less common. This is the way we are approaching for the Semantic Web.

A realized Semantic Web must be more than a web of data.

W3C gave the Semantic Web a short and simplified explanation --- a web of data. Nevertheless this explanation illustrates the goal of Semantic Web, it is somehow misleading to web developers who want to build semantic web.

The point is that if we purely focus on constructing a web of data (even though this is what we want to build), the Semantic Web cannot be built. Constructing a web of data requires user participation. This is not a third-party task but requires the participation of all information contributors themselves, i.e., almost all of the web users.

A crucial demand is willingness. How can we make users be willing to explain the meanings of their contribution to machines? The answer to this question is not by saying that we need to have automated semantic annotation tools. These automated semantic annotation tools are certainly important. But the more crucial one is why users need these tools and ontologies. The resolving of this willingness issue is fundamental if we really want to make Semantic Web be real. In comparison, Web 2.0 becomes popular due to that users are willing to adopt these Web-2.0 products.

Normal web users may not care of whether the Web is a web of data. But they do care of whether by communicating to machine agents these machine agents can proactively work for them. Proactive machine agents can understand their masters' requests and execute the requests on the web by themselves. The implementation of this proactivity is at least as essential as creating interlinks among data if someone wants to truly construct a semantic web. This implementation is a key to ensure user participation.

For people still doubt about the implementation of machine agents that can understand semantics, I repeat the claim at the beginning --- think of this task as an evolutionary event but not just a goal attempt.

The Commercial Side of Semantic Web

Alex raised several questions about business challenges to Semantic Web. He had one sentence that was precise --- "The way the semantic web is presented today makes it very difficult to market."

We are not right on the target. We cannot market Semantic Web by only expressing it to be a web of data. This standard explanation is less marketable because it has no direct association to consumers. In comparison, watch the following three sentences.

"Users can travel from one web site to another by clicking on hyperlinks." This is interesting and users are willing to explore more information. So World Wide Web is marketable.

"Users can freely exchange their opinions by blogging and leaving comments at remote sites." This is also interesting and users are willing to have free spaces exchanging ideas without restrictions. So Web 2.0 is marketable.

"Users can educate machine agents by guiding them learning semantics. Then these machine agents can help their masters do what they have learned." This could be interesting and users would be glad to "educate" machines by specifying data with machine-processable semantics once they are ensured that their work is for themselves but not for the meaningless (to themselves) web of data (who cares). This is a philosophic shift of targeting from a web of data to individual machine agents. A web of data would be automatically weaved when these machine agents start to communicate and work on the web. This is a practical way to market, as well as construct, Semantic Web.

Please allow me repeating my vision of semantic web at a previous post, a simple picture of web evolution. Web 1.0 connects real people to the World Wide Web. Web 2.0 connects real people who use the World Wide Web. The future semantic web, however, will connect virtual representatives of real people who use the World Wide Web.


The realization of Semantic Web is an evolutionary event but not just a goal attempt. This understanding is essential to the construction of semantic web. Therefore, we'd better forgetting about either simple bottom-up or top-down approach since this is not a single project. Many people are simultaneously created mutations of the current web. The most favorite ones will be adopted by the public and the Web evolves. Any enforcement that the Semantic Web must be on this form is likely to wane in this natural selection process.

Selfishness is a main obstacle on realizing semantic web. At the same time, however, it is also the key to market Semantic Web.


Anonymous said...

Interesting thoughts, especially the comments on "agents". While this sort of technology isn't widespread today, I'm fairly confident we'll see an uptake in agents, proxies, background processes, or other mechanisms as tools for discovering, filtering, and manipulating online data.

Just as Google improves the quality of search results by analyzing past user activity, users' interaction with these tools will provide valuable data that can be leveraged to further bootstrap the semantic web / discovery / filtering / categorization process.

NitinK said...

Great article, Yihong! I like your exploration of the commercial aspects of the semantic web - this area has not received much focus, but is sorely needed to spur interest and innovation (not to mention, investment).

Unknown said...

"Users can educate machine agents by guiding them learning semantics. Then these machine agents can help their masters do what they have learned."

Yihong, you said it right! Agents are the key to semantic web. Unfortunately, not many people talk about agents as much as they talk about "web of data". Alex had totally missed the clear distinction that Tim-Berners Lee had made between semantic web and natural language processing.

- Ramani
(Blogger with huge interest in semantic web)