Friday, June 29, 2007

Epistemological extension to ontologies: a key of realizing Semantic Web?

(updated Dec. 21, 2007)

epistemology When discussing Semantic Web, we often think of ontologies. Ontology contains formal specifications of conceptualizations. By linking world facts to ontological declarations, machines can "understand" the meanings of these facts. Moreover, machines can reason on ontology declarations to derive latent conclusions based on explicit ontological expressions. These logical computations form the foundation of the Semantic Web.

But there is a problem. The process of linking world facts to ontology declarations is often ambiguous and subjective. This process, also normally addressed as the process of semantic annotation, is related to another branch of philosophy---epistemology. It short, epistemology studies how we know what we know. A narrower definition of epistemology could be expressed as the study of how we recognize what we know. Human recognition is often subjective.

Besides the distinction of subjectiveness and objectiveness, epistemology is more about the knowledge of recognition. This knowledge of recognition is generally bound to the superficial side of meanings, such as the form and the display, in contrast to that ontology is more about the intrinsic side of concepts themselves. The knowledge in ontologies should not be affected by its external representations, such as the form and the display. The understanding of this distinction is critical to the understanding of semantic annotation.

We should equip machines with not only ontologies but also epistemologies to realize the Semantic Web. Without epistemologies, machines with standard ontologies by themselves can neither recognize new facts, nor verify the correctness of their assigned facts. Hence they must rely on external procedures (such as data extraction routines) to accomplish these missing functions. This type of external procedures are hard to be modified and updated. As a result, we may ask why we must adopt the epistemological declarations produced by others for our own ontology declarations. To solve this problem, we need declarative (but not hard-coded programming subroutines) epistemological extensions to ontologies.

Epistemological extensions of ontologies allow individuals to specify different things of recognition with respect to the same ontology. This requirement is common in our real society. For example, different people may have varied external interpretations of an agreed, shared definition of the concept "beauty." By separating epistemological declarations and ontological declarations, we can have a better and more flexible handling of semantics. This is a key of realizing Semantic Web.

For readers who are interested in exploring more on this topic. Here is a recently accepted paper by the First International workshop on Ontologies and Information Systems for the Semantic Web (ONISW 2007) collocated with ER 2007.

Monday, June 25, 2007

Article Review: Embracing "Web 3.0"

Ora Lassila From time to time, I wrote short reviews about some interesting articles. To be honest, being a critic is much easier than being a creative writer. All these orginal authors are tremendous. They addressed excellent insights on these articles. My purpose is only to share my understanding after reading their work, and try to broadcast these brilliant work to more people.

James Hendler Not long ago, two leading scientists of web research, Ora Lassila and James Hendler, published an article titled Embracing "Web 3.0" at IEEE Internet Computing. This is a great article addressed the understanding of the buzzword "Web 3.0" and even the web beyond it from top Semantic Web scientists. This article begins with the addressing of a popular New York Times article by John Markoff, which claimed Web 3.0 to be the Semantic Web. Many Semantic Web researchers, including these two authors and me, do not like this naming. But all of us agree, as addressed in this article, that "we enthusiastically embrace the technologies it is bringing to the field."

Brief Summary

This article contains three parts. At the first part, the authors briefly introduced some background of Semantic Web probably to the readers that are lack of knowledge about the Semantic Web. At the second part, the authors presented their understanding of what "Web 3.0" might look like. In particular, the authors have addressed the adoption of RDF and SPARQL in the industry domain. At the third part, the authors addressed their dream beyond Web 3.0. As a matter of fact, the descriptions seem going back to the original dream of Semantic Web, which have been discussed repeatedly among Semantic Web researchers for years. In the authors' mind, the so-called "Web 3.0" is nothing but a short period of infant Semantic Web.

Comment and Thought

In this article, the authors presented that Semantic Web "supports ... a view of information processing that emphasizes information rather than processing." Despite of agreeing with what the authors wanted to emphasize, this sentence itself is dangerous. This sentence delivers a hint that Semantic Web is primarily a stationary web (since it is more about information) but not an active web (since it is less about processing information). This latent message becomes very dangerous because without the ability of processing, what on earth can we expect the production of semantics? Information is produced for processing. But without the focusing of processing, the momentum of producing information loses. In fact, this is the problem I repeatedly blogged. This is why the realization of Semantic Web becomes so slow, and it is still quite overlooked by many people. Unless we start to refocus the attentions onto the processing side, Semantic Web would not be realized in near future.

A lesson we have learned from the previous "AI hype" is that "you can't sell a stand-alone 'AI-application'," addressed by the authors. I totally agree with this point. Unfortunately, however, it seems that many people still do not learn from it. This lesson basically tells us that semantic Google is impossible to be built. No company by itself can build a super-program, even if it might be very well paralleled and distributed, that semantically processing all knowledge on the web. In order to realize the real, pragmatic semantic search, collaboration of almost every web user on the web is a demanding. That is, the solution could only be a search network that is operated by every one, but NOT operated only by one big node. This is thus the vision of the semantic search web.

"Web 2.0 is most a social revolution in the use of Web technologies," addressed by the authors. Nevertheless Web 2.0 is a social revolution, it is far beyond a social revolution. Web 2.0 is also a major landmark on web revolution. Unlike many other Semantic Web dedicators, I am kind of like the term "Web 2.0." From the view of web evolution, this term very well reveals a fact that our web is evolving and it is now at its second major stage of its evolution. The hype of Web 2.0 changes not only our real human society, but more importantly it also changes our virtual human society, i.e., the World Wide Web itself. The quality of web resources has been significantly upgraded. Semantic Web researchers must not only focus on Floksonimies and Microformats. Thought these two things are unquestionably important, other things such as the implementation of community collaboration might be at least as important as, if not more important than, the mentioned two technologies.

Web 3.0 is more than RDF and SPARQL. In short, the trend of web evolution is the web being more and more alive. Machines (or machine agents) on the web will start to think of things by themselves based on their learned semantics. No single machine could think all the knowledge. But only if individual ones are thinking of their limited knowledge, the collaboration of all of them can handle close to the entire set of web knowledge. Similarly, no individual humans know much of knowledge. But the whole set of human knowledge becomes so huge and it is so great because we are not just individuals. In contrast, we are collaborated people and our society has been evolved to well cooperate individual knowledge. This is what Web 3.0 and the ones beyond it would deliver to us.

Friday, June 22, 2007

The Key to Initiate the Semantic Web

I have posted a new article to Semantic Focus, a blog in which I was a joint author. The title of the article is Satisfying the Nature of Selfishness: The Key to Initiate the Semantic Web.

Thursday, June 21, 2007

Reality Divide

A recent post by Sean Ness is interesting. He mentioned a phenomenon that is likely happened in the future---Reality Divide. In short, it means that we are going to be confused between real and virtual worlds. Isn't the popularity of Second Life a perfect example of Reality Divide?

"The poor will have access to virtual, while the rich will get to experience the real thing." This is an assessment addressed by Sean. And I believe so too. In fact, quite a few new sci-fi novels have started to address this division and predict that in the future poor people can only live in a virtual world but with very poor real living condition; while only the rich people can have life in reality, even real sunshine is a luxury.

Since most of the people can live in their dream, this world may afford many more people to live under a low real expense but allowing them dreaming every day. Ironcially, it may help achieve world peace! But is this type of peace really perferred?

Moving toward machine processing---the certain destiny of web evolution

This is indeed not new. But I had some strong feeling to say something after reading a new post at the Read/WriteWeb. In the post, Alex Iskold discussed a new, but common, phenomenon after the rise of Web 2.0---attention distraction.

Nevertheless Web 2.0 provides us powerful facilities to build virtual social network on the web, there is downside of this advancement. Unlike the previous ages, we become more and more easily "being caught alive" on the web. As a result, we are distracted regularly. We may often have to interrupt the normal work flow to handle exceptions, an inevitable negative side-effect of "being popular." This phenomenon is addressed as the problem of continuous partial attention.

This "continuous partial attention" is a very interesting issue, especially when we think of it by the view of web evolution. In this view of web evolution, we analogize the progress of World Wide Web to be the growth of humans. In particular, we have analogized Web 1.0 to be a society of newborns, Web 2.0 to be a society of pre-school kids, and the ideal Semantic Web to be a society of educated people. In fact, this analogy can also well explain the reason of "continuous partial attention" on the web and foresee how this issue could be solved gradually with the evolution of WWW.

We are seldom interrupted by newborns. In fact, though newborns may cry, we can ignore them if we want because they do not have the ability to interrupt our normal work flow. On Web 1.0, machines can deliver emails (a type of interruption) to us at any time. But we can choose ignore them at run-time and only choose to take care of them in our scheduled time. Our normal schedule is kept as usual in the environment of Web 1.0.

When children grow up, parents start feeling pain of "continuous partial attention" caused by their kids. Especially pre-school kids, they still do not have much ability to do things by themselves. But unfortunately (or fortunately), they have learned limited knowledge and started to request. They ask questions and require accompanies playing with them. Moreover, they deliver messages. Thought this is often thought positively, these messages are indeed irregular interruptions because these kids often ask for the highest priority to the handling of their delievered messages. This is what we have encountered at present, as in Alex's post, the issue of "continuous partial attention" on Web 2.0.

On Web 2.0, machines have been augmented by limited knowledge. They are equipped by various widgets and active functions. At run-time, we (as virtual parents of these machines) are often interrupted by the messages delievered by these kids from the other parents (other web users). The prevalence of Twitter only worsens the already disturbed schedule. We are often caught alive online; and we often have no choice but interrupt our regular work flow to handle these exceptions so that we can maintain a good relationship in the constructed online social network. This is a pain to have growing-up children; and this is a pain to all Web-2.0 dedicators.

How to solve this problem? Certainly we do not want our children going back to their newborn stage. As well, we certainly do not want to switch back to Web 1.0 or shut up ourselves from online only to avoid this "continuous partial attention" issue. In contrast, we want our children to grow up and start to be able to handle things, from simple to complicated, by themselves. For humans, this process is called education; and the people after this process is called the educated people. For the web, this process is called annotation (or adding semantics); and the web after this process is called the semantic web. We need to educate machines. Let them understand semantics, from simple to complicated. This is the certain direction of web evolution.

In summary, continuous partial attention is a certain side-effect in the process of web evolution. In this Web 2.0 stage, the severity of this problem will reach its peak. But this problem will be gradually solved during the process of web evolution when more and more machine-processable semantics are added to the web. Though it may not be solved totally (just like in our real life we cannot totally avoid being interrupted), it would not be a serious problem in the future web with rich machine-processable semantics.

Trackback list:

*** Continuous Partial Attention: Software & Solutions

*** Dealing with partial attention issues

*** Supernova 2005: Attention

Monday, June 18, 2007

Yahoo! had a new CEO. Can Jerry Yang lead the company to a new level?

A very recent news, Jerry Yang has replaced Terry Semel to be Yahoo's CEO. Though Yahoo! had survived from the dot-com bust under the lead of Semel, its influence declines conspicuously in recent years accompanied by the rise of Google. Can the crowning of Yang slow down or even reverse the decline of Yahoo? It is hard to tell at this moment. But one thing is for sure---Yang will have a long to-do list to accomplish. In a recent survey, up to now (people still can vote at this moment) 44% of people believed that Yang might not be the right choice for Yahoo CEO; in contrast to that only 22% voted yes.

Indeed, I don't think that the fate of the battle between Yahoo and Google will be any difference only if Yahoo has gotten a new CEO, even if this one is Jerry Yang. Personally, I have great respect to Yang for his great vision of founding such a great company (Yahoo) in history. But the once glory of Yahoo has gone with the rise of Google. Yahoo had once established a new model of web search. But Google perfected this model to its ultimate. Once upon a time, Google was a little follower of Yahoo. Except of the PageRank algorithm, Google followed everything Yahoo had invented. But now, many evidences show that Yahoo is following Google. Google has invented so many great applications by facilitating searched web resources. It is even difficult for Yahoo to follow up; let it alone beating the ambitious Google. At present, the opponent of Google is no longer Yahoo, it is Microsoft.

In the second part of my web evolution article, I have presented a brief study of the rise of Google, as well as the decline of Yahoo. Along with several of my previous posts, I believed that the future of Yahoo is lay on a complete new vision of web search. I doubt anybody could beat against Google any more underlying this traditional web search model. Google has executed it too well. And this traditional web search model allows the winner taking all the shares eventually. Yahoo must figure out a new solution, an alternative solution. Otherwise, Novell's present would be Yahoo's future. Novell was once a company leading the world in the network realm, and it had the power to decide the fate of others. But now Novell becomes barely more than a normal middle-size company that is struggling its own survival among the big brothers (once he was one of them).

Can Yang again lead Yahoo to a new route as he had done several years before? Can my vision of web evolution be realized by Yahoo? Best wishes to Yang and Yahoo!

Wednesday, June 13, 2007

Semantic search has two legs

The discussion of semantic search has gradually become popular. Just not long time ago, semantic search was thought to be barely a little bit more than a dream. At present, optimistic researchers have started to believe its possibility in the near future. Very recently at Read/WriteWeb, Dr. Riza C. Berkan, the CEO of Hakia (a company declared to perform "semantic search"), posted an article about semantic search that attracted much attention. Despite of agreeing with the post, here are more thoughts about semantic search.

Semantic search has two legs: semantic understanding and proactive collaboration. Until now, however, most semantic search articles only have focused on the first one. Including Hakia, an "ideal" semantic search engine is popularly thought to be alike a "semantics-enhanced Google." This is, however, a narrowed thought. The intension of semantic search is more beyond "semantics + search."

In order to better understand these two legs, we may watch a regular semantic search scenario in human society that is, however, often overlooked. We humans have daily practised a type of semantic search very successfully for centuries. We ask questions; everybody asks questions, from children to adults. We ask questions to look for answers. These question-and-answer behaviors are typical semantic search activities.

When we are young, we look for answers from parents, whose words are oracles to us. When we grow older, we look for answers from teachers, whose words are oracles to us. When we grow even older, we start to realize that there are indeed no oracles. We start to look for answers by ourselves. In particular, we make friends with various specialities. These friends become our sources of question answering when we get troubles in particular realms. At the meantime, we ourselves also become such a type of sources to our friends. These links constitute a delicate, complicate, and successfully executed network of semantic search in our human society.

If we take a closer look at this successful semantic search network, we can find two fundamental factors that support its execution. First, its success relies on the ability of semantic understanding at each but not some of its nodes. It is generally believed that the set of human knowledge is too rich and too complicated to be executed in a centralized way. For instance, Mor Naaman at Yahoo! Research very recently said that "there is no way that we can engage the masses in annotating media with 'semantic' labels" in a WWW2007 panel. Therefore, representations of global semantics are better to be distributed widely other than be accumulated onto only a few special nodes. In consequence, every node in this semantic search network has its ability to perform a certain level of search depending on its own capability of semantic understanding. This is the basis of a successful semantic search network.

Beyond the local semantic understanding on every node, a successful semantic search network also requires proactive collaborations. In a search network, some nodes (such as professors) may have much greater capability than others (such as first-grade students). But even the node with the greatest ability is still very much limited in its search capability when the search space is about the whole set of human knowledge. A successful semantic search network demands well collaboration among individual search nodes. Moreover, such a type of collaborations appeals to being proactive.

Proactiveness is a unique factor in the network of friendship. The network of friendship is not only a regular social network, but also a search network. When we get troubles, we used to get to our friends for help. Nevertheless, we often make friends on purpose, i.e. in contrast to randomly or aimlessly. A successful semantic search network in human society is priorly built on the joint or depending interest of individuals. For example, both John and Mary love music; so John actively make friend with Mary. Another example, John play piano and does not know to tune a piano; Kate, however, is good at tuning pianos. For the sake of his future requirement of piano-tuning, John proactively make friend with Kate. The third example, Rose is good at history literature; John, however, does not like to read history literature. In consequence, John inactively make friend with Rose. These examples show that the establishing of a search network very much depends on the proactiveness (which in turn decided by the semantic understanding of interests) of these nodes to make connections.

In summary, semantic search naturally contradicts to the centralized web search strategy. In order to activate semantic search to the practical level, we need a search network that is participated by all web users beyond the few independent and aggregated semantic search nodes such as Hakia. The entire web search strategy must experience some revolutionary change other than simple makeups. In the second part of my article about web evolution, I have more discussions about the collaborative search for the future semantic web.

The initial draft of this post is published at SemanticFocus.