Monday, January 01, 2007

2007, New Year, New Web

The World Wide Web has come to a critical turning point. Many web evangelists have predicted a new age be coming in the year 2007. With the continuous advances on Web 2.0 practices, many new philosophies have been introduced to web design and applications. More important, these new philosophies (such as collaborative intelligence) have been gradually accepted by the majority of normal web users. The help from normal web users decreases the cost of both starting up and maintaining a new company. It is now faster than ever that web researchers and developers can turn their new research results to market products. As a result, the progress of web evolution is accelerated.

Many web evangelists predict that the enrichment of semantics to Web 2.0 applications might be a focus of achievement in 2007. This prediction, thus, results in a new hype of an "old" term---Semantic Web. Though self-addressing as a Semantic Web researcher, I am not so optimistic on this movement in 2007 as several other web researchers. Many fundamental problems in Semantic Web research may not be solved immediately because of this hype from the Web 2.0 realm. The realization of the Semantic Web highly relies on great achievement on three main issues: knowledge collection and formalization (essentially ontology construction), knowledge instantiation (essentially semantic annotation and authoring), and knowledge processing (essentially logical inference and reasoning). Until now, however, there are no practical solutions that are scalable to the size of the web for any of the three main issues. Also, we are not sure of any promising solutions that may come soon. The only assured thing is that the hype of Web 2.0 may not bring any immediate break-through to any of these main issues of Semantic Web research.

In this year, I am going to continue my blog with brainstorms and new thoughts on achievement of web technologies. In January, I plan to publish an online article of my historical and analogical view of web evolution. In the article I am going to introduce a novel explanation of Web 2.0 that may better illustrate its natural properties. Moreover, I am going to predict in details of the next-generation web in the article. I hope that these new thoughts might bring the WWW community some fresh ideas especially for the study of web evolution.

1 comment:

Francesco Sclano said...

TermExtractor is online! It's a FREE and high-performing tool for terminology extraction.

TermExtractor, my master thesis, is online at the

TermExtractor is a FREE and high-performing software
package for Terminology Extraction.
The software helps a web community to
extract and validate relevant domain terms in their
interest domain, by submitting an archive of
domain-related documents in any format
(txt, pdf, ps, dvi, tex, doc, rtf, ppt, xls, xml,
html/htm, chm, wpd and also zip archives.)

TermExtractor extracts terminology consensually
referred in a specific application domain. The
software takes as input a corpus of domain documents,
parses the documents, and extracts a list of
"syntactically plausible" terms (e.g. compounds,
adjective-nouns, etc.).
Documents parsing assigns a greater importance
to terms with text layouts (title, abstract, bold, italic,
underlined, etc.). Two entropy-based measures, called
Domain Relevance and Domain Consensus, are then used.
Domain Consensus is used to select only the terms
which are consensually referred throughout the corpus
documents. Domain Relevance to select only the terms
which are relevant to the domain of interest, Domain
Relevance is computed with reference to a set of
contrastive terminologies from different domains.
Finally, extracted terms are further filtered using
Lexical Cohesion, that measures the degree of
association of all the words in a terminological

NEW: Now TermExtractor allows to a group of users to
validate an extracted terminology. See the news at

Francesco Sclano
home page:
skype: francesco978