Wednesday, April 30, 2008

DiSCo '08: The 1st International Workshop on Data Semantics in Social Computing Systems

DiSCo '08 is a new workshop that focuses on the research of data management with the background of social computing and data semantics. It is another attempt of bridging the gap between Semantic Web and Web 2.0.

I am one of the PC co-chairs of the workshop. Along with me are James Caverlee from Texas A&M University and Ying Ding from University of Innsbruck. Besides, we are honored to have Stefan Decker from National University of Ireland and Ling Liu from Georgia Institute of Technology as the general chairs of the workshop.

The workshop is affiliated with OTM 2008, which will be holden at Monterrey, Mexico from November 9 to November 14, 2008.

The deadline of paper submission is June 30, 2008 (two months from today) and you are more than welcomed to submit your papers to the workshop. You can find the details of paper submission guidelines at here. Look forward to seeing your work!

Tuesday, April 29, 2008

Microsoft Windows: more than operating system

(revised May 3, 2008)

Recent reports (such as this one) tell the drooping of Microsoft Windows in the market of operating system (OS). How should Microsoft react to this slide? There may be several options. For example, (1) to accelerate improving Windows Vista, (2) to speed up the migration towards Windows 7, (3) to buy Yahoo, or (4), as I am going to suggest in this post, to explore the potential of Windows that is more than operating system. In particular, I would rather devote this post to the Microsoft Live Search team. Live Search might become a central piece of the new Windows as I would suggest in the following.

Windows, less of momentum to keep on going?

The progress of Windows seems have been slowed down in recent years. I had once asked myself, should I upgrade the installed Windows XP in my laptop to Windows Vista? At last, I decided to wait for a while because I could not see any emergency of the upgrade. Now it seems that this decision might be smart. Many users do not like Vista and they would rather stay with the old Windows XP even for their new computers.

Besides this Vista story, another thread to Windows is from a competitor of Microsoft---Apple. The increasing influence of the Mac operating system has been gradually severe to Microsoft. When the technology of operating system on PCs is becoming more and more mature now, adopting Mac instead of Windows becomes a more and more persuasive excuse especially when the UI design of Mac is generally thought prettier than that of Windows.

Do Microsoft engineers suddenly lose their mind to create this unsatisfactory Vista product? Or are Apple engineers becoming much smarter to compete to Microsoft? Actually, none of the answers are true. The truth is that any further improvement of OS on PC has become so difficult and there are so few rooms to grow on advanced OS technologies on PC that the version-upgrading strategy executed by Microsoft for years may come to its end if no major change of action would been made. We are actually not complaining of how bad Vista is (in fact, Vista is not a bad product). The real issue is that Vista is not innovative enough to be a new version of Windows. Hence if someone really looks for experiencing a new operating system, switch to Mac is probably a better option than upgrade to Vista. This is thus the problem.

Windows, more than an operating system!

To save Windows from collapsing, buying Yahoo is not the only option as some experts have suggested. Although it becomes harder and harder for Windows to upgrade just as an operating system, Windows is actually more than an operating system.

The term "Windows" represents operating system when we talk about personal computers (PCs). But there is a question: will Windows mean something more when a PC is linked to World Wide Web? This question is crucial.

About Windows and World Wide Web, here are several statements.

1. Windows was born when the Web was infant.
2. Windows is not designed for the Web.
3. Windows is an operating system for PCs but it is not a Web operating system.
4. There are no Web operating systems so far and probably there should never been a Web operating system ever.

But all these statements still do not touch the question we just asked: what does Windows to users mean when a Windows-installed PC is connected to the Web?

To answer this question, we should first look at what a PC becomes when linked to the Web. By linking a PC to the Web, the PC automatically becomes part of the Web. Or more precisely, the storage space on the PC becomes a small portion of the gigantic space of the Web. When Windows manages the space on the PC, the Windows manages part of the Web. Hence in this local environment, Windows plays the role to be a web-resource operating mechanism (WebROM). This is a critical recognition of a new function of PC operating systems.

There is another effect caused by linking PCs to the Web. When PC users explore the Web through their PCs, the user activities on the Web result in a local topology of the Web on the PC. If we connect all the Web nodes navigated by the users of a PC, we can obtain a topology of a typical portion of the Web that reflects the interest of the respective PC users. Therefore, linking a PC to the Web causes not only PCs on the Web but also the Web mapped into PCs.

Based on the discussed we have done in the previous two paragraphs, a PC operating system (such as Windows) is indeed a WebROM of not only a particular Web space (the particular local PC) but also a typical topology of Web resources that are physically stored in other places. This is why Windows is more than an operating system by the traditional mean.

I must emphasize that the last statement just made does not, however, suggests Windows to be a Web OS. As I discussed in an earlier post, a WebROM is fundamentally different from a WebOS. What each Windows manages is a topology of a very small portion of the entire Web. Such a topology is consistently a closed world by contrast to that the entire Web is always an open world. I do not believe in generic Web OS.

Live Search, bring a new life for Windows

One Microsoft product will be crucial if Microsoft decides to explore the new Windows by expanding its ability of Web resource management. The product is Live Search.

According to Web resource operating, a central task is search. Unlike local resources stored in personal computers, PC users do not have the full control over Web resources in general. Many traditional OS issues on PCs such as I/O device management are not critical to the role of WebROM. By contrast, PC users do have the right to SEARCH all Web resources even though they cannot control most of them. The resource search operation thus becomes primary.

Up to the date, Live Search at Microsoft is following a route that Google and Yahoo have experienced successfully so far. Microsoft is developing a generic, centralized search engine that is supposed to have indexed the entire Web for users to search. Yahoo has succeeded with this strategy, and so has Google.

A problem to Microsoft by adopting this "successful experience" is that this strategy makes Microsoft forget its unique and powerful weapon that neither Yahoo nor Google has. The weapon is Windows. By giving up this weapon, Microsoft is just "a three-year-old kid comparing to the 12-year-old big boy Google" on Web search, once said by Mr. Ballmer. However, what might happen if the three-year-old kid decides to pick up a WMD (weapon of mass destruction) on hand? Google will be really afraid when Microsoft starts to assign Windows a new interpretation towards the new Web age.

In addition to the current strategy, Microsoft may think of an alternative strategy of Live Search to defeat Google. In my mind, this new search strategy should have a decentralized paradigm. Based on the network of registered Windows users, Microsoft can develop a novel social-search-style strategy by embedding Live Search into Windows (by contrast to add a Live Search link into Web browsers, I will explain the difference at the end of the post). This new strategy will put Live Search to the center of the new Microsoft Windows.

I will not discuss more details of this new search strategy in this post since it has already gotten be too long. I may start another post on this topic later depending on my time. But I will share the details of my thoughts with Microsoft Live Search scientists and engineers next week at Redmond.

More about the Yahoo Deal

As last, I want to say a little bit more about the Microsoft-Yahoo deal. In an earlier post, I briefly expressed that the deal may hurt Live Search in a long run. But certainly it is not because of Yahoo! Search that Microsoft wants to buy Yahoo.

Microsoft wants to jump into the market of online advertisement and buying Yahoo may be the fastest way to get into it and obtain a decent percentage of market share immediately. Unquestionably there are enough proper reasons for Microsoft to take this action. The problem is, however, that whether the action is really the best option, let it alone the only option, on the table, as some analysts have argued. I think it is not.

Merging with Yahoo could be a huge burden to Microsoft. This action would cost Microsoft both of the time and money to develop new-age Web-resource management strategy that is critical to the next-generation Web search. Anyway, Yahoo is constructed on top of its original successful Web search portal and Google's success on online advertisement is also on top of its successful Web search platform too. Designing a new-age Live Search is much more important for the future of Microsoft than merging Yahoo. Anyway, if Windows can keep on its strength on growing, buying Yahoo is actually much less critical to Yahoo, as the same analysts have implicitly suggested. Let's choose Live Search and new Windows instead of merging Yahoo!

(The binding of Live Search and Windows is not to enforce all the search flow to live.com by Windows. Otherwise Microsoft must get sued immediately by Google and all other search engines. The binding is actually about reinterpreting Microsoft to be a provider of WebROM and Windows is the product. By this reinterpretation, Web search is nothing but another basic function of new Windows such as the other ordinary Windows operations, e.g. creating a new file in PC. By this change, it does not matter who is the default search engine set in a computer. Even if Google is set to be the default search engine in a PC stalled by this new Windows, Microsoft may still gain a big (or probably the greatest) share on online advertisement because Windows is always the default search platform, in contrast to the particular search engine. Windows becomes the manager of the Web.)

Friday, April 25, 2008

ZCubes: towards Web 3.0

A few weeks ago, I have a nice conversation with Joseph Pally, CEO of ZCubes. Joe is a very keen person and full of energy when discussing his product. He introduced me the company and demoed for me some most recent advances on ZCubes services. I must say that ZCubes is one of the most amazing startup companies I have ever seen. The product services are fantastic.

ZCubes services basically allow a user to drag any piece of data or services on the existing Web into a self-defined ZSpace and then seamlessly compose the dragged items to be a coherent new Web page. Through this effort, ZCubes services help transform the current publisher-oriented Web to the future viewer-oriented Web.

ZCubes in a Nutshell

ZCubes services begin with the construction of ZSpaces---a personal space that stores individually dragged Web items. Users can create as many as ZSpaces as they want. After users drag Web object (which could be an entire Web page or a specific figure or table in a page) into a created ZSpace, ZCubes services can compose these dragged items to be a ordinary Web page.

When dragging an object from an existing Web page, ZCubes produces a default ZWrapper (not an ordinary frame, thanks Joe for correcting me) that encapsulates the dragged object. The ZCubes services can automatically detect the type of the dragged objects and then apply the respective type of HTML encoding to contain the object in the frames. One uniqueness of ZCubes is that all ZWrapper elements are assumed as image items so that users can freely place them in a ZSpace by mouse. This technique greatly helps ordinary users to construct Web pages with creative designs.

Besides these basic functions, ZCubes has also invented an innovative Web-driven spreadsheet that is equipped with hundreds of build-in functions. During the demo, Joe showed me that the new ZCubes spreadsheet could compute better than Microsoft Excel in several situations, let it alone that this spreadsheet can be seamlessly embedded into the Web.

Remaining Issues

ZCubes is fantastic but there are also a couple of rooms to improve it.

One drawback of the current ZCubes is the miss of social effects. By using ZCubes, users obtain the freedom to re-compose the existing Web by their own perspectives. Because of this freedom ZCubes has claimed itself to be a "Web-3.0" age product. A problem is that, however, the ZCubes-generated pages are more like Web-1.0 style pages than Web-2.0 pages. There is basically no community-element among ZCubes-generated pages. Moreover, users have no ways to transport the associated Web-2.0 social tags automatically with the dragged items even if the items are obtained from Web-2.0 resources. In a broader sense, this latter issue belongs to the general topic of data portability. But we cannot deny that it is a flaw that ZCubes may want to resolve in the future.

Another problem of ZCubes is the handling of its user interfaces. ZCubes has given users a great deal of freedom to re-compose the Web. But the freedom given by ZCubes is probably too much for ordinary users to handle. In short, ZCubes has provided so many functions in various ways that they are not easy for unprofessional users to learn. Better user interface design would be an emergent issue for ZCubes.

The last issue I want to address is the engaging of data semantics. Generally believing, a real "Web-3.0" product must contain the processing of machine-processable semantics. In other words, the product must show its way towards the ideal Semantic Web. Twine from Radar Networks is a typical example in this category. Until now, ZCubes has not shown its progress in this direction. How to augment ZSpace to be not only a graphical container of HTML components but also an aggregator of machine-processable semantics would be a milestone for ZCubes becoming a real "Web-3.0" age product.

Final Address

In two days, I have gone over two leading "Web-3.0" services---Twine and ZCubes. Through two different approaches, the two services are focusing on one common thing---to enable a better organization of online information. Both the efforts demonstrates my prediction that the Web is indeed in the way of transformation from the publisher-oriented view to the viewer-oriented view. Radar Networks and ZCubes are currently in the front of this trend of transformation. The accomplishment of this transformation would be an important signal of Web 3.0.

Wednesday, April 23, 2008

Twine: the second impression

Thanks John Munro for referring me an invitation to test Twine. As I promised earlier, I would write a follow-up post discussing more pros and cons of this novel product. So here am I.

Review of the Goal

In the first impression of Twine, I have unfolded the goal of Twine into four aspects. Hence in this test, I would like to go over the four aspects and see whether the current beta version has reached the goal.

1) Twine produces knowledge network.

The design of Twine follows a pattern: (1) initiate a twine, (2) add items into the twine, and (3) specify tags for every item. Comparing to the standard Web-2.0 design that each item (such as a blog post, a YouTube video, or a Flickr photo) is associated with multiple tags, Twine adds an additional layer upon items that is named "twine". A twine represents a more generalized categorization over the individual topics of items.

We should not overlook this new level of knowledge abstraction represented by "twines". This design shows a significant jump forward on knowledge representation. With the standard Web-2.0 design, all tags are equally weighted. For instance, there is no weight difference between the tags "Twine" and "Web 2.0" that I have specified for this blog post. It is, however, technically difficult for machines to mine structured information out of all these evenly weighted social tags. With the new Twine design, the topics of twines weight differently from the ordinary social tags that are associated with specific items. For instance, I could have created a twine with the topic "Twine" and then I have added this blog post as an item to this twine and assigned the tag "Web 2.0" for the item. By this specification, "Web 2.0" becomes a feature that describes the topic "Twine" in contrast to a peer-to-peer social tag with respect to each other. Hence we have obtained richer, hierarchical semantics for knowledge representation based on human behaviors. This type of machine-processable hierarchical information is the basis for the coming Semantic Web.

In short, this beta version has met the baseline of creating knowledge networks by Twine.

2) The knowledge networks produced by Twine are personalized.

All the stored items in Twine are placed inside a personal space of users. Unless explicitly claimed to be public, the stored items or twines are private and can only be checked by the owner. Nova Spivack once emphasized that "in Twine, about 50% of the content is actually NOT public but rather is personal to individuals or shared in private groups." So Twine has really been designed priorly for personalized knowledge management in contrast to public knowledge share.

In the post of first impression, I have questioned that whether the personalization of knowledge networks could across the boundary of individual networks. The current beta version has not yet provided a clear answer to this question. In the beta, it is not straightforward for users to watch the topology of their individual knowledge networks. Thus it is difficult to precisely tell the communication between different knowledge networks. Radar Networks would want to resolve this issue in the future.

On the other hand, however, I would rather suggest that Radar Networks be careful on supporting knowledge sharing among varied knowledge networks. Without a solid reasoning mechanism underneath, knowledge sharing across network boundaries could significantly decreases the precision of semantic search. The difference between semantic search within a single knowledge network and semantic search across multiple knowledge networks is that the latter case requires a much better solution on data mapping while the former case may bypass this extremely difficult issue.

3) A knowledge network in Twine allows users to find, share, and organize information.

In this beta version, it seems that Twine supports only keyword-based search. The current UI design suggests that Radar Networks is working on semantics-based complex search. But the semantic search has not yet come to live. We may expect more advanced search to be added into Twine in the future.

On the issue of knowledge organization, Radar Networks claims that Twine can automatically produce tags for newly added items by parsing their titles, content, and authors. This process is a typical semantic analysis; and thus Twine is declared to be a Semantic-Web application. A problem is, however, that "it doesn't work very well," said by Marshall Kirkpatrick a month ago. Maybe both Marshall and I have expected too much from Twine at the beginning and thus we both become a little bit disappointed on the testing results of this beta version.

During my test, I have added this blog into a twine and see how Twine suggests tags for the new item. Unfortunately Twine suggests only two tags---blog and Yihong, where the first one identifies the new item is a blog and the second one is my login name at Twine. Apparently the underneath engine has done nothing on parsing the content of this blog. In order to test more, I tried to add another new item---my Web-1.0 style homepage and see whether the original mistake is caused by a Web-2.0 input. As this time, Twine suggested also only two tags---Yihong and Web-1, where the first one is my login name at Twine and the second one is my specified name for the item I just added. Again, apparently no content parsing has been done. These simple tests show that there are still many bugs in this beta version.

4) Information in a knowledge network is from people who are trusted by the owners of the knowledge network.

It seems that this beta version has been equipped with a basic mechanism of trust that allows the owners of twines to decide membership. Since the quality of knowledge networks is closely related to the degree of trust that has been implemented, we should expect more sophisticated mechanism of trust implemented in the later versions of Twine. This issue might not be so critical as the others for a beta service.

In summary, this beta version has generally met the fundamental declarations Radar Networks made originally. Is this Twine beta a Semantic-Web service now? Not yet. But it certainly leads us a correct way to the Semantic Web. In my opinion, the contribution of this Twine beta is more on an attempt to construct a semantic web than on providing the world the first semantic-web application.

Three Suggestions to Improve Twine

It is still a long way for Radar Networks to have Twine be "the first mainstream Semantic Web application". I have three suggestion for Radar Networks to improve this Twine service.

Above all, Twine should not be looked like a "del.icio.us 2.0", and indeed Twine is beyond an upgraded del.icio.us. Up to the date, however, most of the comments I have seen about Twine have addressed Twine to be a new-generation bookmarking service, or in the other words a new-generation del.icio.us. Nevertheless bookmarking is a way of using Twine, Twine is much more than bookmarking. Twine is designed to produce real knowledge networks.

A critical task for the Twine management team is to design a service of viewing and managing the topology of individual knowledge networks. A visual display of knowledge networks is necessary for Twine users to issue reasonable semantic search queries in the future (if Twine goes for this direction). Unlike del.icio.us that only supports a flat structure of tags, Twine allows hierarchical knowledge specification. It is, however, not straightforward for users to issue good semantic queries unless they can have a broad view of all items bound to a particular twine. This augmentation will eventually let Twine be out of the shadow of del.icio.us.

Second, Twine needs to learn more from Squidoo, whose motto is that "everyone's an expert on something." I have been with this motto for long time and it is part of my vision of web evolution. More importantly, I sincerely believe that this motto fits perfect to Twine.

There is a basic question Radar Networks should think of Twine: what does a twine really mean to its creator? Is a twine a garbage collector to the creator as someone has suggested in one of the comments to Marshall's post? Surely a garbage collector is not be the original plot. But the challenge is, however, that how to prevent Twine from being garbage collectors. This motto of Squidoo provides an answer.

A twine should be a place where a user can demonstrate his/her expertise on whatever something. In order to realize this goal and minimize the attempts of using Twine to be garbage collectors, Radar Networks should develop delicate algorithms for users to define the scope of a twine and to be able to clean the items inside a twine. With these functions, Twine can continuously keep users being with it and really build a well-structured semantic web based on the user practices.

The last but not the least, data portability is an issue Radar Networks must consider for Twine. Since the primary goal of Twine is to help users organize private knowledge space, Radar Networks should allow users to transport their twines to their other preferred locations such as local PCs, and also allow users to upload their locally organized twines back to the remote servers maintained by Radar Networks. This capability of data portability would extend the usage of Twine by prompting the service being not only an online resource organizer but also a resource organizer that private users themselves have fully controlled either online or offline.

Final Address

Twine is an impressive new service. Although it is not yet a true Semantic-Web application, it is a demonstration that we are moving towards Semantic Web. At present, Twine is still at its very early stage. There are still many pieces missed in the current beta version. But I am confident that Radar Networks is working on these missing issues and we can expect a much better Twine service in the future.

(You may be interested in watching the previous post in this topic, Twine: the first impression.)

Friday, April 18, 2008

Thinkers and the New Web Age

At the ThinkerNet, I posted a new article that addressed the role of thinkers in the new Web age. Here are a few additional background thoughts about the post.

As we know, in the long history of mankind humans are seeking for new knowledge to understand unknown world and to invent new products. A standard (and probably the most standard) way humans have practiced to expand knowledge is a method called "divide-and-conquer." At the very ancient time, there were very few branches of knowledge such as mathematics, philosophy, and literature. With the evolution of human society, these raw distinctions of knowledge were no longer enough for people to explore and invent knowledge. Hence gradually we had people focus specifically on niche domains such as physics, chemistry, and biology. When it comes to our modern age, these old niches have already grown to be major branches of knowledge and thus we divided them further into more detailed niches such as molecular thermophysics and physical biology. This process of division is still going on and goes narrower and narrower with regard to the scope of new niche domains. This is the so-called "divide-and-conquer" method: when we do not know details of a niche and we want to understand it, we develop a new branch of science of it to conquer.

Nevertheless this "divide-and-conquer" paradigm has led us the flourishing of modern science, it has its negative impacts---we need aggregation beyond division. In nowadays, the branches of science have been so narrow and particular that our PhDs might no longer been called "Doctor of Philosophy." How many new PhDs (even if they are graduated from MIT or Stanford) really have unique philosophy in mind besides knowledge of their professional realms? Now is a pitiful age that has no "new Einstein". This process of division can continuously raise scientists who are good at their professional areas. But it is harmful to raise thinkers that can aggregate knowledge in creative new ways. As Lee Smolin pointed out in his article, this type of creativity is hardly appreciated by the mainstream academy and thus fewer real thinkers can emerge out of surface.

Fortunately, the evolution of World Wide Web opens a new door for the thinkers. With the Web, new-age thinkers can emerge without the recognition of academy. The people all over the world will vote for real creativities. At the level of thinking, we can trust that the collective wisdom of crowds would always beat against the biased view from ivory tower. Hence new-age thinkers will rise from grass-root instead of ivory tower. Let's all expect and watch.