Monday, December 08, 2008

How to construct a high quality ontology?

More and more people start to understand and use the term "ontology" when discussing machine intelligence. Informally, an ontology is a set of specified human intelligence that machines can run automatically. How to build ontologies with high quality, however, remains being a hard problem.

In general, an ontology construction procedure has four steps:

• Identify purpose
• Encode content
• Evaluate construction
• Document product

Among the four steps, identify purpose is the first, the most crucial, and the hardest stage. We often have dozens (if not hundreds) varied ways of describing a domain. Many times, however, there is only one best resolution with respect to a designated purpose. Unless we have precisely identified the purpose at the beginning, we often result in low-quality ontologies that can cause much trouble in a long term.

In principle, there are five ways to identify the purpose for an ontology construction assignment.

(1) Decide goal. We obtain a precise description of the target domain, from which we capture the key entities and relationship sets. This is a typical middle-out ontology construction methodology.

(2) Describe scenario. We obtain a set of competency questions about the application scenario. From the scenario descriptions we define the entities and relationship sets. By applying these entities and relationship sets we should be able to precisely express the questions as well as the potential answers to the questions. This is also a middle-out ontology construction methodology.

(3) Narrow down domain description. Start with a general and broad description of the target domain, we gradually winnow the unrelated portions out of the scope until the remainder becomes satisfactory. This is a typical top-domain methodology.

(4) Determine typical activity. This methodology is particularly useful for creating the service-oriented ontologies. In this case, ontologies are created not for domain description but for application presentation. Thus, we need detailed service activity descriptions to build the right ontologies. This methodology is middle-out.

(5) Grow seeds. To produce an ontology for a broad domain, sometime it is easier to start with building a few small ontologies as the seeds. Then we can gradually grow these seed ontologies by adding them more concepts and relationships and connecting them when appropriate until we reach the final goal. This is a typical bottom-up sequence of ontology construction. To properly execute the method, it demands well understanding of many subtle ontology technologies such as modular ontology and ontology reuse.

In real practices, ontology developers need to carefully choose the proper methodology of purpose identification according to their particular assignment. A bad choice often leads to longer time of development and poorer quality of the results. This first step is indeed tough but truly crucial for creating high quality ontologies.

No comments: