Sunday, January 25, 2009

Learnable and Unteachable

In this world, many things are what we may learn and a few others are not. To what are not learnable, we know them by our genes. For instance, the concepts "I" and "you". Newborns generally know them very well just after born. When the babies hold a thing towards themselves, they mean "I"; while they start crying when seeing/missing you, they call "you" (which is different from "he" or "she"). The babies do not learn the concepts; the knowledge is born with them. On the other hand, to the babies who do not know the distinction between "you" and "I" when they are born (that are born with mental disability), it is impossible to teach them the concepts afterward. The concepts (not the words representing the concepts) such as "you" and "I" are unteachable. It seems that they are part of the hardware (i.e. our body) instead of the software (i.e., the knowledge we learn after born).

This distinction brings interesting thoughts on machine learning and the construction of the Semantic Web. An essential question is: which semantics must be pre-coded inside machines (i.e. realized through design) and which semantics can be specified/modified gradually by machines (i.e. realized through evolution)? If the eventual Semantic Web is a place that we may issue orders to machines, we may have to seriously think of this basic issue.

In the following I try to have a formal specification of the learnable semantics and the unteachable semantics. Any comment of the issue is highly welcomed.

Semantics that can only be expressed in the default setting of a machine-processing system but it may never be taken as an input to the system is unteachable to the system. Any other semantics that is not unteachable is learnable to the system. There exists certain semantics that is "unteachable" to all the machine-processing systems.

By the way, Happy Chinese New Year!


Test Information Space said...

Yihong: These are good points. We seek the nature of a true teacher. It is hard to abandon our intrinsic assumptions. So far, relatively short human life-spans have paradoxically limited individual progress yet allowed social (since a particular mind-set is not enforced long-term). It seems likely that something will eventually process data, as a form of learning, much more rapidly and deeply than classic humans. Machine logic is a form of design semantics, e.g. that handles power, signals, timing, alarms, instructions, and so forth. These can vary within, or translate across, domains, e.g. hardware/software/firmware, analog/digital/network, spatial/temporal/frequency/thermal/electrical/mechanical/chemical, and so on. Counting is trickier when biological and energy model hybrids are added so fuzzy pronouns become a fluid dynamic that can fix on a state solution for only as long as useful/perceived then follow the semantics for instantaneous evolution.

Yihong Ding said...


thank you for the comment. Yes, due to the limit of our human life, we have not yet experienced how much knowledge may grow if a mind computation may exist beyond the scale. However, the invention of computer and the evolution of World Wide Web provide us such a possibility. If we may invent a mind network that store our mind externally so that it can continue its computation after the death of its original host, we may witness a fascinating future of knowledge explosion.

On the other hand, there is still some fundamental things we must carefully address before the mentioned exciting picture could be realized. For instance, the implementation of unteachable semantics is a critical issue. A computation of semantics/mind could only be productive if the necessary unteachable mind has been somehow implemented in the procedure, though the semantics could be impossible to be expressed descriptively. This one could be a very subtle task but might be essential to the success of our goal.

Something we may expect machines to learn. But something machines cannot learn unless we tell them. The success of implementing unteachable semantics is an essential requirement to the success of the Semantic Web.