C. H. Bennett, P. Gács, M. Li, P. M. B. Vitányi, and W. H. Zurek. Information distance. *IEEE Transactions on Information Theory*, 44(4):1407–1423, July 1998. [url]

——————————-

This article contains the definitions of the *Information Distance* marker, as long as other related concept such as *minimal/maximal correlation* and *Kolmogrov complexity*. This ID is defined as the minimal quantity of information sufficient to translate between x and y, generating either string effectively from the other.

The algorithmic information distance is defined in the same way as the the shortest binary program thatcommputes x from yas well as computing y from x. Being the shortest, such program should take advantage of any redoundancy between the information required to go from x to y and the information required to go from y to x.

Finally another intersting definition is that of the cognitive distance. The author refers to metrics as the Humming or the Euclidean distance. However, the author stress the fact that a small change of bits in the binary represenation of two pictures, for intance, reflects on a higher euclidean distance. On the contrary the manipulated pictures still will look the same. So they came our with another metrics that should take into account patter similarities. They assume that similarities between objects can be represented by effectively computable functions of binary strings.

Tags: algorithmic information theory, pattern recognition, information metric, text data mining, universal cognitive distance