Mining Knowledge in Geographical Data

K. Kopersky, J. Han, and J. Adhikary. Mining knowledge in geographical data. Communications of the ACM, pages 1–8, 1998. [url]


Data mining represent the confluence of several research fields, as machine learning, database systems, data visualization, statistics, and information theory.

The authors define the spatial data mining as the extraction of implicit knowledge, spatial relationships, or other patterns not explicitly stored in spatial databases. These data distinguish from the relational databases because they carry topological information, usually organised by a multidimensional spatial indexing structures.

The authors introduce several methods for knowledge mining in geographical data: the first is the generalization-based mining, the attempt to generalise abstract data from a low concept level. This method can be implemented with two phylosopies, a spatial-data-dominant generalization and a non-spatial-dominant generalization, which differ on the importance that is given to the spatial dimension of the data.

Another methodology introduced is the clustering, or densely populated regions, according to some distance measurement, in a large multi-dimensional data set.

A third methodology is the exploration of spatial associations, the rules that associate one or more spatial objects with other spatial objects. Some threshold can be implemented to control the filtering out of associations of objects.


Leave a Reply