Efficient and Effective Clustering Methods for Spatial Data Mining

R. T. Ng and J. Han. Efficient and effective clustering methods for spatial data mining. In Proceedings of the 1994 International Conference Very Large Data Bases (Sept. 12-15, Santiago, Chile), pages 144–155. Morgan Kaufmann, San Francisco, CA, 1994. [url]


This article describes the CLARANS algorithm that is used to cluster spatial databases and that is based on randomised search. Problems of most of the methods used so far is that they require an a priori knowledge to be initialised. Their approach, on the contrary starts from scratch.

The method was developed from the CLARA algorithm, developed by Kaufmann et. al. The authors expand from this adding suppost for randomised search. Initially their method is able to find the best k_nat, which is the most natural number of clusters. Subsequently, the mothod start assigning objects to the clusters and finding the medoids.

Two versions of the method, respectively for Spatial and Non-Spatial-Dominant are tested against CLARA in a real estate data set. The results confirm the efficacy of the new method.

Tags: ,

Leave a Reply