Newsgroup Exploration with WEBSOM Method and Browsing Interface

T. Honkela, S. Kaski, K. Lagus, and T. Kohonen. Newsgroup exploration with websom method and browsing interface. Technical Report A32, Helsinki University of Technology, Laboratory of Computer and Information Science, Rakentajanaukio 2 C, SF-02150 Espoo, Finland, 1996. [url]


This paper present the application of the SOM method to the exploration of a large number of usenet messages. This method, introduced by Kohonen, in 1982 is a means for automatically arranging high-dimensional statistical data so that alike inputs are in general mapped close to each other.

The general concept is that words are first organised into categories on a word category map, then an encoding of the documents can be achieved that explicity expresses the similarity of the word meanings.

This word category map os a self-organising semantic map, in the definition of Ritter and Kohonen, 1989, that describes relations of words based on their averaged short contexts. The SOM is a supervised and calibrated method. The document map is then formed with the SOM algorithm using the histograms as fingerprints of the documents.

Websom Architecture

Word Category Map

