K. Rodden, W. Basalaj, D. Sinclair, and K. Wood. Does organisation by similarity assist image browsing? In Proceedings of CHI 2001, Seattle, Wa, USA, March 31-April 4 2001. Association for Computing Machinery. [PDF] [link to author’s site]
The title of this paper nicely resumes the authors’ research question. They were interested in understanding whether organizing pictures by similarity might be beneficial to an picture retrieval task. Defining the similarity of two pictures is also an interesting problem and many researcher tried to provide solutions based on the image features or the multi-modal information which might be associated with them. The research reported in this paper was concerned with understanding whether a certain organization might have been more effective in helping the retrieval process than another one.
The authors used two kinds of organizations: 1) similarity of visual features; 2) similarity of text annotations. For the retrieval experiments they used information retrieval’s vector model, with binary term weighting, and the cosine coefficient measure. Also, they used a simulated work task situation, in which they asked graphic designers to look for sets of pictures to be used to complement articles for a magazine.
They conducted two experiments. The first one in which they tried to understand whether text-based organization was more useful than visual-based organization or a combination of the two. The majority of participants favoured the textual arrangements of pictures. In the second experiment, they compared more quantitatively a similarity arrangements to a random arrangement of pictures. They considered the time required to complete the task as the main dependent variable and analyzed the results with a linear regression model. Participants were slower with the visual arrangement than with the random selection of pictures. In the analysis the authors suggested that the visual arrangement made easy to find the target pictures however placing similar pictures together cause sometimes them to appear to merge, and therefore more difficult to parse.