IEEE 16th Signal Processing and Communications Applications Conference, Aydın, Türkiye, 20 - 22 Nisan 2008, ss.517-520
Methods developed for image annotation usually make use of region clustering algorithms. Visual codebooks generated for region clusters, using low level features are matched with words in various ways. In this work we ensured that clustering is more meaningful by using words in associated text in addition to image data in clustering of image regions to generate a codebook We first compute topic probabilities of text documents associated with each image in the training set. Next, we eliminate low probability topics and use highly probable ones in the supervision of region clustering algorithm. To implement this supervision, we force our region clustering algorithm to assign each region to one of the clusters reserved for high probability topics of the associated text. Consequently, regions in generated clusters not only become visually closer, but also the probability of them to belong to the same topic increases. Experiment results show that image annotation with semi-supervised clustering is more successful compared to existing methods. To implement the algorithm parallel computation methods have been used.