Archive for the ‘Developing an image retrieval’ Category

As I promised, here is some useful notes about developing a complete image retrieval engine by using color and texture:

1. Image Gallery

Using a free database which have 1000 middle-sized image in 10 different categories

2. Feature Extraction

  • Color: 3 * 64-bin Histogram in HSV Color mode (64-bin Histogram for each H, S and V dimention).
  • Texture: Calculating Co-occurrence matrix for each image and extracting “Contrast, Correlation, Energy and Homogeneity” of texture. Results shows that the respective importance of these feature are: Correlation, Homogeneity, Contrast and Energy.

Some unique blocking methods are used to extract both features in the way that the main parts of image have higher impression and importance.

3. Clustering

I’ve used k-means algorithm to partition my feature space into 7 clusters, respect to 7 feature vectors previously mentioned. But the main criterion for decision is Histogram clusters.

I am not satisfied by using this method, so I am finding a better way to cluster my feature space. I found some articles which are concerned about this issue:

Thomas Deselaers, et al. “Clustering visually similar images to improve image search engines”, …

Ioan Cleju, et al. “Clustering by principal curve with Tree Structue”, …

Xin Zheng, et al. “Locality Preserving Clustering for Image Databse”, …

4. Similarity analogy

I’ve used level 1 of Minkowsky distance for histogram analogy and level 3 for texture-related features.

(For Minkowsky distance formula refer to: Long F. ; Zhang H. and Dagan Feng D., Fundamentals of content-based image retrieval, in Multimedia Information Retrieval and Management – echnological Fundamentals and Applications,” Springer-Verlag, pp. 1-26, 2003)

I’ve used reverse of calculated distance to find the similarity rank of each images. These ranks should be added together to calculate final rank of each images. At this point, IT SHOULD NOT BE FORGOTTEN TO NORMALIZE EACH RANK. Since each feature has different importance, different coefficient correspond to its importance should be multiplied into its calculated rank.

To know how to reach a normalized rank, I refer you to read this paper:

Li X. ; Chen S.C. ; M.L. Shyu and Furht B., “Image retrieval by Color, Texture, and Spatial Information,” in 8th International Conference on Distributed Multimedia Systems (DMS’2002), San Francisco Bay, California, USA, 2002, pp. 152-159.

5. Final Result

To find the similar images from database to a user-defined image, first of all, FV (=feature vector) should be extracted, using the same way as other images in database. Since I use 256-element histogram vector to partition image databse, the histogram part of FV have been used to find the respective cluster. After this step, the comparable images will be limited to the images belong to the respective cluster. Now, by using similatrity measure and finding the specific ranks and them add them in that special way, the similarity ranks will be assigned to every comparable images. The last thing to do is to sort this rates in descending order and show n-first high rank images to user.

Note that Each phase is capable to be improved.


Read Full Post »