Abstract
With the growing use of multimedia such as images and videos in industries as well as in our daily life, image retrieval has become a vital technology for users to consume the valuable multimedia resources effectively and efficiently. For example it is not easy to browse or search a large image collection. Content-based image retrieval has achieved limited success in multimedia asset management and rapid information retrieval based on low-level visual features. However, humans normally access multimedia assets by semantic concepts. There is a significant semantic gap existing between low-level visual features processed by machines and semantic concepts interpreted by humans. It is generally understood that the problem of image retrieval is still far from being solved. As indicated by literatures, image semantic analysis and visualisation are well known research areas to overcome this gap and to enhance the capability of content-based image retrieval systems.
This thesis proposes an approach for intelligent image collection navigation and semantic analysis to bridge the gap between visual features and semantics. Some of MPEG-7 colour and texture descriptors based on global and local visual features are selected as multiple representations of images, as they have been intensively and successfully evaluated in many of image retrieval experiments. Taking a pattern classification approach for image semantic analysis, two types of classifiers are designed according to the different characteristics of global and local visual features to classify images into the predefined classes. Combination classifications are investigated in this study. Leave-one-out cross-validation is employed to evaluate their performances using different visual features and combination schemes. In order to increase the impact of the classifiers with high precisions in the final classification decision, the precision-based combination rule that weights each classifier based on its precision in the combination of the results is proposed. For the visualisation of image collections, an intelligent image collection navigation system is developed by joining the SOM-based image visualisation based on visual feature spaces together with semantic concepts extracted from semantic analysis.
Experiments show that the proposed approach is successful in improving the accuracy of indoor and outdoor scenes classification and revealing image collection structure both in the visual feature spaces and on the semantics level. With further works on this study the system is able to assist users to develop automatic interpretations to the image collection and navigate and access images of interests much more easily.