|dc.description.abstract||There is a growing evidence that saliency can be better modelled using top-down mechanisms that incorporate semantic or object information. This suggests a new direction for image and video analysis, where semantics extraction can be effectively utilized to improve image retrieval, video summarization, indexing and retrieval.
Novelty detection is an important functionality that has found many applications in information retrieval and processing. In this thesis, we first propose a computational framework that deals with novelty detection for multiple-scene image sets, secondly we extend the usage of semantic modelling in proposing another computational framework to facilitate video summarization.
Working with wildlife image data, the framework undergoes image segmentation, feature extraction and matching of image blocks, and then a co-occurrence matrix of semantic labels is constructed to represent the semantic context within the scene. An algorithm for outliers detection that employs multiple one-class models is proposed. An advantage of our approach is that it can be used for novelty detection and scene classification at the same time. Our experiments show that the proposed approach gives favourable performance for the task of detecting novel wildlife scenes as examples.
On the other hand, the proposed high-level semantic feature in this thesis can also facilitate key-frame extraction in video analysis task. Semantic context of video frames is extracted and its sequential changes are monitored so that significant novelties are located using our proposed one-class classifier. Experiments and user evaluations have been conducted to compare the key-frame set obtained from our approach and other conventional low-level features or techniques. Results have shown that our approach is able to produce favourable key-frame set that is semantically representative as compared with its low-level counterparts.
Therefore, this thesis opens up a new facet of semantic analysis in image processing by considering the path from local low-level feature to global high-level semantic feature for further problem solving, such as novelty detection in image collections and video summarization.||