Crowd Scene Analysis in Video Surveillance
There is an increasing interest in crowd scene analysis in video surveillance due to the ubiquitously deployed video surveillance systems in public places with high density of objects amid the increasing concern on public security and safety. A comprehensive crowd scene analysis approach is required to not only be able to recognize crowd events and detect abnormal events, but also update the innate learning model in an online, real-time fashion. To this end, a set of approaches for Crowd Event Recognition (CER) and Abnormal Event Detection (AED) are developed in this thesis. To address the problem of curse of dimensionality, we propose a video manifold learning method for crowd event analysis. A novel feature descriptor is proposed to encode regional optical flow features of video frames, where adaptive quantization and binarization of the feature code are employed to improve the discriminant ability of crowd motion patterns. Using the feature code as input, a linear dimensionality reduction algorithm that preserves both the intrinsic spatial and temporal properties is proposed, where the generated low-dimensional video manifolds are conducted for CER and AED.Moreover, we introduce a framework for AED by integrating a novel incremental and decremental One-Class Support Vector Machine (OCSVM) with a sliding buffer. It not only updates the model in an online fashion with low computational cost, but also adapts to concept drift by discarding obsolete patterns. Furthermore, the framework has been improved by introducing Multiple Incremental and Decremental Learning (MIDL), kernel fusion, and multiple target tracking, which leads to more accurate and faster AED. In addition, we develop a framework for another video content analysis task, i.e., shot boundary detection. Specifically, instead of directly assessing the pairwise difference between consecutive frames over time, we propose to evaluate a divergence measure between two OCSVM classifiers trained on two successive frame sets, which is more robust to noise and gradual transitions such as fade-in and fade-out. To speed up the processing procedure, the two OCSVM classifiers are updated online by the MIDL proposed for AED. Extensive experiments on five benchmark datasets validate the effectiveness and efficiency of our approaches in comparison with the state of the art.
Advisor: Deng, Jeremiah; Woodford, Brendon
Degree Name: Doctor of Philosophy
Degree Discipline: Information Science
Publisher: University of Otago
Research Type: Thesis