Abstract
Text detection from videos is a well known research area. Especially the detection of static superimposed text such as captions has been researched successfully, but makes many assumptions that question the applicability of those algorithms for moving scene text. In this dissertation, I propose a scene text area detection approach that includes a simple key frame extraction, feature extraction, feature code generation and text area classification.
Common edge and variance based features of scene text areas are evaluated and comprised in a "combined feature scheme". For the detection of text areas, two classifiers using a self-organising map and a feed-forward neural
network are compared. Ground truth video data with different characteristics is used to compare the neural computing methods. A combination of detection performance measures and changing features shows the applicability of edge and variance based features and leads to the proposal of
improvements of the "combined feature scheme". Car license plates serve as sample text areas in this research.