Abstract
The paper is a study on a new class of spatial-temporal evolving fuzzy neural network systems (EFuNNs) for on-line adaptive learning, and their applications for adaptive phoneme recognition. The systems evolve through incremental, hybrid (supervised / unsupervised) learning. They accommodate new input data, including new features, new classes, etc. through local element tuning. Both feature-based similarities and temporal dependencies, that are present in the input data, are learned and stored in the connections, and adjusted over time. This is an important requirement for the task of adaptive, speaker independent spoken language recognition, where new pronunciations and new accents need to be learned in an on-line, adaptive mode. Experiments with EFuNNs, and also with multi-layer perceptrons, and fuzzy neural networks (FuNNs), conducted on the whole set of New Zealand English phonemes, show the superiority and the potential of EFuNNs when used for the task. Spatial allocation of nodes and their aggregation in EFuNNs allow for similarity preserving and similarity observation within one phoneme data and across phonemes, while subtle temporal variations within one phoneme data can be learned and adjusted through temporal feedback connections. The experimental results support the claim that spatial-temporal organisation in EFuNNs can lead to a significant improvement in the recognition rate especially for the diphthong and the vowel phonemes in English, which in many cases are problematic for a system to learn and adjust in an adaptive way.