Connectionist-Based Adaptive Speech Recognition Systems
Ghobakhlou, Ali Akbar
This item is not available in full-text via OUR Archive.
If you are the author of this item, please contact us if you wish to discuss making the full text publicly available.
Cite this item:
Ghobakhlou, A. A. (2017). Connectionist-Based Adaptive Speech Recognition Systems (Thesis, Doctor of Philosophy). University of Otago. Retrieved from http://hdl.handle.net/10523/7404
Permanent link to OUR Archive version:
http://hdl.handle.net/10523/7404
Abstract:
It has been a long standing technical challenge to create machines that can performhuman intellectual tasks such as speech processing. Speech recognition is important notonly because it is the most common means of human communication, but also becausein some cases, it is the most efficient way to interact with computers or other smartdevices. Despite great advances over recent years in the development of SpeechRecognition Systems (SRS), these system do not come close to human recognition andthus speech recognition (in computer) remains an unsolved problem.The main obstacle to building successful SRS in real-world environment, is lack ofrobustness. SRS must work for as many people as possible, and should perform wellunder everyday listening conditions. Differences in articulation, accents and speakingcadence combine to form one of the more pervasive speech recognition problems. Thebiggest challenge for SRS is a lack of adequate methods for handling intrinsic variationsin speech. A key human cognitive characteristic is the ability to learn and adapt to newpatterns. Thus, it is important to enable intelligent systems to learn and generalise evenfrom single instances or limited samples of data, so that new or changed signals (e.g.,accented speech, noise) could be correctly understood. It has been well demonstratedthat adaptation in SRS is very beneficial.Evolving Connectionist Systems (ECoS) are neural networks that evolve theirstructure through incremental adaptive learning to recognise an input and/or outputstreams of data. The ECoS paradigm was adopted for the first time, in this research inorder to develop novel algorithms designed to address the problem of the adaptationSRS of new speakers. A case study was conducted using two sets of speakers from theTIMIT corpus; speakers of the same dialect region (intra-accent) were adopted as thebaseline data and speakers of a different dialect region (inter-accent) as the adaptationdata. Comparative analysis of ECoS networks against Multi-Layer Perceptron (MLP)and Fuzzy ARTMAP were undertaken. Simple Evolving Connectionist Systems(SECoS) was shown to outperform other algorithms used in this study demonstratinghigh generalisation while resistance to forgetting.In order to demonstrate the generalisation and adaptivity of the SECoS networks afurther case study was undertaken using a small vocabulary adaptive word recognitionsystem. This was developed to control the navigation of a robot named ROKEL. The implementation facilitated on-line speaker adaptation. An adaptive connectionistmethod was also developed to accurately determine the boundaries of speech andnon-speech segments within incoming speech signals. It allowed adaptation toenvironmental background noise in order to correctly determine the boundaries ofspoken words.There is a significant performance differences exist for noisy and clean speech data inan otherwise identical task. The effect of various noise levels and conditions on speechperception in an in-vehicle environment was investigated. It was hypothesised thatfiltering noisy speech should improve performance of SRS because it improves speechintelligibility, however, an evaluation of several widely used denoising techniques showedthat in general, SECoS performance was decreased over filtered speech. The ANOVA forrecognition results as a function of SNR and speed indicated that both SNR levels andspeed conditions have significant effect in recognition performance. The effect ofacceleration was not significant when considered independently from SNR. However, ifacceleration conditions create more (engine) noise, the recognition rate would decreasedue to the decreased SNR. The experimental analysis of the research work presented,showed that the methods and algorithms developed throughput this thesis were viable.
Date:
2017
Degree Name:
Doctor of Philosophy
Degree Discipline:
Information Science
Publisher:
University of Otago
Keywords:
Speech Recognition Systems; TIMIT; Simple Evolving Connectionist System(SECoS); Evolving Connectionist System(ECoS); Evolving Fuzzy Neural Network(EFuNN)
Research Type:
Thesis
Languages:
English
Collections
- Information Science [486]
- Thesis - Doctoral [3081]