A Statistical Model of Error Correction for Computer Assisted Language Learning Systems
This thesis presents a study in the area of computer assisted language learning systems. The study focusses on the topic of automatic correction of student errors. In the thesis, I will describe a novel statistical model of error correction, trained on a newly gathered corpus of language data from Malaysian EFL learners, and tested on a different group of students from the same population. The main novelty of my statistical model is that it explicitly represents 'corrections' -- i.e. circumstances where a language teacher corrects a student's language. Most statistical models used in language learning applications are simply models of the target language being taught; their aim is just to define the kinds of sentence which are expected in the target language. These models are good at recognising when a student's utterance contains an error: any sentence which is sufficiently improbable according to a model of the target language can be hypothesised to contain an error. But they are not so good at providing suggestions about how to correct errors. In any student's sentence, there are many things which could be changed: the space of possible corrections is too large to be exhaustively searched. A statistical system which explicitly models the incidence of corrections can help guide the search for good corrections. The system which I describe in this thesis learns about the kinds of context in which particular corrections are made, and after training, is able to make quite good suggestions about how to correct sentences containing errors.
Advisor: Knott, Alistair; Robins, Anthony
Degree Name: Doctor of Philosophy
Degree Discipline: Computer Science
Publisher: University of Otago
Keywords: computer assisted language learning systems; natural language processing; language modeling
Research Type: Thesis