Catastrophic Forgetting in Reinforcement-Learning Environments
Reinforcement learning (RL) problems are a fundamental part of machine learning theory, and neural networks are one of the best known and most successful general tools for solving machine learning problems. Despite this, there is relatively little research concerning the combination of these two fundamental ideas. A few successful combined frameworks have been developed (Lin, 1992), but researchers often find that their implementations have unexpectedly poor performance (Rivest & Precup, 2003). One explanation for this is Catastrophic Forgetting (CF), a problem usually faced by neural networks when solving supervised sequential learning problems, made even more pressing in reinforcement learning. There are several techniques designed to alleviate the problem in supervised research, and this research investigates how useful they are in an RL context. Previous researchers have comprehensively investigated Catastrophic Forgetting in many different types of supervised learning networks, and consequently this research focuses on the problem of CF in RL agents using neural networks for function approximation. There has been some previous research on CF in RL problems, but it has tended to be incomplete (Rivest & Precup, 2003), or involve complex many-layered, recurrent, constructive neural networks which can be difficult to understand and even more difficult to implement (Ring, 1994). Instead, this research aims to investigate CF in RL agents using simple feed-forward neural networks with a single hidden layer, and to apply the relatively simple approach of pseudorehearsal to solve reinforcement learning problems effectively. By doing so, we provide an easily implemented benchmark for more sophisticated continual learning RL agents, or a simple, „good enough? continual learning agent that can avoid the problem of CF with reasonable efficiency. The open source RL-Glue framework was adopted for this research in an attempt to make the results more accessible to the RL research community (Tanner, 2008).
Advisor: Robins, Anthony
Degree Name: Master of Science
Degree Discipline: Computer Science
Publisher: University of Otago
Keywords: Catastrophic Forgetting; Pseudorehearsal; Reinforcement Learning; Neural Network; Markov Decision Problem; Temporal Transition Hierarchies
Research Type: Thesis