Neural network models for learning representations of 3D objects via tactile exploration
How does the brain learn the geometry of 3D objects? While the question has been explored for decades, it remains poorly understood. Most researchers considering this question focus on vision. However, infants first learn about 3D objects in the haptic system -- that is, by tactile exploration of objects. Our main hypothesis is that methods used to learn 2D environment representations are computationally similar to those used to learn 3D object representations. We first propose a neural network model that learns something about the structure of a 3D cuboid, using inputs from the motor system that controls a simulated hand navigating on the surfaces. It does this with a simple unsupervised network that learns to represent frequently-experienced sequences of motor movements. Through this learning, the network implicitly acquires an approximate mapping from egocentric (i.e., agent-centered) movements to allocentric (i.e., object-centered) locations on the cuboid's surfaces. We then investigate how this mapping can be improved by the addition of tactile landmarks, object asymmetries and the shape of the simulated hand. We also show that the learned geometry of an object can support a reinforcement learning scheme that enables the agent to learn simple paths to goal locations on the object. The thesis delivers its contributions on four levels. Firstly, it provides a new avenue to investigate how the brain learns 3D object representations. That is, instead of designing widely-developed computer vision systems using supervised learning, a computer touching system, which is largely unexplored despite its significance, is developed using unsupervised learning. Secondly, models proposed in the thesis are very parsimonious, because they simply make use of a general-purpose sequence-processing network, of the kind that is implicated in many other cognitive functions. That is, the thesis gives researchers some economical and lightweight models to use for learning the geometry of 3D objects. Thirdly, we show that this general model can take advantage of several features of objects, including asymmetries in their geometry, and tactile landmarks, and can also make use of information about the configuration of a simple articulated hand. This can give some guidance to researchers to design and tune a computer touching system (e.g., by adding features of objects to the system). Fourthly, we show that this model is of practical use to an agent, in the context of a reinforcement learning task. This can give insights to researchers designing reinforcement learning models for tactile exploration, for instance in systems for blind people.
Advisor: Knott, Alistair; Mills, Steven
Degree Name: Doctor of Philosophy
Degree Discipline: Department of Computer Science
Publisher: University of Otago
Keywords: Tactile Exploration, 3D Object Representation, Recurrent Self-Organising Map, Implicit Geometric Information, Affordance, Sequence Learning
Research Type: Thesis