Logo image
Neural network models for learning representations of 3D objects via tactile exploration
Doctoral Thesis   Open access

Neural network models for learning representations of 3D objects via tactile exploration

Xiaogang Yan
Doctor of Philosophy - PhD, University of Otago
University of Otago
2020
Handle:
https://hdl.handle.net/10523/10399

Abstract

Tactile Exploration 3D Object Representation Recurrent Self-Organising Map Implicit Geometric Information Affordance Sequence Learning
How does the brain learn the geometry of 3D objects? While the question has been explored for decades, it remains poorly understood. Most researchers considering this question focus on vision. However, infants first learn about 3D objects in the haptic system -- that is, by tactile exploration of objects. Our main hypothesis is that methods used to learn 2D environment representations are computationally similar to those used to learn 3D object representations. We first propose a neural network model that learns something about the structure of a 3D cuboid, using inputs from the motor system that controls a simulated hand navigating on the surfaces. It does this with a simple unsupervised network that learns to represent frequently-experienced sequences of motor movements. Through this learning, the network implicitly acquires an approximate mapping from egocentric (i.e., agent-centered) movements to allocentric (i.e., object-centered) locations on the cuboid's surfaces. We then investigate how this mapping can be improved by the addition of tactile landmarks, object asymmetries and the shape of the simulated hand. We also show that the learned geometry of an object can support a reinforcement learning scheme that enables the agent to learn simple paths to goal locations on the object. The thesis delivers its contributions on four levels. Firstly, it provides a new avenue to investigate how the brain learns 3D object representations. That is, instead of designing widely-developed computer vision systems using supervised learning, a computer touching system, which is largely unexplored despite its significance, is developed using unsupervised learning. Secondly, models proposed in the thesis are very parsimonious, because they simply make use of a general-purpose sequence-processing network, of the kind that is implicated in many other cognitive functions. That is, the thesis gives researchers some economical and lightweight models to use for learning the geometry of 3D objects. Thirdly, we show that this general model can take advantage of several features of objects, including asymmetries in their geometry, and tactile landmarks, and can also make use of information about the configuration of a simple articulated hand. This can give some guidance to researchers to design and tune a computer touching system (e.g., by adding features of objects to the system). Fourthly, we show that this model is of practical use to an agent, in the context of a reinforcement learning task. This can give insights to researchers designing reinforcement learning models for tactile exploration, for instance in systems for blind people.
pdf
YanXiaogang2020PhD.pdfDownloadView

Metrics

307 File views/ downloads
429 Record Views

Details

Logo image