Abstract
Natural language processing (NLP) is the ability for computers to understand human language. Despite extensive research in NLP, some challenging problems still exist. In this thesis, we will examine three challenging areas of NLP: identification of target sarcasm, assessing human judgement, and appraising the quality of medical evidence. Our choice to focus on these specific problems was guided by challenges from the Australasian Language Association (ALTA) Shared Tasks.
The main goal of this thesis is to examine how well machine learning approaches perform in these areas and to provide a deeper understanding of what makes tackling these areas challenging. Throughout our investigation, we conducted our experiments using various techniques and evaluated them on publicly available data sets. For two of the three challenges (target sarcasm detection, and assessing human judgement), we have created state-of-the-art models for these tasks.
Our experimental results show that deep learning classifiers could perform well in identifying the target of sarcasm and accessing human judgement because of the complexity of the models. However, when it comes to evaluating the quality of medical evidence, traditional machine learning classifiers perform better because of the use of handcrafted features. We also found that humans and machines face similar challenges, when we measured the performance of humans against machine learning classifiers for evaluating human behaviour.
Nevertheless, we have successfully tackled the primary goal of this thesis through our experimentation with publicly available data sets and open-source machine learning frameworks.