Logo image
Enhancing Citation Context based Information Services through Sentence Context Identification
Doctoral Thesis   Open access

Enhancing Citation Context based Information Services through Sentence Context Identification

Angrosh Annayappan Mandya
Doctor of Philosophy - PhD, University of Otago
University of Otago
2012
Handle:
https://hdl.handle.net/10523/2520

Abstract

Sentence Context Identification Citation Classification Conditional Random Fields Sequential Classification Citation Services
Over the years, scientific articles have played a vital role in disseminating scientific knowledge. Typically, the published scientific information builds upon the previous or existing scientific knowledge and citations play a key role in presenting the argument of the article and defining inter-article relationships across articles. Understanding the use of citations within articles and citation relationships across articles is essential for conducting good research. However, the search and browsing capabilities to trace these relationships are currently limited to finding documents citing a given document and providing links to citing documents, and do not provide insights about the reasons for citations either within an article or across articles. Development of information systems capable of identifying these reasons can be helpful in providing useful services for the research community. Against this background, we investigate in this thesis the possibility of identifying contexts associated with sentences in scientific articles and use this information to provide useful citation context based information services. To achieve this objective, we developed an annotation scheme that defines context types for sentences in scientific articles. An inter-rater reliability study was carried out to examine the reliability of our scheme. We achieved an overall agreement of 89.93% among annotators, indicating the acceptability of our scheme. In order to develop a contextual model of sentence context types in scientific article, we developed the Sentence Context Ontology for generating semantic contextual data that was used for developing applications that provide contextual information services. We also developed a text extraction and preparation system for processing scientific documents. Another key aspect of citation context based information services is the ability to automatically identify contexts of sentences. We achieved this by using Conditional Random Fields (CRFs), a sequential probabilistic classifier. We trained the CRF classifier using a training dataset of 1000 paragraphs extracted from 71 research articles and achieved an accuracy of 91% in classifying the sentences according to our proposed scheme. Finally, using the results obtained from the tasks described above, we developed various applications for providing citation context based information services. These included a standalone system and applications using linked data principles and Web APIs.
pdf
MandyaAngroshA2012PhD.pdfDownloadView

Metrics

2364 File views/ downloads
921 Record Views

Details

Logo image