Abstract
Vocabulary mismatch is an impediment to responding to user queries with relevant results. Stemmers solve this problem by conflating terms with similar spellings. In this thesis we use machine learning to create a stemmer optimised for Information Retrieval performance. We investigate further improvement to stemmers with corpus information. With the goal of stemming selectively for further performance gains we investigate the prediction of query performance.