Abstract
Recent research suggests that a query reformulation system based on deep reinforcement learning has been effective at improving search performance, claiming superior performance to “traditional” techniques such as BM25. However, there is strong evidence to suggest that BM25 has not been systematically outperformed since the early 1990s because it is possibly the upper bound of ad-hoc retrieval. Given these two contrasting claims, we carry out our own investigations in this thesis by implementing a query reformulation system based on deep reinforcement learning and conduct a series of experiments with the goal of gathering further empirical evidence on the subject.