Abstract
There are many competing models for the indexing process of an information retrieval system, one of which is a pipeline based model. Information retrieval is also an inherently parallel process, indexing one document is independent of another document. A pipeline model allows for easy experimentation on the parallelism within an indexer. In this paper we investigate areas within a pipeline where indexing throughput can be increased, as well as exploiting the inherent parallelism of indexing.