Estimating Overdispersion in Sparse Multinomial Data
Afroz, Farzana

View/ Open
Cite this item:
Afroz, F. (2018). Estimating Overdispersion in Sparse Multinomial Data (Thesis, Doctor of Philosophy). University of Otago. Retrieved from http://hdl.handle.net/10523/8595
Permanent link to OUR Archive version:
http://hdl.handle.net/10523/8595
Abstract:
The phenomenon of overdispersion arises when the data are more variable than we expect from the fitted model. This issue often arises when fitting a Poisson or a binomial model. When overdispersion is present, ignoring it may lead to misleading conclusions, with standard errors being underestimated and overly-complex models being selected. In our research we considered overdispersed multinomial data, which arises in a number of research areas. Two approaches can be used to analyze overdispersed multinomial data: (i) the use of quasilikelihood or (ii) explicit modelling of the overdispersion using, for example, a Dirichlet-multinomial (Mosimann n.d.) or finite-mixture distribution. Use of quasilikelihood has the advantage of only requiring specification of the first two moments of the response variable, and is therefore likely to be more robust than use of a specific model for overdispersion. Quasilikelihood is most useful when we can assume that var(Y ) = φV, where V is the variance assumed by the multinomial model. We derive a new estimator of the overdispersion parameter φ for multinomial data by generalizing the results of Farrington (1996), Fletcher (2012) and Deng & Paul (2016). We consider six estimators of φ including the new estimator, discuss their theoretical properties and provide simulation results showing their performance in terms of bias, variance and mean squared error. Dirichlet-Multinomial distribution and the finite mixture of Dirichlet-Multinomial distribution were used in simulation study. The new estimator show the lowest level of RMSE (root mean squared error) for increasing level of φ and sparsity compared to the other estimators when the data are generated by the Dirichlet-Multinomial distribution. For the finite mixture case Farrington’s estimator sometimes performed better than the new estimator in terms of RMSE. We derived the new estimator subject to a condition on the third cumulant of the response variable and the condition was satisfied in the case of Dirichlet-Multinomial distribution. It would be interesting to check the assumption for the mixture of Dirichlet-Multinomial distribution and for the other types of overdispersed multinomial models.
Date:
2018
Advisor:
Fletcher, David; Parry, Matthew
Degree Name:
Doctor of Philosophy
Degree Discipline:
Mathematics and Statistics
Publisher:
University of Otago
Keywords:
Overdispersion; Sparse; Multinomial; Dirichlet-Multinomial
Research Type:
Thesis
Languages:
English
Collections
- Mathematics and Statistics [68]
- Thesis - Doctoral [3455]