Estimating Overdispersion in Sparse Multinomial Data

Farzana Afroz

Back

Estimating Overdispersion in Sparse Multinomial Data

Doctoral Thesis

Open access

Estimating Overdispersion in Sparse Multinomial Data

Farzana Afroz

Doctor of Philosophy - PhD, University of Otago

University of Otago

2018

Handle:

https://hdl.handle.net/10523/8595

Abstract

Overdispersion

Sparse

Multinomial

Dirichlet-Multinomial

The phenomenon of overdispersion arises when the data are more variable than we expect from the ﬁtted model. This issue often arises when ﬁtting a Poisson or a binomial model. When overdispersion is present, ignoring it may lead to misleading conclusions, with standard errors being underestimated and overly-complex models being selected. In our research we considered overdispersed multinomial data, which arises in a number of research areas. Two approaches can be used to analyze overdispersed multinomial data: (i) the use of quasilikelihood or (ii) explicit modelling of the overdispersion using, for example, a Dirichlet-multinomial (Mosimann n.d.) or ﬁnite-mixture distribution. Use of quasilikelihood has the advantage of only requiring speciﬁcation of the ﬁrst two moments of the response variable, and is therefore likely to be more robust than use of a speciﬁc model for overdispersion. Quasilikelihood is most useful when we can assume that var(Y ) = φV, where V is the variance assumed by the multinomial model. We derive a new estimator of the overdispersion parameter φ for multinomial data by generalizing the results of Farrington (1996), Fletcher (2012) and Deng & Paul (2016). We consider six estimators of φ including the new estimator, discuss their theoretical properties and provide simulation results showing their performance in terms of bias, variance and mean squared error. Dirichlet-Multinomial distribution and the ﬁnite mixture of Dirichlet-Multinomial distribution were used in simulation study. The new estimator show the lowest level of RMSE (root mean squared error) for increasing level of φ and sparsity compared to the other estimators when the data are generated by the Dirichlet-Multinomial distribution. For the ﬁnite mixture case Farrington’s estimator sometimes performed better than the new estimator in terms of RMSE. We derived the new estimator subject to a condition on the third cumulant of the response variable and the condition was satisﬁed in the case of Dirichlet-Multinomial distribution. It would be interesting to check the assumption for the mixture of Dirichlet-Multinomial distribution and for the other types of overdispersed multinomial models.

Files and links (1)

pdf

AfrozFarzana2018PhD.pdfDownload View

Open Access

Metrics

508 File views/ downloads

616 Record Views

Details

Record Identifier: 9926480032501891
Title: Estimating Overdispersion in Sparse Multinomial Data
Creators: Farzana Afroz
Contributors: David Fletcher (Advisor / Supervisor)
Matthew Parry (Advisor / Supervisor)
Theses and Dissertations: Doctor of Philosophy - PhD, University of Otago
Academic Unit: Mathematics and Statistics
Awarding Institution: University of Otago
Publisher: University of Otago
Date published ; e-published: 2018
Language: English
Resource Type; Subtype: Doctoral Thesis
Format: application/pdf

Estimating Overdispersion in Sparse Multinomial Data

Abstract

Files and links (1)

Metrics

Details

Usage Policy