Bayesian statistical models for predicting software development effort
van Koten, Chikako

View/ Open
Cite this item:
van Koten, C. (2005). Bayesian statistical models for predicting software development effort (Information Science Discussion Papers Series No. 2005/08). University of Otago. Retrieved from http://hdl.handle.net/10523/933
Permanent link to OUR Archive version:
http://hdl.handle.net/10523/933
Abstract:
Constructing an accurate effort prediction model is a challenge in Software Engineering. This paper presents new Bayesian statistical models, in order to predict development effort of software systems in the International Software Benchmarking Standards Group (ISBSG) dataset. The first model is a Bayesian linear regression (BR) model and the second model is a Bayesian multivariate normal distribution (BMVN) model. Both models are calibrated using subsets randomly sampled from the dataset. The models’ predictive accuracy is evaluated using other subsets, which consist of only the cases unknown to the models. The predictive accuracy is measured in terms of the absolute residuals and magnitude of relative error. They are compared with the corresponding linear regression models. The results show that the Bayesian models have predictive accuracy equivalent to the linear regression models, in general. However, the advantage of the Bayesian statistical models is that they do not require a calibration subset as large as the regression counterpart. In the case of the ISBSG dataset it is confirmed that the predictive accuracy of the Bayesian statistical models, in particular the BMVN model is significantly better than the linear regression model, when the calibration subset consists of only five or smaller number of software systems. This finding justifies the use of Bayesian statistical models in software effort prediction, in particular, when the system of interest has only a very small amount of historical data.
Date:
2005-10
Publisher:
University of Otago
Pages:
27
Series number:
2005/08
Keywords:
effort prediction; Bayesian statistics; regression; software metrics
Research Type:
Discussion Paper
Collections
- Information Science [488]
- Software Metrics Research Laboratory [22]
- Discussion Paper [441]