subject	predicate	object
45347	Creator	9ac1d268bb57f50a76301a873fb56d23
45347	Creator	ext-c1dc802d361ac3ee086d4e4304ee7069
45347	Date	2013-10-09
45347	Is Part Of	repository
45347	abstract	Background: In our previous research, we built defect prediction models by using confirmation bias metrics. Due to confirmation bias developers tend to perform unit tests to make their programs run rather than breaking their code. This, in turn, leads to an increase in defect density. The performance of prediction model that is built using confirmation bias was as good as the models that were built with static code or churn metrics. Aims: Collection of confirmation bias metrics may result in partially "missing data" due to developers' tight schedules, evaluation apprehension and lack of motivation as well as staff turnover. In this paper, we employ Expectation-Maximization (EM) algorithm to impute missing confirmation bias data. Method: We used four datasets from two large-scale companies. For each dataset, we generated all possible missing data configurations and then employed Roweis' EM algorithm to impute missing data. We built defect prediction models using the imputed data. We compared the performances of our proposed models with the ones that used complete data. Results: In all datasets, when missing data percentage is less than or equal to 50% on average, our proposed model that used imputed data yielded performance results that are comparable with the performance results of the models that used complete data. Conclusions: We may encounter the "missing data" problem in building defect prediction models. Our results in this study showed that instead of discarding missing or noisy data, in our case confirmation bias metrics, we can use effective techniques such as EM based imputation to overcome this problem.
45347	authorList	authors
45347	presentedAt	ext-b9cec4b52e0e1f04fa6b29d54624c260
45347	status	peerReviewed
45347	uri	http://data.open.ac.uk/oro/document/404594
45347	uri	http://data.open.ac.uk/oro/document/404599
45347	uri	http://data.open.ac.uk/oro/document/404604
45347	uri	http://data.open.ac.uk/oro/document/404605
45347	uri	http://data.open.ac.uk/oro/document/404606
45347	uri	http://data.open.ac.uk/oro/document/404607
45347	uri	http://data.open.ac.uk/oro/document/407832
45347	type	AcademicArticle
45347	type	Article
45347	label	Calikli, Gul and Bener, Ayse (2013). An Algorithmic Approach to Missing Data Problem in Modeling Human Aspects in Software Development. In: PROMISE '13: 9th International Conference on Predictive Models in Software Engineering, ACM, New York, USA, article no. 10.
45347	label	Calikli, Gul and Bener, Ayse (2013). An Algorithmic Approach to Missing Data Problem in Modeling Human Aspects in Software Development. In: PROMISE '13: 9th International Conference on Predictive Models in Software Engineering, ACM, New York, USA, article no. 10.
45347	Publisher	ext-2af1883e4bbfa0356fcedb366171cb38
45347	Title	An Algorithmic Approach to Missing Data Problem in Modeling Human Aspects in Software Development
45347	in dataset	oro

About

Resources

Support

Follow us