Identifying influential metrics in the combined metrics approach of fault prediction

Springerplus. 2013 Nov 23:2:627. doi: 10.1186/2193-1801-2-627. eCollection 2013.

Abstract

Fault prediction is a pre-eminent area of empirical software engineering which has witnessed a huge surge over the last couple of decades. In the development of a fault prediction model, combination of metrics results in better explanatory power of the model. Since the metrics used in combination are often correlated, and do not have an additive effect, the impact of a metric on another i.e. interaction should be taken into account. The effect of interaction in developing regression based fault prediction models is uncommon in software engineering; however two terms and three term interactions are analyzed in detail in social and behavioral sciences. Beyond three terms interactions are scarce, because interaction effects at such a high level are difficult to interpret. From our earlier findings (Softw Qual Prof 15(3):15-23) we statistically establish the pertinence of considering the interaction between metrics resulting in a considerable improvement in the explanatory power of the corresponding predictive model. However, in the aforesaid approach, the number of variables involved in fault prediction also shows a simultaneous increment with interaction. Furthermore, the interacting variables do not contribute equally to the prediction capability of the model. This study contributes towards the development of an efficient predictive model involving interaction among predictive variables with a reduced set of influential terms, obtained by applying stepwise regression.

Keywords: Fault prediction; Interaction; Object oriented metrics; Regression; Stepwise regression.

Publication types

  • Case Reports