Article
An Empirical Study of Training Data Selection Methods for
Ranking-Oriented Cross-Project Defect Prediction
Haoyu Luo
1
, Heng Dai
2
, Weiqiang Peng
3
, Wenhua Hu
4,
* and Fuyang Li
4,
*
Citation: Luo, H.; Dai, H.; Peng, W.;
Hu, W.; Li, F. An Empirical Study of
Training Data Selection Methods for
Ranking-Oriented Cross-Project
Defect Prediction. Sensors 2021, 21,
7535. https://doi.org/10.3390/
s21227535
Academic Editors: Kim Phuc Tran,
Athanasios Rakitzis and
Khanh T. P. Nguyen
Received: 26 October 2021
Accepted: 10 November 2021
Published: 12 November 2021
Publisher’s Note: MDPI stays neutral
with regard to jurisdictional claims in
published maps and institutional affil-
iations.
Copyright: © 2021 by the authors.
Licensee MDPI, Basel, Switzerland.
This article is an open access article
distributed under the terms and
conditions of the Creative Commons
Attribution (CC BY) license (https://
creativecommons.org/licenses/by/
4.0/).
1
School of Computer Science, South China Normal University, Guangzhou 510631, China;
hluo@m.scnu.edu.cn
2
School of Mechanical and Electrical Engineering, Wuhan Qingchuan University, Wuhan 430204, China;
daiheng726@163.com
3
School of Computer Science, Wuhan University, Wuhan 430072, China; pengweiqiang@whu.edu.cn
4
School of Computer Science and Artificial Intelligence, Wuhan University of Technology,
Wuhan 430070, China
* Correspondence: whu10@whut.edu.cn (W.H.); fyli@whut.edu.cn (F.L.); Tel.: +86-158-2735-4612 (W.H.);
+86-27-8721-6780 (F.L.)
Abstract:
Ranking-oriented cross-project defect prediction (ROCPDP), which ranks software mod-
ules of a new target industrial project based on the predicted defect number or density, has been
suggested in the literature. A major concern of ROCPDP is the distribution difference between the
source project (aka. within-project) data and target project (aka. cross-project) data, which evidently
degrades prediction performance. To investigate the impacts of training data selection methods on
the performances of ROCPDP models, we examined the practical effects of nine training data selec-
tion methods, including a global filter, which does not filter out any cross-project data. Additionally,
the prediction performances of ROCPDP models trained on the filtered cross-project data using the
training data selection methods were compared with those of ranking-oriented within-project defect
prediction (ROWPDP) models trained on sufficient and limited within-project data. Eleven avail-
able defect datasets from the industrial projects were considered and evaluated using two ranking
performance measures, i.e., FPA and Norm(Popt). The results showed no statistically significant
differences among these nine training data selection methods in terms of FPA and Norm(Popt). The
performances of ROCPDP models trained on filtered cross-project data were not comparable with
those of ROWPDP models trained on sufficient historical within-project data. However, ROCPDP
models trained on filtered cross-project data achieved better performance values than ROWPDP
models trained on limited historical within-project data. Therefore, we recommended that soft-
ware quality teams exploit other project datasets to perform ROCPDP when there is no or limited
within-project data.
Keywords: fault prediction; machine learning; data selection
1. Introduction
Software defect prediction (SDP), also known as software fault prediction, is a research
hotspot, which has drawn lots of attention from both industry and academia [1,2]. Defect
prediction recognizes the appearance of defects in the system or industrial software, which
provides support to find the category, location, and scale of defects [
3
–
7
]. It has long been
recognized as one of the important aspects of improving the reliability of industrial system
software [8–10].
With the development of artificial intelligence algorithms, the reliability
of automatic defect prediction is ever-increasing. The general method of software defect
prediction models is to learn a classification model from the historical datasets via the
machine learning algorithms, and then predict whether new software modules contain
bugs [
11
]. The accurate prediction results can contribute to the allocation of reasonable
testing resources by focusing on those predicted defect-prone modules [12,13].
Sensors 2021, 21, 7535. https://doi.org/10.3390/s21227535 https://www.mdpi.com/journal/sensors