Graph feature selection method of software defect prediction based on TMFG-generated topology diagrams
Software defect prediction serves as an important approach to reduce software testing costs,with feature selection be-ing a crucial component.However,traditional feature selection algorithms are limited to considering bilateral relationships be-tween features and pairwise correlations,thereby being unable to effectively handle more complex multilateral relationships and multidirectional interactions.To address this issue,this paper proposed a novel software defect prediction graph-based feature se-lection method utilizing TMFG(Triangulated Maximally Filtered Graph).The method first introduced a topological graph into the feature selection algorithm,representing features as nodes in the graph and employing symmetric uncertainty as the measure of feature relevance,thus constructing a fully connected feature graph.Subsequently,the TMFG edge removal algorithm was em-ployed to remove selected edges from the fully connected graph,followed by graph clustering operations.Then,features within each cluster were ranked,and a specific number of features from each cluster were comprehensively selected to obtain the final feature subset.Finally,comparative experiments conducted on the dataset from the promise repository demonstrated that the pro-posed method achieved favorable results in further optimizing the quality of the selected feature subset,particularly exhibiting greater advantages in datasets with larger volumes.