Application of a Tri-Training Semi-Supervised Learning Method for Non-Functional Requirement Classification in Industrial Software
We combine the advantages of the Word2Vec Skip-gram model in extracting subtle semantic differences from complex software requirement documents and propose a non-functional requirements method based on Tri-Training semi-supervised learning.This approach addresses the challenge of limited labeled samples in software requirements engineering,thus mitigating the performance degradation in non-functional requirement classification.Unlike traditional semi-supervised learning algorithms applied to entirely redundant views or a single classifier,the semi-supervised Tri-Training algorithm initializes three distinct classifiers with three different labeled datasets generated through bootstrapping.It employs the majority voting rule among these classifiers to produce pseudo-labeled data,thereby mitigating constraints on the training set and augmenting the generality and applicability of the classification framework.The method described in this paper is applied to the PROMISE software requirements dataset covering multiple industrial domains.The results demonstrate that the non-functional requirement classification method based on Tri-Training semi-supervised learning exhibits commendable classification performance across datasets with various labeled proportions,particularly under conditions of insufficient labeled data.Compared to supervised learning and other semi-supervised learning algorithms,this method shows significant recall and F1 score advantages.