Bug Report Reformulation Method Based on Topic Consistency Maintenance and Pseudo-correlation Feedback Library Extension
To enhance the speed of locating software bugs for developers,a set of bug location techniques based on text retrieval has been proposed.These techniques aim to automatically recommend potentially suspicious code files associated with bug reports submitted by users.However,due to varying levels of professional expertise among users,the quality of bug reports tends to be inconsistent.As a result,some low-quality bug reports cannot be successfully located.To improve the quality of those bug re-ports,it is common to refactor the bug reports.Existing mainstream methods for reformulation,which involve query extension and query reduction,often face issues such as inconsistent query topics before and after reformulation or the utilization of poor-quality pseudo-correlation libraries.To address this problem,this paper proposes a bug report reformulation method that focuses on maintaining topic consistency and extending pseudo-correlation feedback libraries.This method consists of two parts:the query reduction stage,which aims to maintain topic consistency through combining a concise problem description with keywords extrac-ted from the text,and the query expansion stage,which involves using various locating tools(Lucene,BugLocator,and Blizzard)to comprehensively obtain a pseudo-correlation feedback library.From this library,additional keywords for query expansion are ex-tracted to address the issue of low reformulation quality caused by the inadequacy of the existing pseudo-correlation feedback li-brary.Ultimately,the outputs of the query reduction and expansion stages are combined to form the reformulated query.Through experiments conducted on six Java projects,it is discovered that for low-quality bug reports that could not be identified among the top 10 recommended files using the existing bug location method,21%~39%of them can be located using the proposed reformu-lation method,i.e.,Accuracy@10 and MRR@10 is 10%~16%.Compared with existing reformulation techniques,the Accuracy@10 and MRR@10 of the proposed reformulation method can improve by 7%~32%and 2%~13%,respectively.
Bug localizationQuery reformulationQuery reductionQuery expansionPseudo-correlation feedback librariesQuality of bug report