Sequence-based Program Semantic Rule Mining and Violation Detection
In software development,source code that violates semantic rules may compile or run normally but may have defects in performance or functionality.Therefore,accurately detecting such defects has become a challenge.Existing research usually adopts itemset-based rule mining and detection methods,but these methods have significant room for improvement in detection ability and accuracy due to the failure to integrate the order information and control flow information of source code effectively.To address this problem,this paper propose a sequence-based method called SPUME for extracting and detecting program seman-tic rules.The method converts program source code into an intermediate representation sequence,extracts semantic rules from it using sequence rule mining algorithms,and detects defects in the source code based on these rules.To verify the effectiveness of SPUME,it is compared with three baseline methods,including PR-Miner,Tikanga,and Bugram.Experimental results show that compared with PR-Miner,which is based on unordered itemset mining,and Tikanga,which combines graph models,SPUME has significantly improved detection performance,speed,and accuracy.Compared with Bugram,which is based on Ngram language models,SPUME detects more program defects more efficiently while maintaining a similar level of accuracy.