首页期刊导航|International journal of software engineering and knowledge engineering
期刊信息/Journal information
International journal of software engineering and knowledge engineering
World Scientific Publishing Co. Pte. Ltd.
World Scientific Publishing Co. Pte. Ltd.
0218-1940
International journal of software engineering and knowledge engineering/Journal International journal of software engineering and knowledge engineeringSCIISTPEIAHCI
查看更多>>摘要:With the rapid development of the software industry, the escalating issue of software vulnerabilities has posed significant risks to users. Symbolic execution as a vulnerability mining technology offers a high-test coverage. The existing reviews of symbolic execution methods focus on summarizing various techniques and tools. While some studies have analyzed the technical challenges, classification frameworks and development trends of these methods, they lack a comprehensive and systematic review. This study aims to address the gap in existing reviews by providing a comprehensive, systematic analysis of symbolic execution techniques for vulnerability mining. We conducted a detailed review of 60 peer-reviewed papers published between 2005 and 2024, focusing on symbolic execution techniques for vulnerability mining. First, we reviewed the main techniques used in the symbolic execution process, including program instrumentation, path selection strategy and constraint-solving techniques. Second, we extracted the main information from the selected papers, and the detailed information on the symbolic execution tools is in the form of a table. Compared and analyzed the research object, the execution process at the same time, the software architecture and the application on different system platforms. Finally, we present a comprehensive and systematic summary of current challenges and corresponding solutions in the field. This study provides an in-depth analysis of vulnerability detection technologies based on symbolic execution, serving as a valuable guide for researchers in this domain.
查看更多>>摘要:Assessing the degree of similarity of code fragments is crucial for ensuring software quality, but it remains challenging due to the need to capture the deeper semantic aspects of code. Traditional syntactic methods often fail to identify these connections. Recent advancements have addressed this challenge, though they frequently sacrifice interpretability. To improve this, we present an approach aiming to augment the transparency of the similarity assessment by using GraphCodeBERT, which enables the identification of semantic relationships between code fragments. This approach identifies similar code fragments and clarifies the reasons behind that identification, helping developers better understand and trust the results.
查看更多>>摘要:Open source software has evolved into a fundamental element of the contemporary information sector; however, security threats within its supply chain are persistently rising. Within the collaborative development framework of open source, the introduction of malicious code can lead to significant security vulnerabilities. Conventional methods for detecting these vulnerabilities, which rely on machine learning, face challenges such as a lack of sufficient datasets, inadequate deep semantic understanding, and limitations to single-vulnerability detection. To address these challenges, we introduce a novel approach named VulnTrace, which analyzes historical records of submissions in open source projects to construct a high-quality dataset of vulnerabilities with accurate labels. VulnTrace employs Word2Vec alongside Abstract Syntax Tree (AST) technologies to capture both the semantic and structural details of code segments and utilizes a Transformer model for precise vulnerability identification, thereby enhancing accuracy and interpretability in detection. Experimental results indicate that VulnTrace achieves approximately 93% accuracy, 95% precision, 83% recall and an Fl score of 88% in vulnerability detection tasks, significantly reducing false positives and demonstrating remarkable robustness.
查看更多>>摘要:Multi-modal entity alignment aims to identify equivalent entities across diverse knowledge graphs by leveraging multiple modalities of entity information. This process is crucial for the fusion of multi-modal knowledge graphs. While current research primarily investigates how to utilize side information from entity visuals, relations, and attributes, it often overlooks the significant role of entity-type information. Furthermore, multi-modal data embedding encounters noise that negatively impacts the performance of the entity alignment task. To address these gaps, this paper introduces MTCEA, a multi-modal entity alignment method guided by entity-type information. The proposed method captures the constraints associated with entities based on the entity-type information obtained from knowledge graph ontology; then, it utilizes two embedding strategies for type constraints to enhance the model's performance in knowledge representation. This allows effective modal fusion that integrates more finegrained semantic constraints related to types, which improves the alignment accuracy across various cross-lingual knowledge graphs. MTCEA is validated on three subsets of DBP15K. Experimental results demonstrate that our model achieves good results overall on the Hits@1, Hits@10, and MRR metrics. In an experimental setting without using entity name, MTCEA outperforms state-of-the-art baselines.
查看更多>>摘要:During software maintenance and evolution, developers spend more than half of their time on code comprehension activities. In order to understand an unfamiliar code base, they would naturally ask different types of questions related to code snippets and try to find the answers. In this paper, we conduct an initial work to explore the possibility of automatic question generation for program comprehension. We construct a large-scale data set containing pairs of source code and questions that are automatically transformed from inline comments based on dependency analysis and semantic role labeling. We also build a comprehensive taxonomy of question types so as to generate questions concerning different aspects of code snippets, such as purpose, implementation details and so on. Then, we propose a deep learning-based prototype CodeQG to automatically generates multiple types of questions for code snippets. We evaluate CodeQG by using both typical performance metrics and manual evaluation. The results show that (1) we can achieve a value of 42.02 on BLEU4 and 60.81 on ROUGE-L for the generated questions; (2) overall, the questions are very correct in grammatical, semantic and format; (3) the questions are related to the corresponding code snippet and are helpful for developers in source code comprehension activities. Our work gives insights into automatically generating multiple types of questions for code comprehension. We expect this exploration will improve the applicability and generality of machine code comprehension.
查看更多>>摘要:Just-in-time software defect prediction (JIT-SDP) is a defect prediction technique that targets changes in software code, offering significant advantages in quickly identifying potential defects and improving development efficiency. However, most existing methods assume that the importance of features remains stable over time, overlooking the dynamic changes in feature distributions and the evolution of class imbalance in real-world development environments. This limitation eventually degrades the predictive performance. To address this issue, this paper proposes an Imbalance-oriented Online Feature Selection (IOFS) method, which dynamically adjusts the feature importance and uncertainty parameters to adapt in real time to concept drift and class imbalance in data streams, thereby enhancing model performance and generalization. The experimental validation on 14 open-source project datasets demonstrates that IOFS significantly improves the values of G-Mean on 11 datasets and effectively reduces the average of the absolute differences between recalls for each time step, exhibiting robustness to dynamic feature changes and sensitivity to development-phase feature differences. This study provides an effective solution for online JIT-SDP.