首页期刊导航|The Journal of systems and software
期刊信息/Journal information
The Journal of systems and software
Elsevier Science Inc.
The Journal of systems and software

Elsevier Science Inc.

0164-1212

The Journal of systems and software/Journal The Journal of systems and softwareSCIISTPAHCIEI
正式出版
收录年代

    Cyber-physical systems with Human-in-the-Loop: A systematic review of socio-technical perspectives

    Torkil ClemmensenMahyar Tourchi MoghaddamJacob Norbjerg
    112348.1-112348.22页
    查看更多>>摘要:Understanding and designing Cyber-physical systems (CPS) with humans in the loop (HITL) is a basic cross-scientific research problem with large implications for industry. The current software engineering knowledge already explains how to include the humans in the operation of the machines in terms of interfaces, architectures, adaptive systems, and design methodologies for including the Human-in-the-Loop. This paper extends existing knowledge with a systematic review of socio-technical perspectives on CPS with HITL. The review was software engineering focused, as it searched the body of research on CPS with HITL, and only within that body, those papers that included socio-technical perspectives. The results indicated four main areas in the ST literature. Validating these insights by expert interviews with industry CPS experts showed some alignment and also fundamental differences between the socio-technical literature (ST literature) insights and the industry experts' viewpoints. The discussion identifies useful crossings between the ST literature and research into CPS with HITL adaption, and touch on the issues of non-alignments in industry practice. The conclusion is that the ST perspectives on the body of knowledge on CPS with HITL has much to offer researchers in terms of innovative ways to look at the HITL, but the literature needs further development before industry experts can effectively use it. Future research possibilities are outlined.

    Poisoned source code detection in code models

    Ehab GhannoumMohammad Ghafari
    112384.1-112384.12页
    查看更多>>摘要:Deep learning models have gained popularity for conducting various tasks involving source code. However, their black-box nature raises concerns about potential risks. One such risk is a poisoning attack, where an attacker intentionally contaminates the training set with malicious samples to mislead the model's predictions in specific scenarios. To protect source code models from poisoning attacks, we introduce CodeGarrison (CG), a hybrid deep-learning model that relies on code embeddings to identify poisoned code samples. We evaluated CG against the state-of-the-art technique ONION for detecting poisoned samples generated by DAMP, MHM, ALERT, as well as a novel poisoning technique named CodeFooler. Results showed that CG significantly outperformed ONION with an accuracy of 93.5%. We also tested CG's robustness against unknown attacks, achieving an average accuracy of 85.6% in identifying poisoned samples across the four attacks mentioned above.

    Behavioral decision-making and safety verification approaches for autonomous driving system in extreme scenarios

    Ying ZhaoYi ZhuLi ZhaoJunge Huang...
    112385.1-112385.12页
    查看更多>>摘要:Autonomous vehicles are crucial for improving traffic efficiency and reducing accidents, yet the complexity of driving scenarios and behavioral uncertainty pose challenges for decision-making. Recent research integrates virtual simulation with decision algorithms to enhance system intelligence and performance. Nonetheless, the potential hazards associated with extreme weather conditions are often overlooked. To mitigate this issue, this paper proposes a Bayesian network decision-making model based on hazard probability inference. The model enables the driver assistance system to take over the control of the vehicle in extreme scenarios and dynamically adjust decision strategies based on the potential hazard values under multivariate data. First, safety elements of sporadic hazardous scenarios are extracted using the Accidental and Catastrophic Automatic Driving Scenario Modeling Language and used as nodes to construct a Bayesian network for inferring potential driving hazards. Second, a Bayesian decision-making model is designed based on the semantic hierarchy of the autonomous driving system domain ontology, aiming to derive the optimal driving behavior for the current vehicle in extreme scenarios. The safety of these decisions is verified using the UPPAAL-SMC statistical model checker. Finally, the model's validity is confirmed through a real-world autonomous vehicle accident, with results indicating more rational decisions and improved safety performance.

    Improving the performance of software fault localization with effective coverage data reduction techniques

    Chih-Chiang FangChin-Yu HuangShou-Yu LeeYao-Hsien Tseng...
    112388.1-112388.22页
    查看更多>>摘要:Fault localization (FL) techniques are widely used to identify the exact location of faulty statement in programs. Three common FL families are SBFL, MBFL, and deep learning-based FL, respectively. Before running any FL methods, coverage data is usually considered as input of FL stage. Therefore, coverage data plays an important role in FL field. On the other hand, if coverage data can be reduced effectively, the performance of FL will be greatly improved. In past studies, filtering out fault-irrelevant statements based on solely failed test cases, the traditional principal component analysis (PCA), and revised PCA techniques were applied to minimize coverage data. However, these approaches have a great opportunity to remove the actual faulty statement, especially in multiple fault localization (MFL). Tracing their root causes does not reflect the actual status of each statement. In this paper, we propose two approaches to improve the situations of deleted faulty statements. For the first approach, called Revised PCA with Ensemble Weight Integration (RPCA-EWI), it updates the contribution value of each statement based on revised PCA and incorporate the results of different combinations of failed and passed test cases. For the second approach, called Revised PCA with Important List Checking (RPCA-ILC), we establish a list of the top N% important statements by using the results of different test case combinations. If the deleted statement appears within this list, preserve it in reduced coverage data. Otherwise, it discards directly. We selected three Linux open-source codes (Gzip, Grep, and Sed) with 4 fault injections to validate the correctness. From the analysis of various perspectives, experimental results show that there is a significant improvement in shortening execution time of the FL process, and also can alleviate the situations for removed faulty statements compared to PCA and the revised PCA methods.

    KPAMA: A Kubernetes based tool for Mitigating ML system Aging

    Ding WenjieLiu ZhihaoLu XuhuiDu Xiaoting...
    112389.1-112389.13页
    查看更多>>摘要:As machine learning (ML) systems continue to evolve and be applied, their user base and system size also expand. This expansion is particularly evident with the widespread adoption of large language models. Currently, the infrastructure supporting ML systems, such as cloud services and computing hardware, which are increasingly becoming foundational to the ML system environment, is increasingly adopted to support continuous training and inference services. Nevertheless, it has been shown that the increased data volume, complexity of computations, and extended run times challenge the stability of ML systems, efficiency, and availability, precipitating system aging. To address this issue, we develop a novel solution, KPAMA, leveraging Kubernetes, the leading container orchestration platform, to enhance the autoscaling of computing workflows and resources, effectively mitigating system aging. KPAMA employs a hybrid model to predict key aging metrics and uses decision and anti-oscillation algorithms to achieve system resource autoscaling. Our experiments indicate that KPAMA markedly mitigates system aging and enhances task reliability compared to the standard Horizontal Pod Autoscaler and systems without scaling capabilities.

    NoSQL database education: A review of models, tools and teaching methods

    Nirnaya Tripathi
    112391.1-112391.17页
    查看更多>>摘要:NoSQL databases are essential for managing modern data-intensive applications. While SQL education is a crucial part of the software engineering and computer science curriculum, it is insufficient in addressing the rise of big data and cloud infrastructures. Despite extensive research on SQL education, there is limited exploration of NoSQL education, particularly in teaching methods and data models. This study addresses this gap by conducting a systematic literature review on NoSQL database education, aiming to assess current research, teaching practices, models, tools, scalability, and security mechanisms while offering a framework for integrating NoSQL into academic curricula. Out of 386 articles, 28 were selected for detailed analysis, focusing on NoSQL teaching methods, models, and curriculum development. Findings revealed that document-oriented and graph databases, especially MongoDB, Cassandra, and Neo4j, are the most taught. The project-based learning approach was the most common teaching method. Challenges identified include adapting to technological advancements, addressing diverse student needs, and the shift to online learning. This review contributes valuable insights into NoSQL education and offers recommendations for improving teaching practices in software engineering curricula.

    Pandemic pedagogy: Evaluating remote education strategies during COVID-19

    Daniel Russo
    112392.1-112392.15页
    查看更多>>摘要:The COVID-19 pandemic triggered an unprecedented transformation in the educational landscape, requiring universities to swiftly pivot from in-person to online instruction. This rapid transition left many educators navigating the complexities of remote teaching for the first time. Now that we have moved past the pandemic, we present a critical retrospective study to analyze and assess the remote teaching practices employed during this challenging period. By conducting a cross-sectional analysis of 300 computer science students who experienced a full year of online education during the lockdown, we discovered that while remote teaching practices had a moderate impact on learning outcomes, they significantly influenced student satisfaction. Importantly, these trends were not isolated; they reflect a shared experience across various demographics, including country, gender, and educational background. This research delivers vital evidence-based recommendations that can guide educational strategies in the event of future challenges. By applying these insights, we can enhance both student satisfaction and the effectiveness of learning in online settings, ensuring that we are better prepared for whatever lies ahead.

    A systematic mapping study of crowd knowledge enhanced software engineering research using Stack Overflow

    Minaoar Hossain TanzilShaiful ChowdhurySomayeh ModaberiGias Uddin...
    112405.1-112405.22页
    查看更多>>摘要:Developers continuously interact in crowd-sourced community-based question-answer (Q&A) sites. Reportedly, ~30% of all software professionals visit the most popular Q&A site StackOverflow (SO) every day. Software engineering (SE) research studies are also increasingly using SO data. To find out the trend, implication, impact, and future research potential utilizing SO data, a systematic mapping study needs to be conducted. Following a rigorous reproducible mapping study approach, from 18 reputed SE journals and conferences, we collected 384 SO-based research articles and categorized them into 10 facets (i.e., themes). We found that SO contributes to 85% of SE research compared with popular Q&A sites such as Quora, and Reddit. We found that 18 SE domains directly benefited from SO data whereas Recommender Systems, and API Design and Evolution domains use SO data the most (15% and 16% of all SO-based research studies, respectively). API Design and Evolution, and Machine Learning with/for SE domains have consistent upward publication. Deep Learning Bug Analysis and Code Cloning research areas have the highest potential research impact recently. With the insights, recommendations, and facet-based categorized paper list from this mapping study, SE researchers can find out potential research areas according to their interest to utilize large-scale SO data.

    Soley: Automated detection of logic vulnerabilities in Ethereum smart contracts using large language models

    Majd SoudWaltteri NuutinenGrischa Liebel
    112406.1-112406.22页
    查看更多>>摘要:Context: Modern blockchain, such as Ethereum, supports the deployment and execution of so-called smart contracts, autonomous digital programs with significant value of cryptocurrency. Executing smart contracts requires gas costs paid by users, which define the limits of the contract's execution. Logic vulnerabilities in smart contracts can lead to excessive gas consumption, financial losses, and are often the root cause of high-impact cyberattacks. Objective: Our objective is threefold: (ⅰ) empirically investigate logic vulnerabilities in real-world smart contracts extracted from code changes on GitHub, (ⅱ) introduce S61ey, an automated method for detecting logic vulnerabilities in smart contracts, leveraging Large Language Models (LLMs), and (ⅲ) examine mitigation strategies employed by smart contract developers to address these vulnerabilities in real-world scenarios. Method: We obtained smart contracts and related code changes from GitHub. To address the first and third objectives, we qualitatively investigated available logic vulnerabilities using an open coding method. We identified these vulnerabilities and their mitigation strategies. For the second objective, we extracted various logic vulnerabilities, focusing on those containing inline assembly fragments. We then applied preprocessing techniques and trained the proposed S61ey model. We evaluated Soley along with the performance of various LLMs and compared the results with the state-of-the-art baseline on the task of logic vulnerability detection. Results: Our results include the curation of a large-scale dataset comprising 50,000 Ethereum smart contracts, with a total of 428,569 labeled instances of smart contract vulnerabilities, including 171,180 logic-related vulnerabilities. Our analysis uncovered nine novel logic vulnerabilities, which we used to extend existing taxonomies. Furthermore, we introduced several mitigation strategies extracted from observed developer modifications in real-world scenarios. Experimental results show that S61ey outperforms existing approaches in automatically identifying logic vulnerabilities, achieving a 9% improvement in accuracy and a maximum improvement of 24% in F1-measure over the Baseline. Interestingly, the efficacy of LLMs in this task was evident with minimal feature engineering. Despite the positive results, Soley struggles to identify certain classes of logic vulnerabilities, which remain for future work. Conclusion: Early identification of logic vulnerabilities from code changes can provide valuable insights into their detection and mitigation. Recent advancements, such as LLMs, show promise in detecting logic vulnerabilities and contributing to smart contract security and sustainability.

    Requirements extraction from model-based systems engineering: A systematic literature review

    Jefferson L. SantosLuiz Eduardo G. MartinsJefferson Seide Molleri
    112407.1-112407.19页
    查看更多>>摘要:Collaboration and easy data exchange are crucial in modern systems that involve hardware, electronics, software, and users. Requirement Engineering (RE) and Systems Engineering (SE) are challenging fields that require tool support to automate activities. Natural language (NL) requirement documents can create processing issues. To address these issues, detailed models have been developed to represent a system effectively. These models are intend to replace inconsistent documents over time by using model-based methodologies like Model-Based SE (MBSE). Within the MBSE methodologies, Arcadia/Capella has proven its capabilities as a comprehensive tool in the SE community to define and validate complex system architecture. Thus, this paper aims to investigate the tools, methods, techniques, or processes for extracting requirements from the MBSE environment or model generation from NL requirements. Furthermore, this discusses how these approaches are applied specifically in the Arcadia/Capella and how transforming requirements are addressed to support textual requirements. We conducted a systematic literature review (SLR) by selecting 97 articles to examine advances in this field in various aspects of these approaches. The results presented in this SLR uncovered several key findings that have important implications for future research, such as the dominance of the model generation from NL; transforming model-based requirements to NL requires more data; and the fact that requirements extraction in Arcadia/Capella needs more evidence.