首页|Comparative analysis of text mining and clustering techniques for assessing functional dependency between manual test cases

Comparative analysis of text mining and clustering techniques for assessing functional dependency between manual test cases

扫码查看
Text mining techniques, particularly those leveraging machine learning for natural language processing, have gained significant attention for qualitative data analysis in software testing. However, their complexity and lack of transparency can pose challenges, especially in safety-critical domains where simpler, interpretable solutions are often preferred unless accuracy is heavily compromised. This study investigates the trade-offs between complexity, effort, accuracy, and utility in text mining and clustering techniques, focusing on their application for detecting functional dependencies among manual integration test cases in safety-critical systems. Using empirical data from an industrial testing project at ALSTOM Sweden, we evaluate various string distance methods, NCD compressors, and machine learning approaches. The results highlight the impact of preprocessing techniques, such as tokenization, and intrinsic factors, such as text length, on algorithm performance. Findings demonstrate how text mining and clustering can be optimized for safety-critical contexts, offering actionable insights for researchers and practitioners aiming to balance simplicity and effectiveness in their testing workflows.

Artificial intelligenceClusteringNatural language processingText miningSoftware testing

Sahar Tahvili、Leo Hatvani、Michael Felderer、Francisco Gomes de Oliveira Neto、Wasif Afzal、Robert Feldt

展开 >

Compute Platforms Engineering Unit, Ericsson AB, Stockholm, Sweden||Department of Industrial AI Systems, Maelardalen University, Vaesteras, Sweden

Department of Industrial AI Systems, Maelardalen University, Vaesteras, Sweden

Institute of Software Technology, German Aerospace Center, Cologne, Germany||Department of Mathematics and Computer Science, University of Cologne, Cologne, Germany

Department of Computer Science and Engineering, Chalmers University of Technology, Gothenburg, Sweden||Department of Computer Science and Engineering, University of Gothenburg, Gothenburg, Sweden

Department of Computer Science and Engineering, Chalmers University of Technology, Gothenburg, Sweden||Department of Computer Science and Engineering, Blekinge Institute of Technology, Karlskrona, Sweden

展开 >

2025

Software quality journal

Software quality journal

ISSN:0963-9314
年,卷(期):2025.33(2)
  • 80