PypiGuard: A novel meta-learning approach for enhanced malicious package detection in PyPI through static-dynamic feature fusion

扫码查看

原文链接

NETL
NSTL
Elsevier

外文摘要：The increasing reliance on open-source software repositories, especially the Python Package Index (PyPi), has introduced serious security vulnerabilities as malicious actors embed malware into widely adopted packages, threatening the integrity of the software supply chain. Traditional detection methods, often based on static analysis, struggle to capture the complex and obfuscated behaviors characteristic of modern malware. Addressing these limitations, we present PypiGuard, an advanced hybrid ensemble meta-model for malicious package detection that integrates both static metadata and dynamic Application Programming Interface (API) call behaviors, enhancing detection accuracy and reducing error rates. Leveraging the MalwareBench dataset, our approach utilizes an innovative preprocessing pipeline that fuses metadata features with categorized API behaviors. The PypiGuard model employs a hybrid ensemble structure composed of Random Forest (RF), Gradient Boosting (GB), Decision Tree (DT), K-Nearest Neighbors (KNN), LightGBM, and an Artificial Neural Network (ANN), assembled through dynamically optimized stacking-based meta-learning framework that adapts to model-specific prediction strengths. Compared to Deep Learning (DL) baselines like Long-Short Term Memory (LSTM) and Convolutional Neural Network (CNN), PypiGuard achieves significant improvements in accuracy and False Positive Rate (FPR), with a detection accuracy of 98.43% and a markedly low FPR, confirming its enhanced effectiveness in accurately identifying malicious packages.

外文关键词：

Malicious package detectionHybrid ensemble meta-modelStatic and dynamic feature fusionPython package security (PyPI)Software supply chain securityCHALLENGES

作者：

Iqbal, Tahir、Wu, Guowei、Iqbal, Zahid、Mahmood, Muhammad Bilal、Shafique, Amreen、Guo, Wenbo

展开 >

作者单位：

Dalian University of Technology School of Software Technology

Northeastern University Software College

Nanyang Technol Univ

出版年：

2025

DOI：

10.1016/j.jisa.2025.104032

Journal of information security and applications

SCI

ISSN：2214-2126

年,卷(期)：2025.90(May)

参考文献量56