首页|Multi-label sampling based on local label imbalance

Multi-label sampling based on local label imbalance

扫码查看
Class imbalance is an inherent characteristic of multi-label data that hinders most multi-label learning methods. One efficient and flexible strategy to deal with this problem is to employ sampling techniques before training a multi-label learning model. Although existing multi-label sampling approaches alleviate the global imbalance of multi-label datasets, it is actually the imbalance level within the local neighbour-hood of minority class examples that plays a key role in performance degradation. To address this issue, we propose a novel measure to assess the local label imbalance of multi-label datasets, as well as two multi-label sampling approaches, namely Multi-Label Synthetic Oversampling based on Local label imbal-ance (MLSOL) and Multi-Label Undersampling based on Local label imbalance (MLUL). By considering all informative labels, MLSOL creates more diverse and better labeled synthetic instances for difficult exam-ples, while MLUL eliminates instances that are harmful to their local region. Experimental results on 13 multi-label datasets demonstrate the effectiveness of the proposed measure and sampling approaches for a variety of evaluation metrics, particularly in the case of an ensemble of classifiers trained on repeated samples of the original data. (c) 2021 Elsevier Ltd. All rights reserved.

Multi-label learningClass imbalanceOversampling and undersamplingLocal label imbalanceEnsemble methodsCLASSIFICATION

Liu, Bin、Blekas, Konstantinos、Tsoumakas, Grigorios

展开 >

Aristotle Univ Thessaloniki

Univ Ioannina

2022

Pattern Recognition

Pattern Recognition

EISCI
ISSN:0031-3203
年,卷(期):2022.122
  • 9
  • 38