数据不平衡下鸟声识别的集成学习策略

Ensemble learning strategy for birdsong recognition under data imbalance

扫码查看

原文链接

NETL
NSTL
万方数据

中文摘要：鸟声识别是被动声学监测的重要应用领域,集成学习方法对提升鸟类识别精度具有重要研究价值,但面对数据不平衡问题时缺少有效的集成策略.为此,通过基学习器的迁移学习获得鸟声信号的不同方面表征,满足了少标签样本条件下的学习训练.同时,设计加入自注意力机制的特征融合和敏感正则项用于提升模型对稀有鸟类的关注度,确保集成模型在信息不对称情况下推理时获得全局最优解.本文在南京老山森林公园共收集了 10种鸟类样本,并对预训练模型完成了微调.通过鸟声识别分类实验,在样本不平衡的自建数据集与BirdCLEF 2023数据集上,总体分类精度分别达到了95.29％和90.17％.本文所提出的集成学习策略提升了少量样本类别的敏感度,增强了模型的泛化能力和学习训练效率,与主流集成学习方法相比较,能更好地适用于当地稀有鸟类的被动鸟声监测与识别,助力鸟类生态环境的精准保护.

外文摘要：Aim & Summary:The dynamics and distribution changes of bird populations are essential components of ecosystems and critical for maintaining ecological balance.Recently,the rapid development of acoustic monitoring technologies has enabled passive acoustic bird recognition to become an efficient and non-invasive method for bird monitoring.However,the collection and annotation of bird sound data face numerous challenges for practical application,particularly issues of data imbalance and sample scarcity,which severely limit the improvement of recognition accuracy.We focus on the application of ensemble learning methods in bird recognition to solve the issue of rare bird species identification under data imbalance conditions while enhancing the generalization ability and training efficiency of the model.Our study designs a cost-sensitive ensemble learning strategy to overcome the limitations posed by imbalanced and scarce bird sound data.Thus,we improve the recognition accuracy of rare bird species.We construct an efficient and accurate passive acoustic bird recognition system that provides strong support for the precise conservation of avian environments by integrating techniques such as transfer learning,self-attention mechanisms,and sensitive regularization terms.Methods:To achieve the aforementioned objectives,we propose an improved cost-sensitive stacking ensemble learning strategy(cost-sensitive stacking ensemble for bird sound recognition,CSE-BSR).The specific methods include:(1)preprocessing collected bird sound data,including noise reduction,feature extraction,and spectrogram analysis,to improve model performance and reduce training time;(2)selecting deep learning models pre-trained on large bird sound datasets as base learners and fine-tuning them through transfer learning to better adapt to new recognition tasks;(3)designing a feature fusion method based on self-attention mechanisms to effectively integrate homogeneous yet heterogeneous features output by base learners,enhancing feature representation and model generalization;(4)recognition classification by incorporating sensitive regularization terms into the loss function of the ensemble model and dynamically adjusting weights according to the rarity coefficients of bird species to ensure the model obtains a global optimal solution during inference.Results:We construct a proprietary dataset using samples from ten bird species in Laoshan Forest Park,Nanjing to verify the effectiveness of our proposed method.Additionally,experiments were conducted on the publicly available BirdCLEF 2023 dataset.Experimental results show that the proposed method achieved overall classification accuracies of 95.29％and 90.17％on the imbalanced proprietary dataset and the BirdCLEF 2023 dataset,respectively,significantly outperforming mainstream ensemble learning methods.Specifically,the proposed method exhibited higher sensitivity and generalization capability in recognizing rare bird species.Conclusion:We address the issues of data imbalance and sample scarcity in bird sound recognition by proposing a cost-sensitive ensemble learning strategy.The recognition accuracy and generalization ability of rare bird species is enhanced through techniques such as transfer learning,self-attention mechanisms,and sensitive regularization terms.The proposed approach demonstrates superior performance and scalability in practical applications compared to mainstream ensemble learning methods.However,the training and inference processes remain time-consuming and resource-intensive despite achieving significant recognition effects.Future research plans include how to optimize model structures,reduce computational costs,and enhance model interpretability to better serve the precise conservation of avian environments.

外文关键词：

birdsong recognitiondata imbalanceensemble learningtransfer learningsensitive cost

作者：

申小虎、李冠宇、史洪飞、王传之

展开 >

作者单位：

江苏警官学院刑事科学技术系,南京 210031

野生动植物物证技术国家林业和草原局重点实验室,南京 210023

大连海事大学信息科学与技术学院,辽宁大连 116026

科大讯飞科技有限公司,合肥 230088

展开 >

关键词：

鸟声识别数据不平衡集成学习迁移学习敏感代价

出版年：

2024

DOI：

10.17520/biods.2024215

生物多样性

中国科学院生物多样性委员会中国植物学会　中国科学院植物研究所　中国科学院动物研究所　中国科学院微生物研究所

生物多样性

CSTPCD北大核心

影响因子：1.274

ISSN：1005-0094

年,卷(期)：2024.32(10)