首页|基于欠定盲源分离和深度学习的生猪状态音频识别

基于欠定盲源分离和深度学习的生猪状态音频识别

扫码查看
[目的]为解决群养环境下生猪音频难以分离与识别的问题,提出基于欠定盲源分离与 ECA-EfficientNetV2的生猪状态音频识别方法.[方法]以仿真群养环境下4类生猪音频信号作为观测信号,将信号稀疏表示后,通过层次聚类估计出信号混合矩阵,并利用lp范数重构算法求解lp范数最小值以完成生猪音频信号重构.将重构信号转化为声谱图,分为进食声、咆哮声、哼叫声和发情声 4 类,利用ECA-EfficientNetV2 网络模型识别音频,获取生猪状态.[结果]混合矩阵估计的归一化均方误差最低为 3.266×10-4,分离重构的音频信噪比在 3.254~4.267 dB之间.声谱图经ECA-EfficientNetV2 识别检测,准确率高达 98.35%;与经典卷积神经网络ResNet50 和VGG16 对比,准确率分别提升 2.88 和 1.81 个百分点;与原EfficientNetV2 相比,准确率降低0.52 个百分点,但模型参数量减少 33.56%,浮点运算量(FLOPs)降低 1.86 G,推理时间减少 9.40 ms.[结论]基于盲源分离及改进EfficientNetV2的方法,轻量且高效地实现了分离与识别群养生猪音频信号.
Pig state audio recognition based on underdetermined blind source separation and deep learning
[Objective]In order to solve the problem of difficult separation and recognition of pig audio under group rearing environment,we propose a method of pig state audio recognition based on underdetermined blind source separation and ECA-EfficientNetV2.[Method]Four types of pig audio signals were simulated as observation signals in group rearing environment.After the signals were sparsely represented,the signal mixing matrix was estimated by hierarchical clustering,and the lp-paradigm reconstruction algorithm was used to solve for the minimum of lp-paradigm to complete the reconstruction of pig audio signals.The reconstructed signals were transformed into acoustic spectrograms,which were divided into four categories,namely,eating sound,roar sound,hum sound and estrous sound.The audio was recognized using the ECA-EfficientNetV2 network model to obtain the state of the pigs.[Result]The normalized mean square error of the hybrid matrix estimation was as low as 3.266×10-4,and the signal-to-noise ratios of the separated reconstructed audio ranged from 3.254 to 4.267 dB.The acoustic spectrogram was recognized and detected by ECA-EfficientNetV2 with an accuracy of up to 98.35%,and the accuracy improved by 2.88 and 1.81 percentage points compared with the classical convolutional neural networks ResNet50 and VGG16,respectively.Compared with the original EfficientNetV2,the accuracy decreased by 0.52 percentage points,but the amount of the model parameters reduced by 33.56%,the floating-point operations(FLOPs)reduced by 1.86 G,and inference time reduced by 9.40 ms.[Conclusion]The method based on blind source separation and improvement of EfficientNetV2 lightly and efficiently realizes separating and recognizing audio signals of group-raised pigs.

PigBlind source separationSpectrogramAudio recognitionSparse reconstructionConvolutional neural network

潘伟豪、盛卉子、王春宇、闫顺丕、周小波、辜丽川、焦俊

展开 >

安徽农业大学信息与人工智能学院,安徽合肥 230036

安徽喜乐佳生物科技有限公司,安徽亳州 233500

盲源分离 声谱图 音频识别 稀疏重构 卷积神经网络

安徽省重点研究与开发计划安徽省重点研究与开发计划安徽省研究生质量工程项目安徽省研究生质量工程项目

2023n06020051202103B060200132022lhpysfjd0232022cxcyjs010

2024

华南农业大学学报
华南农业大学

华南农业大学学报

CSTPCD北大核心
影响因子:0.837
ISSN:1001-411X
年,卷(期):2024.45(5)
  • 11