基于深度学习的多声音事件检测研究综述
Review of deep learning based polyphonic sound event detection
张珑 1张恒远 1魏育华 2杨烁祯1
作者信息
- 1. 天津师范大学计算机与信息工程学院,天津 300387
- 2. 广州华立科技职业学院计算机信息工程学院,广州 511325
- 折叠
摘要
多声音事件检测是当前语音处理的研究热点之一,本文对近年来基于深度学习的多声音事件检测模型进行综述.首先介绍了 4种监督学习模型和13种弱监督学习模型,弱监督学习模型包括基于平均教师的模型、基于注意力的模型、基于源分离的模型、基于自训练的模型以及其他模型,分析了各模型的特征、结构和性能;然后对各种模型使用的数据集及评价指标进行简要介绍;最后讨论了该领域未来的研究方向.
Abstract
Polyphonic sound event detection is one of the research hotspots in speech processing.The polyphonic sound event detection models based on deep learning in recent years are reviewed.Firstly,four supervised learning models and 13 weakly supervised learning models are introduced.Weakly supervised learning models include mean-teacher-based model,attention-based model,source separation-based model,self-training model and other models.Then,the data sets and evaluation indexes used in each model are briefly introduced.Finally,the future research direction in this field is discussed.
关键词
深度学习/多声音事件检测/弱监督学习/半监督学习Key words
deep learning/polyphonic sound event detection/weakly supervised learning/semi-supervised learning引用本文复制引用
出版年
2024