首页|Sound event localization and detection based on deep learning

Sound event localization and detection based on deep learning

扫码查看
Acoustic source localization(ASL)and sound event detection(SED)are two widely pursued independent research fields.In recent years,in order to achieve a more complete spa-tial and temporal representation of sound field,sound event localization and detection(SELD)has become a very active research topic.This paper presents a deep learning-based multi-overlapping sound event localization and detection algorithm in three-dimensional space.Log-Mel spectrum and generalized cross-correlation spectrum are joined together in channel dimen-sion as input features.These features are classified and regressed in parallel after training by a neural network to obtain sound recognition and localization results respectively.The channel attention mechanism is also introduced in the network to selectively enhance the features containing essential informa-tion and suppress the useless features.Finally,a thourough comparison confirms the efficiency and effectiveness of the pro-posed SELD algorithm.Field experiments show that the pro-posed algorithm is robust to reverberation and environment and can achieve higher recognition and localization accuracy com-pared with the baseline method.

sound event localization and detection(SELD)deep learningconvolutional recursive neural network(CRNN)chan-nel attention mechanism

ZHAO Dada、DING Kai、QI Xiaogang、CHEN Yu、FENG Hailin

展开 >

School of Mathematics and Statistics,Xidian University,Xi'an 710071,China

Science and Technology on Near-Surface Detection Laboratory,Wuxi 214035,China

国家自然科学基金Foundation of Science and Technology on Near-Surface Detection LaboratoryFoundation of Science and Technology on Near-Surface Detection LaboratoryFoundation of Science and Technology on Near-Surface Detection Laboratory陕西省自然科学基金

61877067TCGZ2019A002TCGZ2021C00361424142005112021JZ-19

2024

系统工程与电子技术(英文版)
中国航天科工防御技术研究院 中国宇航学会 中国系统工程学会 中国系统仿真学会

系统工程与电子技术(英文版)

CSTPCD
影响因子:0.64
ISSN:1004-4132
年,卷(期):2024.35(2)
  • 34