首页|Deep Scattering Spectra with Deep Neural Networks for Acoustic Scene Classification Tasks

Deep Scattering Spectra with Deep Neural Networks for Acoustic Scene Classification Tasks

扫码查看
As one of the most commonly used features,Mel-frequency cepstral coefficients (MFCCs) are less discriminative at high frequency.A novel technique,known as Deep scattering spectrum (DSS),addresses this issue and looks to preserve greater details.DSS feature has shown promise both on classification and recognition tasks.In this paper,we extend the use of DSS feature for acoustic scene classification task.Results on Detection and classification of acoustic scenes and events (DCASE) 2016 and 2017 show that DSS provided 4.8% and 17.4% relative improvements in accuracy over MFCC features,within a state-of-the-art time delay neural network framework.

Acoustic scene classificationTime-delay neural networkDeep scattering spectrumDetection and classification of acoustic scenes and events (DCASE)

ZHANG Pengyuan、CHEN Hangting、BAI Haichuan、YUAN Qingsheng

展开 >

Key Laboratory of Speech Acoustics and Content Understanding, Institute of Acoustics, Chinese Academy of Sciences,Beijing 100190, China

University of Chinese Academy of Sciences, Beijing 100049, China

National Computer Network Emergency Response Technical Team/Coordination Center of China, Beijing 100029, China

This work is supported by the National Natural Science Foundation of ChinaThis work is supported by the National Natural Science Foundation of ChinaKey Science and Technology Project of the Xinjiang Uygur Autonomous RegionPre-research Project for Equipment of General Information System

11590774No.115907702016A03007-1JZX2017-0994/Y306

2019

中国电子杂志(英文版)

中国电子杂志(英文版)

CSTPCDCSCDSCIEI
ISSN:1022-4653
年,卷(期):2019.28(6)
  • 12