Deep Scattering Spectra with Deep Neural Networks for Acoustic Scene Classification Tasks

扫码查看

原文链接

NETL
NSTL
万方数据
维普

外文摘要：As one of the most commonly used features,Mel-frequency cepstral coefficients (MFCCs) are less discriminative at high frequency.A novel technique,known as Deep scattering spectrum (DSS),addresses this issue and looks to preserve greater details.DSS feature has shown promise both on classification and recognition tasks.In this paper,we extend the use of DSS feature for acoustic scene classification task.Results on Detection and classification of acoustic scenes and events (DCASE) 2016 and 2017 show that DSS provided 4.8％ and 17.4％ relative improvements in accuracy over MFCC features,within a state-of-the-art time delay neural network framework.

外文关键词：

Acoustic scene classificationTime-delay neural networkDeep scattering spectrumDetection and classification of acoustic scenes and events (DCASE)

作者：

ZHANG Pengyuan、CHEN Hangting、BAI Haichuan、YUAN Qingsheng

展开 >

作者单位：

Key Laboratory of Speech Acoustics and Content Understanding, Institute of Acoustics, Chinese Academy of Sciences,Beijing 100190, China

University of Chinese Academy of Sciences, Beijing 100049, China

National Computer Network Emergency Response Technical Team/Coordination Center of China, Beijing 100029, China

基金：

This work is supported by the National Natural Science Foundation of ChinaThis work is supported by the National Natural Science Foundation of ChinaKey Science and Technology Project of the Xinjiang Uygur Autonomous RegionPre-research Project for Equipment of General Information System

项目编号：

11590774No.115907702016A03007-1JZX2017-0994/Y306

出版年：

2019

DOI：

10.1049/cje.2019.07.006

中国电子杂志(英文版)

CSTPCDCSCDSCIEI

ISSN：1022-4653

年,卷(期)：2019.28(6)

参考文献量12