基于二阶差分MFCC深度学习的声景基调声分类方法

A soundscape keynote classification based on the second order difference MFCC in depth learning

邓志勇 ¹张万亿 ²刘爱利³

扫码查看

作者信息

1. 首都师范大学音乐学院,北京 100048
2. 中央音乐学院音乐人工智能与音乐信息科技系,北京 100031
3. 首都师范大学资源环境与旅游学院,北京 100048
折叠

摘要

本文提出了一种可用于卷积神经网络分类技术的二阶差分MFCC特征,尝试解决声景学中基调声与非基调声二分类这一具有"人文色彩"的主观分类任务.以老北京中轴线的声景样本数据集为例,根据本文设计的网络模型结构,使用该二阶差分MFCC特征训练的二分类器对于声景基调声的识别准确率达到80.23％,远优于单独使用RMS和Mel频谱特征,以及联合使用RMS与二阶差分MFCC特征的准确率.

Abstract

In order to solve the subjective classification task of soundscape keynote classification with"humanistic color"in depth learning,a feature of the second order difference MFCC used in the classification technology of convolution neural network was put forward in this paper.Taking the soundscape data set in the axis of the Old Beijing for example,the accuracy of the keynote recognition by means of the second order difference MFCC in the designed CNN framework is 80.23％,which is higher than those of RMS,Mel spectrogram,and integration features of RMS and the second order difference MFCC.

关键词

声景/基调声/卷积神经网络/二阶差分MFCC

Key words

soundscape/keynote/convolution neural network/second order difference MFCC

引用本文复制引用

基金项目

北京市社会科学基金重点项目(22GLA014)

国家自然科学基金面上项目(41871130)

出版年

2023

中国传媒大学学报(自然科学版)

中国传媒大学

中国传媒大学学报(自然科学版)

CHSSCD

影响因子：0.514

ISSN：1673-4793

被引量1

参考文献量16

段落导航