利用双向长短时记忆网络的合成语音检测方法

Bi-directional long short-term memory network-based synthetic voice detection method

扫码查看

原文链接

万方数据

中文摘要：人工智能技术的快速发展带来了合成语音技术的广泛应用,同时也引发了身份伪造和欺诈等安全问题.采用深度学习技术,利用双向长短时记忆网络(BiLSTM),提出了一种改进的合成语音检测方法.通过提取梅尔频率倒谱系数(MFCCs)特征,并将其输入CNN-BiLSTM混合模型,该方法利用CNN的特征提取和BiLSTM的序列处理能力,学习自然与合成语音的差异,有效提升了检测准确性和鲁棒性.在ASVspoof 2019和2021数据集上的实验显示,该方法的等错误率为5%左右,检测精度和鲁棒性方面优于现有的一些方法.

外文摘要：The rapid development of artificial intelligence technology has brought about the wide application of synthetic speech technology,but also caused security problems such as identity forgery and fraud.In this paper,we propose an improved syn-thetic speech detection method using deep learning technology and BiLSTM.By extracting Meir frequency cepstrum coefficient(MFCCs)features and inputting them into the CNN-BiLSTM hybrid model,the method utilizes the feature extraction of CNN and the sequence processing capability of BiLSTM to learn the differences between natural and synthetic speech,effectively improving the detection accuracy and robustness.Experiments on ASVspoof 2019 and 2021 datasets show that the method has an equal error rate of about 5%,which is superior to some existing techniques in terms of detection accuracy and robustness.

外文关键词：

synthetic voice detectionbidirectional long short-term memory networkdeep learning

作者：

苏卓艺、陈园允

展开 >

作者单位：

广州城市职业学院电子信息工程学院,广州 510000

广东工程职业技术学院人工智能学院,广州 510000

关键词：

合成语音检测双向长短时记忆网络深度学习

出版年：

2024

DOI：

10.3969/j.issn.1007-1423.2024.24.009

现代计算机

中大控股

现代计算机

影响因子：0.292

ISSN：1007-1423

年,卷(期)：2024.30(24)