首页|基于跨语种声学分析的帕金森病检测方法

基于跨语种声学分析的帕金森病检测方法

扫码查看
基于语音的帕金森病检测具有非介入式、成本较低和无创等优点.当前公开的帕金森病语音数据集大多来源于单一语种,存在数据容量不够大、受试者母语发音特点差异小等特点.单一语种数据集上训练的帕金森病检测模型在面对跨语种语音数据时,将出现性能下降.为避免语种差异带来的影响,提升模型在跨语种场景下的检测性能,该文引入对抗迁移学习和特征解耦的思想,提出一种帕金森病跨语种声学分析模型(CLSAM).首先,将基于多头自注意力机制的Transformer编码块和多层神经网络级联,组成特征提取器模块,用于将从源域和目标域语音中提取的原始Fbank语音特征初步解耦为两个向量,即域不变病理信息表征向量和域信息表征向量;设计了目标任务不一致的双重对抗训练模块,显式地分离域不变病理信息和域信息;最终,提取跨语种语音数据中的域不变病理信息用于帕金森病检测.该文在公开的MaxLittle帕金森病语音数据集以及自采的帕金森病语音数据集上,采用十折交叉验证的方法验证了所提方法的有效性.实验结果表明:与传统机器学习方法以及现有的迁移学习算法相比,所提模型在跨语种场景中的检测准确率、敏感度和F1分数等性能均有明显提升.
Parkinson's Disease Detection Method Based on Cross-Language Acoustic Analysis
The research on speech-based Parkinson's disease detection has the advantages of non-intrusive, low cost and non-invasive. The current publicly available speech datasets for Parkinson's disease mostly originate from single-language speech, which has the characteristics such as insufficient data capacity and small differences in the pronunciation characteristics of the subjects' mother tongue. The Parkinson's disease detection model trained on a single language dataset will experience performance degradation when faced with cross-language speech data. To avoid the impact of language differences and improve the detection performance of the model in cross-language scenarios, the ideas of adversarial transfer learning and feature decoupling is introduced and a Parkinson's disease Cross-Language Speech Analysis Model (CLSAM) is proposed in this paper. Firstly, the model cascades a multihead self-attention encoder and a multi-layer neural network to form a feature extractor module, which is used to decouple the original Fbank speech features extracted from the pronunciation characteristics of the source domain and target domain into two vectors, namely domain invariant pathological information representation vector and domain information representation vector.Secondly, a dual adversarial training module with inconsistent target tasks is designed, which explicitly separates domain invariant pathological information and domain information. Finally, domain invariant pathological information is extracted from cross-language speech data for Parkinson's disease detection. This paper verifies the effectiveness of the proposed method using a ten-fold cross-validation method on both the publicly available MaxLittle Parkinson's disease speech dataset and the self-collected Parkinson's disease speech dataset. Experimental results show that compared with traditional machine learning methods and existing transfer learning algorithms, the proposed model significantly improves the accuracy, sensitivity and F1 scores in cross-language scenarios.

Cross-language speech analysisParkinson's diseaseAdversarial transfer learningFeature decoupling

季薇、王传瑜、吴迪、李云、郑慧芬

展开 >

南京邮电大学通信与信息工程学院 南京 210003

南京邮电大学计算机学院 南京 210023

南京医科大学附属老年医院 南京 210024

跨语种声学分析 帕金森病 对抗迁移学习 特征解耦

江苏省高等学校基础科学(自然科学)重大项目

21KJA520003

2024

电子与信息学报
中国科学院电子学研究所 国家自然科学基金委员会信息科学部

电子与信息学报

CSTPCD北大核心
影响因子:1.302
ISSN:1009-5896
年,卷(期):2024.46(2)
  • 25