Parkinson's Disease Detection Method Based on Time-frequency Feature Fusion of Speech Signals
Dysphonia is one of the earliest symptoms of Parkinson's disease (PD).In recent years,many studies on the detection of PD based on speech signals used deep neural network models combined with Mel Scale features.However,existing models could adequately focus on the global time-series infor-mation of speech signals.And Mel Scale features had limited effectiveness in accurately characterizing the pathological information of PD.To solve the above problems,a speech detection method for PD was pro-posed based on time-frequency feature fusion.Firstly,Mel frequency cepstrum coefficients (MFCC) were extracted from speech signals and used as the input data for subsequent models.Then,encoder module of Conformer was introduced into the S-vectors model to extract speech global features in time do-main.Finally,global features in frequency domain,related to speech detection of PD,were embedded into the time-domain features to fuse the time-frequency information for PD detection ultimately.The ef-fectiveness of the proposed model was verified respectively on a public PD dataset and a self-collected speech dataset.
Parkinson's diseaseMel frequency cepstrum coefficientS-vectorsConformertime-frequency feature fusion