首页|句子级时序卷积网络的多模态抑郁症识别方法

句子级时序卷积网络的多模态抑郁症识别方法

扫码查看
针对多模态抑郁症模型在特征提取时,语句间关联性较弱,不同模态间的特征融合较为随意,在中文数据集上模型的泛化能力缺乏验证等问题,本文通过分析与抑郁症相关的音频、文本和视觉特征,提出了基于改进TCN模型的多模态抑郁症识别模型STCMN(Sentence-level Temporal Convolutional Memory Net-work),并将该模型应用于临床抑郁症辅助诊断当中.该模型首先使用残差块、GRU和Self-Attention的融合模块来提取不同模态下的句子级特征,增强了上下文联系,然后使用TCN模型来提取不同模态的全局特征,并使用Cross Attention对不同模态的全局特征以多模态融合特征为主进行融合,最后通过LogSoftmax层得到模型对抑郁症的识别结果.在DAIC-WOZ公开数据集上,本文所提出的方法对抑郁症识别的准确率达到了91.3%,精确率达到了93.6%,召回率达到了89.7%,其相关指标均优于其他方法,可以更好地满足临床医学的需求.在私有中文数据集MMD2022上,STCMN模型的识别结果仍为最优,表明该模型在中文抑郁症识别任务上具较好的泛化能力.
Sentence-Level Temporal Convolutional Networks for Multimodal Depression Recognition
In the feature extraction of multimodal depression models,there are problems such as weak cor-relation between sentences,random feature fusion between different modalities,and lack of verification of the generalization ability of the model on the Chinese data set.By analyzing audio,text and visual fea-tures related to depression,this paper proposed a multi-modal depression recognition model STCMN(Sentence-level Temporal Convolutional Memory Network)based on improved TCN model.And the model was applied to the auxiliary diagnosis of clinical depression.Firstly,the fusion module of residual block,GRU and Self-Attention was used to extract the sentence-level features under different modalities,which enhances the context connection.Then,the TCN model was used to extract the global features of different modalities.Cross Attention was used to fuse the global features of different modalities mainly with multi-modal fusion features.Finally,the recognition results of the model for depression were obtained through the LogSoftmax layer.On the DAIC-WOZ public dataset,the accuracy rate,precision rate and recall rate of the proposed method for depression recognition reach 91.3%,93.6%and 89.7%,respectively.The related indicators are better than other methods,which can better meet the needs of clinical medicine.On the private Chinese dataset MMD2022,the recognition results of STCMN model are still the best,indicating that the model has good generalization ability in Chinese depression recogni-tion tasks.

depressionTCNGRUself-attentioncross attention

王烽飞、卓广平、周金保、刘国强、张光华

展开 >

太原师范学院 计算机科学与技术学院,山西 晋中 030619

太原学院 智能与自动化系,山西 太原 030032

抑郁症 时序卷积网络 门控循环单元 自注意力机制 交叉注意力机制

山西省自然科学基金面上项目山西省重点研发计划项目太原师范学院研究生教育创新资助项目

201801D121147202202150401019SYYJSYC-2399

2024

中北大学学报(自然科学版)
中北大学

中北大学学报(自然科学版)

影响因子:0.258
ISSN:1673-3193
年,卷(期):2024.45(3)