基于多模态融合的3DRes-ViT膝骨关节炎分类模型

3DRes-ViT knee osteoarthritis classification model based on multimodal fusion

宋宇 ¹徐睿 ¹才晓东 ²王昕¹

扫码查看

作者信息

1. 长春工业大学计算机科学与工程学院,吉林长春 134000
2. 吉林省前卫医院信息科,吉林长春 130012
折叠

摘要

针对膝骨关节炎的多分类准确率低,对膝骨关节图像特征提取不充分的问题,提出了基于多模态融合的3DRes-ViT网络模型.设计一种三维卷积神经网络(3D Convolutional Neural Networks,3D CNN),分别提取双回声稳态(DESS)和快速自旋回波(TSE)磁共振成像(Magnetic Resonance Imaging,MRI)序列的三维浅层特征.研究发现这两种信息具有互补性,将这些特征进行融合.通过高效通道注意力(Efficient Channel Attention,ECA)模块捕捉融合后特征通道间的依赖关系,并输入到Vision Transformer(ViT)编码器中,结合3DCNN和ViT的优势高效聚合两个模态的局部特征和全局特征.最后,ViT的输出再与二维卷积神经网络(2D CNN)提取的X光图像特征进行融合,以进一步提升分类性能.实验结果表明,本文方法在KOA四分类任务中表现优异,平均分类准确率达到了 91.2%、平均精度为91.6%,F1分数为0.914,平均绝对误差降低至8.8%,显著提高了膝骨关节炎多分类的准确率.

Abstract

Aiming at the problems of low accuracy of multiple classification in Knee osteoarthritis(KOA)and insufficient feature extraction of knee joint images,the 3DRes-ViT network model based on multi-modal fusion was proposed in this paper.Firstly,the 3D Convolutional Neural Networks(3D CNN)is designed to extract the 3D shallow features of the two magnetic resonance imaging(MRI)sequences re-spectively,including dual echo steady state(DESS)and fast spin echo(TSE).The study found that the two kinds of information are complementary,and then these features are fused.Secondly,the dependen-cies among the fused feature channels are captured by the Efficient Channel Attention(ECA)module and fed into the Vision Transformer(ViT)encoders,which combines the advantages of 3DCNN and ViT to efficiently aggregate the local and global features of the two modalities.Finally,the output of ViT is then fused with the X-ray image features extracted by the 2D convolutional neural network(2D CNN)to fur-ther enhance the classification performance.Experimental results show that our method performs excellent-ly in the KOA four-classification task,with an average classification accuracy of 91.2%,an average preci-sion of 91.6%,an F1 score of 0.914,and a reduction of the average absolute error to 8.8%.The pro-posed model surpasses the mainstream methods in the current field and significantly improves the multiple classification accuracy of knee osteoarthritis.

关键词

医学图像处理/深度学习/膝骨关节炎/多模态融合/X光/磁共振成像/视觉转换

Key words

medical image processing/deep learning/knee osteoarthritis/multi-modal fusion/X-ray/magnetic resonance imaging/vision transformer

引用本文复制引用

出版年

2024

光学精密工程

中国科学院长春光学精密机械与物理研究所中国仪器仪表学会

光学精密工程

CSTPCD北大核心

影响因子：2.059

ISSN：1004-924X

段落导航