首页|基于视觉Transformer的自注意力舌象肝郁线识别

基于视觉Transformer的自注意力舌象肝郁线识别

扫码查看
智能舌诊是计算机视觉在中医领域的一个典型应用,舌体图像自动识别是其中的核心问题.肝郁线位于舌面两侧,是中医临床重要的诊断依据,研究其自动识别能够促进智能舌诊的进一步发展.文章借助视觉Transformer自注意力机制的优势,基于Swin Transformer提出注意力指导和局部区域复用策略,并以此为基础构建了用于肝郁线识别的ST-LDL网络结构.注意力指导用于促进ST-LDL提取到肝郁线细粒度特征;局部区域复用利用舌面两侧的局部强响应区域,一方面训练ST-LDL的局部网络分支,另一方面增强ST-LDL的全局网络分支.消融实验结果表明,单独使用注意力指导策略和区域复用策略,均能提升识别效果;共用这两种策略时,各项指标均大幅提高.对比实验结果表明,文章所提算法优于通用算法,也好于现有专门用于肝郁线识别的算法.
Automatic Recognition of Liver Depression Line in the Tongue Based on the Self-Attentive Mechanism of Vision Transformer
Automatic recognition of tongue images is the core issue of the intelligent tongue diagnosis,which is a typical application of computer vision in the field of tra-ditional Chinese medicine.The liver depression line(LDL),which is located on both sides of the tongue surface,is an important clinical diagnostic basis of traditional Chi-nese medicine.Its automatic visual recognition can promote the further development of intelligent tongue diagnosis.In order to improve the accuracy of LDL recognition,we propose an attention-guided and local region reuse strategy based on the advan-tages of the self-attention mechanism of the visual Transformer.We then design a network,ST-LDL,for LDL recognition based on Swin Transformer.The attention guidance strategy is used to promote ST-LDL to extract fine-grained features of LDL.The local region reuse strategy utilizes the local strong response areas on both sides of the tongue surface to train the local network branch of ST-LDL on one hand and enhance the global network branch of ST-LDL on the other hand.Ablation exper-iments show that using attention guidance and policy region reuse strategies alone can improve recognition performance;all metrics increase significantly when these two strategies are shared.Comparison experimental results show that our algorithm outperforms other existing advanced algorithms.

Intelligent tongue diagnosisliver depression linevision transformerself-attention mechanismattention guidancelocal area reuse

张东晓、杜鑫康

展开 >

集美大学理学院,厦门 361021

智能舌诊 肝郁线 视觉Transformer 自注意力机制 注意力指导 局部区域复用

国家自然科学基金福建省自然科学基金福建省自然科学基金

122712112020J017102021J01861

2024

系统科学与数学
中国科学院数学与系统科学研究院

系统科学与数学

CSTPCD北大核心
影响因子:0.425
ISSN:1000-0577
年,卷(期):2024.44(5)
  • 59