基于多特征融合的食品图像分类

Food Image Classification Based on Multi-Feature Fusion

扫码查看

原文链接

维普
万方数据

中文摘要：随着生活水平的提升,人们对健康饮食的需求与日俱增,食品图像识别成为热门研究课题之一.食品加工和烹饪过程的不同造成了同类食品的形状和颜色存在差异,不同类别的食品也可能会呈现相似的视觉特征,因此食品图像的识别较一般图像识别难度更大.为了解决上述问题,提出基于多特征融合的食品图像分类网络MTFNet.首先,将图像的RGB彩色通道数据与局部二值模式(LBP)对应的纹理特征相融合作为骨干挤压和激励网络(SENet)的输入.接着,利用细节注意力模块挖掘不同位置上各通道的权重,进而对各层特征图进行局部增强,提升特征图局部表征能力.然后,利用自注意力机制计算特征图各通道之间的自注意力权重,挖掘特征图间的相关性,提取图像的全局特征.最后,将局部增强特征和全局特征拼接融合后进行图像分类.实验结果表明,在食品图像数据集ETH Food101、ChineseFoodNet和ISIA Food-500上,与目前最佳的多尺度拼图重构网络(MJR-Net)模型相比,MTFNet模型的Top-1准确率分别提高了 0.44、1.01和0.66个百分点,取得了更好的识别性能.

外文摘要：With improvements in living standards,the demand for a healthy diet is increasing daily,and the problem of food image recognition has become an important research topic.Owing to the different processing and cooking methods of food,the shape and color of similar food vary,and different types of food may present similar visual characteristics.Hence,the recognition of food images is more challenging than general image recognition.To solve these problems,a multi-feature fusion food image classification network,MTFNet,is proposed.First,the R,G,and B color channel data of the image are fused with the texture features corresponding to the local binary mode as the input of the backbone Squeeze and Excite Network(SENet).A detail attention module is then proposed to mine the weights of each channel at different positions,which can enhance the local information of the feature map of each layer and improve its local representation ability.Subsequently,the self-attention mechanism is applied to calculate the self-attention weights between each channel of the feature map,which can mine the correlation between the feature maps and extract the global features of the image.Finally,the locally enhanced and global features are concatenated and fused to classify the images.The experimental results indicate that the Top-1 accuracy of the MTFNet model is improved by 0.44,1.01,and 0.66 percentage points on the ETH Food101,ChineseFoodNet,and ISIA Food-500 food image datasets,respectively,as compared with Multi-scale Jigsaw Reconstruction Network(MJR-Net),achieving the best recognition performance.

外文关键词：

food image classificationLocal Binary Pattern(LBP)Squeeze and Excite Network(SENet)detail attentionself-attention

作者：

叶志鹏、姜枫

展开 >

作者单位：

南京理工大学泰州科技学院,江苏泰州 225300

关键词：

食品图像分类局部二值模式挤压和激励网络细节注意力自注意力

出版年：

2024

DOI：

10.19678/j.issn.1000-3428.0068731

计算机工程

华东计算技术研究所　上海市计算机学会

计算机工程

CSTPCD北大核心

影响因子：0.581

ISSN：1000-3428

年,卷(期)：2024.50(12)