首页|基于多特征融合的食品图像分类

基于多特征融合的食品图像分类

扫码查看
随着生活水平的提升,人们对健康饮食的需求与日俱增,食品图像识别成为热门研究课题之一。食品加工和烹饪过程的不同造成了同类食品的形状和颜色存在差异,不同类别的食品也可能会呈现相似的视觉特征,因此食品图像的识别较一般图像识别难度更大。为了解决上述问题,提出基于多特征融合的食品图像分类网络MTFNet。首先,将图像的RGB彩色通道数据与局部二值模式(LBP)对应的纹理特征相融合作为骨干挤压和激励网络(SENet)的输入。接着,利用细节注意力模块挖掘不同位置上各通道的权重,进而对各层特征图进行局部增强,提升特征图局部表征能力。然后,利用自注意力机制计算特征图各通道之间的自注意力权重,挖掘特征图间的相关性,提取图像的全局特征。最后,将局部增强特征和全局特征拼接融合后进行图像分类。实验结果表明,在食品图像数据集ETH Food101、ChineseFoodNet和ISIA Food-500上,与目前最佳的多尺度拼图重构网络(MJR-Net)模型相比,MTFNet模型的Top-1准确率分别提高了 0。44、1。01和0。66个百分点,取得了更好的识别性能。
Food Image Classification Based on Multi-Feature Fusion
With improvements in living standards,the demand for a healthy diet is increasing daily,and the problem of food image recognition has become an important research topic.Owing to the different processing and cooking methods of food,the shape and color of similar food vary,and different types of food may present similar visual characteristics.Hence,the recognition of food images is more challenging than general image recognition.To solve these problems,a multi-feature fusion food image classification network,MTFNet,is proposed.First,the R,G,and B color channel data of the image are fused with the texture features corresponding to the local binary mode as the input of the backbone Squeeze and Excite Network(SENet).A detail attention module is then proposed to mine the weights of each channel at different positions,which can enhance the local information of the feature map of each layer and improve its local representation ability.Subsequently,the self-attention mechanism is applied to calculate the self-attention weights between each channel of the feature map,which can mine the correlation between the feature maps and extract the global features of the image.Finally,the locally enhanced and global features are concatenated and fused to classify the images.The experimental results indicate that the Top-1 accuracy of the MTFNet model is improved by 0.44,1.01,and 0.66 percentage points on the ETH Food101,ChineseFoodNet,and ISIA Food-500 food image datasets,respectively,as compared with Multi-scale Jigsaw Reconstruction Network(MJR-Net),achieving the best recognition performance.

food image classificationLocal Binary Pattern(LBP)Squeeze and Excite Network(SENet)detail attentionself-attention

叶志鹏、姜枫

展开 >

南京理工大学泰州科技学院,江苏泰州 225300

食品图像分类 局部二值模式 挤压和激励网络 细节注意力 自注意力

2024

计算机工程
华东计算技术研究所 上海市计算机学会

计算机工程

CSTPCD北大核心
影响因子:0.581
ISSN:1000-3428
年,卷(期):2024.50(12)